Add is_primary_output column to process_flows.csv#657
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR adds an optional is_primary_output column to the process_flows.csv file to support identifying a primary output for each process. Key changes include:
- Adding an is_primary_output field to the ProcessFlow struct and updating its related tests.
- Enhancing CSV reading logic to parse the new column and infer/validate primary outputs.
- Updating sample CSV files to include the new column.
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/process.rs | Added is_primary_output field and updated tests accordingly |
| src/input/process/flow.rs | Updated CSV parser to include is_primary_output and added new primary output validation logic |
| src/input/process.rs | Adjusted tests to include is_primary_output in process flows |
| examples/simple_mc/process_flows.csv | Added is_primary_output column to CSV sample |
| examples/simple/process_flows.csv | Added is_primary_output column to CSV sample |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #657 +/- ##
==========================================
+ Coverage 88.47% 88.57% +0.09%
==========================================
Files 39 39
Lines 3558 3623 +65
Branches 3558 3623 +65
==========================================
+ Hits 3148 3209 +61
- Misses 219 221 +2
- Partials 191 193 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
tsmbland
left a comment
There was a problem hiding this comment.
I think it would have been a lot cleaner to have this as a column in the processes.csv table (since there's one per process and it should never vary by region and year), but I guess this is ok too
Oh really? That was my initial idea, but then I remembered we do vary flows by region and year, so thought this was the best way to do it. It would be good to know what the "rules" are for varying flows, e.g. currently we allow users to vary what commodities are input/output by region and year, not just the magnitude of flows, which seems a bit odd. |
Yeah I think the primary output is core to the definition of the process so wouldn't make sense to vary this by region and year - and I think it's confusing to make this an option I guess it doesn't make sense to vary the input/output commodities either, but I guess someone could conceivably want too set one of these flows to zero in a particular year, which we don't permit, so the only option in this case would be to remove it |
Maybe let's revisit this later then. I'll open an issue.
I wonder if it might make for a cleaner interface if we forced users to provide the same flows for every year and region and just made an exception that, in this case, you are allowed to have a |
Description
Although we don't need PACs any more, we do have to identify one output flow as the "primary" for each process for the investment appraisal step.
I've added an
is_primary_outputcolumn toprocess_flows.csv, but it's an optional value. If there are no output flows, then no primary output is needed. If there's only one, we can infer that it's the primary. If there is more than one, the user can get away with just marking one astrue.I haven't added a helper to get the primary output yet, because I wasn't sure exactly what would be the most useful form (e.g. do we want to return a
ProcessFlowor just aCommodityor something). It might make sense to sort the process flows so that the primary output is always the first element, which would make looking it up faster, but I haven't done this yet.Closes #630.
Type of change
Key checklist
$ cargo test$ cargo docFurther checks