Skip to content

Use AssetID in output files#615

Merged
tsmbland merged 2 commits intomainfrom
asset_id_outputs
Jun 12, 2025
Merged

Use AssetID in output files#615
tsmbland merged 2 commits intomainfrom
asset_id_outputs

Conversation

@tsmbland
Copy link
Copy Markdown
Collaborator

@tsmbland tsmbland commented Jun 11, 2025

Description

Using the asset's ID in the output files, rather than displaying the full asset details every time. For now, I'm keeping asset ID's as integers. We could think about making these more descriptive in the future, as outlined in the issue (perhaps worth opening up another issue for this).

In theory, users could get back what they had before by doing a join between the table of interest (e.g. commodity_flows) and the assets table.

One other thing I noticed is that the assets table is creating a row for each asset every milestone year that the asset is active. I guess in most cases the user could then figure out the decommission year based on the last entry for the asset, but I think it would be cleaner to have a decommission year column, which will be modified as the simulation progresses, with just a single row per asset. Thoughts?

Fixes #581

Type of change

  • Bug fix (non-breaking change to fix an issue)
  • New feature (non-breaking change to add functionality)
  • Refactoring (non-breaking, non-functional change to improve maintainability)
  • Optimization (non-breaking change to speed up the code)
  • Breaking change (whatever its nature)
  • Documentation (improve or add documentation)

Key checklist

  • All tests pass: $ cargo test
  • The documentation builds and looks OK: $ cargo doc

Further checks

  • Code is commented, particularly in hard-to-understand areas
  • Tests added that prove fix is effective or that feature works

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 11, 2025

Codecov Report

Attention: Patch coverage is 77.77778% with 2 lines in your changes missing coverage. Please review.

Project coverage is 85.34%. Comparing base (542baff) to head (61e4a5d).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
src/output.rs 77.77% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #615   +/-   ##
=======================================
  Coverage   85.34%   85.34%           
=======================================
  Files          38       38           
  Lines        3397     3399    +2     
  Branches     3397     3399    +2     
=======================================
+ Hits         2899     2901    +2     
  Misses        305      305           
  Partials      193      193           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tsmbland tsmbland marked this pull request as ready for review June 11, 2025 08:13
@tsmbland tsmbland requested a review from alexdewar June 11, 2025 08:13
Copy link
Copy Markdown
Member

@alexdewar alexdewar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good! Makes the output files much tidier. As you say, it might be worth making a new issue now for making asset IDs something more meaningful than a number, but that can wait.

@alexdewar
Copy link
Copy Markdown
Member

One other thing I noticed is that the assets table is creating a row for each asset every milestone year that the asset is active. I guess in most cases the user could then figure out the decommission year based on the last entry for the asset, but I think it would be cleaner to have a decommission year column, which will be modified as the simulation progresses, with just a single row per asset. Thoughts?

It's true. The current approach is to just dump the contents of the active pool of assets to file on every milestone year, but that'll mean the data file is perhaps a bit repetitious.

I'm not sure if this is exactly what you're suggesting, but we could just have one entry per asset in assets.csv and calculate the actual decommission year (as opposed to what's returned by decommission_year(), which is actually the last possible year it can be decommissioned) for each. Then users could figure out which years each asset was active for if they wanted.

To do that, we'd need to keep tabs on when assets are actually decommissioned. This might be kinda fiddly. There are two places assets can be decommissioned: AssetPool::decommission_old() and AssetPool::replace_active_pool(). Maybe the cleanest way to do this would be to store decommissioned assets in a separate Vec and write an iter_decommissioned() method to AssetPool, then we can write all of the assets to file at the end of the simulation. What do you think?

@alexdewar
Copy link
Copy Markdown
Member

PS -- I've just realised there's a bug in replace_active_pool() that'll need fixing too (it doesn't sort the new active pool, but we rely on it being sorted by ID).

We could also change it to take a Vec as an argument rather than an Iterator then we can just replace the active pool with the new Vec (after sorting etc.).

@alexdewar
Copy link
Copy Markdown
Member

We could also change it to take a Vec as an argument rather than an Iterator then we can just replace the active pool with the new Vec (after sorting etc.).

Actually, this second part won't work. But we still need to fix the bug

@tsmbland
Copy link
Copy Markdown
Collaborator Author

One other thing I noticed is that the assets table is creating a row for each asset every milestone year that the asset is active. I guess in most cases the user could then figure out the decommission year based on the last entry for the asset, but I think it would be cleaner to have a decommission year column, which will be modified as the simulation progresses, with just a single row per asset. Thoughts?

It's true. The current approach is to just dump the contents of the active pool of assets to file on every milestone year, but that'll mean the data file is perhaps a bit repetitious.

I'm not sure if this is exactly what you're suggesting, but we could just have one entry per asset in assets.csv and calculate the actual decommission year (as opposed to what's returned by decommission_year(), which is actually the last possible year it can be decommissioned) for each. Then users could figure out which years each asset was active for if they wanted.

To do that, we'd need to keep tabs on when assets are actually decommissioned. This might be kinda fiddly. There are two places assets can be decommissioned: AssetPool::decommission_old() and AssetPool::replace_active_pool(). Maybe the cleanest way to do this would be to store decommissioned assets in a separate Vec and write an iter_decommissioned() method to AssetPool, then we can write all of the assets to file at the end of the simulation. What do you think?

I would say

  • one row per asset
  • create this row when the asset is commissioned, initially with the decommission_year field blank
  • update the row when the asset is decommissioned to fill in the the decommission year

We don't want to wait until the end of the simulation to write the data, in case the simulation doesn't finish, or the user wants an intermediate peek at the data

@tsmbland tsmbland merged commit c8b2743 into main Jun 12, 2025
7 checks passed
@tsmbland tsmbland deleted the asset_id_outputs branch June 12, 2025 12:42
@alexdewar
Copy link
Copy Markdown
Member

I would say

  • one row per asset
  • create this row when the asset is commissioned, initially with the decommission_year field blank
  • update the row when the asset is decommissioned to fill in the the decommission year

We don't want to wait until the end of the simulation to write the data, in case the simulation doesn't finish, or the user wants an intermediate peek at the data

In practice, I think the only way to achieve this would be to rewrite the file once per milestone year; I don't think there's an easy way to edit individual rows. I don't think that's the end of the world though: we're talking about small amounts of data, so the overhead is likely to be minimal.

We could write things so that the assets are always written when the simulation ends, even if an error has occurred, though that would mean that users wouldn't be able to look at it while the simulation is still running.

@tsmbland
Copy link
Copy Markdown
Collaborator Author

I would say

  • one row per asset
  • create this row when the asset is commissioned, initially with the decommission_year field blank
  • update the row when the asset is decommissioned to fill in the the decommission year

We don't want to wait until the end of the simulation to write the data, in case the simulation doesn't finish, or the user wants an intermediate peek at the data

In practice, I think the only way to achieve this would be to rewrite the file once per milestone year; I don't think there's an easy way to edit individual rows. I don't think that's the end of the world though: we're talking about small amounts of data, so the overhead is likely to be minimal.

We could write things so that the assets are always written when the simulation ends, even if an error has occurred, though that would mean that users wouldn't be able to look at it while the simulation is still running.

I think let's just rewrite every year. Like you say, the overhead will be minimal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Represent assets with ID value in output files

2 participants