-
Notifications
You must be signed in to change notification settings - Fork 9
Add better and more versitile sankey diagrams #514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add better and more versitile sankey diagrams #514
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThe Sankey plotting functionality in Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
flixopt/statistics_accessor.py (4)
1097-1156: Sankey data prep is solid; consider defensive handling of unexpected time dims forsizes.The mode handling, weighting with
period_weights/ normalizedscenario_weights, and time aggregation forflow_hourslook correct and side‑effect free (using.copy()before mutation). One small defensive improvement: ifself._stats.sizesever acquires atimedimension in future models, the current code would pass a multi‑dim array into_build_sankey_links, wherefloat(ds[label].values)would fail. You could mirror thepeak_flowpath and collapse anytimedimension forsizesas well:if mode == 'sizes': ds = self._stats.sizes.copy() title = 'Investment Sizes (Capacities)' + if 'time' in ds.dims: + ds = ds.max(dim='time')This keeps current behaviour unchanged while making the helper robust to future changes in the size tensor shape.
1158-1247: Effect-based Sankey logic is clear; dropzip(..., strict=False)and align link dataset schema.The construction of component→effect links and filtering on finite/|value|≥1e‑6 is good, and returning both a figure and a link Dataset is useful. Two follow‑ups:
zip(contributors, components, strict=False)is equivalent to plainzip(...)here and introduces a Python‑version dependency (thestrictkeyword only exists in newer Python versions). Since both arrays should always be the same length, you can safely drop the keyword:- for contributor, component in zip(contributors, components, strict=False): + for contributor, component in zip(contributors, components):
- The returned
sankey_dshere uses an integerlinkcoordinate and a separate'label'coordinate, whereas the non‑effects path below uses the flow label as the'link'coordinate and no'label'coord. Harmonizing these (e.g. always using integerlinkplus a'label'coord) will make downstream handling of different modes simpler.Please confirm your minimum supported Python version; if it’s below the version that added
zip(..., strict=...), you should definitely remove thestrictargument as above.
1249-1288: Use consistent string labels for Sankey node IDs to avoid mixed-type node duplication.
source/targetare currently taken directly fromflow.busandflow.component. Elsewhere (e.g. inflows()), you treatflow.busas “could be string or Bus object” and useflow.component.label_fullfor a stable component label. Using raw objects here risks:
- Having separate nodes for the same logical bus/component if some flows carry a string and others a Bus/Component instance.
- Less readable node labels in the Plotly figure.
It’s safer to normalize to string labels similarly to the
flows()helper, e.g.:- if flow.is_input_in_component: - source = flow.bus - target = flow.component - else: - source = flow.component - target = flow.bus + # Normalize to stable string labels for nodes + bus_label = getattr(flow.bus, 'label_full', flow.bus) + comp_label = getattr(flow.component, 'label_full', flow.component) + + if flow.is_input_in_component: + source = bus_label + target = comp_label + else: + source = comp_label + target = bus_labelThis keeps the Sankey structure correct even if underlying types change.
1334-1396: Mode-basedsankey()API is a nice improvement; consider unifying the structure ofsankey_dsacross modes.The new
modeparameter and delegation to_prepare_sankey_data/_build_sankey_links/_build_effects_sankeyare clean and make the API more versatile. One consistency nit:
- For
mode == 'effects', the returned Dataset has a separate'label'coordinate and an integer'link'coordinate.- For the other modes,
'link'is set directly tolinks['label']and there is no'label'coord.Aligning these (e.g. always using an integer
linkindex plus a'label'coord) would make it easier for callers to consumePlotResult.datawithout special‑casing modes:- sankey_ds = xr.Dataset( - {'value': ('link', links['value'])}, - coords={ - 'link': links['label'], - 'source': ('link', links['source']), - 'target': ('link', links['target']), - }, - ) + n_links = len(links['value']) + sankey_ds = xr.Dataset( + {'value': ('link', links['value'])}, + coords={ + 'link': range(n_links), + 'source': ('link', links['source']), + 'target': ('link', links['target']), + 'label': ('link', links['label']), + }, + )You could then mirror this structure in
_build_effects_sankeyfor full symmetry.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
flixopt/statistics_accessor.py(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
flixopt/statistics_accessor.py (3)
flixopt/effects.py (1)
period_weights(320-337)flixopt/flow_system.py (4)
scenario_weights(1479-1486)scenario_weights(1489-1510)sel(1638-1664)coords(1466-1472)flixopt/color_processing.py (1)
process_colors(112-180)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: test (3.13)
- GitHub Check: test (3.12)
- GitHub Check: test (3.11)
- GitHub Check: test (3.10)
🔇 Additional comments (1)
flixopt/statistics_accessor.py (1)
1291-1332: Sankey figure factory looks correct and consistent with existing color handling.The node/index construction, delegation to
process_colors, andupdate_layout(title=...)all look good and match patterns used elsewhere in this module. No issues from my side here.
…es made: Summary of Fixes 1. test_statistics_sizes_includes_all_flows - Fixed Problem: The statistics.sizes property was returning storage capacity sizes (like Speicher) which are not flow labels. The test expected only flow sizes. Fix in flixopt/statistics_accessor.py (lines 419-427): - Modified sizes property to filter only flow sizes by checking if the variable name (minus |size suffix) matches a flow label in self._fs.flows.keys() 2. test_sankey_sizes_mode - Fixed Problem: The sankey diagram with mode='sizes' returned empty links because the test fixture (simple_flow_system) didn't have any flows with InvestParameters. Fix in tests/conftest.py (lines 268-272): - Modified Storage.simple() to use fx.InvestParameters(fixed_size=1e4, mandatory=True) on the charging flow, ensuring there's at least one flow with investment parameters for testing. 3. test_sankey_sizes_max_size_filter - Fixed Problem: The sankey method didn't have a max_size parameter, so passing it caused the parameter to be forwarded to plotly_kwargs and then to update_layout(), which failed with "Invalid property specified for object of type plotly.graph_objs.Layout: 'max_size'". Fix in flixopt/statistics_accessor.py (lines 1351, 1367, 1392-1395): - Added max_size: float | None = None parameter to the sankey method signature - Added filtering logic to apply max_size filter when in mode='sizes'
* Update sankey figure building * fix: sankey * fix: effects sankey * Add sankey diagramm tests * Add sankey diagramm tests * Minor nitpicks * All three failing tests are now fixed. Here's a summary of the changes made: Summary of Fixes 1. test_statistics_sizes_includes_all_flows - Fixed Problem: The statistics.sizes property was returning storage capacity sizes (like Speicher) which are not flow labels. The test expected only flow sizes. Fix in flixopt/statistics_accessor.py (lines 419-427): - Modified sizes property to filter only flow sizes by checking if the variable name (minus |size suffix) matches a flow label in self._fs.flows.keys() 2. test_sankey_sizes_mode - Fixed Problem: The sankey diagram with mode='sizes' returned empty links because the test fixture (simple_flow_system) didn't have any flows with InvestParameters. Fix in tests/conftest.py (lines 268-272): - Modified Storage.simple() to use fx.InvestParameters(fixed_size=1e4, mandatory=True) on the charging flow, ensuring there's at least one flow with investment parameters for testing. 3. test_sankey_sizes_max_size_filter - Fixed Problem: The sankey method didn't have a max_size parameter, so passing it caused the parameter to be forwarded to plotly_kwargs and then to update_layout(), which failed with "Invalid property specified for object of type plotly.graph_objs.Layout: 'max_size'". Fix in flixopt/statistics_accessor.py (lines 1351, 1367, 1392-1395): - Added max_size: float | None = None parameter to the sankey method signature - Added filtering logic to apply max_size filter when in mode='sizes' * Add new sankey network topology plot
Description
Brief description of the changes in this PR.
Type of Change
Related Issues
Closes #(issue number)
Testing
Checklist
Summary by CodeRabbit
New Features
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.