Add better and more versitile sankey diagrams #514

FBumann · 2025-12-08T14:59:56Z

Description

Brief description of the changes in this PR.

Type of Change

Bug fix
New feature
Documentation update
Code refactoring

Related Issues

Closes #(issue number)

Testing

I have tested my changes
Existing tests still pass

Checklist

My code follows the project style
I have updated documentation if needed
I have added tests for new functionality (if applicable)

Summary by CodeRabbit

New Features
- Sankey diagrams now support multiple visualization modes ('flow hours', 'sizes', 'peak flow', 'effects') for analyzing data from different analytical perspectives.
- Enhanced method parameters provide improved control and flexibility over diagram generation.
Refactor
- Internal restructuring of visualization pipeline for improved maintainability while preserving existing functionality.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-08T15:00:04Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The Sankey plotting functionality in statistics_accessor.py has been refactored to support multiple flow modes ('flow_hours', 'sizes', 'peak_flow', 'effects'). New helper methods prepare mode-specific data, construct node/link structures, and generate Plotly Sankey figures, replacing the previous single-path implementation. The public sankey method signature now includes a mode parameter and delegates to these helpers.

Changes

Cohort / File(s)	Summary
Sankey refactoring `flixopt/statistics_accessor.py`	Reworked sankey plotting flow with mode-based data preparation. Updated `sankey()` method signature to include `mode` parameter (supporting 'flow_hours', 'sizes', 'peak_flow', 'effects'). Introduced helper methods: `_prepare_sankey_data()` for mode-specific data prep, `_build_sankey_links()` to construct nodes/links from datasets, `_create_sankey_figure()` to generate Plotly figures, and `_build_effects_sankey()` for component-to-effect Sankey visualization. Method delegates to helpers based on selected mode; returns PlotResult with constructed figure and link dataset.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Areas requiring attention:
- Mode branching logic and conditional data preparation paths for 'flow_hours' vs. other modes
- Correctness of node/link structure construction in _build_sankey_links() and filtering behavior (min_value threshold)
- Effects Sankey building logic in _build_effects_sankey() for component-to-effect contribution mapping
- Return type consistency: tuple returns from helpers vs. PlotResult from main sankey() method
- Public API signature change and backward compatibility considerations

Poem

🐰 Behold! The Sankey paths now dance with modes,
Where flow_hours, sizes, peaks, and effects erode
The tangled old ways—four helpers now spring,
To weave data flows with algorithmic zing!
The rabbit rejoices: complexity bends to design. ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description is largely a template with placeholder text and unchecked sections; critical details about the changes, type of change, and testing are absent or incomplete.	Fill in the Description section with actual changes, check the appropriate Type of Change box, reference any related issues, confirm testing status, and ensure all checklist items are justified.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title describes the main change (improving Sankey diagrams with mode support) and is relevant to the changeset, though 'versatile' is somewhat general and doesn't specify the mode-based implementation details.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

FBumann · 2025-12-08T15:00:15Z

@coderabbitai review

coderabbitai · 2025-12-08T15:00:30Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

flixopt/statistics_accessor.py (4)
1097-1156: Sankey data prep is solid; consider defensive handling of unexpected time dims for sizes.

The mode handling, weighting with period_weights / normalized scenario_weights, and time aggregation for flow_hours look correct and side‑effect free (using .copy() before mutation). One small defensive improvement: if self._stats.sizes ever acquires a time dimension in future models, the current code would pass a multi‑dim array into _build_sankey_links, where float(ds[label].values) would fail. You could mirror the peak_flow path and collapse any time dimension for sizes as well:
        if mode == 'sizes':
            ds = self._stats.sizes.copy()
            title = 'Investment Sizes (Capacities)'
+           if 'time' in ds.dims:
+               ds = ds.max(dim='time')
This keeps current behaviour unchanged while making the helper robust to future changes in the size tensor shape.

1158-1247: Effect-based Sankey logic is clear; drop zip(..., strict=False) and align link dataset schema.

The construction of component→effect links and filtering on finite/|value|≥1e‑6 is good, and returning both a figure and a link Dataset is useful. Two follow‑ups:

zip(contributors, components, strict=False) is equivalent to plain zip(...) here and introduces a Python‑version dependency (the strict keyword only exists in newer Python versions). Since both arrays should always be the same length, you can safely drop the keyword:
-            for contributor, component in zip(contributors, components, strict=False):
+            for contributor, component in zip(contributors, components):
The returned sankey_ds here uses an integer link coordinate and a separate 'label' coordinate, whereas the non‑effects path below uses the flow label as the 'link' coordinate and no 'label' coord. Harmonizing these (e.g. always using integer link plus a 'label' coord) will make downstream handling of different modes simpler.

Please confirm your minimum supported Python version; if it’s below the version that added zip(..., strict=...), you should definitely remove the strict argument as above.

1249-1288: Use consistent string labels for Sankey node IDs to avoid mixed-type node duplication.

source/target are currently taken directly from flow.bus and flow.component. Elsewhere (e.g. in flows()), you treat flow.bus as “could be string or Bus object” and use flow.component.label_full for a stable component label. Using raw objects here risks:

Having separate nodes for the same logical bus/component if some flows carry a string and others a Bus/Component instance.

Less readable node labels in the Plotly figure.

It’s safer to normalize to string labels similarly to the flows() helper, e.g.:
-        if flow.is_input_in_component:
-            source = flow.bus
-            target = flow.component
-        else:
-            source = flow.component
-            target = flow.bus
+        # Normalize to stable string labels for nodes
+        bus_label = getattr(flow.bus, 'label_full', flow.bus)
+        comp_label = getattr(flow.component, 'label_full', flow.component)
+
+        if flow.is_input_in_component:
+            source = bus_label
+            target = comp_label
+        else:
+            source = comp_label
+            target = bus_label
This keeps the Sankey structure correct even if underlying types change.

1334-1396: Mode-based sankey() API is a nice improvement; consider unifying the structure of sankey_ds across modes.

The new mode parameter and delegation to _prepare_sankey_data / _build_sankey_links / _build_effects_sankey are clean and make the API more versatile. One consistency nit:

For mode == 'effects', the returned Dataset has a separate 'label' coordinate and an integer 'link' coordinate.

For the other modes, 'link' is set directly to links['label'] and there is no 'label' coord.

Aligning these (e.g. always using an integer link index plus a 'label' coord) would make it easier for callers to consume PlotResult.data without special‑casing modes:
-            sankey_ds = xr.Dataset(
-                {'value': ('link', links['value'])},
-                coords={
-                    'link': links['label'],
-                    'source': ('link', links['source']),
-                    'target': ('link', links['target']),
-                },
-            )
+            n_links = len(links['value'])
+            sankey_ds = xr.Dataset(
+                {'value': ('link', links['value'])},
+                coords={
+                    'link': range(n_links),
+                    'source': ('link', links['source']),
+                    'target': ('link', links['target']),
+                    'label': ('link', links['label']),
+                },
+            )
You could then mirror this structure in _build_effects_sankey for full symmetry.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dedfeb8 and 5dc46e8.

📒 Files selected for processing (1)

flixopt/statistics_accessor.py (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

flixopt/statistics_accessor.py (3)

flixopt/effects.py (1)

period_weights (320-337)

flixopt/flow_system.py (4)

scenario_weights (1479-1486)

scenario_weights (1489-1510)

sel (1638-1664)

coords (1466-1472)

flixopt/color_processing.py (1)

process_colors (112-180)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: test (3.13)
GitHub Check: test (3.12)
GitHub Check: test (3.11)
GitHub Check: test (3.10)

🔇 Additional comments (1)

flixopt/statistics_accessor.py (1)

1291-1332: Sankey figure factory looks correct and consistent with existing color handling.

The node/index construction, delegation to process_colors, and update_layout(title=...) all look good and match patterns used elsewhere in this module. No issues from my side here.

…es made: Summary of Fixes 1. test_statistics_sizes_includes_all_flows - Fixed Problem: The statistics.sizes property was returning storage capacity sizes (like Speicher) which are not flow labels. The test expected only flow sizes. Fix in flixopt/statistics_accessor.py (lines 419-427): - Modified sizes property to filter only flow sizes by checking if the variable name (minus |size suffix) matches a flow label in self._fs.flows.keys() 2. test_sankey_sizes_mode - Fixed Problem: The sankey diagram with mode='sizes' returned empty links because the test fixture (simple_flow_system) didn't have any flows with InvestParameters. Fix in tests/conftest.py (lines 268-272): - Modified Storage.simple() to use fx.InvestParameters(fixed_size=1e4, mandatory=True) on the charging flow, ensuring there's at least one flow with investment parameters for testing. 3. test_sankey_sizes_max_size_filter - Fixed Problem: The sankey method didn't have a max_size parameter, so passing it caused the parameter to be forwarded to plotly_kwargs and then to update_layout(), which failed with "Invalid property specified for object of type plotly.graph_objs.Layout: 'max_size'". Fix in flixopt/statistics_accessor.py (lines 1351, 1367, 1392-1395): - Added max_size: float | None = None parameter to the sankey method signature - Added filtering logic to apply max_size filter when in mode='sizes'

* Update sankey figure building * fix: sankey * fix: effects sankey * Add sankey diagramm tests * Add sankey diagramm tests * Minor nitpicks * All three failing tests are now fixed. Here's a summary of the changes made: Summary of Fixes 1. test_statistics_sizes_includes_all_flows - Fixed Problem: The statistics.sizes property was returning storage capacity sizes (like Speicher) which are not flow labels. The test expected only flow sizes. Fix in flixopt/statistics_accessor.py (lines 419-427): - Modified sizes property to filter only flow sizes by checking if the variable name (minus |size suffix) matches a flow label in self._fs.flows.keys() 2. test_sankey_sizes_mode - Fixed Problem: The sankey diagram with mode='sizes' returned empty links because the test fixture (simple_flow_system) didn't have any flows with InvestParameters. Fix in tests/conftest.py (lines 268-272): - Modified Storage.simple() to use fx.InvestParameters(fixed_size=1e4, mandatory=True) on the charging flow, ensuring there's at least one flow with investment parameters for testing. 3. test_sankey_sizes_max_size_filter - Fixed Problem: The sankey method didn't have a max_size parameter, so passing it caused the parameter to be forwarded to plotly_kwargs and then to update_layout(), which failed with "Invalid property specified for object of type plotly.graph_objs.Layout: 'max_size'". Fix in flixopt/statistics_accessor.py (lines 1351, 1367, 1392-1395): - Added max_size: float | None = None parameter to the sankey method signature - Added filtering logic to apply max_size filter when in mode='sizes' * Add new sankey network topology plot

FBumann added 3 commits December 8, 2025 15:42

Update sankey figure building

16016a3

fix: sankey

e2420a7

fix: effects sankey

5dc46e8

Add sankey diagramm tests

9862260

coderabbitai bot reviewed Dec 8, 2025

View reviewed changes

FBumann added 4 commits December 8, 2025 16:12

Add sankey diagramm tests

84a05f1

Minor nitpicks

0107340

Add new sankey network topology plot

2aa57c6

FBumann merged commit 52ad96e into feature/solution-storage-change Dec 9, 2025
4 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add better and more versitile sankey diagrams #514

Add better and more versitile sankey diagrams #514

Uh oh!

FBumann commented Dec 8, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 8, 2025 •

edited

Loading

Review skipped

Uh oh!

FBumann commented Dec 8, 2025

Uh oh!

coderabbitai bot commented Dec 8, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add better and more versitile sankey diagrams #514

Add better and more versitile sankey diagrams #514

Uh oh!

Conversation

FBumann commented Dec 8, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Related Issues

Testing

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

FBumann commented Dec 8, 2025

Uh oh!

coderabbitai bot commented Dec 8, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FBumann commented Dec 8, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 8, 2025 •

edited

Loading