Fix NetworkCollection statistics for networks with different snapshots#1636
Fix NetworkCollection statistics for networks with different snapshots#1636
Conversation
#1605) Move snapshot_weightings from return_from_first to vertical_concat pattern and compute weighted sums per-network in _aggregate_with_weights.
…ize to handle both simple and collection cases
| # Todo: here we leave out the weights, is that correct? | ||
| return df.agg(agg) | ||
| if agg != "sum": | ||
| return df.agg(agg) |
There was a problem hiding this comment.
Why can the special collection handling be ignored for everything except sum? We never reach the collection aware state then
There was a problem hiding this comment.
that is fine because only sum needs to know about the different weights. "mean" does not go here, so only "max" and "min" (never thought about "std" though), and these can be returned straight forwards
There was a problem hiding this comment.
True, you are right. I think std and median would need weights as well, but not sure if worth to add ourself, since pandas doesn't support it. Not sure if this should be noted somewhere. Maybe here https://docs.pypsa.org/latest/user-guide/statistics/#grouping
There was a problem hiding this comment.
Agreed — this is not ideal. For now, I added a TODO comment but we should fix this mid-term
| @@ -530,7 +529,6 @@ def _get_method_patterns() -> dict[str, str]: | |||
| "return_from_first": r"^(" | |||
| r"\S+_components|" | |||
| r"snapshots|" | |||
There was a problem hiding this comment.
Merging snapshots as return_from_first will still lead to issues when not just weightings but also snapshots differ, as you write in the PR?
As already discussed. The collection was never designed to handle different dimensions. But mid term we would need way to define strict same dimensions (snapshots here) and a way to convert all input networks there. For snapshot weightings it still makes sense to give them the multi index.
There was a problem hiding this comment.
Good point, added a note in the statistics user guide
Co-authored-by: Lukas Trippe <lkstrp@pm.me>
| @@ -530,7 +529,6 @@ def _get_method_patterns() -> dict[str, str]: | |||
| "return_from_first": r"^(" | |||
| r"\S+_components|" | |||
| r"snapshots|" | |||
There was a problem hiding this comment.
Good point, added a note in the statistics user guide
| # Todo: here we leave out the weights, is that correct? | ||
| return df.agg(agg) | ||
| if agg != "sum": | ||
| return df.agg(agg) |
There was a problem hiding this comment.
Agreed — this is not ideal. For now, I added a TODO comment but we should fix this mid-term
Closes #1605.
Changes proposed in this Pull Request
NetworkCollectionstatistics methods (supply(),opex(),energy_balance(), etc.) failed or returned wrong results when networks had different snapshots or snapshot weightings. Two root causes were identified and fixed:snapshot_weightingsused thereturn_from_firstproxy pattern, returning only the first network's weights. This was changed tovertical_concat, producing a MultiIndex(network, snapshot)DataFrame with all networks' weightings._aggregate_with_weightsperformedweights @ dfwhere the weights index and DataFrame column index were misaligned for collections. A new_weighted_sum_per_networkmethod splits the computation by network key, computes weighted sums independently per network, then concatenates results.The
_aggregate_with_weightsmethod was also restructured to use early returns for better readability.A
mock_single_bus_dispatchtest helper was added toconftest.pyto create networks with hard-set dispatch results without invoking a solver, keeping the new tests fast.Checklist
docs.docs/release-notes.mdof the upcoming release is included.