-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for sparse xarrays #248
Comments
(I might be able to submit a pull request for the proposed solution but I first wanted to hear what you thought @FabianHofmann) |
Hey @staadecker, nice to hear from you and that you like the package :) I would very much like to see sparse object support in linopy and I really cannot say how invasive the required changes would be. But you are most welcome to give it a try! PRs are always welcome. |
Update on investigation
General thoughts + question For more context, the lack of sparsity means that a 200 MB data table becomes a 200 GB DataArray which results in an error being thrown due to lack of memory. |
Just to add another perspective to the discussion, I believe the non-sparse nature of xarray is also impacting the memory footprint of solving pypsa-eur networks. Here is an example of memory usage profile during while solving a medium/high-resolution (35 clusters, 3000 time steps, 4th step in myopic foresight pathway optimisation) sector-coupled pypsa-eur network: This is after applying an ad-hoc memory optimisation to drop the model from memory during solving and reading it back in from a netcdf file afterwards (456052e) as suggested in #219. During the above optimisation, Gurobi reported using about 16GB of memory, so the mentioned optimisation still leaves a memory overhead of about 9-10GB. But the maximum memory usage spikes to above 100GB during model preparation. And this was only measured at 30 second intervals, so it's possible that the real maximum was even higher. For me this has been especially bothersome since I'm working on a SLURM cluster where jobs have to be allocated a certain amount of memory, and will be killed if they exceed it. In an ideal world (in order to run as many jobs in parallel as possible) I'd allocate ~16GB to the above optimisation, but more than 100GB is actually needed... It's possible to group optimisations in a single job and start them in a staggered manner, but this adds quite a bit of complexity. |
Hmm, thanks @koen-vg for pointing this out. Memory spikes in the building phase seem to be a recurring issue, but they should be avoidable. New versions of xarray sometimes cause memory leaks. Last September, after #161, I had a memory log for PyPSA-Eur sector-coupled, 100 nodes, 4-hourly resolution that looked like this. I'll check how this particular test scenario looks for me with current PyPSA, linopy, xarray versions. |
Thanks for the quick response! Reassuring to know that a low memory footprint is actually possible. I might try with a few different xarray versions then; the above was using version |
@koen-vg that's quite weird, could you mention which linopy you were using? |
could it have something to do with the IO to the temporary netcdf? did you try it out without it? |
It was based on linopy 0.3.8. When testing out 456052e it seemed like it reduced the overall memory footprint, but I'm about to do some more detailed testing (master branch linopy; xarray version) to see if I can pin the problem down to anything in particular. Hopefully I'll soon be able to either point at a faulty xarray/linopy version or just give you a pypsa network which consistently leads to a big memory spike. |
thanks for your deep-dive @koen-vg, this is quite important |
Regardless of memory spikes, I will just note that, to my best understanding, there is no reason in principle why the optimisation at 2050 should take more time and memory than the one at 2025; the reason is presumably that the later network contains many copies of non-extendable generators, links etc. with different build-years and capacities but otherwise identical function and capacity factors. For solving purposes, those should probably be aggregated and then disaggregated again (since information about how much capacity is built in each year is needed for phase-out purposes). But that's really a separate issue to memory spikes and should be discussed over at pypsa-eur. |
Last thing before logging off for the weekend: Here is the memory spike replicated with upstream pypsa-eur and latest version of linopy: This is using pypsa-eur at commit 95805a8d and a freshly created conda env, resulting a linopy 3.8.0 and xarray 2024.2.0. The configuration file used to produce this example is based in the myopic test config, as follows: run:
name: "mem-test"
disable_progressbar: true
shared_resources: false
shared_cutouts: true
enable:
retrieve_cutout: false
foresight: myopic
scenario:
ll:
- v1.5
clusters:
- 70
sector_opts:
- ""
planning_horizons:
- 2020
- 2025
- 2030
- 2035
- 2040
- 2045
- 2050
countries: ["DE", "NL", "BE"]
clustering:
temporal:
resolution_sector: "336H"
sector:
central_heat_vent: true
electricity:
extendable_carriers:
Generator: [OCGT]
StorageUnit: [battery]
Store: [H2]
Link: [H2 pipeline]
renewable_carriers: [solar, onwind, offwind-ac, offwind-dc]
renewable:
solar:
cutout: "europe-2013-era5"
industry:
St_primary_fraction:
2030: 0.6
2040: 0.5
2050: 0.4
solving:
solver:
name: "gurobi"
mem: 4000 Some observations:
A little bit of debuggingI tried using
|
I'm not sure exactly how much time I'll have to look into this the coming week or two, so if anyone else has the time/opportunity, feel free to dig in. Just some quick (almost administrative) notes:
|
Hey @koen-vg, thanks again for the update. I will try to catch up next week (this week is too busy) |
As you can see in the PR mentioning this issue, I've implemented a kind of aggregation over build-years that solves pypsa-eurs problems with performance. But It doesn't rule out the potential for memory spikes. Again, it wasn't entirely clear if the memory spike was really coming from linopy or pypsa-eur though, so the above PR isn't necessarily a final solution for the kind of memory spike I showed. |
@koen-vg some news here: Linopy has now a new LP-writing backend based on polars. It is very fast and memory efficient. Could you have another try with latest master and setting |
I start to understand now... There is two major memory peaks:
The polars implementation addresses the second peak, but it does not do a proper garbage collection, leading to a stronger increase of memory with time. And unfortunately, polars implementation is not much faster. For smaller networks it looked like a great improvement, see #294 (comment). So, I think there are some possible partial solution
|
As mentioned only briefly before in this thread, I have been working (quite a bit) on alternative 1. in this PR: PyPSA/pypsa-eur#1056. It's non-trivial, but the results are worth it. Not only are the memory spikes entirely resolved (no need for However, some information is always lost. In particular, changing efficiencies and marginal costs are a challenge. I think the build year aggregation is nearly always worth it but it doesn't quite produce identical results and since the implementation isn't trivial, there are chances for unexpected interactions with custom constraints etc. etc. that I might have missed in my testing. Still, I will be using the build year aggregation for myopic foresight in pypsa-eur regardless of the memory footprint of model building with linopy :) |
I really like the result of your aggregation strategy. I am just thinking if we could generalize a bit more. Perhaps, we could finally create a new component column "active" in pypsa which allows to ingore components in the optimization. Instead of removing the components to be aggregated, we simply set their activity status to False and add a one representative component instead, which is dissolved and removed after the optimization. |
Glad to hear it's appreciated :) Yeah actually I can see the appeal of such an Overall I guess this build year aggregation business is just inherently a bit complex when you put together all the little details, so there's a limit to how much things can be simplified with the right abstractions. (P.S. I have some mixed experience with boolean values in PyPSA networks (looking at you, |
Apologies to jump into an issue but I have some thoughts that are related to sparsity so thought I'd add to this discussion rather than create a new one for now. I've just started using linopy (and not from a linear programming background either but have used xarray extensively in other contexts) and came across behaviour which I found unintuitive - which is the default behaviour of So the question / discussion point is what are the reasons for EDITOn further thought I see why Perhaps also some indication in the |
@ollie-bell thanks for the exaplanation. Indeed, this behaviour is a bit confusing and together with the representation you easily lose track. An example would be nice, I would be more than happy to review a PR. |
Great. I probably won't get to it until next month now as I'm about to go on holiday but I'll set myself a reminder for when I'm back. |
First of, great work with the library so far!! This is significantly better than what currently exists out there.
Many energy systems models can be extremely sparse. For example, in one recent case, a linear expression is composed of the sum of only 0.1% of the coordinates of an xarray. Not supporting sparsity hence would increase our model size by ~900x. This effectively prohibits us from using Linopy as is.
Suggested solution
Ideally linopy could support xarrays based on the
sparse
library. I've tried the library but it doesn't work with linopy particularly when trying to combine sparse and not sparse xarrays.Rejected solutions
I tried using MultiIndices which are natively supported in xarray however broadcasting doesn't work when combining a sub-dimension of a multi-index. In general, my gut tells me this is the wrong direction to take for the library.
I'm aware of the
mask
parameter for creating constraints and variables however this a) still requires a large memory allocation and b) doesn't help during the computation of linear expressions (it only helps at the variable/constraint creation stage).The text was updated successfully, but these errors were encountered: