New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache regridding weights if possible #2344
Conversation
Codecov ReportAll modified and coverable lines are covered by tests β
Additional details and impacted files@@ Coverage Diff @@
## main #2344 +/- ##
==========================================
+ Coverage 94.28% 94.29% +0.01%
==========================================
Files 246 246
Lines 13511 13540 +29
==========================================
+ Hits 12739 12768 +29
Misses 772 772 β View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good, bud πΊ A couple q's from me please - also, do you have a feel how big those caches can get to? ie would memory clogging get severe enough to forgo caching and just go for CPU time use instead?
Thanks for reviewing V! Memory usage should be minimal: as mentioned here, the weights should only be around 10 MiB for very high resolution grids (1000x1000), and much smaller for "normal" resolutions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for answering, Manu! Glad @bouweandela asked the same q about memory π
@bouweandela maybe you could have a look too, given that I took it you are not 100% convinced about the need of caching (reading from the issue) - and pls merge if all's good by ye too πΊ |
Yes, a 10 percent increase in runtime in the best case doesn't seem like a huge gain, but it's nice to have of course. I am concerned about the size of the cache though, in the issue you talked about 1GB #2341 (comment) and I saw that there is some discussion planned at the workshop about re-using weights:
I image they would be rather large arrays too if they are so expensive to compute. To avoid this turning into a memory leak, would it be an option to:
|
The 10% reduction in run time were only for this specific example, it's definitely not an upper bound. Also, the 1GiB is a really extreme example (e.g., a 0.1x0.1Β° grid would lead to a weights array of ~25 MiB).
That sounds very reasonable, I will do that! π |
from me own experience it's always better to read from file/memory if possible, than it is to compute (yes OK call me Dr Obvious π€£ ) - and given Manu's reassuring info on upper limits for mem intake, I think it's a go - but I do like Bouwe's suggestions too. My only concern, that I just thought of, is maybe this is better it sat in iris? EDIT: then again, we have it, we use it - rather than wait a couple centuries for iris π |
Also (sorry, I sat down did MO crap today, so now my brain is free at last) - Manu, maybe it'd be worth ploppoing a worst case scenario test in the tests ie build a super-high res netCDF file on the fly, and do the weights caching dance on it - we'd be able to monitor the memory using the test performance tool we have in Github Actions, and of course, decorate it with eg |
Actually this entire functionality is already in iris in the form of
I am honestly not sure if that's worth it, especially if we implement the option to turn off weights caching π¬ What do we want to achieve with such a test? Kill the CI machines? π π€ |
Thanks for making the changes @schlunma! @valeriupredoi Could you please do a final review to review the new code, since you reviewed the original PR, and then we can merge. |
On it, bud πΊ |
Not a smart thing to do, especially since today Skynet's closer than ever π OK, leave the poor machines alone then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new stuff looks good! Cheers gents @schlunma and @bouweandela πΊ
Description
This implements regridder weights caching, which may reduce regridding time dramatically (if many variables of the same data set are analyzed).
I also modernized existing regridding tests so that they use pytest now instead of unittest, and actually test the regridding instead of using mocks.
Closes #2341
Link to documentation: https://esmvaltool--2344.org.readthedocs.build/projects/ESMValCore/en/2344/recipe/preprocessor.html#horizontal-regridding
Before you get started
Checklist
It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the π Technical or π§ͺ Scientific review.
To help with the number pull requests: