New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory usage of MapEvaluator #4989
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #4989 +/- ##
=======================================
Coverage 75.69% 75.69%
=======================================
Files 228 228
Lines 33841 33851 +10
=======================================
+ Hits 25616 25624 +8
- Misses 8225 8227 +2 ☔ View full report in Codecov by Sentry. |
Are you sure the difference comes from the exposure? The cutout should only return a view and not copy any values. However the cutout geom re-computes the array of coordinates. I would presume the memory usage comes from this. |
from what I understand the cutout creates a new map with a new data array : gammapy/gammapy/maps/wcs/ndmap.py Lines 992 to 995 in 888905e
and the coordinates are not re-computed as they are cached only one time with the lru_cache of the original geom |
I updated the reference plot from my previous comment (it was wrong because I forgot to cherry pick one of the other memory patches in my reference branch). |
Thanks, that's even worse. There is a memory leak... |
Yes, but no... As we create a cutout geom for each source, it is a new object. The first time we access the coordinates on the cutout geom, they are re-computed and cached. Worst case, when there are a lot of source and large support, we duplicate all the coordination information. In this case it might be better to create a cutout from the original larger coordinate arrays. But this highly depends on the analysis scenario, for few sources computing the coordinates on the cutouts is probably much better.
Yes, you are right. I might be worth to only work with view and give up on the not fully contained cutouts, but trim instead. |
For this test I have 32 sources, and each new peak corresponds to the npred computation of a source |
Right, at least we could save memory on coordinates caching by returning a 2D meshgrid for lon, lat if the geom is regular and a 1d array for the axes instead of a ND meshgrid for each axis. I will try that in another PR. |
Ok ,thanks for clarifying. It thought it was multiple npred evaluations. But it is only one with 32 sources. |
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
1758c23
to
945542f
Compare
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
Signed-off-by: Quentin Remy <quentin.remy@mpi-hd.mpg.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @QRemy . This looks good. No comment from my side.
Avoid caching exposure cutout on MapEvaluator. The cutout creates a new array in memory for each source so this does not scale well if the cutout region and the number of sources is large, while the cutout is very fast to compute.