Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IterativePSFPhotometry: high memory usage due to a deepcopy? #1706

Open
mwhosek opened this issue Feb 15, 2024 · 3 comments
Open

IterativePSFPhotometry: high memory usage due to a deepcopy? #1706

mwhosek opened this issue Feb 15, 2024 · 3 comments
Labels

Comments

@mwhosek
Copy link

mwhosek commented Feb 15, 2024

Thanks for developing the PSFPhotometry and IterativePSFPhotometry classes! I have been using them to extract stars from JWST images and they appear to be working quite well. However, I noticed that the IterativePSFPhotometry class takes up a lot of memory, making it very difficult to apply to dense star fields (~50,000 - 60,000 stars). It appears to use significantly more memory than v1.8's IterativelySubtractedPSFPhotometry class, which is now deprecated.

I think the issue might be that there is a deepcopy call inside IterativePSFPhotometry which duplicates the PSFPhotometry object after a round of star-fitting is completed. This effectively saves the output from that iteration and then the PSFPhotometry object is reused for the next round of star-fitting.

To show this, I attach a plot of memory usage vs. time for a single iteration of star-finding with the IterativePSFPhotometry object (maxiters=1, grouper=None, using WebbPSF PSF model). This is run on a sub-image which contains ~5500 stars. The long positive slope from 25 - 250s is from the PSF fitting via PSFPhotometry, which only uses ~25% more memory than the old IterativelySubtractedPSFPhotometry did in v1.8. However, afterward we see the sharp spike in memory due to the deepcopy. I also attach a screenshot of the line-by-line memory profile of IterativePSFPhotometry which indicates this.

So, is there a way we can avoid the deepcopy of the PSFPhotometry object? Or, can we do the deepcopy of the PSFPhotometry object after it is initialized but before any fitting is done, so the fit outputs aren't duplicated as well? Currently I'm doing a hack where I initialize a new PSFPhotometry object for each star-finding iteration rather than calling (and overwriting the results of) the existing PSFPhotometry object. It isn't pretty but seems to work OK. Thanks!

Python: 3.10
photutils: 1.10.0
astropy: 6.0.0
numpy: 1.25.2
Operating system: macOS 12.5

mprof IterativePSFPhotometry
@larrybradley
Copy link
Member

larrybradley commented Feb 15, 2024

@mwhosek Many thanks for the detailed report. Can you please run your code and profiling again using the current dev version of Photutils (e.g., pip install -U "photutils[all] @ git+https://github.com/astropy/photutils.git") and report back?

There was a memory leak in copying GriddedPSFModel objects that I fixed in #1679. That fix hasn't made it into a release yet. That would explain why your PSFPhotometry objects are so large. They are supposed to be relatively lightweight (even for 60k sources).

@mwhosek
Copy link
Author

mwhosek commented Feb 15, 2024

@larrybradley thanks for the response. The dev version (1.10.1.dev92+g232edaed) works great! The PSFPhotometry object only uses ~0.5 GB now compared to ~10 GB before. This makes life much easier :).

The dev version appears to run 1.5x slower than 1.10.0, in case that is a reason for concern. But I'd gladly trade the computing time for the memory improvement.

@larrybradley
Copy link
Member

Thanks, @mwhosek. I released v1.11.0 on Friday with the memory fix. I'm curious about your slowdown. My test case (1000 GriddedPSFModels in a 4k x 4k image) with the new code actually ran ~1.4x faster. In any case, I have ideas for further performance improvements (incl. multiprocessing).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants