-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not explicitly call gc.collect in CompImageHDU.compressed_data
#14576
Conversation
Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.
|
👋 Thank you for your draft pull request! Do you know that you can use |
I agree that users who run into issues should simply call gc.collect themselves - we shouldn't call it by default as this can have a large performance penalty. |
But people been relying on this indirectly for so long, so now suddenly they have to manually call |
@pllim - I'm not sure it would actually change anything for most users, we shouldn't actually have to call garbage collection explicitly and it's surprising that the code even includes this. It might have been added as a 'just in case' but the main use case it might have been intended for is opening many files in a small amount of time which is what @Cadair is doing but it seems to be hindering not helping. |
Is asking over at astropy-dev mailing list necessary, just in case? |
I can't imagine that even if anyone is relying on this they know it exists, so asking wont help. |
Should at least add a change log then? |
Sure. I was waiting to see if people a) rejected it outright and b) that all the tests passed 😀 |
For what it's worth, I just loaded 10800 FITS files with memmap enabled without this line and I didn't hit any issues. I am not sure when dask would have dropped the references to them, so I don't know how many were open at once, but it certainly wasn't keeping them around forever. |
I agree we should not call
Interesting, though problems may arise only in specific configurations, e.g. did you take care of not keeping references to I guess the other case to check, where people could be impacted by this change, is reading one or a few big files, keeping / not keeping references to |
I didn't but dask might have done.
I am not suggesting this is a comprehensive test. |
@@ -0,0 +1,2 @@ | |||
Do not call ``gc.collect()`` in ``CompImageHDU.compressed_data`` as it has |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically in the deleter right? Maybe rephrase to say when closing a ConpImageHDU since that is when people are most likely to call this indirectly?
c0de06b
to
0dbf5d5
Compare
@astrofrog @saimn I think this is good to go? |
I did a few quick tests and I'm confused, it seems we have a memory leak with the new code ? |
I am guessing this needs to be rebased on top of #14649 and re-reviewed after that one is merged first? |
0dbf5d5
to
23420f1
Compare
23420f1
to
7e310fc
Compare
Running again some tests after #14649, I don't see a difference with or without |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Cadair !
I have a nice pathological benchmark where I load 150 files and read all the data:
Before:
After:
We already discuss the reason why this line exists pretty comprehensively in the docs here: https://docs.astropy.org/en/stable/io/fits/appendix/faq.html#i-am-opening-many-fits-files-in-a-loop-and-getting-oserror-too-many-open-files
I am not really sure I see the benefit of this still being here?