Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close ascii files with unified file read interface #11809

Merged
merged 17 commits into from Jun 8, 2021

Conversation

larrybradley
Copy link
Member

@larrybradley larrybradley commented Jun 2, 2021

The regions package recently added the unified file read/write interface for regions (ascii) files. While running tests, I noticed that region files read with the unified I/O give a ResourceWarning about the file being unclosed. This exact issue was reported in #8673.

Starting with the hints provided by @taldcroft and @saimn in #8673 and #8675, I have a workaround/solution (based on code in click: https://github.com/pallets/click/blob/9da166957f5848b641231d485467f6140bca2bc0/src/click/_compat.py#L61). Essentially TextIOWrapper always closes the wrapped file when it is garbage collected and this somehow messes up the file closing by the context manager. The solution is to detach the underlying stream from TextIOWrapper before it is deleted (i.e., in its __del__ method). The same issue appears to apply to BufferedReader because both were necessary to close the files.

Those changes fixed most of the failing io.ascii tests due to files being left open (i.e., after no longer ignoring the ResourceWarnings in setup.cfg). The remaining test failures in the tests and documentation (doctests) were simple cases of files being opened, but not closed (fixed here).

There was one remaining test case that left a file open, but I could not fix. I removed this test because the fix requires a fix in an upstream package (jplephem). Essentially jplephem opens an invalid ephem file and then immediately raises a ValueError before closing the file. There's no way to fix that on the astropy side.

Closes #8673.

Addresses part of #9619 .

Copy link
Member

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applaud your bravery to jump into the resource warning rabbit hole. 👏

astropy/coordinates/tests/test_solar_system.py Outdated Show resolved Hide resolved
@@ -1047,8 +1047,10 @@ def test_guessing_file_object():
"""
Test guessing a file object. Fixes #3013 and similar issue noted in #3019.
"""
t = ascii.read(open('data/ipac.dat.bz2', 'rb'))
fd = open('data/ipac.dat.bz2', 'rb')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all the cases that calls open, can we not use context manager so it still exists gracefully if something went wrong with the test itself?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a good idea, but I didn't want to change the tests too much.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I guess we can always revisit in follow-up PR(s) if you don't want to address this here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a few at least. In one, the diff does become substantially larger as the whole test is indented. I decided not to follow through too much on the fits hdul tests...

astropy/io/fits/tests/test_image.py Show resolved Hide resolved
astropy/utils/data.py Show resolved Hide resolved
@pllim pllim requested a review from embray June 2, 2021 16:52
@mhvk mhvk mentioned this pull request Jun 2, 2021
Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to actually have dived into this! But it still feels like an ugly hack! Also strange, that open itself has a closefd argument that would seem so helpful, but unfortunately BufferedReader does not. It still feels like that should be possible to use, but it got messy fast (I tried open(fileobj.fileno(), encoding=encoding, closefd=False) which helps for many cases, but then others do not have a working fileno(), etc.)

Anyway, I suggested an alternative in #11810, which is also hacky, but a bit more localized. Note that both PRs still have a failing test.

astropy/coordinates/tests/test_solar_system.py Outdated Show resolved Hide resolved
@larrybradley
Copy link
Member Author

larrybradley commented Jun 2, 2021

This turned out to be a bigger rabbit hole than I originally expected. All of the tests were passing for me locally, but I neglected to run them with remote data. :(. The tests that require remote data are the ones that are failing now. I fixed several of them in the last two commits.

There's a set of tests still failing due to coordinates/solar_system.py. The root cause of those goes back to jplephem again. The _get_kernel function opens the file, but never closes it:

https://github.com/astropy/astropy/blob/main/astropy/coordinates/solar_system.py#L170
and https://github.com/astropy/astropy/blob/main/astropy/coordinates/solar_system.py#L179

Adding a kernel.close() at the end of the _get_body_barycentric_posvel function fixed all but one of the failing tests. The remaining failing test complained about trying to operate on a closed file, so my solution wasn't correct (or something else is needed). Perhaps a coordinates maintainer can take a look? In any case the issues with coordinates/solar_system.py are unrelated to the original issue of the Unified I/O ascii read leaving files open.

Also, I'm happy to go with the alternate solution by @mhvk in #11810. Thanks, @mhvk!

@mhvk
Copy link
Contributor

mhvk commented Jun 2, 2021

@larrybradley - I'll have a look at solar_system_ephemeris. In principle, it does try to close the file, in the public method at

@classmethod
def get_kernel(cls, value):
# ScienceState only ensures the `_value` attribute is up to date,
# so we need to be sure any kernel returned is consistent.
if cls._kernel is None or cls._kernel.origin != value:
if cls._kernel is not None:
cls._kernel.daf.file.close()
cls._kernel = None
kernel = _get_kernel(value)
if kernel is not None:
kernel.origin = value
cls._kernel = kernel
return cls._kernel

But it sounds like code does not always get there.

On the solution, both are somewhat hacky, but most important seems to be to trace down the remaining problems... The one in io/fits/tests/test_table.py seems particularly tricky.

@mhvk
Copy link
Contributor

mhvk commented Jun 2, 2021

Ah, your pointer to _get_body_barycentric_posvel is very useful - it bypasses the solar_system_ephemeris system....

@mhvk
Copy link
Contributor

mhvk commented Jun 3, 2021

@larrybradley - OK, I have a fix for solar_system which makes remote tests pass (plus a small fix in test_solar_system for skyfield). Shall I push it to your branch? Or open a PR to yours? Or just put them on my alternative, so you can cherry-pick what you want?

@mhvk
Copy link
Contributor

mhvk commented Jun 3, 2021

p.s. One sad part of our tests taking so very long, is that I've taken to test only specific files, which often ends up missing bugs. Especially bad are the remote tests where for every invocation the full DE ephemeris has to be downloaded... I wish there was some better caching possible... But let me be smart and open a separate issue...

@mhvk
Copy link
Contributor

mhvk commented Jun 3, 2021

Darn, I had forgotten to include my fix for test_table in reverting to your method of dealing with detaching. Sigh.

@mhvk
Copy link
Contributor

mhvk commented Jun 3, 2021

Hmm, the windows/all-dependencies error still seems related. @larrybradley - do you want to take back control? If not, I'll try sometime in the coming days...

@pllim
Copy link
Member

pllim commented Jun 3, 2021

This is not looking very backportable. Can we re-milestone to 5.0?

Copy link
Member

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have another look once @larrybradley and @mhvk sort out the details. Thanks!

astropy/coordinates/tests/test_solar_system.py Outdated Show resolved Hide resolved
@@ -1047,8 +1047,10 @@ def test_guessing_file_object():
"""
Test guessing a file object. Fixes #3013 and similar issue noted in #3019.
"""
t = ascii.read(open('data/ipac.dat.bz2', 'rb'))
fd = open('data/ipac.dat.bz2', 'rb')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I guess we can always revisit in follow-up PR(s) if you don't want to address this here.

astropy/io/fits/tests/test_image.py Show resolved Hide resolved
@pllim
Copy link
Member

pllim commented Jun 3, 2021

Re: Windows failure -- Maybe you missed one?

def test_detect_zipped(self):
"""Test detection of a zip file when the extension is not .zip."""
zf = self._make_zip_file(filename='test0.fz')
assert len(fits.open(zf)) == 5

@mhvk
Copy link
Contributor

mhvk commented Jun 3, 2021

@pllim - thanks, that looks right. Though why did it pass on non-windows? Anyway, will try!

docs/io/fits/index.rst Outdated Show resolved Hide resolved
@mhvk
Copy link
Contributor

mhvk commented Jun 3, 2021

Finally, tests pass!!

@larrybradley
Copy link
Member Author

This is ready to be merged.

@pllim
Copy link
Member

pllim commented Jun 7, 2021

I'll review in a few days unless someone else gets to it first.

Copy link
Member

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't grok the solar_system.py changes but the rest LGTM. As much as I itch to turn everything to use context manager, I understand it is out of scope here.

Thanks!

@pllim pllim modified the milestones: v4.0.6, v5.0 Jun 8, 2021
@pllim
Copy link
Member

pllim commented Jun 8, 2021

I moved the milestone as I don't think this will backport cleanly, especially the solar_system.py stuff. Merging now.

@pllim pllim merged commit 6387507 into astropy:main Jun 8, 2021
@larrybradley larrybradley deleted the close-files branch June 8, 2021 17:48
@taldcroft
Copy link
Member

Thank you so much @larrybradley and @mhvk and @pllim for finally solving this thorny issue! I tried once and failed (and really didn't want to touch this again), so it is great to see others step up and make it happen. 🎉

@mhvk
Copy link
Contributor

mhvk commented Jun 8, 2021

Indeed, thanks, @larrybradley, for starting this and for another nice collaborative effort! With as a bonus that the solar system stuff is a bit more logical and better too!

@larrybradley
Copy link
Member Author

Many thanks everyone, especially @mhvk who fixed the solar system stuff!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ResourceWarning when reading ascii tables
5 participants