New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve map read method #1246

Merged
merged 6 commits into from Jan 11, 2018

Conversation

Projects
None yet
3 participants
@woodmd
Member

woodmd commented Jan 10, 2018

This PR addresses #1242 by rewriting MapBase.read to allow it to work when called from a base class. Here are the ways the read method can be called:

  • When called from any of the non-base classes (WcsMapND, HpxMapND, HpxMapSparse) the method will instantiate an object of that class. This was the existing behavior of this method.
  • When called from an abstract base class (MapBase, WcsMap, or HpxMap) it will instantiate a non-sparse map appropriate to the format of the file (WcsMapND if the file is WCS format or HpxMapND for a HPX file).
  • When called from an abstract base class with the map_type argument it will try to instantiate a map of that type. If that map type is inconsistent with the file format then an exception is thrown.

@cdeil cdeil requested review from cdeil and registerrier Jan 11, 2018

@cdeil cdeil self-assigned this Jan 11, 2018

@cdeil cdeil added the feature label Jan 11, 2018

@cdeil cdeil added this to the 0.7 milestone Jan 11, 2018

@registerrier

This comment has been minimized.

Contributor

registerrier commented Jan 11, 2018

Thanks this looks good. It should avoid trying to create a base class from the read method.

@cdeil

@woodmd @registerrier - How about this?https://github.com/woodmd/gammapy/pull/1/files

I find it slightly simpler - the MapBase.read doesn't need to concern itself any more with a list of intermediate base classes, having "auto" instead of "None" is a matter of taste - I find uniformity in type helpful, makes documentation and configuration a little easier, even compared to the default of None.

I think the format discovery is a little better because it only reads FITS headers, no data. I'm not sure if the current implementation loaded the data twice (once for format discovery, once in the from_hdulist call. Probably not, the data is loaded once in format discovery and then stored on the hdulist object and just read again from memory in the from_hdulist call creating the map?

One more thing I saw: in many places you support two strings, e.g. "wcs" and the classname as a string like "MapWCSND". Suggest to remove the second option. Having two ways to do the same thing will just lead to more work for us (testing, documentation) and confusion for users (some use one string, others the second). @woodmd @registerrier - Thoughts on this point?

@woodmd

This comment has been minimized.

Member

woodmd commented Jan 11, 2018

Thanks for the suggestions. I agree your approach is cleaner so let's go with that. The current implementation was only reading the data once. My understanding is that astropy.io.fits generally defers loading data until you reference the data member of an HDU.

One more thing I saw: in many places you support two strings, e.g. "wcs" and the classname as a string like "MapWCSND". Suggest to remove the second option. Having two ways to do the same thing will just lead to more work for us (testing, documentation) and confusion for users (some use one string, others the second). @woodmd @registerrier - Thoughts on this point?

Agreed. I'll amend this PR now to remove the class string options.

Merge pull request #1 from cdeil/maps-read
Improve MapBase.read a bit in gammapy.maps
@cdeil

This comment has been minimized.

Member

cdeil commented Jan 11, 2018

My understanding is that astropy.io.fits generally defers loading data until you reference the data member of an HDU.

This is also my understanding.

Note that you're accessing hdu.data in your three find_hdu utillity functions in gammapy/maps/utils.py, e.g. here:

if hdu.data is not None and isinstance(hdu, fits.ImageHDU):

I think this could sometimes trigger unnecessary data reads from disk for users, specifically if the primary HDU contains data but is not what they want, e.g. in find_bintable_hdu. I don't know if it occurs in practice though.

A while ago I wrote this with a similar purpose, trying to add some conveniences to HDUList:
http://docs.gammapy.org/dev/api/gammapy.utils.fits.SmartHDUList.html
I'm not sure yet if that was a good idea / API or not.
I think you could use it instead of your helper functions in gammapy/maps/utils.py, if you're willing to couple gammapy.maps to the rest of Gammapy. At the moment I think it's still completely standalone, i.e. doesn't import from anywhere else like gammapy.utils. But I think over time gammapy.maps will probably import something from gammapy.utils or gammapy.data and loosely couple to the rest of Gammapy anyways, no?

@woodmd

This comment has been minimized.

Member

woodmd commented Jan 11, 2018

I think this could sometimes trigger unnecessary data reads from disk for users, specifically if the primary HDU contains data but is not what they want, e.g. in find_bintable_hdu. I don't know if it occurs in practice though.

Ah I didn't think of this but I suspect is not None doesn't trigger a read. In retrospect I don't recall why this check was necessary. Is there a prescribed way in astropy.io.fits to check if an HDU is empty?

I think you could use it instead of your helper functions in gammapy/maps/utils.py, if you're willing to couple gammapy.maps to the rest of Gammapy. At the moment I think it's still completely standalone, i.e. doesn't import from anywhere else like gammapy.utils. But I think over time gammapy.maps will probably import something from gammapy.utils or gammapy.data and loosely couple to the rest of Gammapy anyways, no?

I don't have any particular attachment to the helper functions in gammapy/maps/utils.py and I wasn't intentionally trying to keep gammapy.maps decoupled from the rest of gammapy. I agree it makes more sense to have functions like this in gammapy.utils so that it can be more easily shared with other submodules that need to do FITS I/O. If the functions you wrote in gammapy.utils already provide the same functionality as the ones in gammapy.maps feel free to open a PR to replace them.

@cdeil

This comment has been minimized.

Member

cdeil commented Jan 11, 2018

@woodmd - I came up with this to handle the empty HDU:

if hdu_object.is_image and len(hdu_object.shape) > 0:

I think I looked around the astropy.io.fits code and found that pattern somewhere else.

I'm pretty sure your hdu.data in the if hdu.data line in your utils function will trigger the data read from disk, and should be avoided.

I think using gammapy.utils.fits for maps and getting rid of your utils functions would be a good move. My idea was to use gammapy.utils.fits everywhere in Gammapy for FITS I/O if astropy.io.fits doesn't have some conveniences. Should I try to do it here via a commit, or merge now and I make a follow-up PR?

@cdeil

This comment has been minimized.

Member

cdeil commented Jan 11, 2018

I think (hope) hdu_object.shape comes from the FITS header information and doesn't trigger a HDU data read.

@woodmd

This comment has been minimized.

Member

woodmd commented Jan 11, 2018

Should I try to do it here via a commit, or merge now and I make a follow-up PR?

I suggest we do this in a separate PR.

@cdeil

This comment has been minimized.

Member

cdeil commented Jan 11, 2018

Yes, looks like the hdu.shape access doesn't load the data, i.e. is what we want here:
https://github.com/astropy/astropy/blob/1f01ba9d51616453246ad5440ddc9a2e611357b3/astropy/io/fits/hdu/image.py#L212

I suggest we do this in a separate PR.

So ready to merge here, no?
I'd suggest to keep #1242 open for a bit, as it still contains suggestions to improve docs.

@cdeil

cdeil approved these changes Jan 11, 2018

@woodmd

This comment has been minimized.

Member

woodmd commented Jan 11, 2018

So ready to merge here, no?
I'd suggest to keep #1242 open for a bit, as it still contains suggestions to improve docs.

Agreed on both counts.

@cdeil cdeil merged commit 466f0b0 into gammapy:master Jan 11, 2018

1 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment