Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce "gadf" format for RegionNDMap #3231

Merged
merged 11 commits into from
Feb 17, 2021
Merged

Conversation

adonath
Copy link
Member

@adonath adonath commented Feb 9, 2021

Description

This pull request introduces a proof of concept for a "gadf" format for RegionNDMap. It serialises as following:

The main question here is whether this is needed at all for now. The main use case I hand in mind was to be able to serialise a RegionNDMap with multiple axes. However this currently only applies to the EDispKernelMap and PSFMap, for both we can fall back to different solutions: they can be serialised as a WCS map with one spatial bin.

Beside the extra axes the serialisation will also keep all the region information on the maps, which might be interesting for provenance and it provides a simple solution to serialise both SpectrumDataset as well as SpectrumDatasetOnOff in single FITS file, which is probably a convenient alternative to the current ogip format.

The goal here is not to define any official DL4 data format, but just provide a simple and convenient way to serialise a RegionNDMap of any kind. The format will not be optimised by any means, but contain all the info needed to allow for round-tripping of a RegionNDMap, without loss of information.

If we decide this is a useful addition to our serialisation functionality, the only remaining thing is to discuss the structure, that is used to store the data. Beside the ImageHDU, with no associated WCS, it could be an BinTableHDU as well, similar to the IRF DL3 formats.

@adonath adonath marked this pull request as draft February 9, 2021 21:50
@adonath adonath self-assigned this Feb 9, 2021
@adonath adonath added the feature label Feb 9, 2021
@adonath adonath added this to To do in gammapy.maps via automation Feb 9, 2021
@adonath adonath added this to the v0.19 milestone Feb 9, 2021
@codecov
Copy link

codecov bot commented Feb 9, 2021

Codecov Report

Merging #3231 (376eaeb) into master (ded60ed) will decrease coverage by 0.00%.
The diff coverage is 94.62%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3231      +/-   ##
==========================================
- Coverage   93.91%   93.90%   -0.01%     
==========================================
  Files         144      144              
  Lines       17450    17478      +28     
==========================================
+ Hits        16388    16413      +25     
- Misses       1062     1065       +3     
Impacted Files Coverage Δ
gammapy/maps/core.py 86.92% <50.00%> (-0.17%) ⬇️
gammapy/maps/regionnd.py 92.22% <91.89%> (+<0.01%) ⬆️
gammapy/maps/geom.py 91.05% <96.77%> (-0.07%) ⬇️
gammapy/datasets/io.py 95.10% <100.00%> (ø)
gammapy/irf/core.py 91.10% <100.00%> (ø)
gammapy/maps/region.py 90.57% <100.00%> (+0.28%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ded60ed...376eaeb. Read the comment docs.

@registerrier
Copy link
Contributor

Thanks @adonath for starting this.

This pull request introduces a proof of concept for a "gadf" format for RegionNDMap. It serialises as following:

Technically, we need a specific format for map serialization but it should decoupled from data level issues. I am not sure it is relevant to have Map serialization as part of higher level data. After all, this is really basic FITS convention on how to store multidimensional quantities that is not fully supported by WCS.

  • Main HDU as an ImageHDU with the data, but no WCS info (to be defined / documented)

How about a simple table HDU?

Fine. As a side note, the region convention is a regular FITS convention, it is not connected to ogip, no?

OK

The main question here is whether this is needed at all for now. The main use case I hand in mind was to be able to serialise a RegionNDMap with multiple axes. However this currently only applies to the EDispKernelMap and PSFMap, for both we can fall back to different solutions: they can be serialised as a WCS map with one spatial bin.

I think we need it. While OGIP format can support the 1D spectral analysis. It is not fully compliant with all use cases (e.g. what if I want to serialize a SpectrumDataset with a FoVBackgroundModel).

Beside the extra axes the serialisation will also keep all the region information on the maps, which might be interesting for provenance and it provides a simple solution to serialise both SpectrumDataset as well as SpectrumDatasetOnOff in single FITS file, which is probably a convenient alternative to the current ogip format.

+1

The goal here is not to define any official DL4 data format, but just provide a simple and convenient way to serialise a RegionNDMap of any kind. The format will not be optimised by any means, but contain all the info needed to allow for round-tripping of a RegionNDMap, without loss of information.

Indeed. And again something like WcsNDMap or ``RegionNDMap` does not really belong to DL4. It is not describing a data format but rather a possible FITS convention.

If we decide this is a useful addition to our serialisation functionality, the only remaining thing is to discuss the structure, that is used to store the data. Beside the ImageHDU, with no associated WCS, it could be an BinTableHDU as well, similar to the IRF DL3 formats.

@adonath
Copy link
Member Author

adonath commented Feb 10, 2021

Thanks for the feedback @registerrier, I'll explore the use of a BinTableHDU for the data instead...

@adonath
Copy link
Member Author

adonath commented Feb 11, 2021

@registerrier Following your suggestion I changed to use a BinTableHDU and decided to merge the bands HDU with the data hdu (similar to the ogip specresp). The table has now the following structure:

image

I think together with the region HDU this is a very clean structure now. I'll start adding more tests and documentation now.

@adonath adonath marked this pull request as ready for review February 11, 2021 14:11
registerrier
registerrier previously approved these changes Feb 11, 2021
Copy link
Contributor

@registerrier registerrier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adonath . This looks good!
The proposed format is nice. I just wonder whether we should call it gadf for now.
I have left a few comments inline.

`~RegionNDMap.write()` and `~RegionNDMap.read()` methods. Currently
the following formats are supported:

- "gadf": a generic serialisation format with support for ND axes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call it gadf, if it is still internal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment below...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choosing a different name for this format is possible but introduces the problem of defining multiple formats in MapDataset.write(). Consider the case where dataset.psf is a RegionGeom (assuming we want to support this) and one uses MapDataset.write(format="gadf"). I guess we don't really want to introduce something like ,write(format={"counts": "gadf", "background": "fgst-ccube", "psf": "gp-region"}), this would make the number of possible single fits file definitions for a MapDataset explode...


- "gadf": a generic serialisation format with support for ND axes
- "ogip" / "ogip-sherpa": an ogip-like counts spectrum with reconstructed energy axis
- "ogip-arf" / "ogip-sherpa": an ogip-like effective area with true energy axis
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically you should also have the ogip-rmf for the Region EdispKernel no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is true. My proposal would be that I enable the format for IRF maps such as PSFMap and EDispKernelMap in a follow up PR.

return "wcs"
else:
return "region"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be tested?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I'll add a test...

@adonath
Copy link
Member Author

adonath commented Feb 11, 2021

Thanks a lot @registerrier, I have made a request here: open-gamma-ray-astro/gamma-astro-data-formats#170, if it is accepted, I think we can keep the "gadf" name, otherwise I'll choose a different one.

@adonath
Copy link
Member Author

adonath commented Feb 17, 2021

Following a suggestion by @registerrier in the developer call, I now changed to introducing a dedicated bin table HDU for the data as well. This allows to share a common bands HDU if multiple "spectra" are stored within the same FITS file (see e.g. serialisation of #3075)

@adonath
Copy link
Member Author

adonath commented Feb 17, 2021

@registerrier In order to make progress, I'll take the risk and merge this PR as is. The PR in the gammapy astro data format repo is open and any user can go there to look at the definition. I assume it will be merged in a some version close to the proposed one. If we decide the definition of the FITS conventions is not part of GADF, I guess we have to adapt the tag in Gammapy consistently anyway.

@adonath adonath merged commit d9749f0 into gammapy:master Feb 17, 2021
gammapy.maps automation moved this from In progress to Done Feb 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
gammapy.maps
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants