Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add `MapDataset.create()` method #2368

Merged
merged 9 commits into from Sep 18, 2019

Conversation

@AtreyeeS
Copy link
Member

commented Sep 15, 2019

This PR adds a MapDataset.create() which will be useful for stacking datasets.
We need to agree on the parameters. Instead of the ones I have now, we can have (as per @adonath original suggestion)
MapDataset.create(geom, energy_true_axis=None, migra_axis=None, rad_axis=None, binsz_irf=”0.2 deg”).
My issue with this is that we might want to compute the IRFs on not just a coarser binning but also a larger spatial map (to account for spill overs). In this case, we should specify a IRF_spatial_geom instead. This should be only in 2-d. This might be a bit confusing, so I have stuck with geom and geom_irf are both 3-d.
I suppose the users should not really have to worry about these binnings at all.
Maybe it might be helpful to have a helper function get_default_binnings(obs, geom) which returns rad, migra and e_true bins from the IRFs, and a coarse IRF geometry depending upon geom?

@registerrier @adonath @luca-giunti : Please opine!

@adonath adonath self-requested a review Sep 16, 2019
@AtreyeeS AtreyeeS added this to the 0.14 milestone Sep 16, 2019
Copy link
Member

left a comment

Thanks a lot @AtreyeeS! I've left a few comments...

psf=psf,
edisp=edisp,
background_model=background_model,
name="empty_",

This comment has been minimized.

Copy link
@adonath

adonath Sep 16, 2019

Member

I think it's OK to just stick with the empty string as a default.


gti = GTI.create([0 * u.s], [0 * u.s])

edisp, psf = None, None

This comment has been minimized.

Copy link
@adonath

adonath Sep 16, 2019

Member

I think I would change the logic here a little bit. I would always create a PSFMap and EDispMapby default using some default for the rad_axis and migra_axis. In most cases uses will not care about the binning of the rad_axis and migra_axis and I think we should not force them to define one...

This comment has been minimized.

Copy link
@AtreyeeS

AtreyeeS Sep 16, 2019

Author Member

If we want the default binning to be the binnings in the IRFs (as implemented in #2358), then its a bit difficult to do it inside MapDataset as the MapDataset as no idea about the observations and the IRFs being used to create the same. So then the binning here has to be completely arbitrary.
That is why I thought maybe it would be better to have a small helper function to create default binnings, and this is what is passed to the MapDataset.create() normally.

This comment has been minimized.

Copy link
@adonath

adonath Sep 16, 2019

Member

OK, I see. In https://github.com/gammapy/gammapy/blob/master/gammapy/cube/make.py#L332 you also copy the binning from the PSF3D and EnergyDispersion2D objects. Actually I think it might still be a good idea to introduce somewhere the default RAD_AXIS and MIGRA_AXIS objects, they could be used in both places then.

This comment has been minimized.

Copy link
@registerrier

registerrier Sep 16, 2019

Contributor

We can put some default value, but it is clear that for a regular (high level driven) analysis the geometry will be defined on initialization of the loop. Even if default values are entered in the config file, we should probably never use default values in this create methods but pass a default geometry deduced from IRF at the start.
The function in charge to create this default geometry has to know about the IRF used in the datastore, so this cannot be the job of Dataset.create

This comment has been minimized.

Copy link
@adonath

adonath Sep 17, 2019

Member

In my view, the .create() method should make it simple to create an empty MapDataset, without much configuration, just like we have for Map.create(). I think in this case it's even desired, that we make a choice on good default values. For this method I would also choose the most intuitive parametrisation (e.g. MapDataset.create(geom, energy_axis_true=None, rad_axis=None, migra_axis=None, bins_irf="0.2 deg", margin_irf="0.5 deg"). Most of the users will never specify rad_axis nor migra_axis or bins_irf...

I can see the point of copying the binning from the exported IRFs, but on the other hand I think it's not even consistent between e.g. HAP and HAP-France exported data. Also I think that the choice of the exported binning was rather arbitrary and not optimised in terms of memory vs. precision.

In addition we could maybe add MapDataset.from_geoms(geom, geom_true, geom_psf, geom_edisp), which is more flexible and offers full control. .create() would internally call .from_geoms().


mask = np.ones(geom.data_shape, dtype=bool)

gti = GTI.create([0 * u.s], [0 * u.s])

This comment has been minimized.

Copy link
@adonath

adonath Sep 16, 2019

Member

Just pass an empty list see e.g. https://github.com/gammapy/gammapy/blob/master/gammapy/spectrum/dataset.py#L397. By filling it with zeros you've already create a first GTI...

counts=counts,
exposure=exposure,
mask_fit=None,
psf=psf,

This comment has been minimized.

Copy link
@adonath

adonath Sep 16, 2019

Member

Note that psf should be a PSFMap object, that we have to create somewhere before. The PSFMap also contains the exposure map on the same grid as the psf map.

This comment has been minimized.

Copy link
@AtreyeeS

AtreyeeS Sep 16, 2019

Author Member

Ok. I was a bit unsure if we should finally intend to pass PSFMap.psf_map or PSFMap into MapDataset. I guess PSFMap.exposure_map will be required in the stacking, so yes, I will change this a bit

exposure=exposure,
mask_fit=None,
psf=psf,
edisp=edisp,

This comment has been minimized.

Copy link
@adonath

adonath Sep 16, 2019

Member

Note that edisp should be an EDispMap object, that we have to create somewhere before. The EDispMap also contains the exposure map on the same grid as the edisp map.


energy_axis = geom_irf.get_axis_by_name("ENERGY")

exposure_geom = geom.to_image().to_cube([energy_axis])

This comment has been minimized.

Copy link
@adonath

adonath Sep 16, 2019

Member

Here you copy the spatial part of the counts map, so if geom_irf is defined on a larger spatial footprint the information is lost here. Maybe just upsample the exposure?

This comment has been minimized.

Copy link
@AtreyeeS

AtreyeeS Sep 16, 2019

Author Member

This is the final exposure map, which I believe should be on the same spatial geom as the counts map, but with a true energy axis.

This comment has been minimized.

Copy link
@adonath

adonath Sep 18, 2019

Member

To handle spill-over of the PSF correctly at the edges of the predicted counts map, the exposure map also needs to be larger (or have a margin of the width of the PSF). So I think it would make more sense to create it from the geom_irf and upsample it to the binning of the counts map...

This comment has been minimized.

Copy link
@adonath

adonath Sep 18, 2019

Member

Sorry @AtreyeeS! Please ignore my previous comment, I realised our model evaluation list not yet ready to handle the case of the larger exposure map completely. So introducing it here, would require additional changes in the model evaluation. Let's do it a later PR consistently.

@@ -267,6 +268,63 @@ def npred(self):

return npred_total

@classmethod
def create(cls, geom, geom_irf=None, migra_axis=None, rad_axis=None):

This comment has been minimized.

Copy link
@adonath

adonath Sep 16, 2019

Member

Note that we also need a reference_time to create the empty GTI table see e.g. https://github.com/gammapy/gammapy/blob/master/gammapy/spectrum/dataset.py#L370.

This comment has been minimized.

Copy link
@AtreyeeS

AtreyeeS Sep 16, 2019

Author Member

If I make it with empty start and stop times instead of zeros, is the reference time still important?

This comment has been minimized.

Copy link
@adonath

adonath Sep 16, 2019

Member

Yes, it is part of the meta data of the GTI table, that's why we define it in the beginning. Alternatively one could probably define a logic to take the reference time from the first GTI, that is stacked into the table, but this seem much more complicated...

@adonath

This comment has been minimized.

Copy link
Member

commented Sep 16, 2019

I was wondering whether we should split the functionality into two methods, as we have for our Map API:

  • Add a MapDataset.from_geoms(geom, geom_exposure, geom_psf, geom_edisp), which just creates all the empty maps and instanciates the MapDataset. This also gives users full flexibility of the geometries...
  • Add an intuitive Map.create(geom, energy_axis_true=None, rad_axis=None, migra_axis=None, binsz_irf="0.2 deg", margin_irf="0.3 deg") method, which uses a simpler parametrisation and gives acmes to the most important parameters...
@adonath adonath self-assigned this Sep 16, 2019
@adonath adonath added the feature label Sep 16, 2019
@adonath

This comment has been minimized.

Copy link
Member

commented Sep 18, 2019

Thanks a lot @AtreyeeS! I've left one more comment about the size of the exposure map and one more comment in Slack about the defaults for rad_axis and migra_axis. Once those are addressed we can merge.

AtreyeeS added 6 commits Sep 15, 2019
@adonath adonath merged commit 2c05625 into gammapy:master Sep 18, 2019
7 of 8 checks passed
7 of 8 checks passed
continuous-integration/travis-ci/pr The Travis CI build failed
Details
Scrutinizer Analysis: No new issues – Tests: passed
Details
gammapy.gammapy Build #20190918.8 succeeded
Details
gammapy.gammapy (DevDocs) DevDocs succeeded
Details
gammapy.gammapy (Lint) Lint succeeded
Details
gammapy.gammapy (Test Python36) Test Python36 succeeded
Details
gammapy.gammapy (Test Windows36) Test Windows36 succeeded
Details
gammapy.gammapy (Test Windows37) Test Windows37 succeeded
Details
@adonath

This comment has been minimized.

Copy link
Member

commented Sep 18, 2019

Thanks @AtreyeeS! Travis fails are unrelated...

@adonath adonath changed the title Add MapDataset.create method Add `MapDataset.create()` method Sep 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.