Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement MapDataset npred evaluation using cutouts #2071

Merged
merged 8 commits into from Mar 8, 2019

Conversation

Projects
3 participants
@adonath
Copy link
Member

adonath commented Mar 4, 2019

This PR includes the following changes:

  • Create individual MapEvaluator objects for each model component
  • Implement MapEvaluator.update() method to update the model evaluator by cutting out the corresponding part of the global exposure map and setting edisp as well as psf.
  • Implement MapEvaluator.needs_update attribute, which checks, whether the model component drifted out from its cutout region. Currently the evaluator is updated, when the model component drifted more than 0.25 deg from the cutout center, which approximately half a PSF kernel size. While this seems to work, it can be surely improved, e.g. by choosing a smaller update threshold for point sources, where the PSF is much more important. For extended source one could probably choose a larger one.
  • Update the analysis_3d.ipynb notebook to run the fit on the whole image instead of a small cutout. This is now possible, because the performance of the fit improved a lot. I also removed the region mask, so that the background norm is better constrained.
  • Implement the likelihood option for MapDataset which was documented, but not implemented.
  • Minor changes to the hess.ipynb, fermi_lat.ipynb and analysis_3d_joint,ipynb notebooks.

@adonath adonath self-assigned this Mar 4, 2019

@adonath adonath added this to In progress in Modeling via automation Mar 4, 2019

@adonath adonath added this to the 0.11 milestone Mar 4, 2019

@adonath adonath requested a review from JouvinLea Mar 5, 2019

# TODO: lookup correct PSF for this component
width = np.max(psf.psf_kernel_map.geom.width) + 2 * self.model.evaluation_radius

self.exposure = exposure.cutout(position=self.model.position, width=width)

This comment has been minimized.

Copy link
@JouvinLea

JouvinLea Mar 5, 2019

Contributor

Maybe I will have add a TODO comments saying that cutout will also be applied to the PSF and EDISP once IRFmaker is validated and merged!

e_true_axis = geom_etrue.get_axis_by_name("energy")
e_reco_axis = geom.get_axis_by_name("energy")
e_true = e_true_axis.edges * e_true_axis.unit
e_reco = e_reco_axis.edges * e_reco_axis.unit

This comment has been minimized.

Copy link
@JouvinLea

JouvinLea Mar 5, 2019

Contributor

Why this is related to this PR?

This comment has been minimized.

Copy link
@adonath

adonath Mar 6, 2019

Author Member

It was a bug that I just fixed along with this PR, because it appeared in the npred evaluation. The edisp was not initialised with units in the energy axis, which lead to incompatible map geometries (with incompatible units in the energy axis). So an easy addition of the background and spread map was not possible. Hope that makes sense...

@JouvinLea

This comment has been minimized.

Copy link
Contributor

JouvinLea commented Mar 5, 2019

thanks a lot for this PR!!
From what I reviewed the code is pretty clear to me.
As we discussed, the cube/test/test_fits.py passed without changed since in the test the cutout is large enough.

We know that the appropriate approach to define how and when to update the cutout is not trivial so I will leave it for another PR. I agree with your choice: the evaluator is updated, when the model component drifted more than 0.25 deg from the cutout center. Then indeed, we will have to think to better strategy adapted for the different spatial model.
One piece maybe missing is to give the choice to the user to evaluate the models on the whole SkyMap and to not calculate any cutout. This could be need"e when we go for completely blind search.

@adonath adonath force-pushed the adonath:map_dataset_cutouts branch from eac94b3 to d05dafb Mar 5, 2019

@adonath

This comment has been minimized.

Copy link
Member Author

adonath commented Mar 6, 2019

Thanks @JouvinLea! I implemented the evaluation_mode option. Do you think more explanation is required, what the difference between the options is?

@JouvinLea

This comment has been minimized.

Copy link
Contributor

JouvinLea commented Mar 6, 2019

I think it doesn't need more explanation in the class, maybe an example in an notebook that users know the possibilities?

I was imagining that this option will be associate by Skymodels whereas here if I understood correctly the code it's either you evaluate the models on the whole map, either you apply a cutout for each of them, right? I was imagining that you can know specific sources in your observation and then want to apply a cutout for them and then add in your global models another unknown pointlike sources for example? Maybe it starts to be too complicated for this specific PR.

@registerrier
Copy link
Contributor

registerrier left a comment

Thank @adonath ! This looks very good.

My main comment concerns the update criterion which seems too general now. Isn't the model.evaluation_radius a safer guess?

Likelihood function to use for the fit.
evaluation_mode : {"local", "global"}
Model evaluation mode. The "cutout" mode evaluates the model components on smaller grids

This comment has been minimized.

Copy link
@registerrier

registerrier Mar 6, 2019

Contributor

I don't find the distinction between local and global optimization algorithms very clear. Is it an issue of algorithms or available IRF granularity?
In a sense you could still have 2 PSF kernels for two distinct sources and still evaluate/convolve on the full grid.

This comment has been minimized.

Copy link
@registerrier

registerrier Mar 6, 2019

Contributor

Also keep localrather than cutout

This comment has been minimized.

Copy link
@adonath

adonath Mar 6, 2019

Author Member

The problem that arises here is the caching of the coordinate grids. The way it is implemented now, the MapEvaluator computes the coordinate grids and caches it for model evaluation. If each of the MapEvaluator for every model component caches the full coordinate grid this quickly becomes inefficient. The solution here is probably to compute the global coordinate grid on the MapDataset and make cutouts (as views) into the original array which are the passed to the MapEvaluator. I'll try to change the implementation.

This comment has been minimized.

Copy link
@adonath

adonath Mar 6, 2019

Author Member

Or we implement the caching of the .get_coord() method on the MapGeom object...

This comment has been minimized.

Copy link
@registerrier

registerrier Mar 6, 2019

Contributor

OK. But if you are to perform oversampling the initial coordinate arrays becomes irrelevant, right?
I agree that making MapGeom.get_coord() a @lazyproperty seems like a good idea.

This comment has been minimized.

Copy link
@adonath

adonath Mar 7, 2019

Author Member

Yes, oversampling for the global mode will not be supported. However I changed the implementation of the global mode to set a PSF / Edisp per model component, which is updated as well if the model is too far from it's initial position. I've thought about it again and as we only cache the spatial coordinates (and not the full 3D coordinate array), it's probably OK to keep the full 2D coordinate array per model component. It should not be a problem as long as the number of sources is ~10-20.

else:
self.parameters = Parameters(self.model.parameters.parameters)
raise ValueError("Not a valid model evaluation mode. Choose between 'cutout' and 'global'")

This comment has been minimized.

Copy link
@registerrier

registerrier Mar 6, 2019

Contributor

should be local not cutout


@property
def _geom(self):
if self.counts is not None:

This comment has been minimized.

Copy link
@registerrier

registerrier Mar 6, 2019

Contributor

What is the use case of self.counts = None? I guess simulation, right?

This comment has been minimized.

Copy link
@adonath

adonath Mar 6, 2019

Author Member

Yes, I think it's fair use case to instantiate the MapDataset without a counts map and then call .npred() to compute the expected counts from the model, randomise it and use it as counts data:

dataset = MapDataset(counts=None, ...)
npred = dataset.npred()
npred.data = np.random.poisson(npred.data)
datasets.counts = npred 

Shall we maybe implement a .counts setter? To make it safer? E.g. check whether counts is in the correct geometry...

log = logging.getLogger(__name__)


UPDATE_THRESHOLD = 0.25 * u.deg

This comment has been minimized.

Copy link
@registerrier

registerrier Mar 6, 2019

Contributor

Is it really safe to have a global update_threshold?

This comment has been minimized.

Copy link
@registerrier

registerrier Mar 6, 2019

Contributor

I would expect the criterion to depend on the size of the support and the expected variation scale of the IRFs.

This comment has been minimized.

Copy link
@adonath

adonath Mar 6, 2019

Author Member

In the few test I did it seemed to work well, but I agree it might not be safe enough in general. I'll change it...

This comment has been minimized.

Copy link
@adonath

adonath Mar 7, 2019

Author Member

I replaced the fixed threshold by .evaluation_radius and added an additional CUTOUT_MARGIN.

----------
exposure : `Map`
Exposure map.
psf : `PSFMap`

This comment has been minimized.

Copy link
@registerrier

registerrier Mar 6, 2019

Contributor

At the moment this expects a PSFKernel and EnergyDispersion right? If we lack time to properly include the PSFMap and EDispMapI would keep the expected type in the docstring.

self.psf = psf

# TODO: lookup correct PSF for this component
width = np.max(psf.psf_kernel_map.geom.width) + 2 * self.model.evaluation_radius

This comment has been minimized.

Copy link
@registerrier

registerrier Mar 6, 2019

Contributor

So if the source has drift by ~self.model.evaluation_radius, we should expect some leakage out of the boundary right?

This comment has been minimized.

Copy link
@adonath

adonath Mar 7, 2019

Author Member

I added an additional CUTOUT_MARGIN which allows to drift the model in the order of 0.1 deg for point sources without leakage. I think this is sufficient...

@adonath adonath force-pushed the adonath:map_dataset_cutouts branch 2 times, most recently from c0ae892 to 695bacb Mar 7, 2019

@adonath adonath force-pushed the adonath:map_dataset_cutouts branch from 695bacb to f49f323 Mar 7, 2019

@adonath adonath merged commit a0c17b9 into gammapy:master Mar 8, 2019

3 checks passed

Codacy/PR Quality Review Up to standards. A positive pull request.
Details
Scrutinizer Analysis: 7 updated code elements – Tests: passed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

Modeling automation moved this from In progress to Done Mar 8, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.