-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement MapDataset npred evaluation using cutouts #2071
Conversation
gammapy/cube/fit.py
Outdated
# TODO: lookup correct PSF for this component | ||
width = np.max(psf.psf_kernel_map.geom.width) + 2 * self.model.evaluation_radius | ||
|
||
self.exposure = exposure.cutout(position=self.model.position, width=width) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I will have add a TODO comments saying that cutout will also be applied to the PSF and EDISP once IRFmaker is validated and merged!
e_true_axis = geom_etrue.get_axis_by_name("energy") | ||
e_reco_axis = geom.get_axis_by_name("energy") | ||
e_true = e_true_axis.edges * e_true_axis.unit | ||
e_reco = e_reco_axis.edges * e_reco_axis.unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this is related to this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a bug that I just fixed along with this PR, because it appeared in the npred
evaluation. The edisp
was not initialised with units in the energy axis, which lead to incompatible map geometries (with incompatible units in the energy axis). So an easy addition of the background and spread map was not possible. Hope that makes sense...
thanks a lot for this PR!! We know that the appropriate approach to define how and when to update the cutout is not trivial so I will leave it for another PR. I agree with your choice: the evaluator is updated, when the model component drifted more than 0.25 deg from the cutout center. Then indeed, we will have to think to better strategy adapted for the different spatial model. |
eac94b3
to
d05dafb
Compare
Thanks @JouvinLea! I implemented the |
I think it doesn't need more explanation in the class, maybe an example in an notebook that users know the possibilities? I was imagining that this option will be associate by Skymodels whereas here if I understood correctly the code it's either you evaluate the models on the whole map, either you apply a cutout for each of them, right? I was imagining that you can know specific sources in your observation and then want to apply a cutout for them and then add in your global models another unknown pointlike sources for example? Maybe it starts to be too complicated for this specific PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank @adonath ! This looks very good.
My main comment concerns the update criterion which seems too general now. Isn't the model.evaluation_radius
a safer guess?
gammapy/cube/fit.py
Outdated
Likelihood function to use for the fit. | ||
evaluation_mode : {"local", "global"} | ||
Model evaluation mode. The "cutout" mode evaluates the model components on smaller grids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't find the distinction between local and global optimization algorithms very clear. Is it an issue of algorithms or available IRF granularity?
In a sense you could still have 2 PSF kernels for two distinct sources and still evaluate/convolve on the full grid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also keep local
rather than cutout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem that arises here is the caching of the coordinate grids. The way it is implemented now, the MapEvaluator
computes the coordinate grids and caches it for model evaluation. If each of the MapEvaluator
for every model component caches the full coordinate grid this quickly becomes inefficient. The solution here is probably to compute the global coordinate grid on the MapDataset
and make cutouts
(as views) into the original array which are the passed to the MapEvaluator
. I'll try to change the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we implement the caching of the .get_coord()
method on the MapGeom
object...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. But if you are to perform oversampling the initial coordinate arrays becomes irrelevant, right?
I agree that making MapGeom.get_coord()
a @lazyproperty
seems like a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, oversampling for the global mode will not be supported. However I changed the implementation of the global mode to set a PSF / Edisp per model component, which is updated as well if the model is too far from it's initial position. I've thought about it again and as we only cache the spatial coordinates (and not the full 3D coordinate array), it's probably OK to keep the full 2D coordinate array per model component. It should not be a problem as long as the number of sources is ~10-20.
gammapy/cube/fit.py
Outdated
else: | ||
self.parameters = Parameters(self.model.parameters.parameters) | ||
raise ValueError("Not a valid model evaluation mode. Choose between 'cutout' and 'global'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be local
not cutout
|
||
@property | ||
def _geom(self): | ||
if self.counts is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the use case of self.counts = None
? I guess simulation, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think it's fair use case to instantiate the MapDataset
without a counts map and then call .npred()
to compute the expected counts from the model, randomise it and use it as counts data:
dataset = MapDataset(counts=None, ...)
npred = dataset.npred()
npred.data = np.random.poisson(npred.data)
datasets.counts = npred
Shall we maybe implement a .counts
setter? To make it safer? E.g. check whether counts is in the correct geometry...
gammapy/cube/fit.py
Outdated
log = logging.getLogger(__name__) | ||
|
||
|
||
UPDATE_THRESHOLD = 0.25 * u.deg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really safe to have a global update_threshold
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect the criterion to depend on the size of the support and the expected variation scale of the IRFs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the few test I did it seemed to work well, but I agree it might not be safe enough in general. I'll change it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replaced the fixed threshold by .evaluation_radius
and added an additional CUTOUT_MARGIN
.
---------- | ||
exposure : `Map` | ||
Exposure map. | ||
psf : `PSFMap` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment this expects a PSFKernel
and EnergyDispersion
right? If we lack time to properly include the PSFMap
and EDispMap
I would keep the expected type in the docstring.
gammapy/cube/fit.py
Outdated
self.psf = psf | ||
|
||
# TODO: lookup correct PSF for this component | ||
width = np.max(psf.psf_kernel_map.geom.width) + 2 * self.model.evaluation_radius |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if the source has drift by ~self.model.evaluation_radius
, we should expect some leakage out of the boundary right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an additional CUTOUT_MARGIN
which allows to drift the model in the order of 0.1 deg
for point sources without leakage. I think this is sufficient...
c0ae892
to
695bacb
Compare
695bacb
to
f49f323
Compare
This PR includes the following changes:
MapEvaluator
objects for each model componentMapEvaluator.update()
method to update the model evaluator by cutting out the corresponding part of the global exposure map and settingedisp
as well aspsf
.MapEvaluator.needs_update
attribute, which checks, whether the model component drifted out from its cutout region. Currently the evaluator is updated, when the model component drifted more than0.25 deg
from the cutout center, which approximately half a PSF kernel size. While this seems to work, it can be surely improved, e.g. by choosing a smaller update threshold for point sources, where the PSF is much more important. For extended source one could probably choose a larger one.analysis_3d.ipynb
notebook to run the fit on the whole image instead of a small cutout. This is now possible, because the performance of the fit improved a lot. I also removed the region mask, so that the background norm is better constrained.likelihood
option forMapDataset
which was documented, but not implemented.hess.ipynb
,fermi_lat.ipynb
andanalysis_3d_joint,ipynb
notebooks.