Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify and move energy threshold computation #2039

Merged
merged 6 commits into from Feb 25, 2019

Conversation

3 participants
@adonath
Copy link
Member

adonath commented Feb 19, 2019

This PR moves the energy threshold computation to the EnergyDispersion class, so that it can be re-used for 3D analysis. It also adds tests and simplifies the algorithm to find the bias energy .

@adonath adonath self-assigned this Feb 19, 2019

@adonath adonath added this to the 0.11 milestone Feb 19, 2019

@adonath adonath added this to To do in Map analysis via automation Feb 19, 2019

@registerrier
Copy link
Contributor

registerrier left a comment

Thanks @adonath , seems a much better implementation like this, but how much do we loose by not having the reverse option? How is SpectrumObservation.compute_energy_thresholdaffected?

if emax is None:
emax = energy[-1]

bias_spectrum = TableModel(energy, values)

This comment has been minimized.

Copy link
@registerrier

registerrier Feb 20, 2019

Contributor

OK for the code change, but I really wonder why one needs to find the true energy value where the bias becomes larger than a given value. You would normally look for reco energy if you are to find a threshold, no?

This comment has been minimized.

Copy link
@adonath

adonath Feb 20, 2019

Author Member

The old implementation was using e_true as well (https://github.com/gammapy/gammapy/blob/master/gammapy/spectrum/observation.py#L239) so I didn't think about it, when I made the change. But I agree it makes more sense to use e_reco. I'll change it...

This comment has been minimized.

Copy link
@adonath

adonath Feb 21, 2019

Author Member

Done

if emax is None:
emax = self.energy.nodes[-1]

aeff_spectrum = TableModel(self.energy.nodes, self.data.data, values_scale="lin")

This comment has been minimized.

Copy link
@registerrier

registerrier Feb 20, 2019

Contributor

I find it a bit strange to have to import models here. Maybe at some point a utility function to perform the root search on a NDDataArray for instance would be more general.

This comment has been minimized.

Copy link
@adonath

adonath Feb 20, 2019

Author Member

I completely agree, that it's a "misuse" of the TableModel, but only in terms of API. Technically it solves the problem and reduces code duplication, which I found preferable in this case. Maybe it could be implemented as a helper function instead. I'll check if there is a good place for such a method...

@@ -246,55 +246,32 @@ def max_area(self):
cleaned_data = self.data.data[np.where(~np.isnan(self.data.data))]
return cleaned_data.max()

def find_energy(self, aeff, reverse=False):
def find_energy(self, aeff, emin=None, emax=None):

This comment has been minimized.

Copy link
@registerrier

registerrier Feb 20, 2019

Contributor

We the new implementation, we rely on brentq to find a root but we don't control where it will be located i.e. closest to lower bound or higher bound. Is this an issue? As far as I can see the reverse = True option is used in SpectrumObservation.compute_energy_threshold .

This comment has been minimized.

Copy link
@adonath

adonath Feb 20, 2019

Author Member

This is indeed an issue, that I wasn't sure how to deal with. The general problem is that bias as well as aeff are not necessarily invertable, because of noise or irregular shapes. I think it's really hard to find an algorithm that does it right for all cases. The old solution was suffering from this problem as well. Here are two examples from the test datasets we use:

Effective area:
aeff

In this case, let's say if you look for the 90% effective area from the high energy end, there are basically three solutions and you don't know which one to choose...

Energy dispersion:

edisp

In this case, when you look for the energy where the bias is ~3.1 it's again not unique. At the high energy end its even worse, because of noise.

In both cases I thought the best you could do is to provide the emin and emax arguments, that users can define the search interval (e.g. by looking at those plots first...). I guess in practice one can find reasonable default values for emin and emax (e.g. peak bias or peak aeff) such that finding the lower energy threshold is stable in 99% of the cases. For the high energy end I don't really know, if it's reasonable at all to work with the bias and aeff criteria. I'm happy for any suggestion how to deal with this...

CC @lmohrmann

This comment has been minimized.

Copy link
@lmohrmann

lmohrmann Feb 22, 2019

Contributor

This is indeed not a simple problem, because you never know what the curves may look like. I like the idea of using a restricted energy range -- ideally we would provide default values for emin and emax that work well in most cases (maybe you tried that already, see my other comment).

The question whether the high-energy bound should be determined from the effective area / bias histogram at all or if something else would be better is also valid. I don't have a good suggestions for an alternative definition, though. Simply use the energy up to which the effective area is provided?!

Map analysis automation moved this from To do to In progress Feb 21, 2019

@@ -218,10 +218,11 @@ def compute_energy_threshold(
# High threshold
if method_hi == "area_max":
aeff_thres = kwargs["area_percent_hi"] / 100 * self.aeff.max_area
thres_hi = self.aeff.find_energy(aeff_thres, reverse=True)
thres_hi = self.aeff.find_energy(aeff_thres, emin=e_center, emax=self.e_true[-1])

This comment has been minimized.

Copy link
@lmohrmann

lmohrmann Feb 22, 2019

Contributor

What's e_center? I can't find it in the code...? Is this an attempt to pick reasonable default values for emin and emax, so that the algorithm is as stable as possible?

This comment has been minimized.

Copy link
@adonath

adonath Feb 22, 2019

Author Member

Yes, this was a first try. I guess it's probably better to work with the peak aeff and peak bias values. For the high energy end the last energy bin might be a good choice as you suggested bellow. I'll make this changes...

@@ -246,55 +246,32 @@ def max_area(self):
cleaned_data = self.data.data[np.where(~np.isnan(self.data.data))]
return cleaned_data.max()

def find_energy(self, aeff, reverse=False):
def find_energy(self, aeff, emin=None, emax=None):

This comment has been minimized.

Copy link
@lmohrmann

lmohrmann Feb 22, 2019

Contributor

This is indeed not a simple problem, because you never know what the curves may look like. I like the idea of using a restricted energy range -- ideally we would provide default values for emin and emax that work well in most cases (maybe you tried that already, see my other comment).

The question whether the high-energy bound should be determined from the effective area / bias histogram at all or if something else would be better is also valid. I don't have a good suggestions for an alternative definition, though. Simply use the energy up to which the effective area is provided?!

@adonath adonath merged commit c245ed8 into gammapy:master Feb 25, 2019

3 checks passed

Codacy/PR Quality Review Up to standards. A positive pull request.
Details
Scrutinizer Analysis: 4 updated code elements – Tests: passed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

Map analysis automation moved this from In progress to Done Feb 25, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.