Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add light curve upper limits #1456

Merged
merged 8 commits into from Jul 10, 2018
79 changes: 79 additions & 0 deletions gammapy/stats/poisson.py
Expand Up @@ -17,6 +17,7 @@
'significance_on_off',
'excess_matching_significance',
'excess_matching_significance_on_off',
'Helene_ULs',
]

__doctest_skip__ = ['*']
Expand Down Expand Up @@ -413,6 +414,84 @@ def _significance_direct_on_off(n_on, n_off, alpha):

return significance

def Helene_ULs(excess, error, conf_level=95.45):
"""Compute Upper limit based on Helene et al., NIM 212 (1983) 319

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually give a URL to the paper in the docstring.
I think it's this one?
http://adsabs.harvard.edu/abs/1984NIMPA.228..120H
Can you please add it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Parameters
----------
excess : double
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write always "float" instead of "double", that's what the type is called in Python.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, a too fast copy/paste. Corrected

Signal excess
error : double
Gaussian Signal error
conf_level: double
Confidence level in percentage (1sigma=68.27, 2sigma=95.45, 3sigma=99.73, 5sigma=99.999943)

Returns
-------
double
Upper limit for the excess
"""

conf_level1 = 1. - conf_level/100.

if error <= 0:
return 0.

from math import sqrt
from scipy.special import erf
if excess >= 0.:
zeta = excess/error
value = zeta/sqrt(2.)
integral = (1.+erf(value))/2.
integral2 = 1. - conf_level1*integral
value_old = value

# The 1st Loop is for Speed & 2nd For Precision
value_new = value_old+0.01
if integral > integral2:
value_new = 0.
integral = (1.+erf(value_new))/2.

while integral < integral2:
value_old = value_new
value_new = value_new+0.01
integral = (1.+erf(value_new))/2.
value_new = value_old+0.0000001
integral = (1. + erf(value_new))/2.

while integral < integral2:
value_old = value_new
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyCharm shows me that this value_old is unused; you can delete this line and the result will be the same.

This code is really hard to understand.
It looks like the excess > 0 and < 0 are the same or a similar algorithm?
If yes, can you put the algorithm in a helper functino and call it twice, instead of duplicating the ~ 10 lines of the algorithm?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on value_old is accepted and its removal is done.
For the different cases on excess, the algo is slightly different and a difference should be made. I have however simplify a bit the code as there are common ligns. Concerning its complexity, the code is coming from a C lib and optimised in this respect. I have not changed it and I do not plan to optimise it in a short term.

value_new = value_new+0.0000001
integral = (1.+erf(value_new))/2.
value_new = value_new*sqrt(2.)
conf_limit = (value_new+zeta)*error
return conf_limit
elif excess < 0.:
zeta = -excess/error
value = zeta/sqrt(2.)
integral = 1 - (1.+erf(value))/2.
integral2 = 1. - conf_level1*integral
value_old = value

# The 1st Loop is for Speed & 2nd For Precision
value_new = value_old+0.01
integral = (1.+erf(value_new))/2.
while integral < integral2:
value_old = value_new
value_new = value_new+0.01
integral = (1.+erf(value_new))/2.
value_new = value_old+0.0000001
integral = (1.+erf(value_new))/2.

while integral < integral2:
value_old = value_new
value_new = value_new+0.0000001
integral = (1.+erf(value_new))/2.
value_new = value_new*sqrt(2.)
conf_limit = (value_new-zeta)*error
return conf_limit
else:
return 0.

def excess_matching_significance(mu_bkg, significance, method='lima'):
r"""Compute excess matching a given significance.
Expand Down
4 changes: 4 additions & 0 deletions gammapy/stats/tests/test_poisson.py
Expand Up @@ -13,6 +13,7 @@
significance_on_off,
excess_matching_significance,
excess_matching_significance_on_off,
Helene_ULs,
)

pytest.importorskip('scipy')
Expand All @@ -37,6 +38,9 @@ def test_excess_error():
assert_allclose(excess_error(n_on=10, n_off=20, alpha=0.1), 3.1937439)
assert_allclose(excess_error(n_on=4, n_off=9, alpha=0.5), 2.5)

def test_Helene_ULs():
assert_allclose(Helene_ULs(excess=50, error=40, conf_level=99.73), 119.708038)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a different value in the test?!
@bkhelifi - does the test pass for you locally?

python setup.py test -V -t gammapy/stats/tests/test_poisson.py
_____________________________________________________________________________ test_Helene_ULs _____________________________________________________________________________

    def test_Helene_ULs():
>       assert_allclose(Helene_ULs(excess=50, error=40, conf_level=99.73), 119.708038)
E       AssertionError: 
E       Not equal to tolerance rtol=1e-07, atol=0
E       
E       (mismatch 100.0%)
E        x: array(162.727686)
E        y: array(119.708038)

gammapy/stats/tests/test_poisson.py:42: AssertionError
============================================================= 1 failed, 18 passed, 1 xfailed in 1.14 seconds ==============================================================

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the paper has plenty of "reference values" on what the result should be?
Could you pick one of those and leave a one line comment in the test saying where to find the test case in the paper (e.g. Table 1, 2 or 3?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this command, I see that the elif excess < 0.: case is never executed by the tests:

python setup.py test -V -t gammapy/stats/tests/test_poisson.py --coverage
open htmlcov/index.html 

Please add at least one test case, preferably using a test case given in the paper.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a stupid typo. Thanks a lot to have re-check that. Corrected.
I have added a test to make a wider coverage of the code

assert_allclose(Helene_ULs(excess=10, error=6, conf_level=99.73), 20.280005)

def test_significance():
# Check that the Li & Ma limit formula is correct
Expand Down
29 changes: 22 additions & 7 deletions gammapy/time/lightcurve.py
Expand Up @@ -6,7 +6,7 @@
from astropy.table import Table
from astropy.time import Time
from ..spectrum.utils import CountsPredictor
from ..stats.poisson import excess_error
from ..stats.poisson import excess_error, Helene_ULs
from ..utils.scripts import make_path
from ..stats.poisson import significance_on_off

Expand Down Expand Up @@ -642,7 +642,7 @@ def make_time_intervals_min_significance(self, significance, significance_method
table['t_stop'] = Time(table['t_stop'], format='mjd', scale='tt')
return table

def light_curve(self, time_intervals, spectral_model, energy_range):
def light_curve(self, time_intervals, spectral_model, energy_range, sigma_ul_thres=3.):
"""Compute light curve.

Implementation follows what is done in:
Expand All @@ -659,6 +659,8 @@ def light_curve(self, time_intervals, spectral_model, energy_range):
Spectral model
energy_range : `~astropy.units.Quantity`
True energy range to evaluate integrated flux (true energy)
sigma_ul_thres : float
Threshold on detection significance to compute Flux Upper Limits

Returns
-------
Expand All @@ -667,7 +669,7 @@ def light_curve(self, time_intervals, spectral_model, energy_range):
"""
rows = []
for time_interval in time_intervals:
useinterval, row = self.compute_flux_point(time_interval, spectral_model, energy_range)
useinterval, row = self.compute_flux_point(time_interval, spectral_model, energy_range, sigma_ul_thres)
if useinterval:
rows.append(row)

Expand All @@ -691,7 +693,7 @@ def _make_lc_from_row_data(rows):

return LightCurve(table)

def compute_flux_point(self, time_interval, spectral_model, energy_range):
def compute_flux_point(self, time_interval, spectral_model, energy_range, sigma_ul_thres=3.):
"""Compute one flux point for one time interval.

Parameters
Expand All @@ -702,6 +704,8 @@ def compute_flux_point(self, time_interval, spectral_model, energy_range):
Spectral model
energy_range : `~astropy.units.Quantity`
True energy range to evaluate integrated flux (true energy)
sigma_ul_thres : float
Threshold on detection significance to compute Flux Upper Limits

Returns
-------
Expand Down Expand Up @@ -800,20 +804,31 @@ def compute_flux_point(self, time_interval, spectral_model, energy_range):

flux = measured_excess / predicted_excess.value
flux *= int_flux
flux_err = int_flux / predicted_excess.value

# Gaussian errors, TODO: should be improved
flux_err *= excess_error(n_on=n_on, n_off=n_off, alpha=alpha_mean)
flux_err = int_flux / predicted_excess.value
delta_excess = excess_error(n_on=n_on, n_off=n_off, alpha=alpha_mean)
flux_err *= delta_excess

flux_ul = -1
sigma = significance_on_off(n_on=n_on, n_off=n_off, alpha=alpha_mean, method='lima')
if sigma <= sigma_ul_thres:
flux_ul = Helene_ULs(n_on - alpha_mean * n_off, delta_excess, 99.73) #3 sigma ULs
flux_ul *= int_flux / predicted_excess.value
else:
flux = 0
flux_err = 0
flux_ul = -1
sigma = 0.

# Store measurements in a dict and return that
return useinterval, OrderedDict([
('time_min', Time(tmin, format='mjd')),
('time_max', Time(tmax, format='mjd')),
('flux', flux * u.Unit('1 / (s cm2)')),
('flux_err', flux_err * u.Unit('1 / (s cm2)')),

('flux_ul', flux_ul * u.Unit('1 / (s cm2)')),
('is_ul', sigma <= sigma_ul_thres)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bkhelifi - You're missing a comma here:

$ python setup.py test -V -t gammapy/time/tests/test_lightcurve.py
``

________________________________________________________________________ test_lightcurve_estimator ________________________________________________________________________

@requires_data('gammapy-extra')
@requires_dependency('scipy')
def test_lightcurve_estimator():
    spec_extract = spec_extraction()
    lc_estimator = LightCurveEstimator(spec_extract)

    # param
    intervals = []
    for obs in spec_extract.obs_list:
        intervals.append([obs.events.time[0], obs.events.time[-1]])

    model = PowerLaw(
        index=2.3 * u.Unit(''),
        amplitude=3.4e-11 * u.Unit('1 / (cm2 s TeV)'),
        reference=1 * u.TeV,
    )

    lc = lc_estimator.light_curve(
        time_intervals=intervals,
        spectral_model=model,
      energy_range=[0.5, 100] * u.TeV,
    )

gammapy/time/tests/test_lightcurve.py:206:


gammapy/time/lightcurve.py:672: in light_curve
useinterval, row = self.compute_flux_point(time_interval, spectral_model, energy_range, sigma_ul_thres)


self = <gammapy.time.lightcurve.LightCurveEstimator object at 0xb147e2da0>
time_interval = [, ]
spectral_model = PowerLaw(), energy_range = <Quantity [ 0.5, 100. ] TeV>, sigma_ul_thres = 3.0

def compute_flux_point(self, time_interval, spectral_model, energy_range, sigma_ul_thres=3.):
    """Compute one flux point for one time interval.

        Parameters
        ----------
        time_interval : `~astropy.time.Time`
            Time interval (2-element array, or a tuple of Time objects)
        spectral_model : `~gammapy.spectrum.models.SpectralModel`
            Spectral model
        energy_range : `~astropy.units.Quantity`
            True energy range to evaluate integrated flux (true energy)
        sigma_ul_thres : float
            Threshold on detection significance to compute Flux Upper Limits

        Returns
        -------
        useinterval : bool
            Is True if the time_interval produce a valid flux point
        measurements : dict
            Dictionary with flux point measurement in the time interval
        """
    tmin, tmax = time_interval[0], time_interval[1]
    livetime = 0
    alpha_mean = 0.
    alpha_mean_backup = 0.
    measured_excess = 0
    predicted_excess = 0
    n_on = 0
    n_off = 0
    useinterval = False

    # Loop on observations
    for t_index, obs in enumerate(self.obs_list):

        # discard observations not matching the time interval
        obs_start = obs.events.observation_time_start
        obs_stop = obs.events.observation_time_end
        if (tmin < obs_start and tmax < obs_start) or (tmin > obs_stop):
            continue
        useinterval = True
        # get ON and OFF evt list
        on, off, spec, e_reco = self._create_and_filter_onofflists(t_index=t_index, energy_range=energy_range,
                                                                   interval=[tmin, tmax], extra=True)
        n_on_obs = len(on.table)
        n_off_obs = len(off.table)

        # compute effective livetime (for the interval)
        if tmin >= obs_start and tmax <= obs_stop:
            # interval included in obs
            livetime_to_add = (tmax - tmin).to('s')
        elif tmin >= obs_start and tmax >= obs_stop:
            # interval min above tstart from obs
            livetime_to_add = (obs_stop - tmin).to('s')
        elif tmin <= obs_start and tmax <= obs_stop:
            # interval min below tstart from obs
            livetime_to_add = (tmax - obs_start).to('s')
        elif tmin <= obs_start and tmax >= obs_stop:
            # obs included in interval
            livetime_to_add = (obs_stop - obs_start).to('s')
        else:
            livetime_to_add = 0 * u.s

        # Take into account dead time
        livetime_to_add *= (1. - obs.observation_dead_time_fraction)

        # Compute excess
        obs_measured_excess = n_on_obs - spec.alpha * n_off_obs

        # Compute the expected excess in the range given by the user
        # but must respect the energy threshold of the observation
        # (to match the energy range of the measured excess)
        # We use the effective livetime and the right energy threshold
        e_idx = np.where(np.logical_and.reduce(
            (e_reco >= spec.lo_threshold,  # threshold
             e_reco <= spec.hi_threshold,  # threshold
             e_reco >= energy_range[0],  # user
             e_reco <= energy_range[-1])  # user
        ))[0]
        counts_predictor = CountsPredictor(
            livetime=livetime_to_add,
            aeff=spec.aeff,
            edisp=spec.edisp,
            model=spectral_model
        )
        counts_predictor.run()
        counts_predicted_excess = counts_predictor.npred.data.data[e_idx[:-1]]

        obs_predicted_excess = np.sum(counts_predicted_excess)

        # compute effective normalisation between ON/OFF (for the interval)
        livetime += livetime_to_add
        alpha_mean += spec.alpha * n_off_obs
        alpha_mean_backup += spec.alpha * livetime_to_add
        measured_excess += obs_measured_excess
        predicted_excess += obs_predicted_excess
        n_on += n_on_obs
        n_off += n_off_obs

    # Fill time interval information
    if useinterval:
        int_flux = spectral_model.integral(energy_range[0], energy_range[1])

        if n_off > 0.:
            alpha_mean /= n_off
        if livetime > 0.:
            alpha_mean_backup /= livetime
        if alpha_mean == 0.:  # use backup if necessary
            alpha_mean = alpha_mean_backup

        flux = measured_excess / predicted_excess.value
        flux *= int_flux

        # Gaussian errors, TODO: should be improved
        flux_err = int_flux / predicted_excess.value
        delta_excess = excess_error(n_on=n_on, n_off=n_off, alpha=alpha_mean)
        flux_err *= delta_excess

        flux_ul = -1
        sigma = significance_on_off(n_on=n_on, n_off=n_off, alpha=alpha_mean, method='lima')
        if sigma <= sigma_ul_thres:
            flux_ul = Helene_ULs(n_on - alpha_mean * n_off, delta_excess, 99.73) #3 sigma ULs
            flux_ul *= int_flux / predicted_excess.value
    else:
        flux = 0
        flux_err = 0
        flux_ul = -1
        sigma = 0.

    # Store measurements in a dict and return that
    return useinterval, OrderedDict([
        ('time_min', Time(tmin, format='mjd')),
        ('time_max', Time(tmax, format='mjd')),
        ('flux', flux * u.Unit('1 / (s cm2)')),
        ('flux_err', flux_err * u.Unit('1 / (s cm2)')),
        ('flux_ul', flux_ul * u.Unit('1 / (s cm2)')),
        ('is_ul', sigma <= sigma_ul_thres)
      ('livetime', livetime * u.s),
        ('alpha', alpha_mean),
        ('n_on', n_on),
        ('n_off', n_off),
        ('measured_excess', measured_excess),
        ('expected_excess', predicted_excess),
    ])

E TypeError: 'tuple' object is not callable

gammapy/time/lightcurve.py:832: TypeError
=================================================================== 1 failed, 16 passed in 6.83 seconds ===================================================================

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course! My configuration has not run all the tests, in particular the one of the class LightCurve. I missed indeed bugs, showing that the test functions are useful!
Corrected

('livetime', livetime * u.s),
('alpha', alpha_mean),
('n_on', n_on),
Expand Down