{{ message }}

# Add Feldman Cousins algorithm #367

Merged
merged 31 commits into from Oct 21, 2015
Merged

merged 31 commits into from Oct 21, 2015

## Conversation

### dlennarz commented Oct 11, 2015

 Adding Feldman Cousins to gammapy. Status: Done working on it, unless there are new comments / requests. There was a previous pull request (#311), but I messed up that branch when trying to merge changes from the master. That's why I deleted that branch and recreated it.
and others added 19 commits Aug 11, 2015
 Merge branch 'master' of https://github.com/gammapy/gammapy into fc 
 33a258f 
 Clean up file. Add standard documentation. Rearrange code. 
 28dc52a 
 First version of a test file. 
 263f869 
 Add Feldman Cousins to import. Order alphabetically. 
 e5b4332 
 Use logging instead of print statements. 
 9f6322e 
 Revert "Add Feldman Cousins to import. Order alphabetically." 
 0d2b6c2 
This reverts commit e5b4332.
 Address most of Christoph comments. Make the unit test work. 
 45dd42d 
 Make the unit test work. Add another unit test 
 5e1bcc6 
 Add scripts that produce Fig. 7 and 10 from the FC paper numerically … 
 720c36f 
…and then analytically to demonstrate that the numerical implementation works.
 Start documentation. 
 492e813 
 Address Christoph's comment and remove plotting option. 
 20baa66 
 Included feldman_cousins.rst in toctree. Fix a typo. 
 53f5010 
 More typos. 
 26cb78e 
 Fix more typos. 
 bfbb0e5 
 Add another test. Fix typos. 
 f72c96b 
 Fix typo. 
 9784d12 
 Merging master 
 6cdcd58 
 Merge branch 'master' into fc 
 0551ad5 
 Fixing more typos. 
 26dd2da 
added the label Oct 11, 2015
added this to the 0.5 milestone Oct 11, 2015
self-assigned this Oct 11, 2015
mentioned this pull request Oct 11, 2015
added 4 commits Oct 11, 2015
 Fix typos, remove unused variables. 
 eef9757 
 This is basically a rewrite of test_feldman_cousins.py. Reproduce num… 
 09f57d8 
…bers from the FC paper. Extend coverage, while trying to cut down on the computation time. Rename functions to be more logical. Impove function documentation. Fix bugs in previously untested classes.
 Finish documentation. Fix plotting. Add various examples with nice to… 
 ac2673f 
… have code.
 Last commit was incomplete. 
 d49c12c 

### dlennarz commented Oct 14, 2015

 I'm done working on it, unless there are new comments / requests.
modified the milestones: 0.4, 0.5 Oct 15, 2015

### cdeil commented Oct 15, 2015

 @dlennarz – Would you mind making your code PEP8 compliant? E.g. for gammapy/stats/feldman_cousins.py the pep8 tool points out this. Also, if you have the patience to go though and remove the leading f from variable names everywhere, that's a remnant from C++ class members (now we're at Python, no classes), that would be great. For example this: fBackground = 3.5 fNBinsX = 100 fCL = 0.90  should be this background = 3.5 n_bins_x = 100 cl = 0.90  If you don't have time for such nitpicky stuff, just let me know and I'll just do it after merging this PR.
 @@ -0,0 +1,20 @@ """Demonstrate the artefact that can arise if

#### cdeil Oct 15, 2015 Member

I'm not sure it's really better, but maybe put the results in a table to get a nicer printout?

"""
Demonstrate the artifact that can arise if
fc_fix_upper_and_lower_limit is not used.
"""
import numpy as np
from astropy.table import Table
from gammapy.stats import fc_find_acceptance_region_poisson

background = 3.5
cl = 0.90
x_bins = np.arange(0, 100)

table = Table()
table['mu'] = [0.745, 0.750, 0.755, 1.030, 1.035, 1.040, 1.045, 1.050, 1.055, 1.060, 1.065]
table['x_min'] = 0.0
table['x_max'] = 0.0

for row in table:
x_min, x_max = fc_find_acceptance_region_poisson(row['mu'], background, x_bins, cl)
row['x_min'] = x_min
row['x_max'] = x_max

table.pprint()

gives the output:

  mu  x_min x_max
----- ----- -----
0.745   0.0   8.0
0.75   1.0   8.0
0.755   1.0   8.0
1.03   1.0   8.0
1.035   0.0   8.0
1.04   0.0   8.0
1.045   0.0   8.0
1.05   0.0   8.0
1.055   0.0   8.0
1.06   1.0   9.0
1.065   1.0   9.0


#### dlennarz Oct 15, 2015 Author Contributor

I like it. Changed the example.

### cdeil commented Oct 15, 2015

 One more comment on documentation, and then I'm done with the review here. It would be nice if you could add an example at the end of the function docstrings. (see here as an example what I mean and here for the Numpy docstring syntax to do it. It might also be useful to link from the docstrings to the high-level docs page you've written for Feldman-Cousins. Then it's all cross-linked and users can go back and forth between high-level docs, API docs and looking at the code if they like. For the high-level docs page you've written, I also think it would be great to give at least one or two code examples directly on the page, with output. And for people like me that don't have the time to go read the FC paper, it would also be helpful to give some more absolute beginner info than "The first plot reproduces Fig. 7 from [Feldman1998](Poisson process with background 3.0).", i.e. something like: "Assume you have an observation corresponding to a Poisson process and observed 7 events and estimated a background of 3.0. Then using the FC method you can compute a confidence interval of (XXX, YYY) on the excess.". I.e. my point is: it would be great if gamma-ray astronomers get some description and an example how to compute FC limits for the two or three common cases directly, without having to go to the FC paper to understand what's going on.
 [1,8]. A lot of the fast algorithms that do not compute the full confidence belt will come to the conclusion that the 90% confidence interval is [0, 0.745] and thus the upper limit when zero is measured should be 0.745 (one example is TFeldmanCousins that comes with ROOT, but is has the additional bug of making

#### cdeil Oct 15, 2015 Member

Put TFeldmanCousins and ROOT in double backticks.
(for single backticks Sphinx tries to generate links)

#### dlennarz Oct 16, 2015 Author Contributor

Fixed

 @@ -79,6 +79,14 @@ with the Li \& Ma formula: TODO: More examples.

#### cdeil Oct 15, 2015 Member

Add one FC code example (just call one of the functions and show the output) on the top-level page?
(to make some "PR", i.e. make it easier for people to discover it / get started with it)

#### dlennarz Oct 20, 2015 Author Contributor

 Simplification of the fc_find_limit function. Change the return order… 
 606c5b8 
… of some functions. Simplify function names. Some functions now raise ValueErrors in case of bad input. Expand the tests to increase coverage. Adjust variable names (removing trailing f). Improved function documentation. Expand documentation with code examples. Add example to gammapy.stats index page. Progress bar for the two analytical examples. Nicer print out for fc_demonstrate_artefact.py example.

### dlennarz commented Oct 20, 2015

 Done working on the comments / requests. Coverage is 99% now and cannot be improved, because the two missing lines should never be called actually (it's still good to check for zero division). I expanded the documentation with code examples and improved the function documentation (linking to the high-level docs page).

### cdeil commented Oct 21, 2015

 @dlennarz – Thanks!!! I'm not quite sure why this error occurs: https://travis-ci.org/gammapy/gammapy/jobs/86494723#L1071 Can you please rebasing this branch on master (see git commands here) ... maybe this will resolve the issue.
 # Pass too few x_bins to reach confidence level. x_bins = np.linspace(-sigma, sigma, n_step, endpoint=True) assert_raises(ValueError, fc_find_acceptance_region_gauss, 0, 1, x_bins, cl)

#### cdeil Oct 21, 2015 Member

This currently fails in one of the travis-ci builds:
https://travis-ci.org/gammapy/gammapy/jobs/86494765#L1507

Instead of using numpy.testing.assert_raises, please use pytest.raises (see here):

with pytest.raises(ValueError):
fc_find_acceptance_region_gauss(0, 1, x_bins, cl)

You can also assert things about the exception (see example), but I don't think that's needed here.

#### dlennarz Oct 21, 2015 Author Contributor

Okay, done.

 paper.""" import numpy as np import matplotlib.pyplot as plt from scipy import stats

#### cdeil Oct 21, 2015 Member

Unuses import from scipy import stats. Remove.

#### dlennarz Oct 21, 2015 Author Contributor

Removed.

 paper.""" import numpy as np import matplotlib.pyplot as plt from scipy import stats

#### cdeil Oct 21, 2015 Member

Remove unused import from scipy import stats

#### dlennarz Oct 21, 2015 Author Contributor

Removed.

 LowerLimitAna = [] for mu in ProgressBar(mu_bins): goodChoice = fc_find_acceptance_region_poisson(mu, background, x_bins, fCL)

fCL -> cl

#### dlennarz Oct 21, 2015 Author Contributor

Fixed.

 Assume you measured 6 counts in a Poissonian counting experiment with an expected background :math:b = 3. Here's how you compute the 90% upper limit on the signal strength :math:\\mu:

#### cdeil Oct 21, 2015 Member

Isn't this the use case covered by the fc_find_acceptance_regino_poisson convenience function?
If yes, I'd suggest using that simpler version on this top-level gammapy.stats docs page, and moving the (equivalent?) code that takes several calls to more general fc* functions to the FC docs page. Or keep it here in addition, but also show how to call the simpler fc_find_acceptance_region_poisson function here, please.

#### dlennarz Oct 21, 2015 Author Contributor

You are mistaken about what gammapy.stats.fc_find_acceptance_region_poisson does. For a given background and mu it returns the acceptance interval (x_min, x_max) such that the integrated probability is the confidence level.

The case covered here is you measured a certain x and now you are looking for the corresponding confidence interval (mu_1, mu_2).

There is a difference between acceptance and confidence interval. Normally the user is interested in the confidence interval.

#### cdeil Oct 21, 2015 Member

Is it possible to add a "FC for dummies" functions for the common cases?
Something like this:

excess_lo, excess_hi = fc_excess_limits_poisson(n_on, mu_background, cl)
excess_lo, excess_hi = fc_excess_limits_poisson_on_off(n_on, n_off, alpha, cl)

I guess it's still a few lines to implement based on what's here now, and it's non-trivial to implement because some extra parameters like acceptance interval range and spacing has to be chosen so that it's reasonable efficient and accurate?

If you don't have time any more to add this, just let me know, and I'll try to code it up tonight.

Concerning the functions that compute acceptance intervals, it could be useful to add a sentence like:
"Acceptance intervals are used in the Feldman-Cousins computation of confidence intervals. Usually the confidence intervals are the result of interest."

#### dlennarz Oct 21, 2015 Author Contributor

The function fc_excess_limits_poisson basically combines:
fc_construct_acceptance_intervals_pdfs
fc_get_limits
fc_fix_limits
fc_find_limit

I think it's a bad idea to combine them into a function, because a) that function will be slow and b) we will have to choose a mu and x binning for the user. It would be much better to make this a class, so LowerLimit and UpperLimit aren't constantly recomputed when the function is called.

Things can always be improved.

 def fc_find_limit(x_value, x_values, y_values): r""" Find the limit for a given x measurment

#### cdeil Oct 21, 2015 Member

Typo: measurment -> measurement

#### dlennarz Oct 21, 2015 Author Contributor

Fixed.

 This can be easily converted to upper and lower limits (~gammapy.stats.fc_get_limits), which simply connect the outside 1s for different :math:\\mus. An upper or lower limit is obtained by drawing a vertical line at the measured value and calculating the intersection (~gammapy.stats.fc_find_limit).

#### cdeil Oct 21, 2015 Member

Here Sphinx gets confused and doesn't render the HTML output correctly (see here.

I think (but didn't verify) that the issue is the s you have directly following the math directive:

:math:\\mus


Can you try putting this instead

:math:\\mu values


and then see if the link to

~gammapy.stats.fc_find_limit


resolves correctly?

#### dlennarz Oct 21, 2015 Author Contributor

It seems to have fixed the issues.

 import logging log = logging.getLogger(__name__) __all__ = ['fc_find_acceptance_region_gauss',

#### cdeil Oct 21, 2015 Member

Feel free to ignore me here, but just as a question / suggestion: is it possible to make the function names shorter / simpler / more uniform without being cryptic?

One possible improvement would be to always use "interval" instead of sometimes "region".
FC, at least as implemented here and commonly used, only works for one parameter, so "interval" is a bit more clear, no?

And maybe fc_get_limits_poisson would be a bit better than fc_find_acceptance_interval_poisson, because it's shorter and it's a special case of the more general fc_get_limits function? (I didn't look closely if this is actually the case, if this is incorrect and you think the current function names are best, just ignore me here.)

#### dlennarz Oct 21, 2015 Author Contributor

I replaced all the "acceptance_region" with "acceptance_interval" (which is also what is used in the FC paper.

I already explained this in an earlier comment: fc_find_acceptance_interval_poisson does not give you an confidence interval, but an acceptance_interval. So I think the function names are fine.

added 4 commits Oct 21, 2015
 Merge branch 'master' into fc 
 f8583b6 
 Replace numpy.testing.assert_raises with pytest.raises. 
 c7d1b6b 
 Remove unused import. 
 50e3a37 
 Fixed typo. 
 26464b1 

### cdeil commented Oct 21, 2015

 @dlennarz Thanks for your patience with the review on this PR. Let me know when you've pushed the latest changes and I'll merge this.
added 3 commits Oct 21, 2015
 Fixed typo. 
 3e6ad27 
 Fix problem with math rendering. 
 2e2bb46 
 Rename acceptance_region to acceptance_interval. 
 48af6a4 

### dlennarz commented Oct 21, 2015

 Done working on the comments / requests.
merged commit 48af6a4 into gammapy:master Oct 21, 2015
1 check failed
1 check failed
continuous-integration/travis-ci/pr The Travis CI build failed
Details

### cdeil commented Oct 21, 2015

 Merged. Thanks! I added you to the contributor list in e7190bf and let my editor do some auto code formatting fixes in c993789 . (the failing tests on travis-ci are unrelated installation issues of h5py, that should be fixed soon)
mentioned this pull request Oct 21, 2015

### cdeil commented Oct 21, 2015

 Actually I'm pretty sure I won't find time in the near future to work on the excess limit convenience functionality, so I've made a reminder here: #370
deleted the dlennarz:fc branch Dec 8, 2016