New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Feldman Cousins algorithm #367

Merged
merged 31 commits into from Oct 21, 2015

Conversation

Projects
None yet
2 participants
@dlennarz
Contributor

dlennarz commented Oct 11, 2015

Adding Feldman Cousins to gammapy.

Status: Done working on it, unless there are new comments / requests.

There was a previous pull request (#311), but I messed up that branch when trying to merge changes from the master. That's why I deleted that branch and recreated it.

@cdeil cdeil added the feature label Oct 11, 2015

@cdeil cdeil added this to the 0.5 milestone Oct 11, 2015

@cdeil cdeil self-assigned this Oct 11, 2015

Dirk Lennarz added some commits Oct 11, 2015

Dirk Lennarz
This is basically a rewrite of test_feldman_cousins.py. Reproduce num…
…bers from the FC paper. Extend coverage, while trying to cut down on the computation time. Rename functions to be more logical. Impove function documentation. Fix bugs in previously untested classes.
@dlennarz

This comment has been minimized.

Show comment
Hide comment
@dlennarz

dlennarz Oct 14, 2015

Contributor

I'm done working on it, unless there are new comments / requests.

Contributor

dlennarz commented Oct 14, 2015

I'm done working on it, unless there are new comments / requests.

@cdeil cdeil modified the milestones: 0.4, 0.5 Oct 15, 2015

@cdeil

This comment has been minimized.

Show comment
Hide comment
@cdeil

cdeil Oct 15, 2015

Member

@dlennarz – Would you mind making your code PEP8 compliant?
E.g. for gammapy/stats/feldman_cousins.py the pep8 tool points out this.

Also, if you have the patience to go though and remove the leading f from variable names everywhere, that's a remnant from C++ class members (now we're at Python, no classes), that would be great.

For example this:

fBackground  = 3.5
fNBinsX      = 100
fCL          = 0.90

should be this

background = 3.5
n_bins_x = 100
cl = 0.90

If you don't have time for such nitpicky stuff, just let me know and I'll just do it after merging this PR.

Member

cdeil commented Oct 15, 2015

@dlennarz – Would you mind making your code PEP8 compliant?
E.g. for gammapy/stats/feldman_cousins.py the pep8 tool points out this.

Also, if you have the patience to go though and remove the leading f from variable names everywhere, that's a remnant from C++ class members (now we're at Python, no classes), that would be great.

For example this:

fBackground  = 3.5
fNBinsX      = 100
fCL          = 0.90

should be this

background = 3.5
n_bins_x = 100
cl = 0.90

If you don't have time for such nitpicky stuff, just let me know and I'll just do it after merging this PR.

@@ -0,0 +1,20 @@
"""Demonstrate the artefact that can arise if

This comment has been minimized.

@cdeil

cdeil Oct 15, 2015

Member

I'm not sure it's really better, but maybe put the results in a table to get a nicer printout?

"""
Demonstrate the artifact that can arise if
fc_fix_upper_and_lower_limit is not used.
"""
import numpy as np
from astropy.table import Table
from gammapy.stats import fc_find_acceptance_region_poisson

background = 3.5
cl = 0.90
x_bins = np.arange(0, 100)

table = Table()
table['mu'] = [0.745, 0.750, 0.755, 1.030, 1.035, 1.040, 1.045, 1.050, 1.055, 1.060, 1.065]
table['x_min'] = 0.0
table['x_max'] = 0.0

for row in table:
    x_min, x_max = fc_find_acceptance_region_poisson(row['mu'], background, x_bins, cl)
    row['x_min'] = x_min
    row['x_max'] = x_max

table.pprint()

gives the output:

  mu  x_min x_max
----- ----- -----
0.745   0.0   8.0
 0.75   1.0   8.0
0.755   1.0   8.0
 1.03   1.0   8.0
1.035   0.0   8.0
 1.04   0.0   8.0
1.045   0.0   8.0
 1.05   0.0   8.0
1.055   0.0   8.0
 1.06   1.0   9.0
1.065   1.0   9.0
@cdeil

cdeil Oct 15, 2015

Member

I'm not sure it's really better, but maybe put the results in a table to get a nicer printout?

"""
Demonstrate the artifact that can arise if
fc_fix_upper_and_lower_limit is not used.
"""
import numpy as np
from astropy.table import Table
from gammapy.stats import fc_find_acceptance_region_poisson

background = 3.5
cl = 0.90
x_bins = np.arange(0, 100)

table = Table()
table['mu'] = [0.745, 0.750, 0.755, 1.030, 1.035, 1.040, 1.045, 1.050, 1.055, 1.060, 1.065]
table['x_min'] = 0.0
table['x_max'] = 0.0

for row in table:
    x_min, x_max = fc_find_acceptance_region_poisson(row['mu'], background, x_bins, cl)
    row['x_min'] = x_min
    row['x_max'] = x_max

table.pprint()

gives the output:

  mu  x_min x_max
----- ----- -----
0.745   0.0   8.0
 0.75   1.0   8.0
0.755   1.0   8.0
 1.03   1.0   8.0
1.035   0.0   8.0
 1.04   0.0   8.0
1.045   0.0   8.0
 1.05   0.0   8.0
1.055   0.0   8.0
 1.06   1.0   9.0
1.065   1.0   9.0

This comment has been minimized.

@dlennarz

dlennarz Oct 15, 2015

Contributor

I like it. Changed the example.

@dlennarz

dlennarz Oct 15, 2015

Contributor

I like it. Changed the example.

@cdeil

This comment has been minimized.

Show comment
Hide comment
@cdeil

cdeil Oct 15, 2015

Member

One more comment on documentation, and then I'm done with the review here.

It would be nice if you could add an example at the end of the function docstrings.
(see here as an example what I mean and here for the Numpy docstring syntax to do it.
It might also be useful to link from the docstrings to the high-level docs page you've written for Feldman-Cousins. Then it's all cross-linked and users can go back and forth between high-level docs, API docs and looking at the code if they like.

For the high-level docs page you've written, I also think it would be great to give at least one or two code examples directly on the page, with output. And for people like me that don't have the time to go read the FC paper, it would also be helpful to give some more absolute beginner info than "The first plot reproduces Fig. 7 from [Feldman1998](Poisson process with background 3.0).", i.e. something like: "Assume you have an observation corresponding to a Poisson process and observed 7 events and estimated a background of 3.0. Then using the FC method you can compute a confidence interval of (XXX, YYY) on the excess.".
I.e. my point is: it would be great if gamma-ray astronomers get some description and an example how to compute FC limits for the two or three common cases directly, without having to go to the FC paper to understand what's going on.

Member

cdeil commented Oct 15, 2015

One more comment on documentation, and then I'm done with the review here.

It would be nice if you could add an example at the end of the function docstrings.
(see here as an example what I mean and here for the Numpy docstring syntax to do it.
It might also be useful to link from the docstrings to the high-level docs page you've written for Feldman-Cousins. Then it's all cross-linked and users can go back and forth between high-level docs, API docs and looking at the code if they like.

For the high-level docs page you've written, I also think it would be great to give at least one or two code examples directly on the page, with output. And for people like me that don't have the time to go read the FC paper, it would also be helpful to give some more absolute beginner info than "The first plot reproduces Fig. 7 from [Feldman1998](Poisson process with background 3.0).", i.e. something like: "Assume you have an observation corresponding to a Poisson process and observed 7 events and estimated a background of 3.0. Then using the FC method you can compute a confidence interval of (XXX, YYY) on the excess.".
I.e. my point is: it would be great if gamma-ray astronomers get some description and an example how to compute FC limits for the two or three common cases directly, without having to go to the FC paper to understand what's going on.

Show outdated Hide outdated docs/stats/feldman_cousins.rst Outdated
Show outdated Hide outdated docs/stats/index.rst Outdated
Dirk Lennarz
Simplification of the fc_find_limit function. Change the return order…
… of some functions. Simplify function names. Some functions now raise ValueErrors in case of bad input. Expand the tests to increase coverage. Adjust variable names (removing trailing f). Improved function documentation. Expand documentation with code examples. Add example to gammapy.stats index page. Progress bar for the two analytical examples. Nicer print out for fc_demonstrate_artefact.py example.
@dlennarz

This comment has been minimized.

Show comment
Hide comment
@dlennarz

dlennarz Oct 20, 2015

Contributor

Done working on the comments / requests.

Coverage is 99% now and cannot be improved, because the two missing lines should never be called actually (it's still good to check for zero division).

I expanded the documentation with code examples and improved the function documentation (linking to the high-level docs page).

Contributor

dlennarz commented Oct 20, 2015

Done working on the comments / requests.

Coverage is 99% now and cannot be improved, because the two missing lines should never be called actually (it's still good to check for zero division).

I expanded the documentation with code examples and improved the function documentation (linking to the high-level docs page).

@cdeil

This comment has been minimized.

Show comment
Hide comment
@cdeil

cdeil Oct 21, 2015

Member

@dlennarz – Thanks!!!

I'm not quite sure why this error occurs:
https://travis-ci.org/gammapy/gammapy/jobs/86494723#L1071
Can you please rebasing this branch on master (see git commands here) ... maybe this will resolve the issue.

Member

cdeil commented Oct 21, 2015

@dlennarz – Thanks!!!

I'm not quite sure why this error occurs:
https://travis-ci.org/gammapy/gammapy/jobs/86494723#L1071
Can you please rebasing this branch on master (see git commands here) ... maybe this will resolve the issue.

Show outdated Hide outdated examples/fc_gauss_analytical.py Outdated
Show outdated Hide outdated examples/fc_poisson_analytical.py Outdated
Show outdated Hide outdated examples/fc_poisson_analytical.py Outdated
Assume you measured 6 counts in a Poissonian counting experiment with an
expected background :math:`b = 3`. Here's how you compute the 90% upper limit
on the signal strength :math:`\\mu`:

This comment has been minimized.

@cdeil

cdeil Oct 21, 2015

Member

Isn't this the use case covered by the fc_find_acceptance_regino_poisson convenience function?
If yes, I'd suggest using that simpler version on this top-level gammapy.stats docs page, and moving the (equivalent?) code that takes several calls to more general fc* functions to the FC docs page. Or keep it here in addition, but also show how to call the simpler fc_find_acceptance_region_poisson function here, please.

@cdeil

cdeil Oct 21, 2015

Member

Isn't this the use case covered by the fc_find_acceptance_regino_poisson convenience function?
If yes, I'd suggest using that simpler version on this top-level gammapy.stats docs page, and moving the (equivalent?) code that takes several calls to more general fc* functions to the FC docs page. Or keep it here in addition, but also show how to call the simpler fc_find_acceptance_region_poisson function here, please.

This comment has been minimized.

@dlennarz

dlennarz Oct 21, 2015

Contributor

You are mistaken about what gammapy.stats.fc_find_acceptance_region_poisson does. For a given background and mu it returns the acceptance interval (x_min, x_max) such that the integrated probability is the confidence level.

The case covered here is you measured a certain x and now you are looking for the corresponding confidence interval (mu_1, mu_2).

There is a difference between acceptance and confidence interval. Normally the user is interested in the confidence interval.

@dlennarz

dlennarz Oct 21, 2015

Contributor

You are mistaken about what gammapy.stats.fc_find_acceptance_region_poisson does. For a given background and mu it returns the acceptance interval (x_min, x_max) such that the integrated probability is the confidence level.

The case covered here is you measured a certain x and now you are looking for the corresponding confidence interval (mu_1, mu_2).

There is a difference between acceptance and confidence interval. Normally the user is interested in the confidence interval.

This comment has been minimized.

@cdeil

cdeil Oct 21, 2015

Member

Is it possible to add a "FC for dummies" functions for the common cases?
Something like this:

excess_lo, excess_hi = fc_excess_limits_poisson(n_on, mu_background, cl)
excess_lo, excess_hi = fc_excess_limits_poisson_on_off(n_on, n_off, alpha, cl)

I guess it's still a few lines to implement based on what's here now, and it's non-trivial to implement because some extra parameters like acceptance interval range and spacing has to be chosen so that it's reasonable efficient and accurate?

If you don't have time any more to add this, just let me know, and I'll try to code it up tonight.

Concerning the functions that compute acceptance intervals, it could be useful to add a sentence like:
"Acceptance intervals are used in the Feldman-Cousins computation of confidence intervals. Usually the confidence intervals are the result of interest."

@cdeil

cdeil Oct 21, 2015

Member

Is it possible to add a "FC for dummies" functions for the common cases?
Something like this:

excess_lo, excess_hi = fc_excess_limits_poisson(n_on, mu_background, cl)
excess_lo, excess_hi = fc_excess_limits_poisson_on_off(n_on, n_off, alpha, cl)

I guess it's still a few lines to implement based on what's here now, and it's non-trivial to implement because some extra parameters like acceptance interval range and spacing has to be chosen so that it's reasonable efficient and accurate?

If you don't have time any more to add this, just let me know, and I'll try to code it up tonight.

Concerning the functions that compute acceptance intervals, it could be useful to add a sentence like:
"Acceptance intervals are used in the Feldman-Cousins computation of confidence intervals. Usually the confidence intervals are the result of interest."

This comment has been minimized.

@dlennarz

dlennarz Oct 21, 2015

Contributor

The function fc_excess_limits_poisson basically combines:
fc_construct_acceptance_intervals_pdfs
fc_get_limits
fc_fix_limits
fc_find_limit

I think it's a bad idea to combine them into a function, because a) that function will be slow and b) we will have to choose a mu and x binning for the user. It would be much better to make this a class, so LowerLimit and UpperLimit aren't constantly recomputed when the function is called.

Things can always be improved.

@dlennarz

dlennarz Oct 21, 2015

Contributor

The function fc_excess_limits_poisson basically combines:
fc_construct_acceptance_intervals_pdfs
fc_get_limits
fc_fix_limits
fc_find_limit

I think it's a bad idea to combine them into a function, because a) that function will be slow and b) we will have to choose a mu and x binning for the user. It would be much better to make this a class, so LowerLimit and UpperLimit aren't constantly recomputed when the function is called.

Things can always be improved.

Show outdated Hide outdated gammapy/stats/feldman_cousins.py Outdated
Show outdated Hide outdated docs/stats/feldman_cousins.rst Outdated
Show outdated Hide outdated gammapy/stats/feldman_cousins.py Outdated
@cdeil

This comment has been minimized.

Show comment
Hide comment
@cdeil

cdeil Oct 21, 2015

Member

@dlennarz Thanks for your patience with the review on this PR. Let me know when you've pushed the latest changes and I'll merge this.

Member

cdeil commented Oct 21, 2015

@dlennarz Thanks for your patience with the review on this PR. Let me know when you've pushed the latest changes and I'll merge this.

@dlennarz

This comment has been minimized.

Show comment
Hide comment
@dlennarz

dlennarz Oct 21, 2015

Contributor

Done working on the comments / requests.

Contributor

dlennarz commented Oct 21, 2015

Done working on the comments / requests.

@cdeil cdeil merged commit 48af6a4 into gammapy:master Oct 21, 2015

1 check failed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
@cdeil

This comment has been minimized.

Show comment
Hide comment
@cdeil

cdeil Oct 21, 2015

Member

Merged. Thanks!

I added you to the contributor list in e7190bf and let my editor do some auto code formatting fixes in c993789 .

(the failing tests on travis-ci are unrelated installation issues of h5py, that should be fixed soon)

Member

cdeil commented Oct 21, 2015

Merged. Thanks!

I added you to the contributor list in e7190bf and let my editor do some auto code formatting fixes in c993789 .

(the failing tests on travis-ci are unrelated installation issues of h5py, that should be fixed soon)

@cdeil

This comment has been minimized.

Show comment
Hide comment
@cdeil

cdeil Oct 21, 2015

Member

Actually I'm pretty sure I won't find time in the near future to work on the excess limit convenience functionality, so I've made a reminder here: #370

Member

cdeil commented Oct 21, 2015

Actually I'm pretty sure I won't find time in the near future to work on the excess limit convenience functionality, so I've made a reminder here: #370

@dlennarz dlennarz deleted the dlennarz:fc branch Dec 8, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment