Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend documentation of quantities returned by estimators #3278

Merged
merged 7 commits into from Mar 24, 2021

Conversation

adonath
Copy link
Member

@adonath adonath commented Mar 19, 2021

Description

This pull request adds a proposal for the missing quantities returned by Estimator classes. Here is the current proposal:

Quantity Description
npred Predicted counts of the best fit hypothesis, equivalent to correlated counts for backward folding
npred_null Predicted counts of the null hypothesis
npred_signal Predicted counts of the signal over npred_null, equivalent to (npred - npred_null) and correlated excess for backward folding
stat Fit statistics value of the best fit hypothesis
stat_null Fit statistics value of the null hypothesis

I think the information is basically complete now. My proposed naming scheme would be npred uniformly and not introduce separate definitions like counts_corr and excess_corr for the case of backwards folding as done in the ExcessMapEstimator and instead just describe the equivalency in the table. I think in context of hypothesis testing this is reasonable.

@codecov
Copy link

codecov bot commented Mar 19, 2021

Codecov Report

Merging #3278 (e45e12f) into master (3f79b2c) will increase coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3278      +/-   ##
==========================================
+ Coverage   93.79%   93.81%   +0.01%     
==========================================
  Files         144      144              
  Lines       17754    17754              
==========================================
+ Hits        16653    16656       +3     
+ Misses       1101     1098       -3     
Impacted Files Coverage Δ
gammapy/modeling/iminuit.py 95.65% <0.00%> (+3.26%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3f79b2c...e45e12f. Read the comment docs.

@adonath adonath self-assigned this Mar 19, 2021
@adonath adonath added the docs label Mar 19, 2021
@adonath adonath added this to In progress in gammapy.estimators via automation Mar 19, 2021
@adonath adonath added this to the v0.19 milestone Mar 19, 2021
@adonath
Copy link
Member Author

adonath commented Mar 19, 2021

I think one should add a bit more information on the definition of the null hypothesis, but this varies between the estimators, so it's better documented on the Estimator classes...

@adonath adonath changed the title Extend documentation of quantities return by Estimators Extend documentation of quantities returned by estimators Mar 19, 2021
AtreyeeS
AtreyeeS previously approved these changes Mar 19, 2021
Copy link
Member

@AtreyeeS AtreyeeS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adonath !
These changes look good to me.

Going through the full file, I think we can improve it a bit more:

  • Second paragraph:

The core of any estimator algorithm is hypothesis testing: a reference model or counts excess is tested against a null hypothesis. From the best fit reference model a flux is derived and a corresponding \sqrt{\Delta TS} value from the difference in fit statistics to the null hypothesis, assuming one degree of freedom. In this case \sqrt{\Delta TS} represents an approximation of the "classical significance".

This seems to imply that the flux or sqrt(del_ts) is valid only for one degree of freedom, but we start with models, which, naively, can have more dof.
Rather, change to -
"The core of any estimator algorithm is hypothesis testing: a reference model or counts excess is tested against a null hypothesis. From the best fit reference model a flux is derived and a corresponding \sqrt{\Delta TS} value from the difference in fit statistics to the null hypothesis. Assuming one degree of freedom, \sqrt{\Delta TS} represents an approximation of the "classical significance".

  • The tutorials subsection link to only the light curve notebooks. Add the other estimators as well?

stat Fit statistics value of the best fit hypothesis
stat_null Fit statistics value of the null hypothesis
================= =================================================
npred Predicted counts of the best fit hypothesis
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the GitHub rendering, these lines are not coming inside the table, but rather in a chaotic manner.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, indeed this was because of mixture of tabs and whitespace. Fixed.

================= =================================================


To compute the assymetric errors as well as upper limits one can
specify the arguments ``n_sigma`` and ``n_sigma_ul``. The ``n_sigma``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe explain what n_sigma_ul is as well?

@adonath
Copy link
Member Author

adonath commented Mar 19, 2021

Thanks @AtreyeeS! Comments are addressed...

Copy link
Contributor

@registerrier registerrier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @adonath

My main comment is that excess is more explicit than signal. Or one could use excess_signal for more clarity, but it is a bit too long.

For instance the n_sig definition in the statistics doc is confusing for many who think it stands for number of sigma.

================= =================================================
npred Predicted counts of the best fit hypothesis
npred_null Predicted counts of the null hypothesis
npred_signal Predicted counts of the signal over `npred_bull`, equivalent to (`npred - npred_null`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think signal is misleading. Because npred_null can contain signal (in the sense of astrophysical photons). I still think excess is better, since it is really what it is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. My motivation was to stay in-line with what we define here: https://docs.gammapy.org/0.18.2/stats/index.html#notations but I now realise this requires some more thought. I just checked and npred is also define here https://gamma-astro-data-formats.readthedocs.io/en/latest/spectra/flux_points/index.html#normalization-representation, where it means the predicted counts from the model component only (so npred - npred_null in our definition...).

It's a bit similar with excess, because MapDataset.excess is already defined as counts - background and does not account for any additional source contributions. So that definition of excess is somewhat variable, in the sense that it can either include additional source contributions or not. I wonder whether we can find a fully consistent definition. I'll think about in once more and make a new proposal...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could rename sig -> excess consistently here: https://docs.gammapy.org/0.18.2/stats/index.html#notations, to get rid of the ambiguity with n sigma. But this would require adapting the API of MapDataset and CountsStatistic once again (fine from my side...if not now, we will not do it for v1.0...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the inconsistency between "our" npred (which means the total predicted counts...) and the definition in gadf cannot easily be resolved, it would require to either re-define in Gammapy, which I don't want to do, because MapDataset.npred() is already and establish part of our API (.npred_signal() possibly not that much...) or in gadf.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about if we just stay in line with the MapDataset and use npred, npred_background and npred_excess, but clearly state, that for estimators by default a "residual convention" like npred_background = dataset.npred() is used? In case there are no source models defined it is equivalent anyway. When looking at the map users will see whether sources are included in the background anyway...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On 2nd thought maybe npred, npred_excess and npred_null is also fine...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed to npred_excess...

value from the difference in fit statistics to the null hypothesis.
Assuming one degree of freedom, :math:`\sqrt{\Delta TS}` represents an
approximation (`Wilk's theorem <https://en.wikipedia.org/wiki/Wilks%27_theorem>`_)
of the "classical significance".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it an approximation of the significance or its definition in our domain?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I guess it's both and there is no either / or? Taken strictly it's an approximation according to Wilk's theorem and it's also the widely used definition in our domain...

:math:`\sqrt{\Delta TS}` represents an approximation of the
"classical significance".
value from the difference in fit statistics to the null hypothesis.
Assuming one degree of freedom, :math:`\sqrt{\Delta TS}` represents an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe one should mention explicitly that we use signed sqrt_ts

norm Best fit norm with respect to the reference spectral model
norm_err Symmetric error on the norm derived from the Hessian matrix
ts Difference in fit statistics (`stat - stat_null` )
sqrt_ts Square root of ts, in case of one degree of freedom, corresponds to significance (Wilk's theorem)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

squared root of ts multiplied by sign(excess)

================= =================================================
npred Predicted counts of the best fit hypothesis
npred_null Predicted counts of the null hypothesis
npred_signal Predicted counts of the signal over `npred_null`, equivalent to (`npred - npred_null`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because signal is ambiguous, I prefer npred_excess with the following definition:

npred_excess = npred - npred_null for forward folding estimators

And similarly:
excess = counts - npred_null for backward folding estimators.

@adonath
Copy link
Member Author

adonath commented Mar 24, 2021

@registerrier I'll go ahead and merge this now., but we should have follow up discussion on the n_sig and npred_signal() names...

@adonath adonath merged commit 569215f into gammapy:master Mar 24, 2021
gammapy.estimators automation moved this from In progress to Done Mar 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants