Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unused and untested code from stats/data.py #2233

Closed
registerrier opened this issue Jun 13, 2019 · 8 comments
Closed

Remove unused and untested code from stats/data.py #2233

registerrier opened this issue Jun 13, 2019 · 8 comments
Assignees
Labels
Milestone

Comments

@registerrier
Copy link
Contributor

In stats/data.py, the functions make_stats, combine_stats and compute_total_stats are unused and not tested.

I propose to remove them.
OK? @adonath @cdeil

@registerrier registerrier added this to the 0.13 milestone Jun 13, 2019
@registerrier registerrier self-assigned this Jun 13, 2019
@adonath
Copy link
Member

adonath commented Jun 13, 2019

👍

@cdeil
Copy link
Contributor

cdeil commented Jun 13, 2019

@registerrier - The combine_stats helper function probably was useful?

How do we weight a_on and a_off in stacked spectrum analysis (either when combining runs or grouping energy bins)? Could you please link to the code that does this in gammapy.spectrum?

Do you think we'll have on-off maps? If yes, that would be another caller of that helper function.

@adonath
Copy link
Member

adonath commented Jun 14, 2019

I'm with @registerrier here. Code that is currently unused and untested should be just removed. If it's still useful, we can re-introduce it later, when the use-case is on the table and clearly defined. I'm sure the code was introduced following the YAGNI principle...

@registerrier
Copy link
Contributor Author

I think that the use case of combine_stats is found in ObservationsStats.stack() which is used by SpectrumStats. So it is clearly duplicated there.

@adonath
Copy link
Member

adonath commented Jun 14, 2019

Resolved by #2235.

@adonath adonath closed this as completed Jun 14, 2019
@cdeil
Copy link
Contributor

cdeil commented Jun 14, 2019

I'm all for cleaning up and removing or improving code.

I think in this case maybe we should have improved, not removed.

The concern I have is about ObservationStats.stack but even more about SpectrumDatasetOnOffStacker.stack_backscal.

What exactly do these methods do?

They don't have good docstrings, explaining the method used, and they don't have good tests (I found this one).

Especially SpectrumDatasetOnOffStacker.stack_backscal is very important. I remember we had a lot of weird and wrong results in HAP SpectrumMaker and FitSpectrum in cases where we stacked spectra from runs where n_off was small or zero for some observations, and then added extra methods - I think we made a_on and a_off weighting by livetime the default, because while not theoretically "the right thing to do", it was in practice the reasonable thing to do because it always gave OK results, it didn't matter if n_off was zero.

So my suggestions here are to improve SpectrumDatasetOnOffStacker.stack_backscal code and docstring and tests, actually even to follow the nomenclature defined in https://docs.gammapy.org/0.12/stats/index.html for on/off stats, i.e. go away from this "backscal" and OGIP stuff and towards a codebase as we have it in the IACT codes - and just support OGIP as a FITS serialisation format. This is not required here, SpectrumDatasetOnOffStacker.stack_backscal and ObservationStats.stack could also be improved, no matter what variable nomenclature is used.

This might or might not re-introducing a utility function like gammapy.stats.combine_stats - of course with good tests and 1 or 2 callers. If some Sherpa or X-ray paper or note exists that gives the formula how they weight a_on and a_off that should be added as reference - I think it's by livetime, and that should also be the default for us.

I'm re-opening this issue now for discussion.

@registerrier as expert on this - Could you please have a look and comment what you think we could improve (if anything) here?

@cdeil cdeil reopened this Jun 14, 2019
@cdeil
Copy link
Contributor

cdeil commented Jun 17, 2019

Found this: https://heasarc.gsfc.nasa.gov/docs/asca/abc_backscal.html
Might be a useful reference to mention in the docstring.

@cdeil cdeil modified the milestones: 0.13, 0.14 Jul 18, 2019
@cdeil cdeil modified the milestones: 0.14, 0.15 Sep 27, 2019
@cdeil cdeil modified the milestones: 0.15, 1.0 Dec 2, 2019
@adonath
Copy link
Member

adonath commented Jan 30, 2020

The stacking method is now documented here https://docs.gammapy.org/0.15/api/gammapy.spectrum.SpectrumDatasetOnOff.html#gammapy.spectrum.SpectrumDatasetOnOff.stack

Closing the issue now.

@adonath adonath closed this as completed Jan 30, 2020
@adonath adonath modified the milestones: 1.0, 0.15 Jan 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants