Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hist] Fix TH1::KolmogorovTest with option X when one histogram has zero errors #13822

Merged
merged 2 commits into from
Jan 17, 2024

Conversation

lmoneta
Copy link
Member

@lmoneta lmoneta commented Oct 6, 2023

Fix the case of using KS test when one of the histogram has zero errors (i.e is a function). Improve KS test adding possibility to specify number of toys by using option "X=number" for example "X=1000"

This PR fixes the issue #13697

@lmoneta lmoneta self-assigned this Oct 6, 2023
@phsft-bot
Copy link
Collaborator

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac11/noimt, mac12arm/cxx20, windows10/default
How to customize builds

@github-actions
Copy link

github-actions bot commented Oct 6, 2023

Test Results

     9 files       9 suites   1d 20h 32m 46s ⏱️
 2 490 tests  2 488 ✅ 0 💤 2 ❌
21 378 runs  21 375 ✅ 0 💤 3 ❌

For more details on these failures, see this check.

Results for commit 66d3a77.

♻️ This comment has been updated with latest results.

@phsft-bot
Copy link
Collaborator

Build failed on windows10/default.
Running on null:C:\build\workspace\root-pullrequests-build
See console output.

Failing tests:

h2Expt->Reset();
h1Expt->FillRandom(h1_cpy, (Int_t)esum1);
h2Expt->FillRandom(h1_cpy, (Int_t)esum2);
if (!afunc1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we introduce a comment here, explaining what happens in the loop over over toys and why we don't need to fill the histograms if afunc1 or afunc2 respectively? And Maybe it's also good to note that they can't be both false, which is checked at the beginning of the function. Otherwise one might wonder why the loop is even implemented also for this case.

Comment on lines 8250 to 8251
if (afunc1) Copy(*h1Expt);
if (afunc2) h2->Copy(*h2Expt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (afunc1) Copy(*h1Expt);
if (afunc2) h2->Copy(*h2Expt);

Is this even necessary? The h1Exp is already created as a clone of this, and the h2Exp could also be created as a clone of h2.

By the way, when you create the clones, I would use a lambda to reduce code duplication and also use std::unique_ptr, replacing the block a few lines before with:

      auto cloneHist = [&](TH1 const *hist) {
         return std::unique_ptr<TH1>{static_cast<TH1 *>((gDirectory ? gDirectory : gROOT)->CloneObject(hist, false))};
      };
      std::unique_ptr<TH1> h1_cpy = cloneHist(this);
      std::unique_ptr<TH1> h1Expt = cloneHist(this);
      std::unique_ptr<TH1> h2Expt = cloneHist(this /*or maybe h2 here?*/);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment. I will simplify the code using just copy and avoid un-neeeded clones of histograms

}
if (!afunc2) {
h2Expt->Reset();
h2Expt->FillRandom(h1_cpy, (Int_t)esum2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is h2Expt a randomized version of h1_cpy here? Should it not be drawn from h2?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, here we need to find the distribution of the KS test statistics, dsEXPT for the NULL, i.e. when we draw two histograms from the same parent distribution.
However, you are right there is an error. If afunc1=TRUE we need to draw from h2. So the logic is not correct.

Copy link
Contributor

@guitargeek guitargeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Lorenzo, thanks for the PR! I have a few comments about code style, code commenting, redundant copying and a potential problem of randomizing h2.

…ero errors

Fix the case of using KS test when one of the histogram has zero errors (i.e is a function).
Improve KS test adding possibility to specify number of toys by using option
"X=number" for example "X=1000"

This fixes the issue root-project#13697
Use as parent distribution for generating the toys for the NULL distribution of the test statistics the histogram with higer statistics. If it is a function use that one as parent.
This commit takes into account the comment of Jonas in the PR
@phsft-bot
Copy link
Collaborator

Starting build on ROOT-performance-centos8-multicore/soversion, ROOT-ubuntu2204/nortcxxmod, ROOT-ubuntu2004/python3, mac12arm/cxx20, windows10/default
How to customize builds

Copy link
Contributor

@guitargeek guitargeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks a lot for the fix and addressing the review comments!

@lmoneta lmoneta merged commit da7fa2d into root-project:master Jan 17, 2024
9 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants