Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RF] Re-implement multi-range fits #11455

Merged
merged 9 commits into from
Oct 3, 2022

Conversation

guitargeek
Copy link
Contributor

@guitargeek guitargeek commented Sep 28, 2022

Multi-range fits in RooFit are more complicated than they should be.

In principle, all that is required is to change the normalization range of the PDF to the union of the ranges.

There is a RooAbdPdf interface to suggest that this could be done easily like this:

pdf.setNormRange("range1,range2")

But this doesn't work well for RooAddPdfs, which is probably why it was chosen to implement mulit-range fits as a sum of separate RooNLLVars. But in this case, the PDFs are normalized separately, and extra terms need to be introduced to correct for that. This resulted in lots of complicated code, and still there are issues like #11447, i.e. is still doesn't work for simultaneous fits.

That's why I decided to fix the setNormRange() for RooAddPdfs, and then starting from that re-implement multi-ranged fits in both the new BatchMode and the old RooFit based on that.

Closes #11447.

@guitargeek guitargeek self-assigned this Sep 28, 2022
@guitargeek guitargeek changed the title [RF] Remove multi-range fit logic from RooAbsPdf [RF] Try to implement multi-range support for RooSimultaneous Sep 28, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@root-project root-project deleted a comment from phsft-bot Sep 29, 2022
@phsft-bot
Copy link
Collaborator

Starting build on ROOT-debian10-i386/soversion, ROOT-performance-centos8-multicore/cxx17, ROOT-ubuntu18.04/nortcxxmod, ROOT-ubuntu2004/python3, mac1015/cxx17, mac11/cxx14, windows10/cxx14
How to customize builds

The function that created the RooAddPdf cache elements was completely
broken once either the AddPdf or one of its components had a custom
normalization set defined via `RooAbsPdf::setNormSet()`.

To make this work, I was rethinking and rewriting the cache creation
logic to hopefully cover more cases correctly.

There are some PRs coming up where I will use `RooAbsPdf::setNormSet()`
for all ranged fits, so the changes in this commit will get a lot of
test coverage from that.
In the `RooBinSamplingPdf`, the underlying PDF is evaluated with a call
to `getVal()` without normalization set. This is not a good idea,
becaues it implicitly assumes that the PDF was evaluated with the right
normalization range the last time and breaks the case where you set the
normalization range of the PDF with `setNormRange()`.

It is better to evaluate the PDF explicitly normalized and then declate
the RooBinSamplingPdf as self normalized.

Some unnecessary printouts in the corresponding unit tests are also
removed.
Now that defining the normalization range via `RooAbsPdf::setNormRange`
works, we can directly use it to create the right normalization integral
for the top-level PDF for the new BatchMode, and we don't need to create
any further correction integrals.
Multi-range fits in RooFit are more complicated than they should be.

In principle, all that is required is to change the normalization range
of the PDF to the union of the ranges.

There is a RooAbdPdf interface suggesting that this could be done easily
like this:
```C++
pdf.setNormRange("range1,range2")
```

But this didn't wor for RooAddPdfs, which is probably why it was chosen
to implement mulit-range fits as a sum of separate RooNLLVars. In the
old test statistics framework. But in this case, the PDFs are normalized
separately, and extra terms need to be introduced to correct for that.
This resulted in lots of complicated code, and still there are issues
like root-project#11447, i.e. is still doesn't work for simultaneous fits.

Now that in the previous commits the `RooAddPdf` behavior for
`setNormRange()` was fixed, one can actually use comma-separated
normalization ranges for a single PDF in a multi-range fit. With this
commit, this is done in the old RooFit test statistics classes.

This has sevaral advantages:

1. Fixes issues with multi-range simultaneous fits
2. It's now in harmony with the logic in the new BatchMode
3. Some speedup because there are less nodes in the computation graph
4. Less code required
5. Logic is easier to understand

Maybe there are also new bugs now, but they can be fixed later. I am
sure that already now the commit fixes more issues that it creates.
With the recent refactoring of ranged fits, the RooAddPdf configuration
when you do a ranged fits changed in the construction of the test
statistic.

Before: reset default range of the observables to the fitting range and
create named range for the coefficient normalization.

After: keep default range of the observables and coefficient
normalization, and only change the normalization range of the PDF.

With the refactoring, there is no test coverage for the first case
anymore. That's why this commit suggests a new unit test to cover the
RooAddPdf configurations that were covered by the test statistics
formally.
Before, the `checkIfRangesOverlap` function was used in
`RooAbsPdf::createNLL`, which is why it had some command arguments that
accomodated specifically for that usecase. This is not necessary
anymore.
The new implementation of multi-ranged fits is not relying on the
`RooAbsData::reduce()` function to work with multiple comma-separated
ranges. Hence, this is implemented now.
Now that the multi-ranged fits for RooSimultaneous got re-implemented in
a better way with hopefully less bugs, the closure check for multi-range
fits in `testRooSimultaneous` succeeds, so it should be done in the
test.

Closes root-project#11447.
@phsft-bot
Copy link
Collaborator

Starting build on ROOT-debian10-i386/soversion, ROOT-performance-centos8-multicore/cxx17, ROOT-ubuntu18.04/nortcxxmod, ROOT-ubuntu2004/python3, mac1015/cxx17, mac11/cxx14, windows10/cxx14
How to customize builds

@guitargeek
Copy link
Contributor Author

Just pushed again with the fixup commits squashed

@phsft-bot
Copy link
Collaborator

Build failed on ROOT-ubuntu2004/python3.
Running on root-ubuntu-2004-1.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2022-10-03T07:18:59.108Z] FAILED: lib/modules.idx

@phsft-bot
Copy link
Collaborator

Build failed on mac1015/cxx17.
Running on macitois21.dyndns.cern.ch:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

Copy link
Member

@lmoneta lmoneta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice improvements! Thank you Jonas!

@guitargeek guitargeek merged commit 529f330 into root-project:master Oct 3, 2022
@guitargeek guitargeek deleted the multirange branch October 3, 2022 13:29
guitargeek added a commit to guitargeek/root that referenced this pull request Sep 15, 2023
Almost a year ago, I fixed the support for comma-separated normalization
ranges for pdfs, e.g. `pdf.setNormRange("range1,range2")` was fixed also
for RooAddPdfs.

As a result, the logic for multi-range likelihood fits was removed from
`createNLL()`, because the multi-range fit didn't have to be treated as
a special case anymore.

The same applies also to chi-square fits. In fact, the reason why the
`RooChi2Var` constructor and `RooAbsReal::createChi2()` methods had a
`RooAbsPdf` overload was *only* this workaround! For regular
RooAbsReals, the workaround was not necessary, because there is no
normalization.

Therefore, quite a few functions were removed in this commit.

This is a follow-up to PR root-project#11455 (commit fa10523 in particular),
where the same change was already make for regular likelihoods.
guitargeek added a commit to guitargeek/root that referenced this pull request Sep 15, 2023
Almost a year ago, I fixed the support for comma-separated normalization
ranges for pdfs, e.g. `pdf.setNormRange("range1,range2")` was fixed also
for RooAddPdfs.

As a result, the logic for multi-range likelihood fits was removed from
`createNLL()`, because the multi-range fit didn't have to be treated as
a special case anymore.

The same applies also to chi-square fits. In fact, the reason why the
`RooChi2Var` constructor and `RooAbsReal::createChi2()` methods had a
`RooAbsPdf` overload was *only* this workaround! For regular
RooAbsReals, the workaround was not necessary, because there is no
normalization.

Therefore, quite a few functions were removed in this commit.

The multi-range chi2 fit is now also validated by the multi-range fit
unit test in `testRooAbsPdf`.

This is a follow-up to PR root-project#11455 (commit fa10523 in particular),
where the same change was already make for regular likelihoods.
guitargeek added a commit to guitargeek/root that referenced this pull request Sep 17, 2023
Almost a year ago, I fixed the support for comma-separated normalization
ranges for pdfs, e.g. `pdf.setNormRange("range1,range2")` was fixed also
for RooAddPdfs.

As a result, the logic for multi-range likelihood fits was removed from
`createNLL()`, because the multi-range fit didn't have to be treated as
a special case anymore.

The same applies also to chi-square fits. In fact, the reason why the
`RooChi2Var` constructor and `RooAbsReal::createChi2()` methods had a
`RooAbsPdf` overload was *only* this workaround! For regular
RooAbsReals, the workaround was not necessary, because there is no
normalization.

Therefore, quite a few functions were removed in this commit.

The multi-range chi2 fit is now also validated by the multi-range fit
unit test in `testRooAbsPdf`.

This is a follow-up to PR root-project#11455 (commit fa10523 in particular),
where the same change was already make for regular likelihoods.
guitargeek added a commit that referenced this pull request Sep 18, 2023
Almost a year ago, I fixed the support for comma-separated normalization
ranges for pdfs, e.g. `pdf.setNormRange("range1,range2")` was fixed also
for RooAddPdfs.

As a result, the logic for multi-range likelihood fits was removed from
`createNLL()`, because the multi-range fit didn't have to be treated as
a special case anymore.

The same applies also to chi-square fits. In fact, the reason why the
`RooChi2Var` constructor and `RooAbsReal::createChi2()` methods had a
`RooAbsPdf` overload was *only* this workaround! For regular
RooAbsReals, the workaround was not necessary, because there is no
normalization.

Therefore, quite a few functions were removed in this commit.

The multi-range chi2 fit is now also validated by the multi-range fit
unit test in `testRooAbsPdf`.

This is a follow-up to PR #11455 (commit fa10523 in particular),
where the same change was already make for regular likelihoods.
maksgraczyk pushed a commit to maksgraczyk/root that referenced this pull request Jan 12, 2024
Almost a year ago, I fixed the support for comma-separated normalization
ranges for pdfs, e.g. `pdf.setNormRange("range1,range2")` was fixed also
for RooAddPdfs.

As a result, the logic for multi-range likelihood fits was removed from
`createNLL()`, because the multi-range fit didn't have to be treated as
a special case anymore.

The same applies also to chi-square fits. In fact, the reason why the
`RooChi2Var` constructor and `RooAbsReal::createChi2()` methods had a
`RooAbsPdf` overload was *only* this workaround! For regular
RooAbsReals, the workaround was not necessary, because there is no
normalization.

Therefore, quite a few functions were removed in this commit.

The multi-range chi2 fit is now also validated by the multi-range fit
unit test in `testRooAbsPdf`.

This is a follow-up to PR root-project#11455 (commit fa10523 in particular),
where the same change was already make for regular likelihoods.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RF] NLL from RooSimultaneous doesn't give the expected value in multi-range fits
3 participants