Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RF] Use integral of PDF curves for pull plots and residuals #7239

Closed
hageboeck opened this issue Feb 18, 2021 · 6 comments
Closed

[RF] Use integral of PDF curves for pull plots and residuals #7239

hageboeck opened this issue Feb 18, 2021 · 6 comments

Comments

@hageboeck
Copy link
Member

hageboeck commented Feb 18, 2021

Explain what you would like to see improved

When using pullHist, don't take curve at bin centre, but integrate curve over the bin. That should fix the following pull plot:

Wrong Correct
oneToy_old oneToy_new

Optional: share how it could be improved

  • Alter this part of the function:
    if (useAverage) {
    Double_t exl = GetErrorXlow(i);
    Double_t exh = GetErrorXhigh(i) ;
    if (exl<=0 ) exl = GetErrorX(i);
    if (exh<=0 ) exh = GetErrorX(i);
    if (exl<=0 ) exl = 0.5*getNominalBinWidth();
    if (exh<=0 ) exh = 0.5*getNominalBinWidth();
    yy = point - curve.average(x-exl,x+exh) ;
    } else {
    yy = point - curve.interpolate(x) ;
    }
  • Instead of interpolating the curve to the middle of the bin, find the bin edges from RooHist using the x errors. (Note that RooHist inherits from TGraphAsymErrors).
  • Then, find the corresponding points on the PDF curve.
  • Then, integrate the PDF curve from the first to the last point from above step.

Code to produce such a plot

From here

RooRealVar x("x", "x", -10, 10);
x.setBins(20);
RooRealVar m("m", "mean",  1.5, -10, 10);
RooRealVar s("s", "sigma", 0.5, -10, 10);
RooGaussian gaus("gaus", "Gaussian distribution", x, m, s);

RooDataSet* dataset = gaus.generate(x, 10000);
RooDataHist binnedDataset("binnedData", "BinnedData", x, *dataset);

TCanvas c("c", "Pull demo", 1200, 800);
c.Divide(2,2);

c.cd(1);
auto frame = x.frame();
dataset->plotOn(frame);
gaus.plotOn(frame);
frame->Draw();

c.cd(3);
auto pulls = frame->pullHist();
pulls->Draw("P");
c.Draw();
@olantwin
Copy link
Contributor

Thanks @hageboeck ! I will give this a go, when I find some time.

@etejedor
Copy link
Contributor

@olantwin do you want this to be assigned to you then?

@etejedor etejedor removed this from Needs triage in Triage Feb 18, 2021
@olantwin
Copy link
Contributor

I can't promise being able to look at this soon, but as long as that is understood, feel free to do so.

@hageboeck
Copy link
Member Author

Stop, stop, stop!

It's much easier, all the code is already there! See this:

auto pulls = frame->pullHist(); auto pulls = frame->pullHist(nullptr, nullptr, true);
canvas-2 canvas

So,

Revised version of what has to happen

  • Change the default arguments of pullHist and residHist to average = true.
  • Update documentation to clarify that by default, the curve will be integrated over all bins.
  • Mention the change in the Release notes

@guitargeek guitargeek self-assigned this Feb 18, 2021
@guitargeek
Copy link
Contributor

Thanks @hageboeck! I'll take care of it.

guitargeek added a commit to guitargeek/root that referenced this issue Feb 18, 2021
When making residual or pull distributions with `RooPlot::residHist` or
`RooPlot::pullHist`, the histogram is now compared with the curve's
average values within a given bin by default, ensuring that residual and
pull distributions are valid for strongly curved distributions.

The old default behaviour was to interpolate the curve at the bin
centres, which can still be enabled by setting the useAverage
parameter of `RooPlot::residHist` or `RooPlot::pullHist` to `false`.

Fixes root-project#7239.
@olantwin
Copy link
Contributor

Thanks Jonas, Stephan!

nicknagi pushed a commit to nicknagi/root that referenced this issue Mar 30, 2021
When making residual or pull distributions with `RooPlot::residHist` or
`RooPlot::pullHist`, the histogram is now compared with the curve's
average values within a given bin by default, ensuring that residual and
pull distributions are valid for strongly curved distributions.

The old default behaviour was to interpolate the curve at the bin
centres, which can still be enabled by setting the useAverage
parameter of `RooPlot::residHist` or `RooPlot::pullHist` to `false`.

Fixes root-project#7239.
@guitargeek guitargeek added this to Issues in Fixed in 6.26/00 via automation Apr 13, 2021
@guitargeek guitargeek added this to Issues in Fixed in 6.24/00 via automation Apr 13, 2021
@guitargeek guitargeek removed this from Issues in Fixed in 6.26/00 Apr 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests

5 participants