Skip to content

Conversation

@JasMehta08
Copy link
Contributor

@JasMehta08 JasMehta08 commented Aug 15, 2025

This Pull Request:

Changes or fixes:

This PR fixes an issue where the plotOn function incorrectly normalizes an extended PDF to the number of events in a dataset, rather than to its own predicted number of events. This behavior is misleading when the PDF's normalization is a key part of the model's prediction.

The default logic in RooAbsPdf::plotOn is modified to check if a PDF is extended by calling expectedEvents(). If it is, the function now uses the PDF's prediction for normalization; otherwise, it retains the old behavior of scaling to the data for non-extended PDFs.

Checklist:

  • tested changes locally
  • updated the docs (if necessary)

Testing

I have verified this fix locally with several tests:

  1. Reproducer Script: Confirmed that for an extended PDF, the plotted curve is now normalized to the PDF's expectedEvents().

    #include "RooRealVar.h"
    #include "RooUniform.h"
    #include "RooDataSet.h"
    #include "RooExtendPdf.h"
    #include "RooPlot.h"
    #include "TCanvas.h"
    #include "RooCurve.h"
    #include <iostream>
    #include <memory>
    
    void reproducer() {
        RooRealVar x{"x", "x", 0, 1};
        RooRealVar n{"n", "n", 5000, 0, 20000};
        x.setBins(1);
        RooUniform pdf{"pdf", "pdf", x};
        std::unique_ptr<RooAbsData> data{pdf.generateBinned(x, 10000)};
        RooExtendPdf extPdf{"ext_pdf", "ext_pdf", pdf, n};
        auto c1 = new TCanvas{"c1", "c1"};
        auto frame = x.frame();
        data->plotOn(frame);
        extPdf.plotOn(frame);
        std::cout << frame->getCurve()->GetY()[0] << std::endl;
        frame->Draw();
        c1->SaveAs("plot.png");
    }
image
  1. Regression Test: Confirmed that for a standard, non-extended RooGaussian, the plotted curve is still correctly scaled to the data.
#include "RooRealVar.h"
#include "RooGaussian.h"
#include "RooDataSet.h"
#include "RooPlot.h"
#include "TCanvas.h"
#include "RooConstVar.h" 
#include <memory>

void regression_test() {
    // This script checks that non-extended PDFs still
    // correctly scale to the data.

    RooRealVar x("x", "x", -10, 90);
    x.setBins(10);

    // Create the mean and sigma as named RooConstVar objects first
    RooConstVar mean("mean", "mean", 6.0);
    RooConstVar sigma("sigma", "sigma", 3.0);

    // Now pass these named objects to the constructor
    RooGaussian pdf("pdf", "pdf", x, mean, sigma);

    // A dataset with 500 events.
    std::unique_ptr<RooDataSet> data{pdf.generate(x, 500)};

    // Plot both.
    TCanvas* c1 = new TCanvas("c1", "c1");
    RooPlot* frame = x.frame();
    data->plotOn(frame);
    pdf.plotOn(frame); // The curve should scale to the data.

    frame->Draw();
    c1->SaveAs("regression_test.png");
}
image
  1. Comprehensive Test: A single script was run to verify the default behavior and the RelativeExpected and NumEvent normalization overrides. The results are shown in the plot below, and all components behave as expected.
#include "RooRealVar.h"
#include "RooGaussian.h"
#include "RooExtendPdf.h"
#include "RooDataSet.h"
#include "RooPlot.h"
#include "TCanvas.h"
#include "RooConstVar.h"
#include <memory>

void reproducer() {
    TCanvas* c1 = new TCanvas("c1", "Comprehensive plotOn Test", 1200, 400);
    c1->Divide(3, 1); // Create a 3x1 grid of plots

    // --- Common variables for all tests ---
    RooRealVar x("x", "x", 0, 10);
    x.setBins(20);

    // --- Test 1: Default behavior for NON-EXTENDED PDF ---
    c1->cd(1); // Move to the first pad
    RooPlot* frame1 = x.frame(RooFit::Title("Test 1: Default (Non-Extended)"));
    
    RooConstVar mean1("mean1", "mean1", 5.0);
    RooConstVar sigma1("sigma1", "sigma1", 1.0);
    RooGaussian pdf1("pdf1", "pdf1", x, mean1, sigma1);
    
    std::unique_ptr<RooDataSet> data1{pdf1.generate(x, 1000)};
    
    data1->plotOn(frame1);
    pdf1.plotOn(frame1); // Should scale to 1000 events in data
    frame1->Draw();

    // --- Test 2: RelativeExpected behavior for EXTENDED PDF ---
    c1->cd(2); // Move to the second pad
    RooPlot* frame2 = x.frame(RooFit::Title("Test 2: RelativeExpected"));

    RooRealVar n2("n2", "n2", 2000); // PDF predicts 2000 events
    RooExtendPdf pdf2("pdf2", "pdf2", pdf1, n2);

    // Create dummy data with a DIFFERENT number of events
    std::unique_ptr<RooDataSet> data2{pdf1.generate(x, 5000)};

    data2->plotOn(frame2);
    // Explicitly ask to scale to the PDF's own prediction
    pdf2.plotOn(frame2, RooFit::Normalization(1.0, RooAbsPdf::RelativeExpected));
    frame2->Draw();

    // --- Test 3: NumEvent behavior ---
    c1->cd(3); // Move to the third pad
    RooPlot* frame3 = x.frame(RooFit::Title("Test 3: NumEvent"));

    // Use the same non-extended pdf1 and data1
    data1->plotOn(frame3);
    // Explicitly scale the PDF to a specific number of events (e.g., 2500)
    pdf1.plotOn(frame3, RooFit::Normalization(2500, RooAbsPdf::NumEvent));
    frame3->Draw();

    c1->SaveAs("full_test.png");
}
image

Fixes #18929

@JasMehta08
Copy link
Contributor Author

JasMehta08 commented Aug 15, 2025

I have figured out whats wrong with my pull request

Not all pdfs are required to have extended pdfs so it would cause trouble there and also that some of the tests were having the wrong output so i will have to fix that too. I will do this in sometime.

@github-actions
Copy link

github-actions bot commented Aug 15, 2025

Test Results

    21 files      21 suites   3d 9h 33m 55s ⏱️
 3 378 tests  3 265 ✅  0 💤 113 ❌
69 212 runs  69 084 ✅ 15 💤 113 ❌

For more details on these failures, see this check.

Results for commit b0132b4.

♻️ This comment has been updated with latest results.

Signed-off-by: JasMehta08 <jasmehta805@gmail.com>
@JasMehta08 JasMehta08 force-pushed the fix-roofit-ploton-norm branch from ceca2bd to b0132b4 Compare August 17, 2025 03:47
@JasMehta08
Copy link
Contributor Author

JasMehta08 commented Aug 17, 2025

i changed the code a to make sure to ask if the pdfs are extendable and then i ran most of the tests that i could that were failing in the tutorial tests and they passed so i think this should fix it.

(also this is my first time contributing could help me understand how can i test all the CI tests locally if even possible to do so)

@JasMehta08
Copy link
Contributor Author

JasMehta08 commented Aug 19, 2025

@guitargeek i went through the failing tests log and have figured out that the problem is being solved which is causing the stress tests to fail when it is comparing to the expected answers. Can you take a look at my code and tell me if you feel that it is right according to you, so, that i can go ahead and change the tests which are present in the stress tests.

@JasMehta08
Copy link
Contributor Author

Hi @guitargeek ,

I've pushed a new version of the code that I believe is a more robust fix for the CI failures. Could a maintainer please approve the workflows to run so I can verify the solution?

Thank you!

@JasMehta08
Copy link
Contributor Author

JasMehta08 commented Aug 23, 2025

the CI tests are passing, the ones which did not pass are having error relating to,

Failed to open the S3 connection: You must provide an auth secret.

which i think was a error with the CI itself not with the bug fix could you please look into it @guitargeek

Thank you!

Copy link
Contributor

@guitargeek guitargeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot @JasMehta08! This solution is super clean and exactly what we needed!

@guitargeek guitargeek merged commit a8ef8b0 into root-project:master Aug 27, 2025
21 of 26 checks passed
guitargeek added a commit to guitargeek/root that referenced this pull request Dec 2, 2025
Since root-project#19656, scaling the pdf to the expected number of events in the
plot is the default behavior, because this is what makes sense when
comparing predictions to data. Therefore, the explicit
`ExpectedRelative` normalization flag can be removed.
guitargeek added a commit to guitargeek/root that referenced this pull request Dec 2, 2025
Since root-project#19656, scaling the pdf to the expected number of events in the
plot is the default behavior, because this is what makes sense when
comparing predictions to data. Therefore, the explicit
`ExpectedRelative` normalization flag can be removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RF] Plotting extended pdf together with data unexpectedly scales pdf to number of events in data

2 participants