# P-values

_Valerio Ippolito - INFN Sezione di Roma_

This is the part in which we run the p-value calculation.

## Local p-values

In [None]:
f = new TFile("../ws/ICTPws_test_combined_ICTPws_test_model.root");
w = dynamic_cast<RooWorkspace*>(f->Get("combined"));
mc = dynamic_cast<RooStats::ModelConfig*>(w->obj("ModelConfig"));
dataset = w->data("obsData");

In [None]:
w->var("mu_ttH")->setVal(0);
w->var("mu_ttH")->setConstant(kTRUE);
RooFitResult *r = w->pdf("simPdf")->fitTo(*dataset, RooFit::Save());

Get NLL for this fit

In [None]:
nll_0 = r->minNll();
cout << nll_0 << endl;

Now do an unconditional fit

In [None]:
w->var("mu_ttH")->setVal(1);
w->var("mu_ttH")->setConstant(kFALSE);
RooFitResult *r_mu = w->pdf("simPdf")->fitTo(*dataset, RooFit::Save());

In [None]:
nll_mu = r_mu->minNll();
cout << nll_mu << endl;

In [None]:
cout << "delta NLL = " << nll_mu - nll_0 << endl;

In [None]:
float p0 = TMath::Prob(nll_mu - nll_0,1);
cout << p0 << endl;
float Z = RooStats::PValueToSignificance(p0);
cout << Z << endl;

The p-value calculation is very simple: it's given by (https://arxiv.org/pdf/1007.1727.pdf)
$$q_0 = 2(NLL_0 - NLL)$$
there $NLL_0$ is the negative log-likelihood calculated when the POI is set to zero (background-only hypothesis), and $NLL$ is the value when also the POI is free to float.

In [None]:
std::cout << "Significance is: " << calculator.GetSignificance()
          << ", p-value is: " << calculator.GetPvalue() << "\n";

Toys which repeat the calculation over many variations of the global observables may be used as a way to check how likely is it to have a fluctuation higher than the observed one (as in the concept of _global p-value_), and can be run easily:

In [None]:
N_toys = 1000;

calculator.SetSeed(1337); // useful to run in batch and be sure to merge many independent outputs!
calculator.SetPrintoutFrequency(10); // -1 will disable the printout
calculator.CalculateSignificanceToys(w, mc, dataset, N_toys);

pValues = calculator.GetToysPvalues();
significances = calculator.GetToysSignificances();

Let's visualize the output

In [None]:
h_pval = new TH1F("pval", "pval", 100, 0, 1);
for (int i = 0; i < pValues.size(); i++) {
    cout << "toy " << i << ": pval " << pValues[i] << " sign " << significances[i] << endl;
    h_pval->Fill(pValues[i]);
}

c = new TCanvas("c", "c", 600, 600);
h_pval->Draw();
c->Draw();

The output may also be persisted to ROOT file:

In [None]:
output_f = new TFile("my_pvalues.root", "RECREATE");
calculator.WriteResultsToROOTfile(output_f, "p0");
calculator.WriteToysToROOTfile(output_f, "toys");
output_f->Write();
delete output_f;

which is in turn read out easily:

In [None]:
output_f = new TFile("my_pvalues.root");
output_f->ls();

In [None]:
t = dynamic_cast<TTree*>(output_f->Get("p0"));
t->Show(0);

In [None]:
c = new TCanvas("c", "c", 600, 600);
t = dynamic_cast<TTree*>(output_f->Get("toys"));
t->Draw("significance");
c->Draw();