## Hypothesis Test with study of LEE 

We repeat again the same hypothesis test on the same problem, but now we keep varying one (or all) the signal parameters such as the mass and the width of the signal. 

We show that the extimated p-value is not following anymore the 1 degree of freedom $\chi^2$ distribution. 


In [None]:
using namespace RooStats;

In [None]:
HypoTestResult * result = nullptr;
ProfileLikelihoodTestStat * testStat = nullptr; 
ToyMCSampler * toymcs = nullptr; 
HypoTestPlot * plot = nullptr; 
// enable use of NLL offset for better minimizations
RooStats::UseNLLOffset(true);
//ROOT::Math::MinimizerOptions::SetDefaultMinimizer("Minuit2");

#### Reading the model (Workspace) from input file

First part is just to access the workspace file and retrieve the model and the data 

In [None]:
TString fileName ="HiggsBinModel.root";  
TString workspaceName = "w";
TString modelConfigName = "ModelConfig";
TString dataName = "data";
TString integrationType = "";  

In [None]:
auto file = TFile::Open(fileName);

In [None]:
auto w =  (RooWorkspace*) file->Get(workspaceName);
w->Print();
auto sbModel = (RooStats::ModelConfig*) w->obj(modelConfigName);
auto  data = w->data(dataName);
auto poi = (RooRealVar*) sbModel->GetParametersOfInterest()->first();
// the workspace contains the number of observed events
int nevt_obs = poi->getVal(); 

##### Make the b Model by cloning the b model and use a value = 0 for the parameter of interest

In [None]:
auto bModel = (RooStats::ModelConfig*) sbModel->Clone();
sbModel->SetName("S+B Model");
poi->setVal(250);
sbModel->SetSnapshot( *poi);
bModel->SetName("B Model");
poi->setVal(0);
bModel->SetSnapshot( *poi  );
sbModel->Print();
bModel->Print();
poi->setVal(nevt_obs);

#### Make the signal parameters varying in the fit

To study the LEE we do not fix the mass of the signal peak, we must keep it varying

Again we fix the background parameters to speed-up the fitting time for the toys

In [None]:
w->var("mass")->setConstant(false);
w->var("width")->setConstant(true);
w->var("a1")->setConstant(true);
w->var("a2")->setConstant(true);

### Run the Frequentist Calculator

We run now on the same model the FrequentistCalculator. The Frequentist Calculator uses the test statistic distributions obtained with pseudo-experiments.

In [None]:
RooStats::FrequentistCalculator   fc(*data, *sbModel, *bModel);
// to enable Proof
RooStats::ProofConfig pc(*w, 0, "", kFALSE);

We configure the Frequentist calculator by specifying the number of toys for the two hypothesis 

We need also to specify the test statistics type. Here are some possible test statistics to use 

In [None]:
testStat = new RooStats::ProfileLikelihoodTestStat(*sbModel->GetPdf());
// needed for PL test statistics
testStat->SetOneSidedDiscovery(true);
// to enable debug of fitting toys
//testStat->SetPrintLevel(-1);

In [None]:
toymcs = (RooStats::ToyMCSampler*)fc.GetTestStatSampler();
toymcs->SetTestStatistic(testStat);
toymcs->SetGenerateBinned(true);
// toymcs->SetProofConfig(&pc);    // to use PROOF 

#### Set the number of pseudo-experiments

We generate toys only for the null hypothesis. Not interested now in the expected significance

In [None]:
fc.SetToys(2000,1);   

#### Run now the calculator

It can take some time... be patient 

In [None]:
tw = new TStopwatch(); tw->Start(); 
result = fc.GetHypoTest(); 
result->Print();
tw->Print();

#### Plot now the test statistics distribution for the null hypothesis (B only)

In [None]:
plot = new RooStats::HypoTestPlot(*result);
plot->SetLogYaxis(true);
plot->Draw();
gPad->Draw();

We save the result in a file. We don;t want to loose the resulting information if we have run toys for some time. 

In [None]:
fileOut = TFile::Open("HypoTestResult.root","RECREATE");
result->Write();
fileOut->Close();

### Is Test statistic distribution like a chi-square distribution with n.d.f =1 ? 

We want to fit the null test statistic distribution to check if it is compatible with a chi2 distribution

In [None]:
dist = result->GetNullDistribution();
vec = dist->GetSamplingDistribution();
cout << "number of generated null toys = " << vec.size() << endl;

hdist = new TH1D("hdist","Test Statistic distribution",200,0,10);
hdist->FillN(vec.size(),vec.data(),nullptr);
// merge all underflows (failing fits) in the first bin (bin 0)
hdist->SetBinContent(1, hdist->GetBinContent(0)+hdist->GetBinContent(1));

Create the fit function as a 1/2 chisquared. Special case for forst bin (x < 0.05) 
Also the quantity plotted is the log-likelihood ratio and not 2 x log-likelihood ratio .
0.05 is the histogram bin width. 

In [None]:
fchi2 = new TF1("chi2","[](double*x,double*p){ if (x[0] < 0.05) { return 0.5*p[0]+ 0.5*p[0]*ROOT::Math::chisquared_cdf(0.1,p[1]); } else { return 0.05*p[0]*ROOT::Math::chisquared_pdf(2*x[0],p[1]); } }",0.,10.,2,1);

##### Comparison of test statistic distribution vs 1/2 $\chi^2$ distribution with *ndf = 1*

In [None]:
hdist->Draw();
fchi2->SetParameters(vec.size(),1);
fchi2->SetNpx(1000);
fchi2->SetLineColor(kGreen);
fchi2->DrawCopy("SAME");
fchi2->SetLineColor(kRed);
gPad->Draw();

#### Fit obtained distribution with  1/2 $\chi^2$ 

In [None]:
// do integral fit 
hdist->Fit(fchi2,"L I ","SAME");
gStyle->SetOptFit(1111);
gPad->Draw();