# Example event generation, fitting and ToyMC study

Load the ROOT and fitting modules.
Turn on javascipt ROOT for nice interactive plots

In [1]:
import ROOT
ROOT.gROOT.ProcessLine(".x $HSCODE/hsfit/LoadFit.C")
%jsroot

Welcome to JupyROOT 6.16/00


Construct a Toy manager for generating initial data set. This would be equivalent to your real data. The argument 1 tells it to only create one data set per bin.

In [2]:
toy = ROOT.ToyManager(1)

Give an output directory for storing the "data"

In [3]:
toy.SetUp().SetOutDir("outSimpleToys/");
toy.SetUp().SetIDBranchName("UID");

Declare your fit variable and its range

In [4]:
toy.SetUp().LoadVariable("Mmiss[0,10]");

Declare your PDF to generate from. Here a Signal is Gaussian with mean 6 (with range 4-7) and width 0.2 (with range 0.0001-3).

LoadSpecies adds this PDF to the total PDF, while the 100 is the typical number of events to generate. Actual number will include Poisson statistics fluctuation.

In [5]:
toy.SetUp().FactoryPDF("Gaussian::Signal( Mmiss, Gmean[6,4,7], Gsigma[0.2,0.0001,3] )");
toy.SetUp().LoadSpeciesPDF("Signal",100);

The Background BG, is a Chebychev polynomial, which is going to have twice as many (200) events as the signal contribution.

In [6]:
toy.SetUp().FactoryPDF("Chebychev::BG(Mmiss,{a0[-0.1,-1,1],a1[0.1,-1,1]})");
toy.SetUp().LoadSpeciesPDF("BG",200);

If I had a real data set here with additional branched I could choose to split the data into bins before performing the fits. Here "Eg" would have to be another branch in the tree. If I use it here I would just create 5 similar data sets which would each be fitted seperately.

In [7]:
#toy.Bins().LoadBinVar("Eg",5,3,4);

Generate the data!

In [8]:
ROOT.Here.Go(toy);

### Fitting the generated data

A ToyMananger that has been used to generate data can be directly used to create a fitter with the same model. Alternately you can define a completely new fit model here. 

In [9]:
fit0=toy.Fitter();

DataEvents::Load ToyData 1


Fit the data on one local CPU.

We should see the minimiser ouput with the final fit plots at the end.

In [10]:
ROOT.Here.Go(fit0);

FiledTree::~FiledTree()  tree name ToyData 281 /work/Dropbox/HaSpect/dev/HASPECT6/tutorials/RooFitExamples/Generators/outSimpleToys///Toy0.root
RooDataSet::DataEvents[Mmiss,UID] = 281 entries
[#1] INFO:Minization -- RooMinimizer::optimizeConst: activating const optimization
[#1] INFO:Minization --  The following expressions will be evaluated in cache-and-track mode: (Signal,BG)
Minuit2Minimizer: Minimize with max-calls 3000 convergence for edm < 1 strategy 1
MnSeedGenerator: for initial parameters FCN = -778.6580373386
MnSeedGenerator: Initial state:   - FCN =  -778.6580373386 Edm =      6.96339 NCalls =     25
VariableMetric: start iterating until Edm is < 0.001
VariableMetric: Initial state   - FCN =  -778.6580373386 Edm =      6.96339 NCalls =     25
VariableMetric: Iteration #   0 - FCN =  -778.6580373386 Edm =      6.96339 NCalls =     25
VariableMetric: Iteration #   1 - FCN =  -785.1912035932 Edm =     0.237473 NCalls =     39
VariableMetric: Iteration #   2 - FCN =  -785.492485

Error in <TTree::SetBranchStatus>: unknown branch -> UID
Info in Minuit2Minimizer::Hesse : Hesse is valid - matrix is accurate
Info in Minuit2Minimizer::Hesse : Hesse is valid - matrix is accurate


### Toy MC study

Now I have successful fit results I want to study the fit for bias etc. I can do this by generating many data sets from my fit results and fitting them to make sure the extracted parameters are conssitent each time.

The ToyMC model with the fit results found in the previous cell can be combined into a ToyManager with 1 line of code using the fit model (fit0 here), which also specifies the number of toy datasets (400) to generate. We should then set a new ouput directory and generate the toy datasets.

In [None]:
toy2=ROOT.ToyManager.GetFromFit(400,fit0,"ResultsToy0HSMinuit2.root")
toy2.SetUp().SetOutDir("outSimpleToy2");
ROOT.Here.Go(toy2.get());

Thre are going to be 400 toy fits so lets run them in parallel with PROOF. The next cell is need to initialise PROOF.

In [12]:
from ROOT import TProof

Get my fitter from the new ToyManager and run the fits on PROOF with 4 workers.

In [None]:
fit2=toy2.Fitter()
ROOT.Proof.Go(fit2,4)

Collect all the fits and create parameter distributions and pulls. This is autimated by the ToyManager Summarise function.

In [14]:
toy2.Summarise()

Summarise ResultTree /work/Dropbox/HaSpect/dev/HASPECT6/tutorials/RooFitExamples/Generators/outSimpleToy2/
 ToyManager::Summarise() Initial Parameters
  1) 0x56492f5c9990 RooRealVar::      Gmean = 5.96906  L(4 - 7)  "Gmean"
  2) 0x56492f5d2120 RooRealVar::     Gsigma = 0.206953  L(0.0001 - 3)  "Gsigma"
  3) 0x56492f5950d0 RooRealVar::         a0 = -0.0433  L(-1 - 1)  "a0"
  4) 0x56492f52e730 RooRealVar::         a1 = 0.0877073  L(-1 - 1)  "a1"
  5) 0x56492f5ca1a0 RooRealVar:: Yld_Signal = 109.655  L(0 - 1e+12)  "Yld_Signal"
  6) 0x56492f5ca9a0 RooRealVar::     Yld_BG = 171.461  L(0 - 1e+12)  "Yld_BG"

Gmean 5.96715 +- 0.0237717 sigma 0.0241361 meanPull 0.00223759 sigmaPull 1.02776
      bias -0.00190544 bias Pull -0.0790702 sigma 1.02767


Gsigma 0.207819 +- 0.0204853 sigma 0.0225982 meanPull -0.115595 sigmaPull 1.1047
      bias 0.00086601 bias Pull -0.0722243 sigma 1.10013


a0 -0.0390713 +- 0.132536 sigma 0.140173 meanPull -0.000150034 sigmaPull 1.06967
      bias 0.00422866 bias Pu

In [None]:
file=ROOT.TFile.Open("outSimpleToy2/ToySummary.root")
file.ls()

Draw the histograms from the Toy summary

In [16]:
canvas=ROOT.TCanvas()
canvas.Divide(4,2)
#Draw distributions
canvas.cd(1)
ROOT.gDirectory.Get("Gmean").Draw()
canvas.cd(2)
ROOT.gDirectory.Get("Gsigma").Draw()
canvas.cd(3)
ROOT.gDirectory.Get("a0").Draw()
canvas.cd(4)
ROOT.gDirectory.Get("a1").Draw()
#Draw pulls
canvas.cd(5)
ROOT.gDirectory.Get("Gmean_pull").Draw()
canvas.cd(6)
ROOT.gDirectory.Get("Gsigma_pull").Draw()
canvas.cd(7)
ROOT.gDirectory.Get("a0_pull").Draw()
canvas.cd(8)
ROOT.gDirectory.Get("a1_pull").Draw()

canvas.Draw()