# sPlot with simple Gaussian and polynomial PDFs and splitting the data into 4 seperate fits

Fit a pseudo data missing mass distribution to produce sWeights then use them to plot Signal weighted distribution for other variables.

Here we use the Eg variable to split the data into 4 sepeate datasets and perform fits on each one using ROOT PROOF to parallise the fits. In the end the weights are combined back into 1 for drawing the integrated distributions.

Note the weights are stored in a Weights object which then can be loaded into other fits. [Weights.h](https://github.com/dglazier/brufit/blob/master/core/Weights.h)

Load brufit using ROOT python bindings and initialise jsroot for drawing histograms.

In [1]:
import ROOT
ROOT.gROOT.ProcessLine(".x $BRUFIT/macros/LoadBru.C")
%jsroot

Welcome to JupyROOT 6.16/00


First you will need to generate some data. This is done with a ROOT macro generating random numbers from TF1 functions. It can be executed from the following notebook GenerateData
Link to [Generate Data](GenerateData.ipynb)

Create the sPlot fit manager and set the ouput directory for fit results, plots and weights

In [2]:
splot = ROOT.sPlot()
splot.SetUp().SetOutDir("outBins")

Define the fit variable as Mmiss which is the name of a branch in the tree and set the fit range to 0-10

In [3]:
splot.SetUp().LoadVariable("Mmiss[0,10]");

Set the name of the event ID variable. The input tree should have a double branch with a unique event ID number, in this case it is fgID.

In [4]:
splot.SetUp().SetIDBranchName("fgID")

Make a signal PDF. Here we will use a Gaussian distribution with mean and width parameters smean (initial value 6, allowed values between 4-7) and swidth (initial value 0.2, allowed values between 0.0001 and 3). The PDF is given the name Signal.

We then load it into the total fit PDF with LoadSpeciesPDF

In [5]:
splot.SetUp().FactoryPDF("Gaussian::Signal( Mmiss, smean[6,4,7], swidth[0.2,0.0001,3] )");
splot.SetUp().LoadSpeciesPDF("Signal")

Make a background PDF as a 2nd degree Chebychev polynomial with coefficients a0 (starting -0.1 between -1 and 1) and a1. The PDF is given the name BG

In [6]:
splot.SetUp().FactoryPDF("Chebychev::BG(Mmiss,{a0[-0.1,-1,1],a1[0.1,-1,1]})");
splot.SetUp().LoadSpeciesPDF("BG",1)

Split the data into 4 bins based on the Eg variable. Note I can add any number of these splits based on different variables. You can also provide variable size bin widths ("Eg",4,{3,3.1,3.3,3.7,4}) where 4 = number of bins.

Note this line is the only difference between running split and non split data.

In [7]:
splot.Bins().LoadBinVar("Eg",4,3,4);

Load the data giving the tree name then the file name.

In [8]:
dirname=ROOT.gSystem.pwd();
splot.LoadData("MyModel",dirname+"/Data.root")

SplitData /work/Dropbox/HaSpect/dev/brufit/outBins/ 0
Bins::RunBinTree Running bins from 0 to 4
Constructing Bin Tree /work/Dropbox/HaSpect/dev/brufit/outBins//Eg3.12_/TreeData
Constructing Bin Tree /work/Dropbox/HaSpect/dev/brufit/outBins//Eg3.38_/TreeData
Constructing Bin Tree /work/Dropbox/HaSpect/dev/brufit/outBins//Eg3.62_/TreeData
Constructing Bin Tree /work/Dropbox/HaSpect/dev/brufit/outBins//Eg3.88_/TreeData
On event 0 = 0%
BinTree::Save() /work/Dropbox/HaSpect/dev/brufit/outBins//Eg3.12_/TreeData
BinTree::Save() /work/Dropbox/HaSpect/dev/brufit/outBins//Eg3.38_/TreeData
BinTree::Save() /work/Dropbox/HaSpect/dev/brufit/outBins//Eg3.62_/TreeData
BinTree::Save() /work/Dropbox/HaSpect/dev/brufit/outBins//Eg3.88_/TreeData
FiledTree::~FiledTree()  tree name MyModel 100000 /work/Dropbox/HaSpect/dev/brufit/tutorials/sPlotSimple/Data.root
DataEvents::Load MyModel with 4 files


Error in <TTreeFormula::Compile>:  Empty String
Info in <HS::FIT::Bins:: Bins::Save()>:  Saving HSBins to /work/Dropbox/HaSpect/dev/brufit/outBins/DataBinsConfig.root


Run the fit with PROOF using 4 workers, this will also create the sWeights for each event. When it is finished it will display a plot of the signal and background fit to Mmiss and also show the residual and pull plots between the fit and data.


In [9]:
ROOT.Proof.Go(splot,4)

 Proof :: Go 0
 +++ Starting PROOF-Lite with 12 workers +++
PROOF set to parallel mode (12 workers)
PROOF set to parallel mode (4 workers)
PROOF set to parallel mode (4 workers)
15:04:47 10675 Wrk-0.0 | Info in <TProofServLite::HandleCache>: loading macro libRooStats.so ...

[1mRooFit v3.60 -- Developed by Wouter Verkerke and David Kirkby[0m 
                Copyright (C) 2000-2013 NIKHEF, University of California & Stanford University
                All rights reserved, please read http://roofit.sourceforge.net/license.txt

15:04:47 10683 Wrk-0.2 | Info in <TProofServLite::HandleCache>: loading macro libRooStats.so ...

[1mRooFit v3.60 -- Developed by Wouter Verkerke and David Kirkby[0m 
                Copyright (C) 2000-2013 NIKHEF, University of California & Stanford University
                All rights reserved, please read http://roofit.sourceforge.net/license.txt

15:04:47 10677 Wrk-0.1 | Info in <TProofServLite::HandleCache>: loading macro libRooStats.so ...

[1mRooFit v

Opening connections to workers: 1 out of 12 (8 %)Opening connections to workers: 2 out of 12 (16 %)Opening connections to workers: 3 out of 12 (25 %)Opening connections to workers: 4 out of 12 (33 %)Opening connections to workers: 5 out of 12 (41 %)Opening connections to workers: 6 out of 12 (50 %)Opening connections to workers: 7 out of 12 (58 %)Opening connections to workers: 8 out of 12 (66 %)Opening connections to workers: 9 out of 12 (75 %)Opening connections to workers: 10 out of 12 (83 %)Opening connections to workers: 11 out of 12 (91 %)Opening connections to workers: OK (12 workers)                 
Setting up worker servers: 1 out of 12 (8 %)Setting up worker servers: 2 out of 12 (16 %)Setting up worker servers: 3 out of 12 (25 %)Setting up worker servers: 4 out of 12 (33 %)Setting up worker servers: 5 out of 12 (41 %)Setting up worker servers: 6 out of 12 (50 %)Setting up worker servers: 7 out of 12 (58 %)Setting up worker servers: 8 out of 12 (66 %)Settin

To see the results we have to open each Results file and get the canvas

In [10]:
ls outBins

DataBinsConfig.root    [0m[01;34mEg3.38_[0m/  HSFit.root           WeightsEg3.38_.root
DataWeightedTree.root  [01;34mEg3.62_[0m/  Tweights.root        WeightsEg3.62_.root
[01;34mEg3.12_[0m/               [01;34mEg3.88_[0m/  WeightsEg3.12_.root  WeightsEg3.88_.root


Open the ROOT file and get the canvas showing Mmiss.

Note it is required to close the ROOT file after using it so as not to confuse subsequent cells.

In [11]:
from ROOT import TFile
fileEg1=TFile.Open("outBins/Eg3.12_/ResultsHSMinuit2.root")
fileEg1.ls()
fileEg1.Get("Eg3.12__Mmiss").Draw()
fileEg1.Close()

TFile**		outBins/Eg3.12_/ResultsHSMinuit2.root	
 TFile*		outBins/Eg3.12_/ResultsHSMinuit2.root	
  KEY: RooDataSet	FinalParameters;1	HSMinuit2Results
  KEY: TProcessID	ProcessID0;1	a50d47b4-fa8e-11e9-9557-0101007fbeef
  KEY: TTree	ResultTree;1	ResultTree
  KEY: RooFitResult	MinuitResult;1	Result of fit of p.d.f. Eg3.12_TotalPDF to dataset DataEvents
  KEY: TCanvas	Eg3.12__Mmiss;1	Eg3.12__Mmiss


Now we will draw the resulting weighted distributions. First create canvases for drawing on.

In [12]:
from ROOT import TCanvas
canvas = TCanvas("WeightedPlots","WeightedPlots")
canvas.Divide(2,2);

Now get the truth tree so we can compare true to weighted distributions.

In [13]:
fileTree =  ROOT.FiledTree.Read("MyModel","Data.root")
trueTree = fileTree.Tree()

And draw the weighted distributions using sPlot::DrawWeighted. The first string takes standard ROOT TTree::Draw arguments. The second is the name of the PDF corresponding to the species you want to draw.

In this case the histograms integrate over the binned variable Eg, using the weights from the 4 different fits.

trueTree is just a normal ROOT TTree.

The plots will appear when canvas.Draw() is called, which is in the following cell after the background histograms have been created. The weighted distributions will be points with error bars, the true distributions will be solid line histograms.

In [14]:
canvas.cd(1)
splot.DrawWeighted("M1>>hM1(100,0,10)","Signal");
trueTree.Draw("M1","Sig==1","same hist");
canvas.cd(2)
splot.DrawWeighted("M2>>hM2(100,0,10)","Signal");
trueTree.Draw("M2","Sig==1","same hist");

Weights::Merge Merging Weights* in directory /work/Dropbox/HaSpect/dev/brufit/outBins
Insert species BG 0 0
Insert species Signal 1 0
Weights::SortWeights() Clone tree to save 
Weights HSsWeights contains 100000 events associated file is /work/Dropbox/HaSpect/dev/brufit/outBins//Tweights.root 
ID branch name : fgID
Species are : 
BG
Signal
The first ten entries are :
0 -0.107301 1.1073 
1 1.44268 -0.44269 
2 -0.168291 1.16834 
3 1.16659 -0.166587 
4 1.19305 -0.193045 
5 -0.0696369 1.06968 
6 1.44541 -0.445415 
7 0.04804 0.952 
8 1.45094 -0.450929 
9 1.45095 -0.450944 
These weights are combined from :
Eg3.62_
Eg3.88_
Eg3.12_
Eg3.38_
FiledTree::~FiledTree()  tree name MyModel 100000 /work/Dropbox/HaSpect/dev/brufit/tutorials/sPlotSimple/Data.root


Now for our background weighted distributions.

In [15]:
canvas.cd(3)
splot.DrawWeighted("M1>>hM1_BG(100,0,10)","BG");
trueTree.Draw("M1","Sig==-1","same hist");
canvas.cd(4)
splot.DrawWeighted("M2>>hM2_BG(100,0,10)","BG");
trueTree.Draw("M2","Sig==-1","same hist");
canvas.Draw()

If we want to save the weighted tree with all branches inclduing weights you need to use the following line

In [16]:
splot.DeleteWeightedTree()

FiledTree::~FiledTree()  tree name MyModel 100000 /work/Dropbox/HaSpect/dev/brufit/outBins/DataWeightedTree.root
