# sPlot with simple Gaussian and polynomial PDFs and splitting the data into 4 seperate fits

Fit a pseudo-data missing-mass distribution to produce sWeights then use them to plot signal-weighted distributions for other variables.

Here we use the `Eg` variable to split the data into 4 separate datasets and perform fits on each one using ROOT PROOF to parallise the fits. In the end, the weights are combined back into 1 for drawing the integrated distributions.

Note the weights are stored in a `Weights` object which then can be loaded into other fits. [Weights.h](https://github.com/dglazier/brufit/blob/master/core/Weights.h)

Load `brufit` using ROOT Python bindings and initialise `jsroot` for drawing histograms.

In [None]:
import ROOT
ROOT.gROOT.ProcessLine(".x $BRUFIT/macros/LoadBru.C")
%jsroot

First you will need to generate some data. This is done with a ROOT macro generating random numbers from `TF1` functions. It can be executed from the following notebook GenerateData
Link to [Generate Data](GenerateData.ipynb).

Create the `sPlot` fit manager and set the output directory for fit results, plots, and weights.

In [None]:
splot = ROOT.sPlot()
splot.SetUp().SetOutDir("outBins")

Define the fit variable as `Mmiss` which is the name of a branch in the tree and set the fit range to 0-10.

In [None]:
splot.SetUp().LoadVariable("Mmiss[0,10]")

Set the name of the event ID variable. The input tree should have a double branch with a unique event ID number, in this case it is `fgID`.

In [None]:
splot.SetUp().SetIDBranchName("fgID")

Make a signal PDF. Here we will use a Gaussian distribution with mean and width parameters `smean` (initial value 6, allowed values between 4-7) and `swidth` (initial value 0.2, allowed values between 0.0001 and 3). The PDF is given the name `Signal`.

We then load it into the total fit PDF with `LoadSpeciesPDF`.

In [None]:
splot.SetUp().FactoryPDF("Gaussian::Signal( Mmiss, smean[6,4,7], swidth[0.2,0.0001,3] )")
splot.SetUp().LoadSpeciesPDF("Signal")

Make a background PDF as a 2nd degree Chebychev polynomial with coefficients `a0` and `a1` (both starting at -0.1 and varying between -1 and +1). The PDF is given the name `BG`.

In [None]:
splot.SetUp().FactoryPDF("Chebychev::BG(Mmiss,{a0[-0.1,-1,1],a1[0.1,-1,1]})")
splot.SetUp().LoadSpeciesPDF("BG",1)

Split the data into 4 bins based on the `Eg` variable. Note I can add any number of these splits based on different variables. You can also provide variable size bin widths `("Eg",4,{3,3.1,3.3,3.7,4})` where 4 is the number of bins.

In this case, after performing the individual fits, all weights will be combined into 1 file `OUTDIR/Tweights.root`.

Note this line is the only difference between running split and non-split data.

In [None]:
splot.Bins().LoadBinVar("Eg",4,3,4)

Load the data giving the tree name then the file name.

In [None]:
dirname=ROOT.gSystem.pwd()
splot.LoadData("MyModel",dirname+"/Data.root")

Run the fit with PROOF using 4 workers, this will also create the sWeights for each event. When it is finished, it will display a plot of the signal and background fit to `Mmiss` and also show the residual and pull plots between the fit and data.

In [None]:
ROOT.Proof.Go(splot,4)

To see the results we have to open each results file and get the canvas.

In [None]:
ls outBins

Open the ROOT file and get the canvas showing `Mmiss`.

Note, it is required to close the ROOT file after using it so as not to confuse subsequent cells.

In [None]:
from ROOT import TFile
fileEg1=TFile.Open("outBins/Eg3.12_/ResultsHSMinuit2.root")
fileEg1.ls()
fileEg1.Get("Eg3.12__Mmiss").Draw()
fileEg1.Close()

Now we will draw the resulting weighted distributions. First create canvases for drawing on.

In [None]:
from ROOT import TCanvas
canvas = TCanvas("WeightedPlots","WeightedPlots")
canvas.Divide(2,2)

Now get the truth tree so we can compare true to weighted distributions.

In [None]:
fileTree = ROOT.FiledTree.Read("MyModel","Data.root")
trueTree = fileTree.Tree()

And draw the weighted distributions using `sPlot::DrawWeighted`. The first string takes standard ROOT `TTree::Draw` arguments. The second is the name of the PDF corresponding to the species you want to draw.

In this case the histograms integrate over the binned variable `Eg`, using the weights from the 4 different fits.

`trueTree` is just a normal ROOT TTree.

The plots will appear when `canvas.Draw()` is called, which is in the following cell after the background histograms have been created. The weighted distributions will be red points with error bars, the true distributions will be blue solid-line histograms.

In [None]:
canvas.cd(1)
splot.DrawWeighted("M1>>hM1(100,0,10)","Signal")
ROOT.gDirectory.Get("hM1").SetLineColor(ROOT.kRed + 1)
trueTree.Draw("M1","Sig==1","same hist")
canvas.cd(2)
splot.DrawWeighted("M2>>hM2(100,0,10)","Signal")
ROOT.gDirectory.Get("hM2").SetLineColor(ROOT.kRed + 1)
trueTree.Draw("M2","Sig==1","same hist")

Now the same for our background weighted distributions.

In [None]:
canvas.cd(3)
splot.DrawWeighted("M1>>hM1_BG(100,0,10)","BG")
ROOT.gDirectory.Get("hM1_BG").SetLineColor(ROOT.kRed + 1)
trueTree.Draw("M1","Sig==-1","same hist")
canvas.cd(4)
splot.DrawWeighted("M2>>hM2_BG(100,0,10)","BG")
ROOT.gDirectory.Get("hM2_BG").SetLineColor(ROOT.kRed + 1)
trueTree.Draw("M2","Sig==-1","same hist")
canvas.Draw()

If you want to save the weighted tree with all branches including weights you need to use the following line:

In [None]:
splot.DeleteWeightedTree()