# Pileup Exercises 


## Measuring pileup
Before we get into mitigating pileup effects, let's first examine measures of pileup in more detail. We will discuss event-by-event variables that can be used to characterize the pileup and this will give us some hints into thinking about how to deal with it.

If you are familiar with the ROOT command line (clang) then all of the quantities we want to look at can be computed interactively. However, to move things along we have provided a set of python commands which will display the necessary information.

In [None]:
# Loads the ROOT environment and style
import ROOT as r
from collections import OrderedDict
from Analysis.JMEDAS.tdrstyle_mod14 import *

# Set the ROOT style
r.gROOT.Macro("rootlogon.C")
setTDRStyle()

In [None]:
#Settings for each of the pads in the canvas
settingsTMP = {'X'         : (4,0.0,60.0,0.0,0.04,"X","a.u."),
               'RhoVsNpv'  : (2,0.0,60.0,0.0,60.0,"N_{PV}","#rho"),
               'NpvVsTnpu' : (1,0.0,60.0,0.0,60.0,"#mu","N_{PV}"),
               'RhoVsTnpu' : (3,0.0,60.0,0.0,60.0,"#mu","#rho")
               }
settings = OrderedDict(sorted(settingsTMP.items(), key=lambda x:x[1], reverse=False))

# Create and draw the canvas
frames = []
for f, s in enumerate(settings) :
    frame = r.TH1D()
    frames.append(frame)
    frames[f].GetXaxis().SetLimits(settings[s][1],settings[s][2])
    frames[f].GetYaxis().SetRangeUser(settings[s][3],settings[s][4])
    frames[f].GetXaxis().SetTitle(settings[s][5])
    frames[f].GetYaxis().SetTitle(settings[s][6])
c = tdrCanvasMultipad("c",frames,14,10,2,2)

# Open the ROOT file with the ntuple
f = r.TFile("JECNtuple_MiniAOD.root")

# Access and store the necessary trees
tAK4PFchs   = f.Get("AK4PFCHSL1L2L3/t")

# Crease some histograms
hAK4PFchs_npv        = r.TH1D("hAK4PFchs_npv","hAK4PFchs_npv",60,0,60)
hAK4PFchs_rho        = r.TH1D("hAK4PFchs_rho","hAK4PFchs_rho",60,0,60)
hAK4PFchs_npu        = r.TH1D("hAK4PFchs_npu","hAK4PFchs_npu",60,0,60)
hAK4PFchs_tnpu       = r.TH1D("hAK4PFchs_tnpu","hAK4PFchs_tnpu",60,0,60)
hAK4PFchs_rhovsnpv   = r.TH2F("hAK4PFchs_rhovsnpv","hAK4PFchs_rhovsnpv",60,0,60,60,0,60)
hAK4PFchs_npvvsnpu   = r.TH2F("hAK4PFchs_npvvsnpu","hAK4PFchs_npvvsnpu",60,0,60,60,0,60)
hAK4PFchs_rhovsnpu   = r.TH2F("hAK4PFchs_rhovsnpu","hAK4PFchs_rhovsnpu",60,0,60,60,0,60)

# Fill the histograms
tAK4PFchs.Draw("npv:tnpus[12]>>hAK4PFchs_npvvsnpu","","goff")
tAK4PFchs.Draw("rho:npv>>hAK4PFchs_rhovsnpv","","goff")
tAK4PFchs.Draw("rho:tnpus[12]>>hAK4PFchs_rhovsnpu","","goff")

tAK4PFchs.Draw("npv>>hAK4PFchs_npv","","goff")
tAK4PFchs.Draw("rho>>hAK4PFchs_rho","","goff")
tAK4PFchs.Draw("npus[12]>>hAK4PFchs_npu","","goff")
tAK4PFchs.Draw("tnpus[12]>>hAK4PFchs_tnpu","","goff")

#Scale the histograms to be the same height
hAK4PFchs_npv.Scale(1.0/hAK4PFchs_npv.Integral())
hAK4PFchs_rho.Scale(1.0/hAK4PFchs_rho.Integral())
hAK4PFchs_npu.Scale(1.0/hAK4PFchs_npu.Integral())
hAK4PFchs_tnpu.Scale(1.0/hAK4PFchs_tnpu.Integral())

#Draw the histograms
c.cd(1)
tdrDraw(hAK4PFchs_npvvsnpu,"colz")
c.cd(2)
tdrDraw(hAK4PFchs_rhovsnpv,"colz")
c.cd(3)
tdrDraw(hAK4PFchs_rhovsnpu,"colz")
c.cd(4)
tdrDraw(hAK4PFchs_npv,"HIST",r.kNone,r.kBlack,r.kSolid,-1,0,0)
c.cd(4)
tdrDraw(hAK4PFchs_rho,"HIST",r.kNone,r.kRed,r.kSolid,-1,0,0)
c.cd(4)
tdrDraw(hAK4PFchs_npu,"HIST",r.kNone,r.kGreen,r.kSolid,-1,0,0)
c.cd(4)
tdrDraw(hAK4PFchs_tnpu,"HIST",r.kNone,r.kBlue,r.kSolid,-1,0,0)

# Add entries to the legend and draw it
c.cd(4)
l_X = tdrLeg(0.8,0.65,0.9,0.9)
l_X.AddEntry(hAK4PFchs_npv,"N_{PV}","l")
l_X.AddEntry(hAK4PFchs_rho,"#rho","l")
l_X.AddEntry(hAK4PFchs_npu,"N_{PU}","l")
l_X.AddEntry(hAK4PFchs_tnpu,"#mu","l")
l_X.Draw("same")

c.Update()
c.Draw()

<font color='red'>Question 1: Why are there a different amount of pileup interactions than primary vertices?</font><details>
<summary><font color='blue'>Show answer...</font></summary>
There is a vertex finding efficiency, which in Run I was about 72%. This means that $N_{PV}\simeq0.72{\cdot}N_{PU}$
</details>

<font color='red'>Question 2: How many pileup interactions are simulated before and after the in-time bunch crossing?</font><details>
<summary><font color='blue'>Hint</font></summary>
Try running the command `t.Scan("bxns:tnpu:npu")`
</details><details>
<summary><font color='blue'>Show answer...</font></summary>
There are 12 interactions before and 3 after.
</details>

<font color='red'>Question 3: Rho is the measure of the density of the pileup in the event. It's measured in terms of GeV per unit area. Can you think of ways we can use this information the correct for the effects of pileup?</font><details>
<summary><font color='blue'>Show answer...</font></summary>
From the jet $p_{T}$ simply subtract off the average amount of pileup expected in a jet of that size. Thus $p_{T}^{corr}{\simeq}p_{T}^{reco}-\rho{\cdot}area$
</details>

<font color='red'>Question 4: This plot shows the jet composition. Generally, why do we see the mixture of photons, neutral hadrons and charged hadrons that we see?</font>
<img src="files/composition_combo_pt_pfpaper_final_v2.png" alt="Jet Composition Vs. Pt" width="400px" style="float: right;" />
<details>
<summary><font color='blue'>Show answer...</font></summary>
A majority of the constituents in a jet come from pions. Pions come in neutral ($\pi^{0}$) and charged ($\pi^{\pm}$) varieties. Naively you would expect the composition to be two thirds charged hadrons and one third neutral hadrons. However, we know that $\pi^{0}$ decays to two photons, which leads to a large photon fraction.
<img src="files/efracs_particles_8TeV.png" alt="Jet Composition MC" width="380px" style="float: left;" />
</details>

## Pileup Reweighting 
Here we are going to produce a file containing the weights used for pileup reweighting. Please note that this process can take quite a while. Be patient!

In the meantime, the first question is asked here, at the beginning of this section, in order to give you a chance to think about the answers before you produce the plots. Ask yourself what pileup reweighting is doing. Try to answer the questions and do the exercise before looking at the answer.

<font color='red'>Question 5: How large do you expect the pileup weights to be?</font>

<font color='red'>Question 6: In what unit will the x-axis be plotted? Another way of asking this is what pileup variable can be measured in both data and MC and is fairly robust?</font><details>
<summary><font color='blue'>Show answer...</font></summary>
The x-axis is plotted as a function of $\mu$ as this is a true measurement of pileup (additional interactions) and not just some variable which is correlated with pileup. Other options might have been $N_{PV}$, which has an efficiency which is less than 100%, and $\rho$, which assumes that the pileup energy density is uniform. We also get different values of $\rho$ if we measure it for different regions in $\eta$ (i.e. $|\eta|<3$ or $|\eta|<5$).

<img src="files/Zmumu_npv.png" alt="Zmumu_npv" width="380px" style="float: left;" />
<img src="files/Zmumu_rho.png" alt="Zmumu_rho" width="380px" style="float: right;" />
<img src="files/Zmumu_npv_nputruth.png" alt="Zmumu_npv_nputruth" width="380px" style="float: left;" />
<img src="files/Zmumu_rho_nputruth.png" alt="Zmumu_rho_nputruth" width="380px" style="float: right;" />
</details>

In [None]:
%%capture
%%bash
python ../../python/pileupCorr.py -j ../../data/Cert_271036-284044_13TeV_PromptReco_Collisions16_JSON_NoL1T.txt -l ../../data/pileup_latest.txt -b 100

In [None]:
import array, math
def DtoF(file,get):
    hD = file.Get(get)
    hF = r.TH1F()
    hD.Copy(hF)
    return hF

def hold_pointers_to_implicit_members( obj ):
  if not hasattr( obj, '_implicit_members' ):
    obj._implicit_members = []
  if hasattr( obj, 'GetListOfPrimitives' ):
    for prim in obj.GetListOfPrimitives():
      obj._implicit_members.append( prim )

'''
Interactive:
from Analysis.JMEDAS.pileupCorr import *
c = MakeCanvas(filename="../data/PileupHistograms.root")
c.Draw()
'''
def MakeCanvas(dataHists = [], MCHist = r.TH1F(), weightHists = [], filename = ""):
    if filename!='':
        f = r.TFile.Open(filename,"READ")
        dataHists.append(DtoF(f,"data_pu_central"))
        dataHists[-1].SetDirectory(None)
        dataHists.append(DtoF(f,"data_pu_up"))
        dataHists[-1].SetDirectory(None)
        dataHists.append(DtoF(f,"data_pu_down"))
        dataHists[-1].SetDirectory(None)
        MCHist = DtoF(f,"hMC25ns")
        MCHist.SetDirectory(None)
        weightHists.append(DtoF(f,"pu_weights_central"))
        weightHists[-1].SetDirectory(None)
        weightHists.append(DtoF(f,"pu_weights_up"))
        weightHists[-1].SetDirectory(None)
        weightHists.append(DtoF(f,"pu_weights_down"))
        weightHists[-1].SetDirectory(None)

    setTDRStyle()

    frame = r.TH1D()
    frame.GetXaxis().SetLimits(0,100)
    frame.GetYaxis().SetRangeUser(1e-11,1e4)
    frame.GetXaxis().SetTitle("#mu")
    frame.GetYaxis().SetTitle("Weights / a.u.")

    c = tdrCanvas("pileupCanvas",frame,14,11,True)
    c.cd()
    c.SetLogy()

    #Reweight the MC histogram
    MCHist_reweighted = MCHist.Clone("hMC25ns_reweighted")
    MCHist_reweighted.SetDirectory(None)
    for b in xrange(1,int(MCHist.GetNbinsX()+1)):
        MCHist_reweighted.SetBinContent(b,MCHist_reweighted.GetBinContent(b)*weightHists[0].GetBinContent(b))

    #Make TGraphAsymmErrors for the statistical+systematic error bands
    #First up is for the data histograms
    x       = array.array('f',[])
    y       = array.array('f',[])
    ex_up   = array.array('f',[])
    ex_down = array.array('f',[])
    ey_up   = array.array('f',[])
    ey_down = array.array('f',[])
    for b in xrange(1,int(dataHists[0].GetNbinsX()+1)):
        x.append(dataHists[0].GetBinCenter(b))
        y.append(dataHists[0].GetBinContent(b))
        ex_up.append(0.5)
        ex_down.append(0.5)
        if dataHists[1].GetBinContent(b) > dataHists[0].GetBinContent(b):
            ey_up.append(math.sqrt((dataHists[1].GetBinContent(b)-dataHists[0].GetBinContent(b))**2+(dataHists[0].GetBinError(b))**2))
            ey_down.append(math.sqrt((dataHists[0].GetBinContent(b)-dataHists[2].GetBinContent(b))**2+(dataHists[0].GetBinError(b))**2))
        else:
            ey_up.append(math.sqrt((dataHists[2].GetBinContent(b)-dataHists[0].GetBinContent(b))**2+(dataHists[0].GetBinError(b))**2))
            ey_down.append(math.sqrt((dataHists[0].GetBinContent(b)-dataHists[1].GetBinContent(b))**2+(dataHists[0].GetBinError(b))**2))
    errorBandData = r.TGraphAsymmErrors(len(x)-1,x,y,ex_down,ex_up,ey_down,ey_up)

    #Then we do the weight histograms
    x       = array.array('f',[])
    y       = array.array('f',[])
    ex_up   = array.array('f',[])
    ex_down = array.array('f',[])
    ey_up   = array.array('f',[])
    ey_down = array.array('f',[])
    for b in xrange(1,int(weightHists[0].GetNbinsX()+1)):
        x.append(weightHists[0].GetBinCenter(b))
        y.append(weightHists[0].GetBinContent(b))
        ex_up.append(0.5)
        ex_down.append(0.5)
        if weightHists[1].GetBinContent(b) > weightHists[0].GetBinContent(b):
            ey_up.append(math.sqrt((weightHists[1].GetBinContent(b)-weightHists[0].GetBinContent(b))**2+(weightHists[0].GetBinError(b))**2))
            ey_down.append(math.sqrt((weightHists[0].GetBinContent(b)-weightHists[2].GetBinContent(b))**2+(weightHists[0].GetBinError(b))**2))
        else:
            ey_up.append(math.sqrt((weightHists[2].GetBinContent(b)-weightHists[0].GetBinContent(b))**2+(weightHists[0].GetBinError(b))**2))
            ey_down.append(math.sqrt((weightHists[0].GetBinContent(b)-weightHists[1].GetBinContent(b))**2+(weightHists[0].GetBinError(b))**2))
    errorBandWeight = r.TGraphAsymmErrors(len(x)-1,x,y,ex_down,ex_up,ey_down,ey_up)

    #Draw all histograms and graphs
    tdrDraw(errorBandData,"4",r.kNone,r.kNone,r.kNone,r.kNone,3144,r.kGray)
    tdrDraw(dataHists[0],"e1",r.kFullCircle,r.kBlack)
    tdrDraw(MCHist,"e1p",r.kFullCircle,r.kGreen,r.kNone,r.kGreen,r.kNone,r.kNone)
    tdrDraw(MCHist_reweighted,"e1p",r.kOpenCircle,r.kGreen,r.kNone,r.kGreen,r.kNone,r.kNone)
    tdrDraw(errorBandWeight,"4",r.kNone,r.kNone,r.kNone,r.kNone,3144,r.kRed+2)
    tdrDraw(weightHists[0],"e1",r.kFullTriangleUp,r.kRed)

    #Draw a legend
    leg = tdrLeg(0.5, 0.65, 0.88, 0.88)
    leg.AddEntry(dataHists[0],"Data","ep")
    leg.AddEntry(errorBandData,"Data Stat+Sys","f")
    leg.AddEntry(MCHist,"MC","ep")
    leg.AddEntry(MCHist_reweighted,"MC (Reweighted)","ep")
    leg.AddEntry(weightHists[0],"Weights","ep")
    leg.AddEntry(errorBandWeight,"Weight Stat+Sys","f")
    leg.Draw("same")

    #Set the canvas to own the histograms and graphs which are drawn
    hold_pointers_to_implicit_members(c)

    print "Canvas Successfully Made!"
    return c

In [None]:
c = MakeCanvas(filename="PileupHistograms.root")
c.Draw()