# Dilepton analysis - handling the MC background
A key point of any (particle) physics analysis is to make sure that we understand the background. By "background" we mean everything that shows up in the detector that is NOT the process we are looking for. For example, when colliding protons, there is a    

We start in the usual way: 

In [1]:
#include <iostream>
#include <string>
#include <stdio.h>

In [2]:
%jsroot on

In [3]:
TChain *dataset = new TChain("mini"); 

We now make a vector of strings with the names of the files we want to include. Then we loop over these names to add the files to the TChain. 

In [4]:
vector<TString> backgrounds = { "mc_105985.WW.root", "mc_105986.ZZ.root", "mc_105987.WZ.root", "mc_117049.ttbar_had.root", 
"mc_117050.ttbar_lep.root", "mc_147770.Zee.root", "mc_147771.Zmumu.root", "mc_147772.Ztautau.root" };   

In [5]:
dataset->Reset(); 
for(const auto & i:backgrounds){
    dataset->Add("DataSamples/MC/"+i);
}

In [6]:
const int vs = 5; 

Int_t lepton_n = -1, lepton_charge[vs], lepton_type[vs], ID; 

Float_t lepton_pt[vs], lepton_E[vs], lepton_phi[vs], lepton_eta[vs], MET;

In [7]:
dataset->SetBranchAddress("lep_n",      &lepton_n);
dataset->SetBranchAddress("lep_charge", &lepton_charge);
dataset->SetBranchAddress("lep_type",   &lepton_type);
dataset->SetBranchAddress("lep_pt",     &lepton_pt);
dataset->SetBranchAddress("lep_eta",    &lepton_eta);
dataset->SetBranchAddress("lep_phi",    &lepton_phi);
dataset->SetBranchAddress("lep_E",      &lepton_E);
dataset->SetBranchAddress("met_et",     &MET); 
dataset->SetBranchAddress("channelNumber", &ID);

We need to classify the different backgrounds, and for each variable we want to plot we should make one histogram per background category. 

In [8]:
vector<TString> variables = {}; 
vector<TString> bgs = {}; 
vector<Int_t> Diboson = {}; 
vector<Int_t> Zjets = {}; 
vector<Int_t> ttbar = {}; 

In [9]:
variables = {"mll", "lep_pt", "met"}; 
bgs = {"Diboson", "ttbar", "Zjets"};
Diboson = {105985, 105986, 105987};
Zjets = {147770, 147771, 147772};
ttbar = {117049, 117050};

A nice way of dealing with all these histograms is to make [map](http://www.cplusplus.com/reference/map/map/)s (the C$++$ equivalent of dictionaries in Python). Below we define one map for each variable. The "key values" are our background categories, while the "mapped values" are the histograms.   

In [10]:
map<TString, TH1*> hist_mll; 
map<TString, TH1*> hist_lep_pt; 
map<TString, TH1*> hist_met;

In [11]:
for(const auto & i:bgs){
    hist_mll[i] = new TH1F(); 
    hist_lep_pt[i] = new TH1F(); 
    hist_met[i] = new TH1F();
}

In [12]:
for(const auto & i:bgs){
    hist_mll[i]->SetNameTitle("hist_mll", "Invariant mass"); 
    hist_lep_pt[i]->SetNameTitle("hist_lep_pt", "Lepton pT"); 
    hist_met[i]->SetNameTitle("hist_met", "Missing ET");
    hist_mll[i]->SetBins(20,0,500); 
    hist_lep_pt[i]->SetBins(20,0,1000);
    hist_met[i]->SetBins(20,0,500); 
}

In [13]:
TLorentzVector l1, l2, dileptons; 

In [28]:
for(const auto & i:bgs){
    hist_mll[i]->Reset(); 
    hist_lep_pt[i]->Reset(); 
    hist_met[i]->Reset();
}

In [15]:
int nentries = (Int_t)dataset->GetEntries();

Now we loop over all events in the chain, but with a slight difference; when filling histograms we need to check which background category the event falls into, and then fill the appropriate histogram. (This only applies to MC background. For real data you only make one histogram pr. variable.)

In [29]:
for (int i = 0; i < nentries ; i++)
{
    //if( i%1000000 == 0){ cout << i << " events processed" << endl;}    
    if( i % 100 == 0 ){ 
    dataset->GetEntry(i); // We "pull out" the i'th entry in the chain. The variables are now 
                          // available through the names we have given them. 
    
    // Cut #1: Require (exactly) 2 leptons
    if(lepton_n == 2)
    {
        // Cut #2: Require opposite charge
        if(lepton_charge[0] != lepton_charge[1])
        {
            // Cut #3: Require same flavour (2 electrons or 2 muons)
            if(lepton_type[0] == lepton_type[1])
            {
                l1.SetPtEtaPhiE(lepton_pt[0]/1000., lepton_eta[0], lepton_phi[0], lepton_E[0]/1000.);
                l2.SetPtEtaPhiE(lepton_pt[1]/1000., lepton_eta[1], lepton_phi[1], lepton_E[1]/1000.);
                // Variables are stored in the TTree with unit MeV, so we need to divide by 1000 
                // to get GeV, which is a more practical unit. 
                
                dileptons = l1 + l2;   
                
                if(std::find(Diboson.begin(), Diboson.end(), ID) != Diboson.end()){
                    hist_mll["Diboson"]->Fill(dileptons.M());
                    hist_lep_pt["Diboson"]->Fill(l1.Pt());
                    hist_lep_pt["Diboson"]->Fill(l2.Pt()); 
                    hist_met["Diboson"]->Fill(MET/1000); 
                }  
                
                if(std::find(Zjets.begin(), Zjets.end(), ID) != Zjets.end()){
                    hist_mll["Zjets"]->Fill(dileptons.M());
                    hist_lep_pt["Zjets"]->Fill(l1.Pt());
                    hist_lep_pt["Zjets"]->Fill(l2.Pt()); 
                    hist_met["Zjets"]->Fill(MET/1000); 
                }
                
                if(std::find(ttbar.begin(), ttbar.end(), ID) != ttbar.end()){
                    hist_mll["ttbar"]->Fill(dileptons.M());
                    hist_lep_pt["ttbar"]->Fill(l1.Pt());
                    hist_lep_pt["ttbar"]->Fill(l2.Pt()); 
                    hist_met["ttbar"]->Fill(MET/1000); 
                } 
                
            }
        }
    }
    }        
}
cout << "Loop finished!" << endl; 

Loop finished!


When we have filled all histograms we would typically like to plot the results, and we would like to plot a variable with all the different backgrounds stacked on top of each other. To do this we can use the [THStack](https://root.cern.ch/doc/master/classTHStack.html) class. We define a so-called "stack", add the histograms to the stack and then plot it, as shown in following. 

Before we do the stacking we need to decorate the different backgrounds with different colors, to be able to tell them apart. Once again it is convenient to make a map with the different backgrounds and the corresponing colors. Colors are defined as integers (e.g. kRed = 632), meaning that the mapped value must also be an integer. More about the ROOT colors can be found in the [TColor](https://root.cern.ch/doc/master/classTColor.html) class reference.  

In [17]:
map<TString, Int_t> colors; 

In [18]:
colors["Diboson"] = kBlue; 
colors["Zjets"] = kRed; 
colors["ttbar"] = kGreen;

In [19]:
for(const auto h:bgs){
    hist_mll[h]->SetFillColor(colors[h]); 
    hist_met[h]->SetFillColor(colors[h]);
    hist_lep_pt[h]->SetFillColor(colors[h]);
    
    hist_mll[h]->SetLineColor(colors[h]); 
    hist_met[h]->SetLineColor(colors[h]);
    hist_lep_pt[h]->SetLineColor(colors[h]);
}

In [20]:
THStack *stack_mll = new THStack("Invariant mass", "");

In [21]:
THStack *stack_met = new THStack("Missing ET", ""); 

In [22]:
for(const auto h:bgs){
    stack_mll->RecursiveRemove(hist_mll[h]); // Remove previously stacked histograms  
    stack_met->RecursiveRemove(hist_met[h]);
    stack_mll->Add(hist_mll[h]); 
    stack_met->Add(hist_met[h]);
}    

Now we make a legend with the different backgrounds, and plot the stacks. 

In [23]:
gStyle->SetLegendBorderSize(0); // Remove (default) border around legend 
TLegend *legend = new TLegend(0.65, 0.70, 0.85, 0.85); 

In [24]:
legend->Clear();
for(const auto i:bgs){
    legend->AddEntry(hist_mll[i], i, "f");  // Add your histograms to the legend
} 

In [25]:
TCanvas *C = new TCanvas("c", "c", 600, 600);

In [26]:
gPad->SetLogy(); // Set logarithmic y-axis

In [27]:
stack_mll->Draw(); 
stack_mll->GetYaxis()->SetTitle("# events");
stack_mll->GetYaxis()->SetTitleOffset(1.3); 
stack_mll->GetXaxis()->SetTitle("m_{ll} (GeV)");
stack_mll->GetXaxis()->SetTitleOffset(1.3);
legend->Draw();
C->Draw();

In [31]:
stack_met->Draw(); 
stack_met->GetYaxis()->SetTitle("# events");
stack_met->GetYaxis()->SetTitleOffset(1.3); 
stack_met->GetXaxis()->SetTitle("E_{T}^{miss} (GeV)");
stack_met->GetXaxis()->SetTitleOffset(1.3);
stack_met->GetXaxis()->SetLimits(0,250);
legend->Draw();
C->Draw();