# Session 2 : Find the invariant mass of the Z boson!

## Welcome to the Session 2 !
This notebook uses ATLAS Open Data http://opendata.atlas.cern to show you the steps to find and calculate the mass of the Z boson decaying to two leptons of same flavour (either electron or muon) and opposite charge.<br>

Example of a Z boson decaying to two electrons: <CENTER><img src="images/Z_ElectronPositron.png" style="width:40%"></CENTER>

**By the end of this notebook you will be able to:**
1. Work using the ROOT dataframe (pyROOT)
2. Manipulate the data and find the Z boson by yourself by applying relevant event selections 
3. Calculate the Z invariant mass by hand
4. Plot the Z invariant mass
5. Perform a fit on the mass

The whole notebook takes less that an hour to follow through!

## pyROOT
The library used is ROOT - a scientific software framework that provides all the functionalities needed to deal with big data processing, statistical analysis, visualization and storage.<br>

First of all, ROOT is imported to read the files in the .root data format.<br>

**Reminder from previous season:** A .root file consists of a tree having branches and leaves.<br>

At this point you could also import further programs that contain other formulas that you maybe use more often. But here we don't import other programs for the sake of simplicity:

In [1]:
import ROOT

Welcome to JupyROOT 6.26/02


To increase flexibility during this session, we can directly access a significant module from ROOT, namely **TMath**. TMath offers a wide range of mathematical functions and operations that we will use extensively:

In [2]:
from ROOT import TMath

In order to activate visualisation of the histogram that is later created we can use the JSROOT magic:

In [3]:
%jsroot on

## Where is my data?

Next, we have to open the data that we want to analyze, stored in a *.root file:

In [None]:
## CHOOSE here which sample to use!!

filename = "mc_361106.Zee.1largeRjet1lep.root"
url = "https://atlas-opendata.web.cern.ch/atlas-opendata/samples/2020/1largeRjet1lep/MC/"
f = ROOT.TFile.Open(url+filename)

After the data is opened, we create a canvas on which we can draw a histogram. If we do not have a canvas we cannot see our histogram at the end. Its name is **Canvas** and its header is **c**. The two following arguments define the width and the height of the canvas.

In [None]:
canvas = ROOT.TCanvas("Canvas", "c", 800, 600)

The next step is to define a tree named **tree** to get the data out of the *.root file:

In [None]:
tree = f.Get("mini")
tree.GetEntries()

Now, we define a histogram that will later be placed on this canvas. Its name is variable, the header of the histogram is Mass of the Z boson, the x axis is named mass[GeV] and the y axis is name events, The three following arguments indicate that this histogram contains 30 bins which have a range from 40 to 140.

In [None]:
hist = ROOT.TH1F("variable","Mass of the Z boson; mass [GeV]; events",30,40,140)

## Calculate the invariant mass!

The **selections** we should require in each event to ensure that the final state contains objects potentially comming from the Z boson are the followings:
1. Exactly two leptons
2. The two leptons to have opposite charge (the one positive and other negative)
3. The two leptons to be from the same family (same flavor). E.g. two electrons or two muons

After applying the selections above, the **calculation** of the the invariant mass is doen following these steps:
1. If the energy of the 2 leptons are lep_E[0] and lep_E[1], write the sum of energy, sumE
2. Write the sum of x-momentum, sumpx
3. Do the same for y and z momenta (sumpy and sumpz)
4. Now you have the x,y,z components sumpx,sumpy and sumpz. The vector **sump** = (sumpx,sumpy and sumpz). Write the magnitute of total momentum, sump.

The invariant mass, M, of a parent particle decaying to two daughter particles is related to properties of the daughter particles by the formula:

$$M^2 = E^2 - p^2$$

where E is the total energy of the daughter particles, and p the magnitute of the vector sum of the momenta of the daughter particles. This is written assuming c=1 (natural units)

Time to fill our histogram defined previously, following the instructions above. At first, we define the aforementioned variables and then we loop over the data:

In [None]:
def has_two_leptons(event):
    return event.lep_n == 2

def has_opposite_charge_leptons(event):
    return event.lep_charge[0] != event.lep_charge[1]

def has_same_family_leptons(event):
    return event.lep_type[0] == event.lep_type[1]

def compute_mll(event):
    sumE = event.lep_E[0] + event.lep_E[1]

    px_leading = event.lep_pt[0]*TMath.Cos(event.lep_phi[0])
    px_subleading = event.lep_pt[1]*TMath.Cos(event.lep_phi[1])

    py_leading = event.lep_pt[0]*TMath.Sin(event.lep_phi[0])
    py_subleading = event.lep_pt[1]*TMath.Sin(event.lep_phi[1])

    pz_leading = event.lep_pt[0]*TMath.SinH(event.lep_eta[0])
    pz_subleading = event.lep_pt[1]*TMath.SinH(event.lep_eta[1])    

    sumpx = px_leading + px_subleading
    sumpy = py_leading + py_subleading
    sumpz = pz_leading + pz_subleading
    sump = TMath.Sqrt(TMath.Power(sumpx,2) + TMath.Power(sumpy,2) + TMath.Power(sumpz,2))
    Mll = TMath.Sqrt((TMath.Power(sumE,2) - TMath.Power(sump,2)))/1000.

    return Mll

# Main code
for event in tree:
    if has_two_leptons(event) and has_opposite_charge_leptons(event) and has_same_family_leptons(event):
        Mll = compute_mll(event)
        hist.Fill(Mll)

After filling the histogram we want to see the results of the analysis. First, we draw the histogram on the canvas and then the canvas on which the histogram lies.

In [None]:
hist.Draw()
hist.SetFillColor(8)

In [None]:
canvas.Draw()

## Fitting the histogram

Now try to fit the mass resonance of the W boson using the "Fit" procedure from ROOT. For this we will use the **Breit-Wigner (BW)** Function. The BW function is a probability density function that describes the probability of a particle with a certain mass and width decaying into other particles.

In [None]:
# Define Breit-Wigner function
fBW = ROOT.TF1("fBW", "[0]/((x-[1])*(x-[1])+([2]*[2])/4)", 60, 120)

# Set initial parameter values
fBW.SetParameter(0, hist.GetMaximum()) # Overall normalization
fBW.SetParameter(1, 91.2) # Mass of the Z boson
fBW.SetParameter(2, 2.5)  # Width of the Z boson resonance

# Fit histogram with Breit-Wigner function
fit_result = hist.Fit("fBW","RS")

# Print the fit parameters
print("Normalization: {}".format(fBW.GetParameter(0)))
print("Mass: {}".format(fBW.GetParameter(1)))
print("Width: {}".format(fBW.GetParameter(2)))

In [None]:
fBW.Draw("sames")

In [None]:
canvas.Draw()

Extending the range of the y-axis would provide a clearer view of the fitting curve:

In [None]:
hist.GetYaxis().SetRangeUser(0, hist.GetMaximum()+0.4*hist.GetMaximum())

In [None]:
canvas.Draw()

**Question:** How can we improve the fitting procedure above?