# Analysis of the di-muon spectrum using CMS Open Data in binder

This analysis processes events from the CMS experiment taken in 2012 and extracts a di-muon spectrum from the data.

The dataset contains of the following columns.

| Column name | Data type | Description |
|-------------|-----------|-------------|
| `nMuon` | `unsigned int` | Number of muons in this event |
| `Muon_pt` | `float[nMuon]` | Transverse momentum of the muons stored as an array of size `nMuon` |
| `Muon_eta` | `float[nMuon]` | Pseudo-rapidity of the muons stored as an array of size `nMuon` |
| `Muon_phi` | `float[nMuon]` | Azimuth of the muons stored as an array of size `nMuon` |
| `Muon_charge` | `int[nMuon]` | Charge of the muons stored as an array of size `nMuon` and either -1 or 1 |
| `Muon_mass` | `float[nMuon]` | Mass of the muons stored as an array of size `nMuon` |

In [None]:
import ROOT

## Create dataframe from NanoAOD files

In [None]:
df = ROOT.RDataFrame("Events", "root://eospublic.cern.ch//eos/root-eos/cms_opendata_2012_nanoaod/Run2012B_DoubleMuParked.root")

## Select events with exactly two muons

In [None]:
df_2mu = df.Filter("nMuon == 2", "Events with exactly two muons")

## Select events with two muons of opposite charge

In [None]:
df_os = df_2mu.Filter("Muon_charge[0] != Muon_charge[1]", "Muons with opposite charge")

## Compute invariant mass of the dimuon system

The following code just-in-time compiles the C++ function to compute the invariant mass, so that the function can be called in the Define node of the ROOT dataframe.

In [None]:
ROOT.gInterpreter.Declare(
"""
using namespace ROOT::VecOps;
float computeInvariantMass(RVec<float>& pt, RVec<float>& eta, RVec<float>& phi, RVec<float>& mass) {
    ROOT::Math::PtEtaPhiMVector m1(pt[0], eta[0], phi[0], mass[0]);
    ROOT::Math::PtEtaPhiMVector m2(pt[1], eta[1], phi[1], mass[1]);
    return (m1 + m2).mass();
}
""")
df_mass = df_os.Define("Dimuon_mass", "computeInvariantMass(Muon_pt, Muon_eta, Muon_phi, Muon_mass)")

## Restrict analysis on subset of the full dataset

We restrict the analysis on only a subset of the dataset to speed-up the runtime of this example.

In [None]:
df_range = df_mass.Range(1000000)

## Book histogram of dimuon mass spectrum

In [None]:
bins = 30000 # Number of bins in the histogram
low = 0.25 # Lower edge of the histogram
up = 300.0 # Upper edge of the histogram
hist = df_range.Histo1D(ROOT.RDF.TH1DModel("", "", bins, low, up), "Dimuon_mass")

## Request cut-flow report

In [None]:
report = df_range.Report()

## Create canvas for plotting

In [None]:
ROOT.gStyle.SetOptStat(0)
ROOT.gStyle.SetTextFont(42)
c = ROOT.TCanvas("c", "", 800, 700)
c.SetLogx()
c.SetLogy();

## Draw histogram

In [None]:
hist.GetXaxis().SetTitle("m_{#mu#mu} (GeV)")
hist.GetXaxis().SetTitleSize(0.04)
hist.GetYaxis().SetTitle("N_{Events}")
hist.GetYaxis().SetTitleSize(0.04)
hist.SetStats(False)
hist.Draw();

## Draw labels

In [None]:
label = ROOT.TLatex()
label.SetTextAlign(22)
label.DrawLatex(0.55, 1.2e3, "#eta")
label.DrawLatex(0.77, 2.5e3, "#rho,#omega")
label.DrawLatex(1.20, 1.5e3, "#phi")
label.DrawLatex(4.40, 5.0e3, "J/#psi")
label.DrawLatex(4.60, 6.0e2, "#psi'")
label.DrawLatex(12.0, 8.0e2, "Y(1,2,3S)")
label.DrawLatex(91.0, 6.0e2, "Z")
label.SetNDC(True)
label.SetTextAlign(11)
label.SetTextSize(0.04)
label.DrawLatex(0.10, 0.92, "#bf{CMS Open Data}")
label.SetTextAlign(31)
label.DrawLatex(0.90, 0.92, "#sqrt{s} = 8 TeV, L_{int} = 11.6 fb^{-1}");

## Save plot

In [None]:
%jsroot on
c.Draw()

## Print cut-flow report

In [None]:
report.Print()