## Stage 2: Using Python

Python makes our lives easier but - in case you are not very carefully - often also slower. To omit this, you can learn in the following example how to combine the best of both worlds using `RDataFrame` and ROOT's C++ interpreter `cling`.

In [None]:
import ROOT

## Create a ROOT dataframe in Python

The interface in Python is directly mapped from the C++ API. So you can guess easily how you can write down the Python version of the C++ code introduced before. Since we talk to a C++ library in the back, PyROOT supports to create any C++ type in the Python world out-of-the-box so you can pass data to your C++ library. Here, you can see how this maps to a `std::vector<string>` used as a filelist for the ROOT dataframe.

In [None]:
files = ROOT.std.vector("string")()
files.push_back("root://eospublic.cern.ch//eos/root-eos/cms_opendata_2012_nanoaod/Run2012B_DoubleMuParked.root")
files.push_back("root://eospublic.cern.ch//eos/root-eos/cms_opendata_2012_nanoaod/Run2012C_DoubleMuParked.root")
df = ROOT.RDataFrame("Events", files)

## Filter relevant events for this analysis

**Fill in the correct expressions to select ...**

1. Events with exactly two muons
2. Events with muons of opposite charge

Same than in the C++ example!

In [None]:
df_2mu = df.Filter("do something with nMuon", "Events with exactly two muons")
df_os = df_2mu.Filter("do something with Muon_charge", "Muons with opposite charge")

## Perform complex operations in Python, efficiently!

Since we still want to perform complex operations in Python but plain Python code is prone to be slow and not thread-safe, you can inject C++ functions doing the work in your event loop during runtime. This mechanism uses the C++ interpreter `cling ` shipped with ROOT, making this possible in a single line of code.

Note, that we are using here the `Define` node of the computation graph with a jitted function, calling into the code declared with the interpreter. This allows you to implement the computational expensive parts of your event loop in C++ directly from Python.

In [None]:
ROOT.gInterpreter.Declare(
    """
    using Vec_t = const ROOT::VecOps::RVec<float>&;
    float compute_mass(Vec_t pt, Vec_t eta, Vec_t phi, Vec_t mass) {
        ROOT::Math::PtEtaPhiMVector p1(pt[0], eta[0], phi[0], mass[0]);
        ROOT::Math::PtEtaPhiMVector p2(pt[1], eta[1], phi[1], mass[1]);
        return (p1 + p2).mass();
    }
    """)
df_mass = df_os.Define("Dimuon_mass", "compute_mass(Muon_pt, Muon_eta, Muon_phi, Muon_mass)")

## Make a histogram and draw the result

The rest of the analysis is direclty translated to Python. Have a look at the code and compare to the C++ version.

**Adjust the plotting range accordingly to the C++ version.**

In [None]:
df_range = df_mass.Range(100000)

In [None]:
nbins = 30000
low = 100
up = 300
h = df_range.Histo1D(("Dimuon_mass", "Dimuon_mass", nbins, low, up), "Dimuon_mass")

In [None]:
report = df_range.Report()

In [None]:
%%time
ROOT.gStyle.SetOptStat(0); ROOT.gStyle.SetTextFont(42)
c = ROOT.TCanvas("c", "", 800, 700)
c.SetLogx(); c.SetLogy()
h.SetTitle("")
h.GetXaxis().SetTitle("m_{#mu#mu} (GeV)"); h.GetXaxis().SetTitleSize(0.04)
h.GetYaxis().SetTitle("N_{Events}"); h.GetYaxis().SetTitleSize(0.04)
h.Draw()

label = ROOT.TLatex(); label.SetNDC(True)
label.SetTextSize(0.040); label.DrawLatex(0.100, 0.920, "#bf{CMS Open Data}")
label.SetTextSize(0.030); label.DrawLatex(0.630, 0.920, "#sqrt{s} = 8 TeV, L_{int} = 11.6 fb^{-1}");

In [None]:
%jsroot on
c.Draw()

In [None]:
report.Print()