# Calculating a new variable from existing variables in the ROOT file

In this example, we calculate the $\Delta M = m(D^*) - m(D^0)$ variable in the $B^0 \to D^{*-} \pi^+ \pi^+ \pi^-$ Monte Carlo sample. In the $D^* \to D^0 \pi$ decay, there is very little phase space left over for the pion produced in the decay, because the mass difference between the $D^*$ and $D^0$ is only slightly more than the pion mass. Thus, we should see a strong peak in the $\Delta M$ distribution, which helps us to select $D^*$'s. 

Some imports of stuff we need:

In [1]:
from root_pandas import read_root
import matplotlib.pyplot as plt
from bd2dst3pi.locations import loc
from bd2dst3pi.definitions import years, magnets
import numpy as np

#Gives us nice LaTeX fonts in the plots
from matplotlib import rc
rc('font',**{'family':'serif','serif':['Roman']})
rc('text', usetex=True)

Welcome to JupyROOT 6.22/02


Now we load the $B^0 \to D^{*-} \pi^+ \pi^+ \pi^-$ Monte Carlo sample, which is a pure sample of simulated decays:

In [3]:
file_list = []
for y in years:
    for m in magnets:
        file_list.append(f"{loc.MC}/Bd_Dst3pi_11266018_{y}_{m}_Sim09e-ReDecay01.root")
tree_name = "DecayTree"
vars = ["D0_M","Dst_M"]
df = read_root(file_list, tree_name, columns=vars)

We can check how many MC events we have in our DataFrame:

In [4]:
n_events = len(df)
n_events

23692

Let's calculate the $\Delta M$ variable. In `pandas` this is super-simple - no loops like in traditional `ROOT`, no need for `SetBranchAddress` or declaring new branches! We simply define a new column in our DataFrame:

In [5]:
df["Delta_M"] = df["Dst_M"] - df["D0_M"]

We can check some properties of our new variable like it's mean:

In [6]:
mu = df["Delta_M"].mean()
print(f"Delta_M mu = {mu:.2f}")

Delta_M mu = 145.46


**Follow-up tasks**

- At an earlier stage, our files have had a $143 < \Delta M < 148$ MeV cut applied to them. Can you plot the `Delta_M` variable we just calculated to check this?
- How does the $\Delta M$ compare in data and ?
- Can you make a ratio of the data and MC distributions, to check how consistent they are? Try to make a `numpy` histgoram for each one, and then you can divide them.