# 2. Accessing Truth Information

An important aspect at this stage in the design of EIC experiments is the comparison of the so-called 'truth' information with simulated variables. In this notebook we will go over how to access the truth information in an ATHENA event, similar to the EDM4hep-based data model used for the EIC Detector 1.

## Importing uproot

Depending on the versions of uproot and XRootD that you have installed, you may encouter a warning from uproot below. Nevertheless, because of the simple data format of the  ROOT files, we are able to ignore this warning.

In [None]:
import uproot as ur
print('Uproot version: ' + ur.__version__)

## Opening a file with uproot

To test uproot, we will open a sample file (a single-particle simulation of interest to those who wish to study detector performance):

In [None]:
server = 'root://sci-xrootd.jlab.org//osgpool/eic/'
file = 'ATHENA/RECO/master/SINGLE/pi+/1GeV/3to50deg/pi+_1GeV_3to50deg.0001.root'

In [None]:
events = ur.open(server + file + ':events', library = 'np')

## Truth information in the `mcparticles` branch

Truth information is stored in the `mcparticles` branch. This includes all *steps* in the simulation, subject to certain conditions. For the purpose of end-user analysis, the conditions are essentially that only steps by primary particles are included.

Several fields are available for the truth information:

In [None]:
events['mcparticles'].keys()

Besides the particle data group code `pdgID`, the parent track `g4Parent`, and the generator status `genstatus`, you will also see the vertex (`v`) and momentum (`p`) of at the start (`s`) and end (`e`) of each step. Thus, `ps.x` corresponds to the `x` component of the momentum at the start of a step. Let's retrieve these starting momenta, as well as the `pdgID` and `g4Parent` code

In [None]:
pdgID = events['mcparticles.pdgID'].array()
g4Parent = events['mcparticles.g4Parent'].array()
genStatus = events['mcparticles.genStatus'].array()
psx,psy,psz = events['mcparticles.ps.x'].array(), events['mcparticles.ps.y'].array(), events['mcparticles.ps.z'].array()

As expected, for this file the `pdgID` corresponds to that of a pion.

In [None]:
pdgID[100]

In [None]:
print(g4Parent[100])

And indeed, the energy of this pion is 1 GeV. We are importing the `numpy` library to use the `sqrt` function.

In [None]:
import numpy as np
p = np.sqrt(psx**2 + psy**2 + psz**2)

In [None]:
p[100]

## Making a simple plot

We can now create a simple plot of the angular (theta) distribution of the generated particles. Note we have to "flatten" the array again using ```awkward``` before we can plot it.

In [None]:
import awkward as ak
theta = np.arctan2(np.sqrt(psx**2 + psy**2), psz)
ak.flatten(theta[g4Parent == 0])

In [None]:
import matplotlib.pyplot as plt
plt.hist(ak.flatten(theta[g4Parent == 0]), bins = 50)
plt.xlabel('Initial Scattering Angle $\\theta$ [rad]')
plt.ylabel('Number of events')
plt.show()