## Uproot

The standard library to read and write ROOT files in Python is the well (or not) know ROOT library but sometimes it can be difficult to understand or at least I had this promblem when doing it. 

Fortunately it exists an alternative which I found easier to work with and it's called uproot. Uproot is a library to create and work with ROOT files only using Numpy in Python, which is better known for most users. 

It has the advantage that it is an I/O library (Input/Output library) which means it's primarly designed to handle data reading and writting and its vision is focused to take the data from the ROOT files for machine learning libraries in Pyhton. 

Uproot does not depend on C++ ROOT. It works with the data using Numpy, "converting" the ROOT file into Numpy arrays. 

Uproot is actually really easy to install. You can do it by pip as follows:

The akward library is not completely necessary but it is recommended in order to manipulate JSON-like data with Numpy-like idioms (that is actually the description you found about this library)

More about Uproot can be check in: https://uproot.readthedocs.io/en/latest/index.html

The getting started guide is on: https://uproot.readthedocs.io/en/latest/basic.html

## About ROOT

From the ROOT primer you'll see that:
"ROOT is a software framework for data analysis and I/O: a powerful tool to cope with the demanding task typicial of state of the art scientific data analysis.

ROOT is better known for being the software used to store the data of most Large Hadron Collider experiments. But it actually has other features such as its advanced graphical user interface used for analysis. 

ROOT is an open source project coordinated by the Euroepan Organisation for Nuclear Research, CERN in Geneva. It was created in order to satisfice the necessity of a tool to visualice data, to make some manipulations on it, to have a powrful library of mathematical functions and procedures, to visualice the uncertainties of experiments, to do data analysis with the collected data, to analize a large data volume, to have and easy-to-use and efficient methods for storing and handling data, to generate and visualize frequency distributions, to simulate data (generation of pseudo-data) inter alia.

Many ROOT functions are desingned in C++, so a little knowledge about this language is needed. Although it has its equivalent in python, so one should be able to work with the ROOT library and it is very similar to the C++ workflow. 

More about ROOT can be found in: https://root.cern/primer/ 

## About Delphes

Delphes is a C++ basic framework for sumulations of generic collider experiments. It is fast and multipurpose for detector response simulation which includes a tracking system, embedded into a magnetic field, calorimeters and a muon system.

Delphes framework is based in standard file formats such as Les Houches Event File (a standard in event generators to define event listings in a common language) and HepMC, and outputs observables that can be used for analyses. The simulations tackes into account things as the effect of magnetic field among others. 

The visualization of the final state particles is built-in using the ROOT library. The output can be analyzed using the ROOT data analysis framework or alternatively using Uproot (which will be done in this work). In most cases it can be done with TTree draw or with ROOT macros.

##### The DELPHES 3 collaboration., de Favereau, J., Delaere, C. et al. DELPHES 3: a modular framework for fast simulation of a generic collider experiment. J. High Energ. Phys. 2014, 57 (2014). https://doi.org/10.1007/JHEP02(2014)057

## About TTrees

Ttree is a ROOT framework class used to save structured sets of data in high energy physics. It helps to manage huge data volumes dividing them into independent branches (TBranch) allowing the automatic compression and eficient reading of specific values without having to load all the file in the memory. 

A better reference about the TTree class in ROOT can be found in: https://root.cern.ch/doc/v636/classTTree.html 

In [1]:
import uproot
import awkward as ak
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

In [2]:
# Import the .root file
file = uproot.open("/tf/Higgs-Boson-LHC-Collision-Detector/sigfcc_350.root")
# Check the file
print(file.keys())

['ProcessID0;1', 'ProcessID1;1', 'Delphes;1']


In [3]:
# Open the tree Delphes, which contains the necessary information of the simulation
tree = file["Delphes"]

# tree.show() 

In [4]:
# Check the keys of the tree so you can know what are you going to work with
tree.keys()

['Event',
 'Event/Event.fUniqueID',
 'Event/Event.fBits',
 'Event/Event.Number',
 'Event/Event.ReadTime',
 'Event/Event.ProcTime',
 'Event/Event.ProcessID',
 'Event/Event.MPI',
 'Event/Event.Weight',
 'Event/Event.CrossSection',
 'Event/Event.CrossSectionError',
 'Event/Event.Scale',
 'Event/Event.AlphaQED',
 'Event/Event.AlphaQCD',
 'Event/Event.ID1',
 'Event/Event.ID2',
 'Event/Event.X1',
 'Event/Event.X2',
 'Event/Event.ScalePDF',
 'Event/Event.PDF1',
 'Event/Event.PDF2',
 'Event_size',
 'Weight',
 'Weight/Weight.fUniqueID',
 'Weight/Weight.fBits',
 'Weight/Weight.Weight',
 'Weight_size',
 'Particle',
 'Particle/Particle.fUniqueID',
 'Particle/Particle.fBits',
 'Particle/Particle.PID',
 'Particle/Particle.Status',
 'Particle/Particle.IsPU',
 'Particle/Particle.M1',
 'Particle/Particle.M2',
 'Particle/Particle.D1',
 'Particle/Particle.D2',
 'Particle/Particle.Charge',
 'Particle/Particle.Mass',
 'Particle/Particle.E',
 'Particle/Particle.Px',
 'Particle/Particle.Py',
 'Particle/Parti

In [None]:
# In this case we're going to check the distribution of the eta, phi and P_t values for all events
# First we check what's inside the Particle tree
tree["Particle"].show()

In [None]:
# This way we select the data that is important for us: Particle.Eta, Particle.Phi & Particle.PT:

eta = tree["Particle/Particle.Eta"].array(library="ak")
phi = tree["Particle/Particle.Phi"].array(library="ak")
pt = tree["Particle/Particle.PT"].array(library="ak")

In [None]:
# If we check the data we'll see that it is separated for each event and for each particle
eta

In [None]:
# As we want to check the distribution of all the data we should flatten all the info so we can then graph
# for all events

eta_all = ak.flatten(eta)
phi_all = ak.flatten(phi)
pt_all = ak.flatten(pt)

# Notice the difference
eta_all

In [None]:
# Now we graph the distribution of the data
# First we'll do it with the Matplotlib library

plt.hist(eta_all, bins = 3)
plt.xlabel(r"$\eta$")
plt.ylabel("Particles")
plt.show()

plt.hist(phi_all, bins = 40)
plt.xlabel(r"$\phi$")
plt.ylabel("Particles")
plt.show()

plt.hist(pt_all, bins = 100)
plt.xlabel(r"$p_t$")
plt.ylabel("Particles")
plt.show()

In [None]:
# Now, in order to see the data distribution we'll use the seaborn library

# First, since Seaborn expects a 1D numpy array we'll have to convert the ak-arrays into numpy arrays
# which actually is pretty simple

eta_numpy = ak.to_numpy(eta_all)
phi_numpy = ak.to_numpy(phi_all)
pt_numpy = ak.to_numpy(pt_all)

# Then we can graph

sns.histplot(x=eta_numpy, bins = 3, kde = True)
plt.xlabel(r"$\eta$")
plt.ylabel("Particles")
plt.show()

sns.histplot(x=phi_numpy, bins = 40, kde = True)
plt.xlabel(r"$\phi$")
plt.ylabel("Particles")
plt.show()

sns.histplot(x = pt_numpy, bins = 100, kde = True)
plt.xlabel(r"$p_T$")
plt.ylabel("Particles")
plt.show()

In [None]:
# Since we want to do all the distributions graphs for all the particles we can do this as follows:

