## Working with ROOT -- python

For interactive work jupyter notebook is rather convenient, both for Python and C++ interface.

ROOT home page: https://root.cern/

Detailed ROOT tutorial: https://www.nevis.columbia.edu/~seligman/root-class/

ROOT is not default package for python, it requires special installation and setup: 

Select environment Environment `root/6.24.02 ` when you start on *jupyterhub* on *https://workshop.physik.uni-muenchen.de* 

Then do `import ROOT` to get access to ROOT classes and functions

In [None]:
import ROOT
# enable interactive plotting features
%jsroot on 

### Python/ROOT as pocket calculator

In [None]:
sin(0.5) # does not work -- sin not known

In [None]:
import math # need module math
math.sin(0.5)

In [None]:
math.atan(1)*4

In [None]:
# or equivalent using ROOT.TMath funcs:
ROOT.TMath.Sin(0.5)

In [None]:
ROOT.TMath.ATan(1)*4

In [None]:
# check what functions are available
help(ROOT.TMath)

### ROOT classes
ROOT provides users with a huge class library for functions, histograms, random numbers, statistics, I/O , etc.

In practice, working with ROOT means that you create objects of the respective ROOT classes and then call methods of these objects.

Simple examples:

#### 1D Histogram ####

In [None]:
import ROOT

# book histo,
# arguments: tag, title, N-channels, xlow, x-high
myhist1 = ROOT.TH1F("h1","Gauss Random Numbers",100,-3.,3.)
# random generator object
rng = ROOT.TRandom()
# fill histo in loop
for i in range(100000):
    xrnd = rng.Gaus() # Gaussian distributed Random number
    myhist1.Fill( xrnd ) # Fill random number in histogram

# need drawing canvas for jupyter    
myc = ROOT.TCanvas("myc","myc",10,10,500,300);     
myhist1.Draw() # Draw Histogramm
myc.Draw() # draw canvas

#### 2D Histogram ####

In [None]:
# 2D hist example
myhist2 = ROOT.TH2F("h2","2D Gauss Random Numbers",100,-5.,5.,100,-5.,-5.)
# random generator object
rng = ROOT.TRandom()
# fill histo in loop
for i in range(100000):
    xrnd = rng.Gaus() # Gaussian distributed Random number
    yrnd = rng.Gaus()
    myhist2.Fill( xrnd, xrnd+yrnd ) # Fill random number in histogram

# in Jupyter we need to create explict Canvas ...
c = ROOT.TCanvas("c","myCanvas",50,50,300,200)
myhist2.Draw() # Draw Histogramm as scatter plot
#myhist2.Draw('lego2') # lego plot
# ... and call Draw for Canvas
c.Draw()  

many options how to draw 2D histogram, see more in https://root.cern/doc/master/draw2dopt_8C.html


### Drawing functions ###

In [None]:
#myc.Clear()
myc = ROOT.TCanvas("myc","myc",10,10,800,600);     
myc.Divide(2,2) # make 2x2 subpads
myc.cd(1)
f1 = ROOT.TF1("f1","sin(x)", 0, 10 )
f1.Draw()
myc.cd(2) # next pad
f2 = ROOT.TF1("f2","cos(x)", 0, 10 )
f2.SetLineColor(1);
f2.Draw()
f1.Draw("same")
mypad=myc.cd(3)
mypad.SetLogy() # log y scale
mypad.SetGrid() # grid lines
f4 = ROOT.TF1("f4","abs(f1)*exp(x)", 0, 10 )
f4.Draw()
myc.Draw()





#### x-y Plot with error bars ####
Scientific data, in particular in physics, often has errors associated to the y-coordinate (sometimes also to x)

In [None]:
import numpy as np
x = np.linspace(0, 10, 50)
y = np.sin(x) + 0.5*np.random.randn(50)
ey = np.ones(len(x))*0.7
tg = ROOT.TGraphErrors(len(x), x, y, 0, ey) # graph with error bars
tg.SetMarkerStyle(20);

c = ROOT.TCanvas("c","myCanvas",500,400)
tg.Draw('AP')
c.Draw()  





Let's define short helper function to combine canvas and object drawing:

In [None]:
def nbdraw( myobj, myopts="", mysize=(500,400) ):
    "helper function for drawing in notebook"
    c = ROOT.TCanvas("myc","myCanvas",*mysize);
    myobj.Draw(myopts)
    c.Draw()  
    return c

    

In [None]:
nbdraw(tg, 'AP')

---
#### Simple fitting examples ####

1. Gaus fit to 1d histogram

In [None]:
fr = myhist1.Fit("gaus")
fit = myhist1.GetFunction("gaus")
mean = fit.GetParameter(1)
emean = fit.GetParError(1)
print ( f" {mean = :.5f}  +-  {emean:.5f}")
# need drawing canvas for jupyter    
#myhist1
ROOT.gStyle.SetOptFit(1) # global setting to have fit results in stat box
nbdraw(myhist1)

---
2nd fit example: polynom fit to x-y points in TGraphErrors 

*try different polynomial degrees*

In [None]:
tgf = tg.Fit("pol3") # fit polynomial 4th order
nbdraw(tg,'AP')



#### Exercises - 1 ####
ATLAS drift tube postions with ROOT

* Read the data from [rohr1.dat](http://www-static.etp.physik.uni-muenchen.de/kurs/comp20/uebungen/source/rohr1.dat) into ROOT

* **C++ IO**
```
ifstream data_file;
data_file.open("pipe1.dat");
doublex;
// book histogram ...
while ( data_file >>x ) // reads next value into x
{ // Fill histo ...
}
```
* **Python IO**
```
import numpy as np
data = np.loadtxt('rohr1.dat') # read all data in numpy array
# book histo ....
for x in data: # loop over data
  # fill histo
```
Create a histogram and fill in the values. ( Sample solution: .C , .py )

* And the same for [rohr2.dat](http://www-static.etp.physik.uni-muenchen.de/kurs/comp20/uebungen/source/rohr2.dat)  in a 2-dim histogram (*scatter-plot*).

```
TH2F h2("h","mytitle",nx,xlow,xhigh,ny,ylow,yhigh);
...
h2.Fill(x,y)
```
* **C++ IO for x,y**
```
...
while ( data_file >>x >>y) // reads next value pair into x and y
```
* **Python IO for x,y**
```
...
data2 = np.loadtxt('rohr2.dat') # read all data in numpy array
...
for x,y in data2: # loop over data2 entries 
```


---

### Simple data analysis ###

ROOT enables the efficient and fast analysis of very large amounts of data. 
In principle of course data can be saved in simple ASCII or .csv  formats and read in and processed with normal python or C/C++ commands. However, this is not very efficient, both in terms of space usage as well as fast I/O.
Using ROOT tuples is much more efficient, this way the data and its properties or variables are structured in the so-called tree format (trees).

Root trees are optimized for storing and efficiently processing particle physics event data .

Typically, events that are recorded in the detector are stored and processed in very different ways.

* **Raw data** contain all the detailed information of the sub-detectors, eg the drift times of each tube in the myinspectrometer that registered a signal. This requires a complex, deep tree structure:
   * ATLAS -> muon spectrometer ->  chamber ID ->  tube ID -> drift time 



* **reconstructed data:** tracks in the tracking detectors, blocks or clusters in the calorimeters, reconstructed decay vertices, .... 
   * Still requires complex tree structure


* **Summary data:** reconstructed abstract objects: jets, leptons, photons, missing momentum vector, ... 
   * still nested data: can be 1, 2, ... n jets, leptons, ....


* **Global summary:** number of tracks or jets, energy in calorimeter, ... 
   * Flat table layout, fixed number of parameters (=columns) per event is sufficient for easy characterization.


HEP data analysis at the final stage usually use the latter format...


In [None]:
import ROOT

f=ROOT.TFile.Open("http://www.etp.physik.uni-muenchen.de/kurs/comp10/uebungen/Z0-Versuch/data/ntz0mhmc.root")
mytree = ROOT.gROOT.FindObject("h5000") # get access to tree

In [None]:
mytree.Print()

In [None]:
c = ROOT.TCanvas("c","myCanvas",500,400)
#mytree.Draw("Ncharged+N_ecal")   # fuellt 1D Histogramm mit Variable Ncharged (automatische Histo Erzeugung)
#mytree.Draw("N_ecal:Ncharged")   # fuellt 2D Histogramm mit Variable Ncharged vs N_ecal
h1 = ROOT.TH1F("nc","N Tracks", 50, 0, 50) #  Buche 1d Histo
#mytree.Draw("Ncharged>>nc")  # fuellt 1D Histogramm mit Variable Ncharged in gebuchtes Histo
mytree.Draw("Ncharged>>nc","E_ecal>10") #  fuellt 1D Histogramm mit Variable Ncharged wenn Cut erfuellt ist
c.Draw()  

#### Exercises - 2 ####

* Plot the distributions (1D histograms) of important quantities like Ncharged, E_ecal, Pcharged for the datasets with simulated events: ntz0mhmc.root (quark/hadron decays), ntz0eemc.root (electron/positron), ntz0mmmc.root (muon), ntz0ttmc .root (tau)

* Example notebook for processing the files/trees and creating/plotting the histograms: PyROOT-Z0-Ex1.ipynb

* Make 2D histograms for the three combinations of Ncharged, E_ecal, Pcharged for both the simulated datasets and the detector data (ntz0e4.root).

