## Introduction to ROOT

ROOT is a software framework for data analysis. This is a powerful tool to cope with the demanding tasks, typically state of the art scientific data analysis. It has an graphical user interface, ideal for interactive analysis, an interpreter for the C++ programming language for rapid and efficient prototyping and a persistency mechanism for C++ objects used also to write petabytes of data recorded by the Large Hadron Collider experiments every year.

In this class, we will learn how to illustrate the main features of ROOT which are relevant for the typical problems of data analysis: input and plotting of data from measurements and fitting of analytical functions.

### Motivation

**Data Analysis**

The Comparison of measurements to theoretical models is one of the standard tasks in experimental physics. In the most simple case, a ``model`` is just a function providing predictions of measured data. Very often, the model depends on parameters. Such a model may simply state ``the current I is proportional to the voltage U``, and the task of the experimentalist consists of determining the resistance, R, from a set of measurements.

<img src="https://github.com/monttj/computational-physics/blob/2021/figs/functions.png?raw=1">

In the first step, the visualisation of the data is needed. Next, some manipulations typically have to be applied, e.g. corrections or parameter transformations. Quite often, these manipulations are complex, and a powerful library of mathematical functions and procedures should be provided - think for example of an integral or peak-search or a Fourier transformation applied to an input spectrum to obtain the actual measurement described by the model.

A specialty of experimental physics are the inevitable uncertainties affecting each measurement, which have to be included in the visualisation tools. In subsequent analyses, the statistical nature of the errors must be handled properly.

In the last step, measurements are compared to models, and free model parameters need to be determined in the process. In the next chapters you will find an example of a function (model) fit to data points. Several standard methods are available, and a data analysis tool should provide easy access to more than one of them. Means to quantify the level of agreement between measurements and model must also be available.
Quite often, the data volume to be analysed is large - think of fine-granular measurements accumulated with the aid of computers. A usable tool therefore must contain easy-to-use and efficient methods for storing and handling data.

In Quantum mechanics, models typically only predict the probability density function (``pdf``) of measurements depending on a number of parameters, and the aim of the experimental analysis is to extract the parameters from the observed distribution of frequencies at which certain values of the measurement are observed. Measurements of this kind require means to generate and visualise frequency distributions, so-called histograms, and stringent statistical treatment to extract the model parameters from purely statistical distributions.

Simulation of expected data is another important aspect in data analysis. By repeated generation of ``pseudo-data``, which are analysed in the same manner as intended for the real data, analysis procedures can be validated or compared. In many cases, the distribution of the measurement errors is not precisely known, and simulation offers the possibility to test the effects of different assumptions.

A powerful software framework addressing all of the above requirements is ROOT, an open source project coordinated by the European Organisation for Nuclear Research, CERN in Geneva.

ROOT is very flexible and to provide both a programming interface to use in one'sown applications and a graphical user interface for interactive data analysis. The purpose of this document is to serve as a beginners guide and provides extendable examples for your own use cases, based on typical problems addressed in student labs. This guide will hopefully lay the ground for more complex applications in your future scientific work building on a modern, state-of the art tool for data analysis.

This guide in form of a tutorial, is intended to introduce you quickly to the ROOT package. This goal will be accomplished using concrete examples, according to the “learning by doing” principle. Also because of this reason, this guide cannot cover all the complexity of the ROOT package. Nevertheless, once you feel confident with the concepts presented in the following chapters, you will be able to appreciate the ROOT Users Guide (The ROOT Users Guide 2015) and navigate through the Class Reference (The ROOT Reference Guide 2013) to find all the details you might be interested in. You can even look at the code itself, since ROOT is a free, open-source product. Use these documents in parallel to this tutorial!

The ROOT Data Analysis Framework itself is written in and heavily relies on the ```C++``` programming language: some knowledge about ```C++``` is required. Just take advantage from the immense available literature about ```C++``` if you do not have any idea of what this language is about.


**Let’s dive into ROOT!**

In order to use ROOT in a Python notebook, we first need to import the ROOT module. During the import, all notebook related functionalities are activated.

In [None]:
#import ROOT library 
import ROOT

Now we are ready to use [PyROOT](https://root.cern.ch/how/how-use-pyroot-root-python-bindings). For example, we create a histogram.

Frequency distributions in ROOT are handled by a set of classes derived from the histogram class TH1, in our case TH1F. The letter F stands for float, meaning that the data type float is used to store the entries in one histogram bin.

In [None]:
h = ROOT.TH1F("gauss","Example histogram",100,-4,4)

In this line a histogram is instantiated, with a name, a title, a certain number of bins (100 of them, equidistant, equally sized) in the range from -4 to 4.

We use yet another new feature of ROOT to fill this histogram with data, namely pseudo-random numbers generated with the method TH1::FillRandom. This is based on [```TF1::GetRandom```](https://root.cern.ch/doc/master/classTF1.html#ab44c5f63db88a3831d74c7c84dc6316b).
This can be used to randomly fill an histogram using the contents of an existing TF1 function or another TH1 histogram (for all dimensions).

In [None]:
h.FillRandom("gaus")

Next we create a canvas, the entity which holds graphics primitives in ROOT.

Here, the Draw() method, here without any parameters, displays the histogram in a window which should pop up after you type the ```h.Draw()``` in your terminal or it will be displayed below your code in the notebook environment.

For the histogram to be displayed in the notebook, we need to draw the canvas.

In [None]:
c = ROOT.TCanvas("myCanvasName","The Canvas Title",800,600)
h.Draw()
c.Draw()

It is not active by default yet, but Javascript visualisation can be activated for testing purposes. The plot below will be interactive: click on it and discover the [JSROOT](https://root.cern.ch/js/) capabilities!

In [None]:
%jsroot on
c.Draw()

In [None]:
c = ROOT.TCanvas("c")
h = ROOT.TH1F("h","ROOT Histo;X;Y",64,-4,4)

Thanks to ROOT, it is possibile to write cells in C++ within a Python notebook. This can be done using the ``%%cpp`` magic. Magics are a feature of Jupyter notebooks and when importing the ROOT module, the ``%%cpp`` magic was registered.

In [None]:
%%cpp
cout << "This is a C++ cell" << endl;

Another example to draw histogram with C++! 

In [None]:
%%cpp
h->FillRandom("gaus");
h->Draw();
c->Draw();

###  ROOT as a function plotter

Using one of ROOT’s powerful classes, here TF1 will allow us to display a function of one variable, x. Try the following:

In [None]:
c1 = ROOT.TCanvas("example","sin([1]*x)/x",800,600)
f1 = ROOT.TF1("f1","sin(x)/x",0.,10.)

``` f1 ``` is an instance of a ``` TF1 ``` class, the arguments are used in the constructor; the first one of type string is a name to be entered in the internal ROOT memory management system, the second string type parameter defines the function, here sin(x)/x, and the two parameters of type double define the range of the variable x.

In [None]:
f1.Draw();
c1.Draw();

A slightly extended version of this example is the definition of a function with parameters, called [0], [1] and so on in the ROOT formula syntax. We now need a way to assign values to these parameters; this is achieved with the method [SetParameter](https://root.cern.ch/doc/master/classTF1.html#ade6e54171210c6b1b955c9f813040eb8)(``parameter_number``,``parameter_value``) of class TF1. Here is an example:

In [None]:
f2 = ROOT.TF1("f2","[0]*sin([1]*x)/x",0.,10.)

You can try to change the parameters of the input below.

In [None]:
f2.SetParameter(0,1)
f2.SetParameter(1,1)
f2.Draw()
c1.Draw()

### Use of Python Functions

It is possible to mix Python functions with ROOT and perform such operations as plotting and fitting of histograms with them. In all cases, the procedure consists of instantiating a ROOT ```TF1```, ```TF2```, or ```TF3``` with the Python function and working with that ROOT object. There are some memory issues, so it is for example not yet possible to delete a ```TF1``` instance and then create another one with the same name. In addition, the Python function, once used for instantiating the ```TF1```, is never deleted.

Instead of a Python function, you can also use callable instances (e.g., an instance of a class that has implemented the ```__call__``` member function). The signature of the Python callable should provide for one or two arrays. The first array, which must always be present, shall contain the ```x```, ```y```, ```z```, and t values for the call. The second array, which is optional and its size depends on the number given to the ```TF1``` constructor, contains the values that parameterize the function. For more details, see the ```TF1``` documentation and the examples below.

In [None]:
from ROOT import TF1, TCanvas

#customized function
def identity( x, par): # python function gets two arguments : x, parameters
    return x[0]

# create an identity function
f = TF1('pyf1', identity, -1,1) #from -1 to 1
f.SetParameter(0,3)

# plot the function
c = TCanvas()
f.Draw()
c.Draw()

Because no number of parameters is given to the ```TF1``` constructor, ‘```0```’  (the default) is assumed. This way, the ‘```identity```’ function need not handle a second argument, which would normally be used to pass the function parameters. Note that the argument ‘```x```’ is an array of size 4. The following is an example of a parameterized Python callable instance that is plotted on a default canvas:

In [None]:
from ROOT import TF1, TCanvas

class Linear:
    def __call__( self, x, par ):
        return par[0] + x[0]*par[1]

pyc = Linear()    
# create a linear function with offset 5, and pitch 2
f = TF1('pyf2',pyc,-1.,1.,2)
# set parameters so that the function becomes y = 5+2x : ~two lines
# mini task
#f.SetParameters(0,5)
#f.SetParameters(1,2)

# plot the function
c = TCanvas()
f.Draw()
c.Draw()

Note that this time the constructor is told that there are two parameters, and note in particular how these parameters are set. It is, of course, also possible (and preferable if you only use the function for plotting) to keep the parameters as data members of the callable instance and use and set them directly from Python.

### Fitting Histograms

Fitting a histogram with a Python function is no more difficult than plotting: instantiate a ```TF1``` with the Python callable and supply that ```TF1``` as a parameter to the ```Fit()``` member function of the histogram. After the fit, you can retrieve the fit parameters from the ```TF1``` instance. For example:

In [None]:
from ROOT import TF1, TH1F, TCanvas, TRandom3
# create and fill a histogram
h_data = TH1F('h_data','test',100,-1.,1.)
f2 = TF1('cf2','6.+x*4.5',-1.,1.)
h_data.FillRandom('cf2',10000)

# fit the histo with the python 'linear' function
h_data.Fit(f)

# print results
par = f.GetParameters()
print ('fit results: const =', par[0], ',pitch =', par[1])

##### answer:
```
fit results: const = 98.7442051575 ,pitch = 74.9120871231
```
Did you get the same answer? If not, why? Maybe it is because of random number generator?

### task

In [None]:
# draw the histogram with the fit function: 3 lines


##### answer:
<img src="https://github.com/monttj/computational-physics/blob/2021/figs/c.png?raw=1" />

Did you get the same answer? If not, why?