# LCP Final Project

This is my personal notebook containing the draft of all the functions and the code necessary for my own implementation, together with any kind of material deemed useful to the purpose.

GUI implementation and subsequent merging will be crucial additions.

## Introduction 

We are presented with a dataset obtained by four drift-tube detectors during a beam test. The detectors are filled with a gas mixture and they are divided in four layers, each made of cells with an anodic wire sitting at the center; their purpose is to measure the time of flight of $e^-$ drifting towards the anode after the gas has been ionized by the passage of particles through it.

The main purpose of our research is to study properties of $\mu^+ - \mu^-$ pairs produced by positrons colliding at $E=45$ GeV energy onto a Berillium target. 

The data-taking process consists of two phases: a calibration one, where only $\mu^+$ at given energies are employed; the sign of the current powering the magnet is changed so as to have alternance of track directions.

## Outline

We briefly present the outline of our research contained in this document. The code related to the functions that have been created specifically for this project can be found in a separate module.

## Importing libraries

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

## Preliminary utilities

Below we copy the paths for calibration and physics data; the folders contain .txt files.

In [None]:
path_cal = "./Final\ Project\ \(group6\)/data/calibration/" # + "Run000260" & 261,262,263

In [None]:
path_phys = "./Final\ Project\ \(group6\)/data/physics/" # + "Run000331" & 332 to 339

In [None]:
# Cell dimensions
XCELL = 42.
ZCELL = 13.

# X coordinates translation
global_x_shifts = [994.2, 947.4,-267.4,-261.5,]

# Z coordinates translations
local_z_shifts = [z*ZCELL for z  in range(0,4)]
global_z_shifts = [823.5, 0, 823.5, 0]

**Transformations**

```python
x_global =  global_x_shifts[chamber] - x_local
```
(*negative sign is due to 180° degrees rotation of detectors*)

```python
z_local = local_z_shifts[layer]
z_global = global_z_shifts[chamber] + z_local
```

## Functions
### Read data

* **Input**: 1 row of data_file.txt, e.g. 1 event, passed as a list
* **Output**: pandas dataframe as in the Data Format paragraph, Number of the Event, Number of hits in the Event

This function takes in input one event at time, and then outputs a pandas dataframe as described in the previous section. In addition, the transformation from local to global coordinates is performed.


In [None]:
data_file = "./Final Project (group6)/data/calibration/Run000260/data_000000.txt"
with open(data_file) as f:
    line = f.readline()
    line = f.readline()
    line = f.readline()
    line = f.readline()
    line = f.readline()
    event = line.split()
    #event = [float(i) for i in event]
    print(event)

In [None]:
def read_data(event):    
    event_number = int(event[0])
    hits_number  = int(event[1])
    hit       = np.arange(hits_number)
    chamber   = np.fromiter((event[2+5*i] for i in range(hits_number)), int)
    layer     = np.fromiter((event[3+5*i] for i in range(hits_number)), int)
    xl_local  = np.fromiter((event[4+5*i] for i in range(hits_number)), float)
    xr_local  = np.fromiter((event[5+5*i] for i in range(hits_number)), float)
    z_local   = np.fromiter((local_z_shifts[i-1]+ZCELL/2 for i in layer), float)
    time      = np.fromiter((event[6+5*i] for i in range(hits_number)), float)
    xl_global = np.fromiter((global_x_shifts[i] for i in chamber), float) - xl_local
    xr_global = np.fromiter((global_x_shifts[i] for i in chamber), float) - xr_local
    z_global  = np.fromiter((global_z_shifts[i] for i in chamber), float) + z_local
    dataframe = pd.DataFrame(
        { 'EvNumber' : event_number,
          'Hit'      : hit,
          'Chamber'  : chamber,
          'Layer'    : layer,
          'XL_local' : xl_local,
          'XR_local' : xr_local,
          'Z_local'  : z_local,
          'Time'     : time,
          'XL_global': xl_global,
          'XR_global': xr_global,
          'Z_global' : z_global,
        })
    #dataframe.set_index('Hit', inplace=True) # set as index the number of the hit 
    return dataframe, event_number, hits_number

In [None]:
ev, evNumber, hits = read_data(event)
ev["EvNumber"][0]

### Event selection

- **Input**: Pandas `DataFrame` (1 event)
- **Output**: True/False

The input of the function is the Pandas `DataFrame` made by the *Read Data* function. The output is a boolean value, which labels the good calibration events.
We need to plot the histogram of the frequency of the number of hits, in order to find out the best requirements for good events.

*Possible choice (to be evaluated)*: Good events requires at least 6 hits (in different layers) either in the left or in the right side of the detector.

In [None]:
def select_events(dataframe):
        
        #hits only in the left side
        if((dataframe['Chamber']<=1).all()):
            #compute number of different layers in each chamber
            n_layer_ch0 = dataframe[dataframe['Chamber']==0]['Layer'].nunique()
            n_layer_ch1 = dataframe[dataframe['Chamber']==1]['Layer'].nunique()
            
            #require at least 3 different layers for each chamber
            if(n_layer_ch0>=3 and n_layer_ch1>=3):
                return True
            else:
                return False
            
        #hits only in the right side
        elif((dataframe['Chamber']>=2).all()):
            #compute number of different layers in each chamber
            n_layer_ch2 = dataframe[dataframe['Chamber']==2]['Layer'].nunique()
            n_layer_ch3 = dataframe[dataframe['Chamber']==3]['Layer'].nunique() 
            
            #require at least 3 different layers for each chamber
            if(n_layer_ch2>=3 and n_layer_ch3>=3):
                return True
            else:
                return False
        
        #hits in both left and right side
        else:
            return False
    
#print(select_events(ev))            
#print(ev)

### Plot background

- **Input**: `None`
- **Output**: `list`[pyplot `Axes`] (global image + 4 detectors zooms)

Five plots are given as output: one image of the whole detector, and one for each of the 4 chambers. This function is made in order to do the plot of the background only one time, instead of doing that for every Event.

In order to plot the `Axes` in the `list`, and get a good layout:
```python
gridsize = (5, 2)
fig = plt.figure(figsize = (12, 24))
axes = plot_background()
plt.show()
```

In [None]:
def plot_background():
    # create Pandas DataFrame for the cambers positions
    chamber_position = pd.DataFrame({
    'chamber' : [i for i in range(4)],
    'x_vertices' : [(global_x_shifts[i], global_x_shifts[i] - 720, global_x_shifts[i] - 720, global_x_shifts[i])
                    for i in range(4)],
    'y_vertices' : [(global_z_shifts[i], global_z_shifts[i], global_z_shifts[i] + 52, global_z_shifts[i] + 52)
                    for i in range(4)],
    })
    x_lim = [[-1000, 1000], # global detector
             [    0, 1000], # chamber 0
             [    0, 1000], # chamber 1
             [-1000,    0], # chamber 2
             [-1000,    0]] # chamber 3
    y_lim = [[-100, 1000],  # global detector
             [800 ,  900],  # chamber 0
             [ -25,   75],  # chamber 1
             [ 800,  900],  # chamber 2
             [ -25,   75]]  # chamber 3
    title = ["DETECTOR", "Chamber 0", "Chamber 1", "Chamber 2", "Chamber 3"]
    # create pyplot 'Axes' objects
    ax_global = plt.subplot2grid(gridsize, (0, 0), colspan=2, rowspan=2)
    ax_0 = plt.subplot2grid(gridsize, (2, 1), colspan=1, rowspan=1) # top-right
    ax_1 = plt.subplot2grid(gridsize, (3, 1), colspan=1, rowspan=1) # bottom-right
    ax_2 = plt.subplot2grid(gridsize, (2, 0), colspan=1, rowspan=1) # top-left
    ax_3 = plt.subplot2grid(gridsize, (3, 0), colspan=1, rowspan=1) # bottom-left
    
    axes = [ax_global, ax_0, ax_1, ax_2, ax_3]
    for index, ax in enumerate(axes):
        ax.set_xlim(x_lim[index])
        ax.set_ylim(y_lim[index])
        ax.set_xlabel("x [mm]")
        ax.set_ylabel("z [mm]")
        if index == 0: ax.set_title(title[index])
        else: ax.set_title(title[index], pad=-20)
        # plot the 4 chambers in each 'Axes'
        for j in range(4):
            chamber = chamber_position[chamber_position["chamber"] == j]
            ax.fill(chamber["x_vertices"].values[0], chamber["y_vertices"].values[0], color='gray', fill=False)
    return axes

In [None]:
gridsize = (5, 2)
fig = plt.figure(figsize = (12, 24))
axes = plot_background()
plt.show()

### Plot events

- **Input**: Pandas `DataFrame` (1 event) + event number
- **Output**: `list`\[pyplot `Axes`\] (global image + 4 detectors zooms)

The input of the function is the Pandas `DataFrame` made by the *Read Data* function, and the event number (this is due to the fact that, if the are no hits, the `DataFrame` is empty, and therefore we can't get the Event number from that). Five plots are given as output: one image of the whole detector, and one for each of the 4 chambers. In the images there will be the points of the hits tracked in the event (left/right positions must have different colors).

In [None]:
def plot_events(dataframe, evNumber):
    # get the EvNumber as argument, because, if the dataframe is empty,
    # I can't get it from data
    plots = plot_background()
    plots[0].set_title("Event:"+str(evNumber), {'size':'18'})
    if dataframe.empty == False:
        xL = dataframe["XL_global"]
        xR = dataframe["XR_global"]
        z  = dataframe["Z_global"]
        for index, image in enumerate(plots):     
            image.plot(xL, z, "bo", markersize=3)
            image.plot(xR, z, "ro", markersize=3)
    return plots

In [None]:
gridsize = (5, 2)
fig = plt.figure(figsize = (12, 24))
axes = plot_background()
axes = plot_events(ev, evNumber)
plt.show()

## Linear fit

* **Input**: Pandas DataFrame (1 event)
* **Output**: [[slope, intercept] for each chamber]

The input of the function is the Pandas DataFrame made by the Read Data function. The output is a list of list with the coefficients of the linear regression (e.g. scipy.stats.linregress) for each chamber.

The fit is only made for good events, which means the return of Select Events (Calibration) function is True. If there are no hits in the chamber, the list returned should be [False, False].

The fit has to be made considering all of the possible permutation of the left/right signals; the result will be chosen by selecting the fit with the lowest $\chi^2$.

## Note a contorno

* Come errore sulla $x=x(z)$ prediligiamo a quello ricavato a posteriori dal fit un errore ottenuto dal $\sigma$ della posizione del filo ottenuta mediando $x_L$ e $x_R$.


* Local fit permette di eliminare l'ambiguità tra $x_L$ e $x_R$; global fit offre una modalità per testare, qualora vi siano punti aggiuntivi, la bontà del fit: si verifica che l'errore di $y_{fit}$ nel punto non considerato sia ragionevolmente entro le 3$\sigma$. 


* Valutiamo l'eventuale presenza di errore sistematico verificando che la gaussiana dei residui del fit di $x(z)$ sia centrata in 0; la FWHM della gaussiana stessa dovrebbe essere la risoluzione (*da verificare*)

* Proposta per calcolare l'errore dell'efficienza di selezione: $\sigma_{\varepsilon} = \sqrt{\frac{\varepsilon (1-\varepsilon)}{N_2}}$ where $\varepsilon = \frac{N_1}{N_2}$, essendo $N_1$ il numero degli eventi con 8 hits e $N_2$ con 7 hits.

* Dataframe per ogni run in calibrazione; dataframe con tutti gli hit (meglio unire che ciclare) in modo da avere una visione di insieme (utile per istogrammi di frequenza, efficienza, ...); in Physics si possono unire diversi run (effettuati sotto le medesime condizioni)

* Nella scelta del $\chi^{2}$ migliore usiamo solo 3 punti: il quarto va escluso a priori? O solo dopo aver verificato quale combinazione effettivamente abbia $\chi^{2}$ minore?