<h1 align='center'>Final assignment of "Laboratory of Computational Physics"</h1>
<img align='right' src='https://www.unidformazione.com/wp-content/uploads/2018/04/unipd-universita-di-padova.png' alt='Drawing' style='width:400px;'/>


<h2 align='left'>Search for flavor-changing neutral currents <br>in $t\bar{t}$ processes in multilepton final states in <br> proton-proton collisions with the CMS detector</h2>


<h3 align='left'>University of Padua - Physics of Data</h3>
<h4 align='left'>Dott. Alberto Zucchetta, Prof. Marco Zanetti</h4>

**Name** | **ID number** | **mail**@studenti.unipd.it
:-:|:-:|-:
Chiara Maccani | 2027591 | chiara.maccani
Samuele Piccinelli | 2027650 | samuele.piccinelli
Tommaso Stentella | 2027586 | tommaso.stentella
Cristina Venturini | 2022461 | cristina.venturini.5

<div class="alert-success">
<h2 align='center'>Event selection: constructing high level features and plotting</h2>
</div>

In [None]:
# - import libraries and useful dependencies

import ROOT
import ROOT.ROOT as rr
import uproot
import numpy as np
import pandas as pd
import os
from pathlib import Path
import awkward as ak
import random

import FilterFunctions as ff
import cpp

### 1. Loading data
Data samples used in this analysis have been collected in the 2016-17-18 runs, at centre-of-mass energy of $13$ TeV.

The skimming step reduces the inital generic samples to a dataset optimized for this specific analysis. Data, in form of `.root` files are divided based on the data, signal and specific MC process tag by means of the `loadData` function.

Each `TTree` object, representing a columnar dataset and allowing a easy interface with Python, are concatenated by means of a `Chain` which are then loaded in a `RDataFrame` object.

All background processes considered are listed in `listBkgDir`.

In [None]:
dirBasePath  = '/data/FCNC/'
dirOutPath = '/data/Skim/'
dirPlotPath = './Plots/'

listDir = os.listdir(dirBasePath)

In [None]:
def returnDir(string):
    return [filename for filename in listDir if filename.startswith(string)]

def loadData(chain, pathDirs, info=False):
    # Set up multi-threading capability of ROOT
    rr.EnableImplicitMT()
    
    for Dir in pathDirs:
        if info: print('>>> Process directory ', Dir)
        file_list = os.listdir(dirBasePath + Dir)
        for file in file_list:
            chain.AddFile(dirBasePath + Dir + '/' + file)
            
    return chain

def CountEvents(df, info=True):
    n = df.Count().GetValue()
    if info: print('\nNumber of events:', n, '\n')
    return n

# Retrieve a histogram from the input file based on the process and the variable name
def getHistogram(tfile, name, variable, tag=''):
    name = '{}_{}{}'.format(name, variable, tag)
    h = tfile.Get(name)
    if not h:
        raise Exception('Failed to load histogram {}.'.format(name))
    return h

In [None]:
# - Data + MC Signal
signalDirs = returnDir('SingleMuon')
signalMCDirs = returnDir('TT_FCNC')

# - MC backgrounds
listBkgDir = ['ST_', 'TTTT_Tune', 'TTTo2L2Nu', 'TTToHadronic', 'TTToSemiLeptonic', 'TTWJetsToLNu', 'TTZToLLNuNu',
              'WGToLNuG', 'WJetsToLNu', 'WWTo2L2Nu', 'WWW', 'WWZ', 'WZG', 'WZTo1L1Nu2Q', 'WZTo2L2Q', 'WZTo3LNu',
              'WZZ', 'WmWmJJ', 'WminusH', 'WpWpJJ', 'ZG', 'ZZ', 'tZq']

bkgMCDirs = dict(list(zip(listBkgDir, map(returnDir, listBkgDir))))
bkgMCChain = dict(list(zip(listBkgDir, [ROOT.TChain('Events') for _ in range(len(listBkgDir))])))

In [None]:
# - Load Data + MC Signal

chainSig = ROOT.TChain('Events')
dfData = rr.RDataFrame(loadData(chainSig, signalDirs))
# CountEvents(dfData)

chainMC = ROOT.TChain('Events')
dfMCSig = rr.RDataFrame(loadData(chainMC, signalMCDirs))
# CountEvents(dfMCSig)

In [None]:
# - Load MC backgrounds

dfMCBkg = {}
for key, value in bkgMCDirs.items():
    dfMCBkg[key] = rr.RDataFrame(loadData(bkgMCChain[key], value))
    # CountEvents(dfMCBkg[key])

### 2. Data skimming

The first step of the analysis consists in applying cuts to the data in order to select the events that have characteristics matching those of the desired final states.<br>
In this initial stage we want to obtain agreement between data and background for the non-discriminating variables. In particular for the iSkim 2 category tighter cuts were needed in order to achieve this.

In the table below we show the cuts for each category implemented in the `FSkim` functions contained in the `FilterFunction.py` script.

|                  |iSkim1   |iSkim2   |iSkim3   |iSkim4   | 
|:-----------------|:------:|:-----:|:------:|:------:|
|Muon $p_T$| $>15$ | $>27$| $>15$ |$>15$  | 
|Electron $p_T$    | $-$    | $>20$  | $-$    |$>15$   |
|Muon / Electron Iso      | $<$ $0.15$| $<$ $0.1$ | $<$ $0.15$|$<$ $0.15$ |
|Muon / Electron $|\eta|$ | $<2.5$| $<2.4$| $<2.5$| $<2.5$|
|$\textbf{n Muon}$          | $\mathbf{2}$    | $\mathbf{1}$    | $\mathbf{3}$    | $\mathbf{2}$    |
|$\textbf{n Electron}$      | $\mathbf{0}$    | $\mathbf{1}$    | $\mathbf{0}$    | $\mathbf{1}$    |
|Jet $p_T$| $>30$ | $>30$| $>30$ |$>30$  |
|$\Delta R( $Jet, $\ell$ )| $>0.4$ | $>0.4$| $>0.4$ |$>0.4$  |
|Jet $|\eta|$ | $<2.5$| $<2.5$| $<2.5$| $<2.5$|
|$\textbf{n Clean Jet}$      | $\mathbf{\geq 4}$    | $\mathbf{\geq 4}$   | $\mathbf{\geq 2}$  | $\mathbf{\geq 2}$   |
|b-Jet Deep tag| $>0.6$| $>0.6$| $>0.6$| $>0.6$|
|Jet $|\eta|$ | $<2.5$| $<2.5$| $<2.5$| $<2.5$|
|$\textbf{n b-Jet}$      | $\mathbf{> 0}$, $\mathbf{\leq 2}$    | $\mathbf{> 0}$, $\mathbf{\leq 2}$   | $\mathbf{ \geq 1}$  | $\mathbf{\geq 1}$   |

Furthemore for the tri-lepton categories a filter that rejects events in which all muons have the same charge is applied. 

### 3.  High level features (HLF) construction

The next step consists in constructing HLF with the hope of them being discriminant. For this task we refer to the works of [M. Aaboud et al.](https://journals.aps.org/prd/abstract/10.1103/PhysRevD.98.032002).

#### 3.1. Flattening
Starting from the vectorial form of branches we select only the variables needed for the subsequent tasks, flattening in this way the tree-like structure of the data.

Each lepton is tagged with a number, the meaning of which is different for each iSkim category, as shown in the table below. 

| | 0|1|2|
|:-|:-:|:-:|:-:|
|iSkim1 |Muon w/ highest $p_T$ |Muon w/ lowest $p_T$| $$-$$|
|iSkim2 | Muon | Electron|$$-$$|
|iSkim3 |OS Muon | Muon w/ higher $\Delta R$ from Muon 0| Muon w/ lowest $\Delta R$ from Muon 0|
|iSkim4 |OS Lepton | Lepton w/ higher $\Delta R$ from Lepton 0| Lepton w/ lowest $\Delta R$ from Lepton 0|

In the following subsections, we present the tables with the complete description of the variables we constructed. The functions used to define the new branches are implemented in C++ and contained in the `cpp.py` script. They are then called in the `DeclareVariables` functions contained in the `FilterFunction.py` script.

#### 3.2. HLF of di-lepton categories (iSkim1, iSkim2)

Variable | Definition
:-:|:-:|
$$\eta_{max}$$ | max. absolute value of the pseudo-rapidity of the 2 leptons
$$m^{inv}$$ | invariant mass of the leptons
$$\Delta R_{hl/J}$$ | angular separation between the highest $p_T$ lepton direction and the axis of the nearest jet in the ($\eta$,$\phi$) plane
$$\Delta R_{ll/J}$$ | angular separation between the lowest $p_T$ lepton direction and the axis of the nearest jet in the ($\eta$,$\phi$) plane
$$\Delta R_{lep}$$ | angular separation between the 2 leptons
$$\Delta\phi_{MET/0}$$ | phase difference between missing transverse energy (MET) and lepton with 0 tag
$$\Delta\phi_{MET/1}$$ | phase difference between MET and lepton with 1 tag
$$ST$$| difference between MET and the scalar sum of all leptons and jets' $p_T$ 
$$m^{inv}_{Jet/Jet}$$ | invariant mass of 2 not b-tagged jets 


#### 3.3. HLF of tri-lepton categories (iSkim1, iSkim2)

Variable | Definition
:-:|:-:|
$$m^{inv}_{0/1}$$ | invariant mass of lepton with tag 0 and lepton with tag 1
$$m^{inv}_{0/2}$$ | invariant mass of lepton with tag 0 and lepton with tag 2
$$m^{inv}_{1/2}$$ | invariant mass of lepton with tag 1 and lepton with tag 2
$$m^{inv}_{3}$$ | invariant mass of all the 3 leptons
$$\Delta R_{1/J}$$ | angular separation between the lepton with tag 1 direction and the axis of the nearest jet in the ($\eta$,$\phi$) plane
$$\Delta R_{0/bJ}$$ | angular separation between the lepton with tag 0 direction and the axis of the nearest b-tagged jet in the ($\eta$,$\phi$) plane
$$\Delta R_{0/1}$$ | angular separation between the lepton with tag 0 direction and the lepton with tag 1 direction in the ($\eta$,$\phi$) plane
$$\Delta R_{0/2}$$ | angular separation between the lepton with tag 0 direction and the lepton with tag 2 direction in the ($\eta$,$\phi$) plane
$$ST$$| difference between MET and the scalar sum of all leptons and jets' $p_T$ 
$$\Delta\phi_{MET/0}$$ | phase difference between missing transverse energy (MET) and lepton with 0 tag
$$\Delta\phi_{MET/1}$$ | phase difference between missing transverse energy (MET) and lepton with 1 tag
$$\Delta\phi_{MET/2}$$ | phase difference between missing transverse energy (MET) and lepton with 2 tag
$$m^{inv}_{Jet/Jet}$$ | invariant mass of 2 not b-tagged jets

#### 3.4. Additional Cuts
We added additional cuts based on the HLF just created.
##### Invariant Mass
For all the 4 categories events with reconstructed invariant masses $<15$ GeV are rejected.

##### Cut in the Z resonant region
In order to reduce the background contribution from resonant $Z$ production another filter is added:
   + **iSkim1**: events in which the two selected leptons have opposite charge and  $|m^{inv} - 91.2 \text{ GeV}| > 10$ are rejected 
   + **iSkim2**: no rejections (the selected leptons have different flavour so they can not come from a $Z$ decay)
   + **iSkim3**: events with   $|m^{inv}_{0/1} - 91.2 \text{ GeV}| > 10$ and  $|m^{inv}_{0/2} - 91.2 \text{ GeV}| > 10$ are rejected 
   + **iSkim4**: events with  $|m^{inv}_{0/1} - 91.2 \text{ GeV}| > 10$ and  $|m^{inv}_{0/2} - 91.2 \text{ GeV}| > 10$ are rejected 
   
##### Same Sign imposition (Dilepton cathegories)
After having reached the agreement between data and Montecarlo simulations, a filter on the charge of the selected leptons is imposed in order to consider only same sign (**SS**) particles.

### 4. Produce histograms

The range of the histogram for each variable is declared in the `ff.SkimRanges` dictionary. Each entry in it contains of the variable name as key and a tuple specifying the histogram layout as value. The tuple sets the number of bins,
the lower edge and the upper edge of the histogram.

The `make_hist` function loops over the outputs from the skimming step and produces the equired histograms for the final plotting that are saved in a root file. Here, the dictionary entry corresponding to the iSkim category under exam is called in `ff.DeclareVariables[nSkim]` - this time the `save` flag is set to false.

By observing the plots, background processes are grouped based on their significance in 6 categories as follows:

- Single top;
- $t\bar{t}$;
- $t\bar{t}\rightarrow \ell\nu \ell\nu$;
- $t\bar{t}\rightarrow$ semi-leptonic;
- Diboson;
- Others.

We exluded 2 categories of background processes, the ones referring to the Drell-Yann and QCD processes since their proper handling is done by means of sophisticated data-driven techniques (as shown [here](https://arxiv.org/abs/1110.1368)). Furthermore, their contributions were in clear eccess with respect to the data.

The `bookHistogram` function takes into account the correct weighting of each event stored in the `eventWeightLumi` variable. Each MC background category is stacked in the final plot. The uncertainty of the MC simulated events is computed individually for each process and then combined together by adding all the MC contributions in a single histogram of which we only plot the uncertainty.<br>
The signal is plotted with an arbitrary normalization obtained by imposing equal number of counts to the background histogram in order to visualize the shape of the distribution.<br>
The data histogram is then superimposed.

In [None]:
ROOT.gROOT.SetBatch(True)

# Book a histogram for a specific variable: takes weights into account
def bookHistogram(df, variable, range_):
    return df.Histo1D(rr.RDF.TH1DModel(variable, variable, range_[0], range_[1], range_[2]), variable, 'eventWeightLumi')

# Write a histogram with a given name to the output ROOT file
def writeHistogram(h, name):
    h.SetName(name)
    h.Write()

# Main function of the histogramming step
def make_hist(nSkim):
    rr.EnableImplicitMT()
    ranges = ff.SkimRanges[nSkim]
    
    # Create output file
    tfile = ROOT.TFile(dirPlotPath + 'histogram_{}.root'.format(nSkim), 'RECREATE')
    variables = ranges.keys()
    
    fdfData = ff.DeclareVariables[nSkim](dfData, '', save=False)
    fdfMCSig = ff.DeclareVariables[nSkim](dfMCSig, '', save=False)
    
    # Loop through skimmed datasets and produce histograms of variables
    hists = {}
    for variable in variables:
        hists[variable] = bookHistogram(fdfData, variable, ranges[variable])

    hists_sig_mc = {}
    for variable in variables:
        hists_sig_mc[variable] = bookHistogram(fdfMCSig, variable, ranges[variable])

    # Write histograms to output file
    for variable in variables:
        writeHistogram(hists[variable], '{}_{}'.format('Data', variable))
    for variable in variables:
        writeHistogram(hists_sig_mc[variable], '{}_{}'.format('MCSig', variable))
        
    
    for key, value in dfMCBkg.items():
        fdfMCBkg = ff.DeclareVariables[nSkim](value, '', save=False)
        
        hists = {}
        for variable in variables:
            hists[variable] = bookHistogram(fdfMCBkg, variable, ranges[variable])
        for variable in variables:
            writeHistogram(hists[variable], '{}_{}'.format(key, variable))
    
    tfile.Close()

In [None]:
make_hist(4)

### 5. Implementation of the plotting step of the analysis

In [None]:
nSkim = 3
labels = ff.SkimLabels[nSkim]
    
# Specify the color for each process:
# - Signal
colors = {
        'Data': ROOT.TColor.GetColor('#BF2229'),
        'MCSig': ROOT.TColor.GetColor('#00A88F'),
        }
# - MC BKG
colorsBkg = {
    'ST_': ROOT.TColor.GetColor(100, 192, 232),
    'TTTT_Tune': ROOT.TColor.GetColor(155, 152, 204),
    'TTToHadronic': ROOT.TColor.GetColor(155, 152, 204),
    'TTWJetsToLNu': ROOT.TColor.GetColor(155, 152, 204),
    'TTZToLLNuNu': ROOT.TColor.GetColor(155, 152, 204),
    'TTTo2L2Nu': ROOT.TColor.GetColor(248, 206, 104),
    'TTToSemiLeptonic': ROOT.TColor.GetColor(250, 202, 255),
    'WGToLNuG': ROOT.TColor.GetColor(222, 90, 106),
    'WJetsToLNu': ROOT.TColor.GetColor(222, 90, 106),
    'WWTo2L2Nu': ROOT.TColor.GetColor(222, 90, 106),
    'WWW': ROOT.TColor.GetColor(222, 90, 106),
    'WWZ': ROOT.TColor.GetColor(222, 90, 106),
    'WZG': ROOT.TColor.GetColor(222, 90, 106),
    'WZTo1L1Nu2Q': ROOT.TColor.GetColor(222, 90, 106),
    'WZTo2L2Q': ROOT.TColor.GetColor(222, 90, 106),
    'WZTo3LNu': ROOT.TColor.GetColor(222, 90, 106),
    'WZZ': ROOT.TColor.GetColor(222, 90, 106),
    'WmWmJJ': ROOT.TColor.GetColor(222, 90, 106),
    'WminusH': ROOT.TColor.GetColor(222, 90, 106),
    'WpWpJJ': ROOT.TColor.GetColor(222, 90, 106),
    'ZG': ROOT.TColor.GetColor(222, 90, 106),
    'ZZ': ROOT.TColor.GetColor(222, 90, 106),
    'tZq': ROOT.TColor.GetColor(6, 138, 43),
}

In [None]:
# Main function of the plotting step
def plot_hist(variable, nSkim, mu=0):
    
    tfile = ROOT.TFile(dirPlotPath + 'histogram_{}.root'.format(nSkim), 'READ')

    # Styles
    ROOT.gStyle.SetOptStat(0)

    ROOT.gStyle.SetCanvasBorderMode(0)
    ROOT.gStyle.SetCanvasColor(ROOT.kWhite)
    ROOT.gStyle.SetCanvasDefH(600)
    ROOT.gStyle.SetCanvasDefW(600)
    ROOT.gStyle.SetCanvasDefX(0)
    ROOT.gStyle.SetCanvasDefY(0)

    ROOT.gStyle.SetPadTopMargin(0.08)
    ROOT.gStyle.SetPadBottomMargin(0.13)
    ROOT.gStyle.SetPadLeftMargin(0.16)
    ROOT.gStyle.SetPadRightMargin(0.05)

    ROOT.gStyle.SetHistLineColor(1)
    ROOT.gStyle.SetHistLineStyle(0)
    ROOT.gStyle.SetHistLineWidth(1)
    ROOT.gStyle.SetEndErrorSize(2)
    ROOT.gStyle.SetMarkerStyle(20)

    ROOT.gStyle.SetOptTitle(0)
    ROOT.gStyle.SetTitleFont(42)
    ROOT.gStyle.SetTitleColor(1)
    ROOT.gStyle.SetTitleTextColor(1)
    ROOT.gStyle.SetTitleFillColor(10)
    ROOT.gStyle.SetTitleFontSize(0.05)

    ROOT.gStyle.SetTitleColor(1, 'XYZ')
    ROOT.gStyle.SetTitleFont(42, 'XYZ')
    ROOT.gStyle.SetTitleSize(0.05, 'XYZ')
    ROOT.gStyle.SetTitleXOffset(1.00)
    ROOT.gStyle.SetTitleYOffset(1.60)

    ROOT.gStyle.SetLabelColor(1, 'XYZ')
    ROOT.gStyle.SetLabelFont(42, 'XYZ')
    ROOT.gStyle.SetLabelOffset(0.007, 'XYZ')
    ROOT.gStyle.SetLabelSize(0.04, 'XYZ')

    ROOT.gStyle.SetAxisColor(1, 'XYZ')
    ROOT.gStyle.SetStripDecimals(True)
    ROOT.gStyle.SetTickLength(0.03, 'XYZ')
    ROOT.gStyle.SetNdivisions(510, 'XYZ')
    ROOT.gStyle.SetPadTickX(1)
    ROOT.gStyle.SetPadTickY(1)

    ROOT.gStyle.SetPaperSize(20., 20.)
    ROOT.gStyle.SetHatchesLineWidth(5)
    ROOT.gStyle.SetHatchesSpacing(0.05)

    ROOT.TGaxis.SetExponentOffset(-0.08, 0.01, 'Y')
    
    legend = ROOT.TLegend(0.64, 0.70, 0.95, 0.91)
    legend.SetNColumns(2)

    # Data + MC
    data = getHistogram(tfile, 'Data', variable)
    MCSig = getHistogram(tfile, 'MCSig', variable)
    
    stack = ROOT.THStack('', '')
    seen, count, areaBkg, first = [], 0, 0, True
    titles = ['Single top', 't#bar{t}','t#bar{t}#rightarrow l#nu l#nu',
              't#bar{t}#rightarrow s-lep','Diboson', 'Others']
    for key, value in colorsBkg.items():
        histo = getHistogram(tfile, key, variable)
        histo.SetLineWidth(0)
        histo.SetFillColor(value)
        areaBkg += histo.Integral('width')
        stack.Add(histo)
        if first:
            bkgError = histo.Clone()
            first = False
        else:
            bkgError.Add(histo)
        if value not in seen:
            legend.AddEntry(histo, titles[count], 'f')
            count += 1
            seen.append(value)
    
    bkgError.SetFillStyle(3002)
    bkgError.SetFillColor(12)
    bkgError.SetMarkerSize(0)
    legend.AddEntry(bkgError, 'BKG err', 'f')
    
    if mu == 0:
        areaSig = MCSig.Integral('width')
        scale = areaBkg/areaSig
        MCSig.Scale(scale)
    else:
        MCSig.Scale(mu)
        MCSig.Add(bkgError)
        
    # Draw histograms
    data.SetMarkerStyle(20)
    data.SetLineColor(ROOT.kBlack)
    data.SetLineWidth(3)
    MCSig.SetLineColor(colors['MCSig'])
    MCSig.SetLineWidth(3)

    c = ROOT.TCanvas('', '', 600, 600)
    
    name = data.GetTitle()
    if name in labels:
        title = labels[name]
    else:
        title = name

    stack.Draw('HIST')
    bkgError.Draw('E2 SAME')
    MCSig.Draw('HIST SAME')
    data.Draw('E1P SAME')
    
    stack.GetXaxis().SetTitle(labels[variable])
    stack.GetYaxis().SetTitle('N_{Events}')
    stack.SetMaximum(max(bkgError.GetMaximum(), data.GetMaximum()) * 1.6)
    stack.SetMinimum(1.0)

    # Add legend
    legend.AddEntry(MCSig, 'FCNC', 'f')
    legend.AddEntry(data, 'Data', 'lep')
    legend.SetBorderSize(0)
    legend.Draw()

    # Add title
    latex = ROOT.TLatex()
    latex.SetNDC()
    latex.SetTextSize(0.04)
    latex.SetTextFont(42)
    latex.DrawLatex(0.16, 0.935, '#bf{CMS FCNC}')

    # Save
#     c.SaveAs(dirPlotPath + '{}_{}'.format(variable, nSkim))
    c.SaveAs(dirPlotPath + '{}_final_histogram_{}.png'.format(variable, nSkim))

In [None]:
# Loop over all variable names and make a plot for each
for variable in labels.keys():
    plot_hist(variable, nSkim, 0.0112)

In [None]:
from IPython.display import Image
for image in sorted(os.listdir(dirPlotPath + '')):
    if image.endswith('*.png'):
        display(Image(filename=(dirPlotPath + '' + image)))

max_eta1
lep_eta0

lep_eta1
ST

inv_m01
jet_pt0

dR01
MET

#### iSkim 1

<img align='right' src='https://www.unidformazione.com/wp-content/uploads/2018/04/unipd-universita-di-padova.png' alt='Drawing' style='width:450px;'/>
<img align='left' src='https://www.unidformazione.com/wp-content/uploads/2018/04/unipd-universita-di-padova.png' alt='Drawing' style='width:450px;'/>

### 6. Cut significance

After having reached the agreement between data and MC background, when we aimed to increase the ratio $S/\sqrt{B}$ of signal over background with new cuts, we made use of the `signif` function. Each cut effectiveness is checked and the result is written on file.

In [None]:
nSkim = 2
variable = 'lep_eta1'

def signif(nSkim, title, variable=variable):    
    tfile = ROOT.TFile(dirPlotPath + 'histogram_{}.root'.format(nSkim), 'READ')
    MCSig = getHistogram(tfile, 'MCSig', variable)
    
    stack = ROOT.THStack('', '')
    seen, count, areaBkg = [], 0, 0
    
    for key in dfMCBkg.keys():
        histo = getHistogram(tfile, key, variable)
        areaBkg += histo.Integral('width')
        
    areaSig = MCSig.Integral('width')
    significance = round(areaSig/np.sqrt(areaBkg),3)
    
    with open('signif.txt', 'a') as f:
        f.write('\niSkim' + str(nSkim) + '\t' + title + '\t' + str(significance))
        f.close()

### 7. Write the snapshots to disk

Through the `FilterFunctions.py` script the variables of interest are written on disk in a new root file thanks to the `define.Snapshot` command.

In [None]:
%%bash
rm /data/Skim/*.root

In [None]:
%%bash
ls /data/Skim

In [None]:
for i in range(1,5): ff.DeclareVariables[i](dfData, 'Data', save=True)

In [None]:
for i in range(1,5): ff.DeclareVariables[i](dfMCSig, 'Signal', save=True)

In [None]:
for key, value in dfMCBkg.items():
    for i in range(1,5): ff.DeclareVariables[i](value, 'MC' + key, save=True)

### 8. Bibliography

http://tesi.cab.unipd.it/46481/1/Boletti_Alessio.pdf

https://cds.cern.ch/record/2766001/files/CERN-THESIS-2021-039.pdf

https://arxiv.org/pdf/physics/0703039.pdf

#### To do:
- unità di misura sui plot
- final run: ricreare histos + classifier
- sistemare i plot
- bibliografia
- formattazione del codice generale
- hyperref


- slides