Skip to content

VSOP26HandsOnEx

Sezen Sekmen edited this page Dec 11, 2020 · 7 revisions

VSOP26 analysis methods hands-on exercise with ADL and CutLang

In these hands-on exercises, we will look into events simulated for different physics processes and analyze their basic properties. This task normally requires knowledge of one or more programming languages and of sophisticated analysis frameworks. However, for these hands-on exercises, we will use a newly developed approach and setup which allows a more direct communication with data, and does not require programming or extensive knowledge of frameworks (beyond the capability of running a few commands with the ROOT analysis framework). This approach is called an Analysis Description Language (ADL), and the technical setup is called CutLang. In this approach, the analysis tasks are written with an English-like syntax in a human-readable text file, rather than being programmed into frameworks (e.g. using C++ or Python). This text file, which we call the ADL file, is then automatically interpreted and run on events using the framework CutLang. Note that event though CutLang is a framework itself, you will not need to do any programming with it, but only use it to run on events. The only very basic programming commands would be useful when looking into and working with the results produced by CutLang, which will be given in a ROOT file. These results will include histogrammed distributions of event quantities.

ADL and CutLang will be briefly introduced in the beginning of the first hands-on session, but since we have very limited time, we recommend you to take a preliminary look at the ADL website, and in particular, check the introductory presentation at the CMS Open Data for Theorists workshop linked here.

Exercise setup

There are two options for performing these hands-on exercises and working with CutLang:

  • Installing CutLang on your personal computer or university cluster: If you have a Unix or MacOS system and you are accustomed to working with terminals, this option is recommended.
    • Pros: More flexible system where you will have more control and easier interaction with ROOT. You will have a setup ready for future explorations.
    • Cons: You will have to do the installation and compiling yourself.
  • Using binder and jupyter notebook interface: If you have a Windows OS, or are not accustomed to working with terminals, you can proceed with this option.
    • Pros: No installation required. Only a browser window is needed.
    • Cons: Less flexible.

Installing CutLang on your personal computer or university cluster

1. Install ROOT

CutLang is based on the ROOT analysis framework. Therefore, the first step is to install ROOT. There are several ways of installing ROOT, which are nicely explained in https://root.cern/install/. Please proceed with the installation choice appropriate for your OS and that is most convenient for you. ROOT can be installed straightforwardly on linux and Mac systems. There is also a beta release for Windows. Since CutLang has not been tested on Windows, we recommend the binder/jupyter option to the Windows users. However, if you would like to familiarize yourself with ROOT, we recommend you try it!

The latest stable ROOT release is 6.22/02, however any release from 6.18 onward is acceptable. If you already have ROOT installed for a release 6.18 or higher, you do not need to update to 6.22/02.

2. Check flex and bison

CutLang is a so-called runtime interpreter, which can directly read and interpret your analysis description written with the easy syntax of ADL, without the need for compilation. In order for it to do this, CutLang relies on the tools flex and bison, which are by default included in any UNIX system. Please check that these tools are available by

which flex
which bison

If these do not exist, please check the availability of lex and yacc:

which lex
which yacc

In the extremely unlikely case that these tools do not exist, you would need to install them.

3. Install CutLang

CutLang source code is hosted in https://github.com/unelg/CutLang . To install:

 git clone https://github.com/unelg/CutLang.git
 cd CutLang/CLA
 make
 cd ../runs

This is the only time you will need to compile CutLang.
runs directory in CutLang is where we will do all the running.

4. Test your CutLang installation

Make sure you are still in the runs directory in CutLang. Get a sample events file:

wget https://www.dropbox.com/s/zza28peyjy8qgg6/T2tt_700_50.root

Run CutLang using one of the basic example ADL files, exHistos.adl:

./CLA.sh T2tt_700_50.root DELPHES -i exHistos.adl

You should be seeing a table with some results and an output ROOT file called histoOut-exHistos.root. Also, look into exHistos.adl to see what simple steps were taken.

Using binder and jupyter notebook interface

Open a browser and simply click here: https://mybinder.org/v2/gh/unelg/CutLang/master?urlpath=/lab/tree/binder%2Fexample1.ipynb

which will bring up a binder installation of CutLang. After a little waiting, a jupyter notebook example1.pynb will come up. It shows how CutLang can be used for analysis. Execute the cells in the notebook.

Please familiarize yourself with the menus of binder. Learn how to navigate between files and notebooks.

Introduction to ROOT

Your first run with CutLang should have given a ROOT file with a few histograms in it. Now we will learn the very basics of looking into that ROOT file and plotting histograms. Here is a nice reference tutorial from Fermilab for ROOT basics.

  • If you have a CutLang installation, you can directly run ROOT macros or scripts on your computer.

Running either of these will output two pdf files with plots: hjetpt.pdf and hmetjet1pt.pdf

One practical thing you can do is to look directly into the ROOT file with a GUI browser. In the runs directory, execute

root -l histoOut-exHistos.root

In the ROOT C interpreter prompt, write

TBrowser b

This will open up the browser GUI window. You can click on the file name in the left side menu, and start browsing. The browser allows you to do many simple operations by mouse clicks. You can find a nice description of what one can do with the browser in the relevant section of the Fermilab tutorial.

  • If you are using binder, you will interact with ROOT via jupyter notebooks. The notebooks can be seen in the left side menu.
    • C++ users: execute ROOTintroCpp.ipynb
    • Python users: execute ROOTintroPython.ipynb

IMPORTANT: Please take a very careful look inside these scripts or notebooks and understand the commands used for opening and navigating files and plotting histograms. You will be making your plots using these commands during the hands-on sessions.

The Z(ee)Z(mumu) exercise

We will look at events having 2 Z bosons where Z bosons decay either to 2 electrons or 2 muons and plot invariant masses. We will also see the effect of applying a kinematic selection on electrons and muons.

If you are using CutLang directly, go to the CutLang/runs directory. If you are using binder, go to the example1.pynb notebook and execute the first cell that takes you to the CutLang/runs directory.

For this exercise, we will use events from ATLAS open data. Copy the file using

wget http://opendata.atlas.cern/release/samples/MC/mc_105986.ZZ.root

or, for binder

!wget http://opendata.atlas.cern/release/samples/MC/mc_105986.ZZ.root

We will work with a mostly-written ADL file. Similarly, copy the ADL file from here:

https://www.dropbox.com/s/ykf14n40u30f3g9/VSOPZZ.adl

Complete the assignments in the ADL file and run CutLang. Here is the command to run CutLang:

./CLA.sh mc_105986.ZZ.root ATLASOD -i VSOPZZ.adl

In order to run on a limited number of events: add -e to the end of the command, e.g.

./CLA.sh mc_105986.ZZ.root ATLASOD -i VSOPZZ.adl -e 10000

(Note that binder crashes with many events due to memory issues, so you can only run with limited events there)

Make sure to observe the CutLang text output, which gives the cutflow.

The next task is to make some plots. For this, we have 3 options:

  • runs/plotting/ROOTVSOPZZ.C or runs/plotting/ROOTVSOPZZ.py, within direct installation of CutLang. Pick whichever you prefer.
  • The notebook ROOTVSOPZZ.ipynb in binder.

The SUSY exercise

We will implement a simple new physics (SUSY) search.

The signal and background files are here:

The ADL file is here: https://www.dropbox.com/s/y2vnse0alq6lct8/VSOPSUSY.adl

Running CutLang for signal:

./CLA.sh T2tt_700_50.root DELPHES -i VSOPSUSY.adl
cp histoOut-VSOPSUSY.root histoOut-VSOPSUSY_sg.root

Running CutLang for background:

./CLA.sh mc_117050.ttbar_lep.root ATLASOD -i VSOPSUSY.adl
cp histoOut-VSOPSUSY.root histoOut-VSOPSUSY_bg.root

Note that, since we have two processes, i.e. a signal and a background, we must have two files with different names for these.

There are two assignments:

  1. Defining a 0-lepton signal region, and running CutLang over signal and background events to apply the event selection in the signal region. Then, we will compare the distributions of signal vs. background. * Instructions for defining the signal region are in the ADL file. * The scripts for analyzing the output ROOT files to make signal-background comparison plots are runs/plotting/ROOTVSOPSUSYsgbg.py, runs/plotting/ROOTVSOPSUSYsgbg.C and binder/ROOTVSOPSUSYsgbg.ipynb
  2. (Optional) Defining a 1-lepton control region, and comparing the background distributions in the signal region and the control region. * Instructions for defining the control region are in the ADL file. * A prepared script/notebook does not exist for comparing the background distributions in the signal region with background distribution in the control region. However the ROOTVSOPSUSYsgbg script/notebook can be modified to produce such plots.