Reducible Background Studies

This repo contains scripts to:

Skim NTuples.
Retrieve Data and MC info.
Calculate lepton fake rates.
Remove the WZ contribution.
Estimate the total non-ZZ background contribution.
Plot the resulting distributions.

Other scripts:

Compare specific events between analyzers:
- UFHZZAnalyzer (xBF)
- CJLST Analyzer
- Multiple quartets per event (Jake)
  - sidequests/scripts/print_commonanduniqueevts_jakevsbbf.py
  - sidequests/scripts/compare_jakesel_bbfsel.py
  - sidequests/scripts/run_bbfana_using_runlumievent.py
  - sidequests/scripts/count_bbf_or_jake_events.py
Run Sequentially:
```
# Create fake rates with WZ removed.
python WZremoval_from_FR_comp.py

# Plot fake rates.
python scripts/plotters/plot_fakerate_hists.py \
   -d "Hist_Data_ptl3_Data_2016.root" \
   -w "Hist_Data_ptl3_WZremoved_2016.root" \
   -o "fakerates_2016UL.pdf" \
   -y 2016
```

Skim NTuples

Use the UFHZZ4LAnalyzer to skim the MiniAOD files (Data or MC).
Combine files (using hadd) of the same process (e.g., MuonEG runs A-D) with:
- scripts/hadders/haddfiles_on_slurm_step01_runstogether.py
  - Submits hadd jobs to SLURM.
  - NOTE: If you get an error like the one below then first get rid of extraneous branches using skimmers/skim_useless_branches.C; then resume hadding.
```
Error in <TBufferFile::WriteByteCount>: bytecount too large (more than 1073741822)
```
Skim any extraneous branches to reduce file size:
- skimmers/skim_useless_branches_onslurm.py
- NOTE: Make sure you keep the branches you want in the skimmer template:
  - skimmers/skim_useless_branches_template.C
Now hadd together the "AllRun" files with:
- scripts/hadders/haddfiles_on_slurm_step02_datasetstogether.py
  - Submits the one hadd job to SLURM.
Remove duplicate events (same Run:Lumi:Event) with one of:
- skimmers/remove_duplicates.py
- skimmers/remove_duplicates_Filippo.C
- NOTE: The Python script may be faster than the C++ one:
  
  Script user time sys time TOTAL time
  
  Python 19m50.211s 0m30.203s 20m20.413s
  
  C++ 21m22.529s 0m21.924s 21m44.453s
Combine Data files into a single file (e.g. Data2018_NoDuplicates.root).
- If you get the 'bytecount too large' error, consider first trimming branches, then hadd together, and THEN remove duplicates.

Retrieving Data and MC Info

Cross Sections for MC

You will find many different values for cross section. After some comparisons and asking the experts, it seems it is best to use the values recommended by the Higgs MC working group.

For historical purposes, let it be noted that the CJLST group uses these values.

You can also find cross sections associated with the generated samples, found on McM:

Go to https://cms-pdmv.cern.ch/mcm/ > Request > Output Dataset
Search for your data set by providing the data set name.
Under 'PrepId' column, copy the name of the file.
Under 'Dataset name' column, click on the icon to the right of the name of the data set (it looks like a file explorer).
Show all results (bottom-righthand corner).
Search (Ctrl-F) for the PrepId. Even if a few options are found, they should all have the same LHEGS file. Click this LHEGS file name.
Then Select View > Generator parameters. The cross section is in the 'Generator Parameters' column to the right.

Put cross sections in constants/analysis_params.py.

Retrieve Integrated Lumi (L_int) for Data

crab report -d <crab_dir>

This produces the file <crabdir>/results/processedLumis.json. Then do:

brilcalc lumi -c web -i <crabdir>/results/processedLumis.json

Add up the L_int for all data sets of a single kind (e.g. only SingleMuon). Each data set should give a similar value as the other data sets. Take the average or the min and put this number in constants/analysis_params.py

Sum GenWeights for MC

Get the effective number of MC events (the sum of gen weights) with:

scripts/helpers/print_sumWeights.py

h = f.Get('Ana/sumWeights')
h.GetBinContent(1)

Put the number of events in constants/analysis_params.py.

Calculate Fake Rates

Use the Z+L background control region (CR) to calculate how often a non-prompt lepton passes tight selection:

python scripts/main_FR_CR.py

Produces a .root file which contains histograms of the total number of leptons which passed tight selection, total number of loose leptons, and their ratio (fake rate) in bins of pT(lep3).
Calls analyzeZX.py.
Uses MC_composition.py to call PartOrigin().
- Checks the type of fake procedure (checking the ID of the parent).

WZ Removal

Since the WZ process produces 3 prompt leptons, we must subtract this from the Z+L CR:

python WZremoval_from_FR_comp.py

Plot the fake rate histograms (before and after WZ removal) with:

scripts/plotters/plot_fakerate_hists.py

Estimate Reducible Background

Use Data and Monte Carlo ZZ->4l (irreducible background) to estimate the RB. Choose a framework to work with:

Vukasin's Framework.
Jake's Framework.

Jake's Framework

Select "OS Method" events (2P2F and 3P1F) with:

skimmers/select_evts_OSmethod_multiquartetperevt.py

Vukasin's Framework

python main_estimateZX_ntuples.py

Calls estimateZX.py.
Produces final distributions.

Print out the estimates (integrals) within the histograms using:

python estimate_final_numbers_macro.py

Plot the 2P2F/3P1F distributions

python plotting_macros.py

Name		Name	Last commit message	Last commit date
Latest commit History 131 Commits
classes		classes
constants		constants
scripts		scripts
sidequests		sidequests
skimmers		skimmers
.gitignore		.gitignore
README.md		README.md
setup_hpg.sh		setup_hpg.sh
setup_lxplus.sh		setup_lxplus.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reducible Background Studies

Skim NTuples

Retrieving Data and MC Info

Cross Sections for MC

Retrieve Integrated Lumi (L_int) for Data

Sum GenWeights for MC

Calculate Fake Rates

WZ Removal

Estimate Reducible Background

Jake's Framework

Vukasin's Framework

Plot the 2P2F/3P1F distributions

About

Releases

Packages

Contributors 2

Languages

Script	user time	sys time	TOTAL time
Python	19m50.211s	0m30.203s	20m20.413s
C++	21m22.529s	0m21.924s	21m44.453s

rosedj1/ZplusXpython

Folders and files

Latest commit

History

Repository files navigation

Reducible Background Studies

Skim NTuples

Retrieving Data and MC Info

Cross Sections for MC

Retrieve Integrated Lumi (L_int) for Data

Sum GenWeights for MC

Calculate Fake Rates

WZ Removal

Estimate Reducible Background

Jake's Framework

Vukasin's Framework

Plot the 2P2F/3P1F distributions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages