In [None]:
import matplotlib.pylab as plt
import numpy as np
import pandas as pd
import correctionlib
import os

## Read CSV files

First, we will open the .csv files we saved during the event selection process and store all the data in a dictionary. We will also create a dictionary to hold the number of events for each sample that we found in yesterday's background modeling lesson. The keys of these dictionaries will be the types of data samples we are using in this search.

In [None]:
datasets = ['collision', 'signal_M2000', 'ttsemilep', 'tthadronic', 'ttleptonic', 'Wjets']

datadict = {}
for dataset in datasets:
    df = pd.read_csv(f'SUMMED_{dataset}.csv')
    
    N_gen = df['N_gen'][0]
    
    datadict[dataset] = {'df':df, 'N_gen': N_gen}

In [None]:
## This is ok because we used a "leading order" wjets sample!
datadict['Wjets']['N_gen'] = 80958227  
datadict['Wjets']['df']['N_gen'] = 80958227

## Open correctionlib files

Now we will load the two correctionlib JSON files that we used in the earlier examples, and access the specific corrections that we need.

In [None]:
import gzip
with gzip.open("POG/LUM/2016postVFP_UL/puWeights.json.gz",'rt') as file:
    data = file.read().strip()
    evaluator = correctionlib._core.CorrectionSet.from_string(data)
with gzip.open("POG/MUO/2016postVFP_UL/muon_Z.json.gz",'rt') as file:
    data = file.read().strip()
    evaluatorMU = correctionlib._core.CorrectionSet.from_string(data)
    
pucorr = evaluator["Collisions16_UltraLegacy_goldenJSON"]
mucorr = evaluatorMU["NUM_TightID_DEN_TrackerMuons"]

## Store data for histograms

Now we will use the data we read from the .csv files to evaluate the 2 corrections and their uncertainties. We will slim down the number of variables that we need to create our final Z' mass histograms and put everything in a final dictionary.

In [None]:
histoData = {}
for sample in datadict.keys():
    histoData[sample] = {
        "N_gen": datadict[sample]['N_gen'],
        "genWeight": datadict[sample]['df']['weight'].values/np.abs(datadict[sample]['df']['weight'].values),
        "mtt": datadict[sample]['df']['mtt'].values,
        "pu_weight": [pucorr.evaluate(n,"nominal") for n in datadict[sample]['df']['pileup']],
        "pu_weight_up": [pucorr.evaluate(n,"up") for n in datadict[sample]['df']['pileup']],
        "pu_weight_dn": [pucorr.evaluate(n,"down") for n in datadict[sample]['df']['pileup']],
        "muId_weight": [mucorr.evaluate(eta,pt,"nominal") for pt,eta in zip(datadict[sample]['df']['mu_pt'],datadict[sample]['df']['mu_abseta'])],
        "muId_weight_up": [mucorr.evaluate(eta,pt,"systup") for pt,eta in zip(datadict[sample]['df']['mu_pt'],datadict[sample]['df']['mu_abseta'])],
        "muId_weight_dn": [mucorr.evaluate(eta,pt,"systdown") for pt,eta in zip(datadict[sample]['df']['mu_pt'],datadict[sample]['df']['mu_abseta'])]
    } 

### Save histogram data 

We need to open this histogram data in ROOT, so we will move over to our other docker container. Let's write `histoData` in a pickle file that we can copy to our other folder. When you're done with this notebook, go back to the lesson page for the next script!

In [None]:
import pickle
with open('hists_for_ROOT.p','wb') as f:
    pickle.dump(histoData,f)