# Running the TTGammaProcessor

This cell will copy the test files from their location on eos to your local area. This only needs to be done once!

In [None]:
%load_ext autoreload
from coffea import util, processor
from coffea.nanoevents import NanoEventsFactory, NanoAODSchema

List of samples to be run on (fileset variable) and a dictionary containing the number of events processed for each sample

In [None]:
fileset = {
    "TTGamma_SingleLept": [
        "root://cmseos.fnal.gov//store/user/cmsdas/2020/long_exercises/TTGamma/TestFiles/TTGamma_1l.root"
    ],
    "TTbarPowheg_Semilept": [
        "root://cmseos.fnal.gov//store/user/cmsdas/2020/long_exercises/TTGamma/TestFiles/TTbar_1l.root"
    ],
    "W4jets": [
        "root://cmseos.fnal.gov//store/user/cmsdas/2020/long_exercises/TTGamma/TestFiles/W4Jets.root"
    ],
    "WGamma_01J_5f": [
        "root://cmseos.fnal.gov//store/user/cmsdas/2020/long_exercises/TTGamma/TestFiles/WGamma.root"
    ],
    "ZGamma_01J_5f_lowMass": [
        "root://cmseos.fnal.gov//store/user/cmsdas/2020/long_exercises/TTGamma/TestFiles/ZGamma.root"
    ],
}

Run the TTGammaProcessor on the list of files included in fileset.

You can specify the chunksize and maximum number of chunks to process from each sample (selecting a small number of events and one chunk will force coffea to process only a subset of the events for quicker debugging)

In [None]:
#autoreload forces the kernel to reload the processor to include any new changes
%autoreload 2
from ttgamma import TTGammaProcessor
import awkward as ak

import time
tstart = time.time()

#Run Coffea code using uproot
output = processor.run_uproot_job(
    fileset,
    "Events",
    TTGammaProcessor(isMC=True),
    processor.iterative_executor,
    executor_args={'schema': NanoAODSchema,'workers': 4},
    chunksize=10000,
    maxchunks=1,
)

elapsed = time.time() - tstart
print("Total time: %.1f seconds"%elapsed)

In [None]:
output

In [None]:
import hist
import matplotlib.pyplot as plt

# Group MC histograms                                                                                                                               
histList = [histo for key, histo in output.items()]
outputHist = processor.accumulate(histList)

h = outputHist['photon_chIso']
h = h[{'lepFlavor':sum}]
h = h[{'category':sum}]
h = h[{'systematic':'nominal'}]

h


In [None]:
h.plot1d(overlay='dataset')
plt.yscale('log')
plt.ylim(1e-4, 1000)

plt.legend()

In [None]:
expected = util.load("output_Expected.coffea")

hname = "photon_chIso"
for samp, histo in expected.items():
    mine = output[samp][hname].values()
    other = expected[samp][hname].values()
    if (~ak.all(mine == other)):
        print(key, ak.all(mine == other))
print('done!')

# Accessing Arrays Interactively

Below is an example of loading a NanoAOD file interactively. This can be very useful for developing the code, and debugging any issues. Use this area to build your intuition for working with Coffea and awkward arrays!

In [None]:
import awkward as ak
from coffea.nanoevents import NanoEventsFactory, NanoAODSchema

fname = fileset["TTGamma_SingleLept"][0]
events = NanoEventsFactory.from_root(fname, schemaclass=NanoAODSchema).events()

Once you have opened the file, you can explore its contents using the 'fields' syntax

In [None]:
events["Photon", "charge"] = 0
leadingMuon = ak.firsts(events.Muon)
leadingPhoton = ak.firsts(events.Photon)
leadingElectron = ak.firsts(events.Electron)

In [None]:
(leadingMuon + leadingPhoton).mass

In [None]:
mugammapairs = ak.cartesian({"mu":events.Muon, "gamma":events.Photon})
(mugammapairs.mu + mugammapairs.gamma).mass

In [None]:
leadingMuon

In [None]:
print(leadingMuon)

In [None]:
type(leadingMuon)

In [None]:
from ttgamma.scalefactors import mu_trig_err

mu_trig_err

In [None]:
mu_trig_err(1.2, 36)

In [None]:
events.Photon.matched_gen.fields

In [None]:
events.GenPart.fields

There is also a docstring for each of these variables in NanoAOD, which you can access using '?':

In [None]:
events.Jet.rawFactor?