`TTbarResCoffeaOutputs` Notebook to produce Coffea output files for an all hadronic $t\bar{t}$ analysis, should one choose not to run the .py script with condor.  The outputs will be found in the corresponding **CoffeaOutputs** directory.

In [None]:
import time
import copy
import scipy.stats as ss
from coffea import hist, processor, nanoevents, util
from coffea.nanoevents.methods import candidate
from coffea.nanoevents import NanoAODSchema, BaseSchema

import awkward as ak
import numpy as np
import glob as glob
import itertools
import pandas as pd
from numpy.random import RandomState

from dask.distributed import Client

In [None]:
ak.behavior.update(candidate.behavior)

In [None]:
#from columnservice.client import ColumnClient
#cc = ColumnClient("coffea-dask.fnal.gov")
#client = cc.get_dask()

#from distributed import Client
#client = Client('coffea-dask.fnal.gov:8786')

In [None]:
#from columnservice.client import FileManager
#FileManager.open_file(TTbarResProcessor.py)

This notebook mimicks the script except for the subtle changes in imports.  

Notice how we import the processor from `TTbarResProcessor_nb.py` where the `_nb` shows we are looking in a slightly modified version of the `TTbarResProcessor.py` script.  This is because this coffea notebook is ran from within the **TTbarAllHadUproot** directory, rather than outside of it to communicate with the **lpc_dask** like the script version (`TTbarResCoffeaOutputs.py`) requires.

Similar logic is demonstrated whenever you come across `<filename>_nb` or `<filename>_nb.py`

In [None]:
from TTbarResProcessor_nb import TTbarResProcessor

In [None]:
from Filesets_nb import filesets

If there are already 'unweighted' coffea output files available that you wish to use, either for demonstration, quickly getting to the step of importing lookup tables, testing/weighting these files, etc..., you can simply load them in from the directory **CoffeaOutputs/UnweightedOutputs/** (or wherever you decide to get the coffea outputs from).

Change the following switch to `True` to load the premaid files.  Switch to `False` to use `processor.run_uproot_job` to generate new files. 

If making your own files, be sure to choose the name they are saved as in 
`util.save(output, 'Whatever_Name_You_Want.coffea')`

In [None]:
LoadingUnweightedFiles = False
# -- include another switch for using dask here later... -- #

In [None]:
tstart = time.time()

outputs_unweighted = {}

seed = 1234577890
prng = RandomState(seed)
Chunk = [100000, 10] # [chunksize, maxchunks]

for name,files in filesets.items(): 
    if not LoadingUnweightedFiles:        
        print('Processing', name)
        output = processor.run_uproot_job({name:files},
                                          treename='Events',
                                          processor_instance=TTbarResProcessor(UseLookUpTables=False,
                                                                               ModMass=False, 
                                                                               RandomDebugMode=False,
                                                                               prng=prng),
                                          #executor=processor.dask_executor,
                                          #executor=processor.iterative_executor,
                                          executor=processor.futures_executor,
                                          executor_args={
                                              #'client': client,
                                              'skipbadfiles':False,
                                              'schema': BaseSchema, #NanoAODSchema,
                                              'workers': 2},
                                          chunksize=Chunk[0], maxchunks=Chunk[1]
        				)

        elapsed = time.time() - tstart
        outputs_unweighted[name] = output
        print(output)
        #util.save(output, 'CoffeaOutputs/UnweightedOutputs/TTbarResCoffea_' + name + '_unweighted_output_futures_3-24-21_btagSF_trial.coffea')

    else:
        output = util.load('CoffeaOutputs/UnweightedOutputs/TTbarResCoffea_' + name + '_unweighted_output_futures_3-10-21_trial.coffea')

        outputs_unweighted[name] = output
        print(name + ' unweighted output loaded')
        elapsed = time.time() - tstart

In [None]:
print('Elapsed time = ', elapsed, ' sec.')
print('Elapsed time = ', elapsed/60., ' min.')
print('Elapsed time = ', elapsed/3600., ' hrs.') 

In [None]:
for name,output in outputs_unweighted.items(): 
    print("-------Unweighted " + name + "--------")
    for i,j in output['cutflow'].items():        
        print( '%20s : %12d' % (i,j) )

First, run the `TTbarResLookUpTables` module by simply importing it.  If it works, it will print out varies pandas dataframes with information about the mistag rates and finally print the `luts` multi-dictionary

In [None]:
import TTbarResLookUpTables_nb

Next, import that multi-dictionary `luts`, as it is needed for the processor to create output files.  These new output files will have the necessary datasets weighted by their corresponding mistag rate

In [None]:
OnlyCreateLookupTables = True

In [None]:
from TTbarResLookUpTables_nb import luts

Ensure that the necessary files have been included in the `TTbarResLookUpTables_nb` process before running the next processor, as the mistag procedure is found within that module.  For details about the categories used to write the mistag procedure, refer to the `TTbarResProcessor` module.

In [None]:
from Filesets_nb import filesets_forweights

In [None]:
tstart = time.time()

seed = 1234577890
outputs_weighted = {}
prng = RandomState(seed)
Chunk = [100000, 100] # [chunksize, maxchunks]

for name,files in filesets_forweights.items(): 
    
    if not OnlyCreateLookupTables:
        print(name)
        output = processor.run_uproot_job({name:files},
                                          treename='Events',
                                          processor_instance=TTbarResProcessor(UseLookUpTables=True,
                                                                               ModMass = True,
                                                                               RandomDebugMode = False,
                                                                               CalcEff_MC=False,
                                                                               lu=luts,
                                                                               prng=prng),
                                          #executor=processor.dask_executor,
                                          #executor=processor.iterative_executor,
                                          executor=processor.futures_executor,
                                          executor_args={
                                              'client': client, 
                                              'skipbadfiles':False,
                                              'schema': BaseSchema, #NanoAODSchema,
                                              'workers': 2},
                                          chunksize=Chunk[0], maxchunks=Chunk[1]
        )
	
        elapsed = time.time() - tstart
        outputs_weighted[name] = output
        print(output)
        util.save(output, 'CoffeaOutputs/WeightedModMassOutputs/TTbarResCoffea_' + name + '_ModMass_weighted_output_futures_3-24-21_btagSF_trial.coffea')

    else:
        continue


In [None]:
print('Elapsed time = ', elapsed, ' sec.')
print('Elapsed time = ', elapsed/60., ' min.')
print('Elapsed time = ', elapsed/3600., ' hrs.') 

In [None]:
if not OnlyCreateLookupTables:
    for name,output in outputs_weighted.items(): 
        print("-------Unweighted " + name + "--------")
        for i,j in output['cutflow'].items():        
            print( '%20s : %12d' % (i,j) )
else:
    print('We\'re done here')