# Study Systematics Flip

The purpose of this notebook is to check if systematics can be implemented as part of a vector of events in a way that does not increase memory usage.

```Muon_pt_1```, ```Muon_pt_2```, ```Electron_pt_1```, ```Electron_pt_2``` are the variables that also have systematics counterparts.

```PV_x```, ```PV_y```, ```PV_z``` do not have systematics counterparts.

In [3]:
import uproot
import pickle
import pandas as pd
import numpy as np

In [4]:
# Open root file and ttree

file = uproot.open('test_sys_signal.root')
tree = file['Events']
tree.show()

name                 | typename                 | interpretation                
---------------------+--------------------------+-------------------------------
Muon_pt_1            | float                    | AsDtype('>f4')
Muon_pt_2            | float                    | AsDtype('>f4')
Electron_pt_1        | float                    | AsDtype('>f4')
Electron_pt_2        | float                    | AsDtype('>f4')
Muon_pt_1_Up         | float                    | AsDtype('>f4')
Muon_pt_2_Up         | float                    | AsDtype('>f4')
Electron_pt_1_Up     | float                    | AsDtype('>f4')
Electron_pt_2_Up     | float                    | AsDtype('>f4')
Muon_pt_1_Down       | float                    | AsDtype('>f4')
Muon_pt_2_Down       | float                    | AsDtype('>f4')
Electron_pt_1_Down   | float                    | AsDtype('>f4')
Electron_pt_2_Down   | float                    | AsDtype('>f4')
PV_x                 | float                    | AsDtype(

In [5]:
nominals_with_sys = ['Muon_pt_1', 'Muon_pt_2', 'Electron_pt_1', 'Electron_pt_2']
nominals_without_sys = ['PV_x', 'PV_y', 'PV_z']
systematics = ['', '_Up', '_Down']

variables = nominals_with_sys + nominals_without_sys

## Numpy

In [None]:
branches_pd = tree.arrays(library='pd')

In [None]:
branches_pd

In [None]:
np.array(branches_pd['Muon_pt_1'])

In [None]:
# Make structured array for Muon_pt_1
muon_pt_1 = np.array([(branches_pd.loc[i, 'Muon_pt_1'], branches_pd.loc[i, 'Muon_pt_1_Up'], branches_pd.loc[i, 'Muon_pt_1_Down']) for i in range(len(branches_pd))], 
                     dtype=[('nominal', 'f4'), ('up', 'f4'), ('down', 'f4')])

In [None]:
muon_pt_1.shape

In [None]:
'''
Attempt: make structured arrays for variables with systematics and flat for the others
Unfortunately, fails when concatenating the two arrays for syst and not-syst
The only way would be https://stackoverflow.com/questions/39754658/concatenate-arrays-with-mixed-types,
but then we loose the possibility to access with arr['nominal'] etc.
'''

df_sys = np.vstack(tuple(np.array([tuple(branches_pd.loc[i, name] for name in list(map(lambda suf: var + suf, systematics))) 
                               for i in range(len(branches_pd))], 
                              dtype=[('nominal', 'f4'), ('up', 'f4'), ('down', 'f4')]) for var in nominals_with_sys)).T

'''
df_nosys = np.vstack(tuple(np.array([((branches_pd.loc[i, var]))
                                     for i in range(len(branches_pd))],
                                    dtype=[('nominal', 'f4')]) for var in nominals_without_sys)).T
'''

df_nosys = np.vstack(tuple(np.array(branches_pd[var]) for var in nominals_without_sys)).T

In [None]:
'''Attempt: df_sys is supposed to be a matrix n_evs x nominals_with_sys where every element is a n_sys-dim array
Does not work becaus when you concatenate the arrays n_evs x n_sys arrays for the different (nominals_with_sys) variables
numpy flattens it
'''
df_sys = np.vstack(tuple(np.array([np.array([branches_pd.loc[i, name] for name in list(map(lambda suf: var + suf, systematics))])
                                   for i in range(len(branches_pd))]) for var in nominals_with_sys))

In [None]:
np.array([branches_pd.loc[i, name] for name in list(map(lambda suf: 'Muon_pt_1' + suf, systematics))])

In [None]:
a = np.array([np.array([branches_pd.loc[i, name] for name in list(map(lambda suf: 'Muon_pt_1' + suf, systematics))]) for i in range(len(branches_pd))])
b = np.array([np.array([branches_pd.loc[i, name] for name in list(map(lambda suf: 'Electron_pt_1' + suf, systematics))]) for i in range(len(branches_pd))])

In [None]:
'''Attempt: df_sys is a 3D array. It can't be concateneted with a 2D array for df_nosys, unless we fill the gaps with nominals.
This would would probably increase the memory usage.
'''
df_sys = np.array([np.array([np.array([branches_pd.loc[i, name] for name in list(map(lambda suf: var + suf, systematics))])
                                   for i in range(len(branches_pd))]) for var in nominals_with_sys])

In [None]:
df_sys.shape

### Attempt: double structured array

The dataframe ends up being a nested structured array. In the outmost one, the fields are the variables (```Muon_pt_1```, ```Electron_pt_1```, ```PV_x```, etc.), whose type is another structured array (```sys_type``` below) for the variables that have systematics, ```float``` for the others. 

Of course, the type would have to be generalized if the types that we retrieve from the tree are not simple floats.

```sys_type``` fields are the systematics themselves, plus ```Nominal```.

In [33]:
branches = tree.arrays(library='ak')

In [34]:
branches

<Array [{Muon_pt_1: 17.4, ... PV_z: 7.01}] type='22838 * {"Muon_pt_1": float32, ...'>

In [35]:
len(branches)

22838

#### Generalize

In [36]:
sys_type = np.dtype([('Nominal', 'float32') if sys == '' else (sys, 'float32') for sys in systematics])

In [37]:
sys_type

dtype([('Nominal', '<f4'), ('_Up', '<f4'), ('_Down', '<f4')])

In [38]:
nom_type = np.dtype([('Nominal', 'float32')])

In [39]:
nom_type

dtype([('Nominal', '<f4')])

In [40]:
table_type = np.dtype([(var, sys_type) for var in nominals_with_sys] + [(var, nom_type) for var in nominals_without_sys])

In [41]:
table_type

dtype([('Muon_pt_1', [('Nominal', '<f4'), ('_Up', '<f4'), ('_Down', '<f4')]), ('Muon_pt_2', [('Nominal', '<f4'), ('_Up', '<f4'), ('_Down', '<f4')]), ('Electron_pt_1', [('Nominal', '<f4'), ('_Up', '<f4'), ('_Down', '<f4')]), ('Electron_pt_2', [('Nominal', '<f4'), ('_Up', '<f4'), ('_Down', '<f4')]), ('PV_x', [('Nominal', '<f4')]), ('PV_y', [('Nominal', '<f4')]), ('PV_z', [('Nominal', '<f4')])])

In [42]:
%%time

final = np.array([tuple([np.array(tuple([branches[name][i] for name in list(map(lambda suf: var + suf, systematics))]), sys_type) for var in nominals_with_sys]
                        + [np.array(tuple([branches[var][i]]), nom_type) for var in nominals_without_sys])
                        #+ [branches[var][i] for var in nominals_without_sys]) 
                  for i in range(len(branches))], dtype=table_type)

CPU times: user 8.69 s, sys: 68.7 ms, total: 8.76 s
Wall time: 8.76 s


In [43]:
final

array([((17.382437, 17.882437, 16.882437), (13.2383995, 13.7383995, 12.7383995), (44.744175 , 45.244175 , 44.244175 ), (58.94207  , 59.44207  , 58.44207  ), (0.24211372,), (0.39511582,), (  6.090504 ,)),
       ((50.051613, 50.551613, 49.551613), (41.024353 , 41.524353 , 40.524353 ), (15.9591465, 16.459146 , 15.4591465), ( 5.622648 ,  6.122648 ,  5.122648 ), (0.24053279,), (0.39375338,), ( -6.5441937,)),
       ((24.70227 , 25.20227 , 24.20227 ), (41.462658 , 41.962658 , 40.962658 ), ( 5.042018 ,  5.542018 ,  4.542018 ), ( 7.9949474,  8.494947 ,  7.4949474), (0.24565527,), (0.39248142,), (-12.015012 ,)),
       ...,
       ((33.58854 , 34.08854 , 33.08854 ), (35.16421  , 35.66421  , 34.66421  ), (14.531511 , 15.031511 , 14.031511 ), (12.232145 , 12.732145 , 11.732145 ), (0.24721472,), (0.3938958 ,), ( -1.3571125,)),
       ((59.00309 , 59.50309 , 58.50309 ), (21.31192  , 21.81192  , 20.81192  ), ( 5.1863894,  5.6863894,  4.6863894), (14.721363 , 15.221363 , 14.221363 ), (0.24585512,), 

#### Try some operations

In [44]:
final.shape

(22838,)

In [45]:
# Slicing

for i, evt in enumerate(final[0:3]):
    print('Event {}: {}'.format(i, evt))

Event 0: ((17.382437, 17.882437, 16.882437), (13.2383995, 13.7383995, 12.7383995), (44.744175, 45.244175, 44.244175), (58.94207, 59.44207, 58.44207), (0.24211372,), (0.39511582,), (6.090504,))
Event 1: ((50.051613, 50.551613, 49.551613), (41.024353, 41.524353, 40.524353), (15.9591465, 16.459146, 15.4591465), (5.622648, 6.122648, 5.122648), (0.24053279,), (0.39375338,), (-6.5441937,))
Event 2: ((24.70227, 25.20227, 24.20227), (41.462658, 41.962658, 40.962658), (5.042018, 5.542018, 4.542018), (7.9949474, 8.494947, 7.4949474), (0.24565527,), (0.39248142,), (-12.015012,))


In [46]:
final['Muon_pt_1']

array([(17.382437, 17.882437, 16.882437),
       (50.051613, 50.551613, 49.551613),
       (24.70227 , 25.20227 , 24.20227 ), ...,
       (33.58854 , 34.08854 , 33.08854 ),
       (59.00309 , 59.50309 , 58.50309 ),
       (42.957253, 43.457253, 42.457253)],
      dtype=[('Nominal', '<f4'), ('_Up', '<f4'), ('_Down', '<f4')])

In [47]:
final['Muon_pt_1']['Nominal']

array([17.382437, 50.051613, 24.70227 , ..., 33.58854 , 59.00309 ,
       42.957253], dtype=float32)

In [48]:
for i, evt in enumerate(final['Muon_pt_1'][0:3]):
    print('Event {}: Nominal {}, Up {}, Down {}'.format(i, evt['Nominal'], evt['_Up'], evt['_Down']))

Event 0: Nominal 17.382436752319336, Up 17.882436752319336, Down 16.882436752319336
Event 1: Nominal 50.051612854003906, Up 50.551612854003906, Down 49.551612854003906
Event 2: Nominal 24.7022705078125, Up 25.2022705078125, Down 24.2022705078125


#### Taggers

In [49]:
import pickle

In [50]:
# unpickle trained model
bdt = pickle.load(open('classifier.pkl', 'rb'))

First we study how to add a new column to a structured Numpy array

In [51]:
new_col = np.array([tuple(np.array([(0,0,0)], dtype=sys_type)) for i in range(len(final))], dtype=[('Predicted', sys_type)])

In [52]:
new_col['Predicted']['Nominal']

array([0., 0., 0., ..., 0., 0., 0.], dtype=float32)

In [53]:
import numpy.lib.recfunctions as rfn

In [None]:
#final_added = rfn.merge_arrays((final, new_col), flatten=True)

In [54]:
# Making a whole new one

new_dtype = np.dtype(final.dtype.descr + [('Predicted', sys_type)])

In [55]:
new_df = np.empty(final.shape, dtype=new_dtype)

In [56]:
for var in nominals_with_sys + nominals_without_sys:
    new_df[var] = final[var]
new_df['Predicted'] = new_col['Predicted']

In [57]:
new_df

array([((17.382437, 17.882437, 16.882437), (13.2383995, 13.7383995, 12.7383995), (44.744175 , 45.244175 , 44.244175 ), (58.94207  , 59.44207  , 58.44207  ), (0.24211372,), (0.39511582,), (  6.090504 ,), (0., 0., 0.)),
       ((50.051613, 50.551613, 49.551613), (41.024353 , 41.524353 , 40.524353 ), (15.9591465, 16.459146 , 15.4591465), ( 5.622648 ,  6.122648 ,  5.122648 ), (0.24053279,), (0.39375338,), ( -6.5441937,), (0., 0., 0.)),
       ((24.70227 , 25.20227 , 24.20227 ), (41.462658 , 41.962658 , 40.962658 ), ( 5.042018 ,  5.542018 ,  4.542018 ), ( 7.9949474,  8.494947 ,  7.4949474), (0.24565527,), (0.39248142,), (-12.015012 ,), (0., 0., 0.)),
       ...,
       ((33.58854 , 34.08854 , 33.08854 ), (35.16421  , 35.66421  , 34.66421  ), (14.531511 , 15.031511 , 14.031511 ), (12.232145 , 12.732145 , 11.732145 ), (0.24721472,), (0.3938958 ,), ( -1.3571125,), (0., 0., 0.)),
       ((59.00309 , 59.50309 , 58.50309 ), (21.31192  , 21.81192  , 20.81192  ), ( 5.1863894,  5.6863894,  4.6863894

In [58]:
def perform_inference(df, clf, nominals, systematics, new_column):
    '''Given a classifier that takes n columns as input, recursively apply
    the classifier on the n columns specified by the combination of nominal
    with the elements in systematics
    '''
    model_features = clf.get_booster().feature_names
    for sys in systematics:
        columns = list(map(lambda pref: pref + sys, nominals))
        df[new_column + sys] = clf.predict(df.rename(
            columns=dict(zip(columns, model_features)), inplace=False)[model_features])
        df = df.rename(columns=dict(zip(model_features, columns)))
    return df

In [None]:
# First we try to produce a new_column with the results of the application of the booster to final
p = pd.DataFrame(final)
df = perform_inference(p, bdt, nominals_with_sys, systematics, new_column='Y')

In [None]:
# Create 2D array from final in order to feed the classifier
inp = np.array([[final[var][i] for var in nominals_with_sys] for i in range(len(final))])

In [None]:
inp[:, 0]

In [88]:
final[['Muon_pt_1', 'Muon_pt_2']][0]

((17.382437, 17.882437, 16.882437), (13.2383995, 13.7383995, 12.7383995))

## Awkward

In [28]:
import awkward1 as ak

In [29]:
branches = tree.arrays(library='ak')

In [30]:
branches

<Array [{Muon_pt_1: 17.4, ... PV_z: 7.01}] type='22838 * {"Muon_pt_1": float32, ...'>

In [31]:
ak.type(branches)

22838 * {"Muon_pt_1": float32, "Muon_pt_2": float32, "Electron_pt_1": float32, "Electron_pt_2": float32, "Muon_pt_1_Up": float32, "Muon_pt_2_Up": float32, "Electron_pt_1_Up": float32, "Electron_pt_2_Up": float32, "Muon_pt_1_Down": float32, "Muon_pt_2_Down": float32, "Electron_pt_1_Down": float32, "Electron_pt_2_Down": float32, "PV_x": float32, "PV_y": float32, "PV_z": float32}

In [8]:
len(branches)

22838

In [9]:
# Quick example of the structure we want
a = ak.Array([{'Muon': {'Nominal': 0, 'Up': 1, 'Down': -1}, 'PV': 3}, {'Muon': {'Nominal': 0, 'Up': 10, 'Down': -10}, 'PV': 30}])

In [10]:
ak.type(a)

2 * {"Muon": {"Nominal": int64, "Up": int64, "Down": int64}, "PV": int64}

### Generalize

First way: the dataframe is a type record array where the variables having systematics are themselves a record while the others are normal arrays

In [32]:
%%time

final = ak.Array([{var: val for (var, val) in [(var, {k: v for (k, v) in [(
    'Nominal' if sys == '' else sys.replace('_', ''), evt[var + sys]) for sys in systematics]}) for var in nominals_with_sys] 
                   + [(var, evt[var]) for var in nominals_without_sys]} for evt in branches])

CPU times: user 1.57 s, sys: 0 ns, total: 1.57 s
Wall time: 1.57 s


In [33]:
ak.type(final)

22838 * {"Muon_pt_1": {"Nominal": float64, "Up": float64, "Down": float64}, "Muon_pt_2": {"Nominal": float64, "Up": float64, "Down": float64}, "Electron_pt_1": {"Nominal": float64, "Up": float64, "Down": float64}, "Electron_pt_2": {"Nominal": float64, "Up": float64, "Down": float64}, "PV_x": float64, "PV_y": float64, "PV_z": float64}

Second way: the dataframe is a type record array where all the variables are other record type arrays; the ones without systematics only have the "nominal" field

In [38]:
%%time

def get_systematics_record(event, var, systematics):
    systematics_record = {}
    for sys in systematics:
        if sys == "":
            placeholder = "Nominal"
        else:
            placeholder = sys.replace("_", "")
        key = "{}{}".format(var, sys)
        if key in event.fields:
            systematics_record[placeholder] = event[key]
    return systematics_record

def get_variables_record(event, variables, systematics):
    variables_record = {}
    for var in variables:
        variables_record[var] = get_systematics_record(event, var, systematics)
    return variables_record

final = ak.Array([get_variables_record(evt, variables, systematics) for evt in branches])

CPU times: user 2.96 s, sys: 32.5 ms, total: 2.99 s
Wall time: 2.99 s


In [39]:
final.type

22838 * {"Muon_pt_1": {"Nominal": float64, "Up": float64, "Down": float64}, "Muon_pt_2": {"Nominal": float64, "Up": float64, "Down": float64}, "Electron_pt_1": {"Nominal": float64, "Up": float64, "Down": float64}, "Electron_pt_2": {"Nominal": float64, "Up": float64, "Down": float64}, "PV_x": {"Nominal": float64}, "PV_y": {"Nominal": float64}, "PV_z": {"Nominal": float64}}

### Try some operations

In [15]:
# Slicing

for i, evt in enumerate(final[0:3]):
    print('Event {}: {}'.format(i, evt))

Event 0: ... Nominal: 58.9, Up: 59.4, Down: 58.4}, PV_x: 0.242, PV_y: 0.395, PV_z: 6.09}
Event 1: ... Nominal: 5.62, Up: 6.12, Down: 5.12}, PV_x: 0.241, PV_y: 0.394, PV_z: -6.54}
Event 2: ... Nominal: 7.99, Up: 8.49, Down: 7.49}, PV_x: 0.246, PV_y: 0.392, PV_z: -12}


In [17]:
for i, evt in enumerate(final['Muon_pt_1'][0:3]):
    print('Event {}: Nominal {}, Up {}, Down {}'.format(i, evt['Nominal'], evt['Up'], evt['Down']))

Event 0: Nominal 17.382436752319336, Up 17.882436752319336, Down 16.882436752319336
Event 1: Nominal 50.051612854003906, Up 50.551612854003906, Down 49.551612854003906
Event 2: Nominal 24.7022705078125, Up 25.2022705078125, Down 24.2022705078125


### Taggers

In [11]:
import pickle

In [12]:
# unpickle trained model
bdt = pickle.load(open('classifier.pkl', 'rb'))

First we study how an Awkward array can be fed to a XGBoost classifier

In [14]:
%%time
# Mine

up = ak.Array({var: up for (var, up) in [(var, final[var]['Up']) for var in final.fields if 'Up' in final[var].fields]})

CPU times: user 1.99 ms, sys: 0 ns, total: 1.99 ms
Wall time: 2.01 ms


In [15]:
up

<Array [{Muon_pt_1: 17.9, ... ] type='22838 * {"Muon_pt_1": float64, "Muon_pt_2"...'>

In [18]:
%%time
# Jim (https://github.com/scikit-hep/awkward-1.0/discussions/614)

up = ak.zip({field: final[field, "Up"] for field in ak.fields(final) if "Up" in ak.fields(final[field])})

CPU times: user 2.1 ms, sys: 98 µs, total: 2.2 ms
Wall time: 2.22 ms


In [19]:
up

<Array [{Muon_pt_1: 17.9, ... ] type='22838 * {"Muon_pt_1": float64, "Muon_pt_2"...'>

In [46]:
ak_up = ak.to_numpy(up)

In [47]:
ak_up

array([(17.88243675, 13.73839951, 45.24417496, 59.44207001),
       (50.55161285, 41.52435303, 16.4591465 ,  6.12264776),
       (25.20227051, 41.96265793,  5.54201794,  8.49494743), ...,
       (34.08853912, 35.66421127, 15.03151131, 12.73214531),
       (59.5030899 , 21.81192017,  5.68638945, 15.22136307),
       (43.4572525 , 40.81328583,  7.15603828,  9.40444851)],
      dtype=[('Muon_pt_1', '<f8'), ('Muon_pt_2', '<f8'), ('Electron_pt_1', '<f8'), ('Electron_pt_2', '<f8')])

In [62]:
%%time

sys = 'Up'
up = ak.Array([[evt[var][sys] for var in evt.fields if hasattr(evt[var], 'fields') and sys in evt[var].fields] for evt in final])

CPU times: user 5.58 s, sys: 0 ns, total: 5.58 s
Wall time: 5.58 s


In [67]:
%%time

sys = 'Up'
up = np.array([[evt[var][sys] for var in evt.fields if hasattr(evt[var], 'fields') and sys in evt[var].fields] for evt in final])

CPU times: user 5.51 s, sys: 14.8 ms, total: 5.52 s
Wall time: 5.52 s


In [70]:
prediction_up = bdt.predict(up)

In [107]:
prediction_up.shape

(22838,)

In [73]:
prediction_up_ak = ak.from_numpy(prediction_up)

In [112]:
prediction_up_ak

<Array [1, 1, 0, 1, 1, 1, ... 1, 1, 0, 1, 1, 1] type='22838 * float64'>

In [127]:
for i in prediction_up_ak[:3]:
    print(i)

1.0
1.0
0.0


In [76]:
# Study how a new complex value can be added to final

from copy import copy

In [132]:
cp = copy(final)

In [133]:
cp.type

22838 * {"Muon_pt_1": {"Nominal": float64, "Up": float64, "Down": float64}, "Muon_pt_2": {"Nominal": float64, "Up": float64, "Down": float64}, "Electron_pt_1": {"Nominal": float64, "Up": float64, "Down": float64}, "Electron_pt_2": {"Nominal": float64, "Up": float64, "Down": float64}, "PV_x": float64, "PV_y": float64, "PV_z": float64}

In [135]:
type(cp['Muon_pt_1'])

awkward.highlevel.Array

In [147]:
systematics = ['Up']
d = {'Up': prediction_up_ak}
def make_evt_dict(d, evt):
    return {k: v[evt] for (k, v) in d.items()}
    
cp['Predict'] = ak.Array([make_evt_dict(d, evt) for evt in range(len(prediction_up_ak))])
#cp['Predict'] = ak.Array([{k: prediction_up_ak for k in systematics} for i in range(len(prediction_up_ak))])

In [148]:
cp.type

22838 * {"Muon_pt_1": {"Nominal": float64, "Up": float64, "Down": float64}, "Muon_pt_2": {"Nominal": float64, "Up": float64, "Down": float64}, "Electron_pt_1": {"Nominal": float64, "Up": float64, "Down": float64}, "Electron_pt_2": {"Nominal": float64, "Up": float64, "Down": float64}, "PV_x": float64, "PV_y": float64, "PV_z": float64, "Predict": {"Up": float64}}

### Tagger class

In [28]:
class Tagger:
    def __init__(self, clf, variables, systematics, prediction):
        self.clf = clf
        self.variables = variables
        self.systematics = systematics
        self.prediction = prediction
        
    def predict(self, df):
        def get_predicted_array(df, sys):
            np_arr_input = np.array([[evt[var][sys] for var in self.variables] for evt in df])
            np_arr_output = self.clf.predict(np_arr_input)
            return ak.from_numpy(np_arr_output)
        def make_evt_dict(d, evt):
            return {k: v[evt] for (k, v) in d.items()}
        predictions = {sys: arr for (sys, arr) in [(sys, get_predicted_array(df, sys)) for sys in self.systematics]}
        df[self.prediction] = ak.Array([make_evt_dict(predictions, evt) for evt in range(len(df))])
        return df

In [29]:
tagger = Tagger(bdt, nominals_with_sys, ['Nominal', 'Up', 'Down'], 'Y')

In [30]:
%%time
predicted = tagger.predict(final)

CPU times: user 10.1 s, sys: 34.3 ms, total: 10.2 s
Wall time: 6.54 s


In [32]:
predicted.type

22838 * {"Muon_pt_1": {"Nominal": float64, "Up": float64, "Down": float64}, "Muon_pt_2": {"Nominal": float64, "Up": float64, "Down": float64}, "Electron_pt_1": {"Nominal": float64, "Up": float64, "Down": float64}, "Electron_pt_2": {"Nominal": float64, "Up": float64, "Down": float64}, "PV_x": float64, "PV_y": float64, "PV_z": float64, "Y": {"Nominal": float64, "Up": float64, "Down": float64}}

### Selections

In [79]:
selected = predicted[predicted['Muon_pt_1']['Nominal'] > 25]

In [80]:
selected.type

9270 * {"Muon_pt_1": {"Nominal": float64, "Up": float64, "Down": float64}, "Muon_pt_2": {"Nominal": float64, "Up": float64, "Down": float64}, "Electron_pt_1": {"Nominal": float64, "Up": float64, "Down": float64}, "Electron_pt_2": {"Nominal": float64, "Up": float64, "Down": float64}, "PV_x": float64, "PV_y": float64, "PV_z": float64, "Y": {"Nominal": float64, "Up": float64, "Down": float64}}