# Object Selection Example

This notebook looks at how you can apply selections on NanoAOD objects with Awkward Array. We'll do this in a single event, as well as across multiple events simultaneously.

As in the event counting notebook, you will need to have run the appropriate `voms-proxy-init` command to be able to access the root file in this notebook.

We'll start by getting the events from one file, like in the previous example.

In [1]:
import awkward as ak
import numpy as np
from coffea.nanoevents import NanoEventsFactory
from coffea.analysis_tools import PackedSelection

redirector = "root://cmsxrootd.fnal.gov//"
file = redirector+"/store/mc/RunIISummer20UL17NanoAODv9/Z2JetsToNuNu_M-50_LHEFilterPtZ-400ToInf_MatchEWPDG20_TuneCP5_13TeV-amcatnloFXFX-pythia8/NANOAODSIM/106X_mc2017_realistic_v9-v2/2510000/012B57B0-38DB-8246-B875-2E5E4625C134.root"
events = NanoEventsFactory.from_root(
        file+":Events",
    ).events()
print("Events loaded")

  from pandas.core.computation.check import NUMEXPR_INSTALLED
Issue: coffea.nanoevents.methods.vector will be removed and replaced with scikit-hep vector. Nanoevents schemas internal to coffea will be migrated. Otherwise please consider using that package!.
  from coffea.nanoevents.methods import vector


Events loaded


## One Event

Let's grab the first event, and then just its AK4 jets. We can see the $p_t$ of the jets in the event as before.

In [2]:
event0 = events[0]
jets0 = event0.Jet
print(f"p_t of AK4 jets in first event: {jets0.pt.compute()}")

p_t of AK4 jets in first event: [489, 72.1, 45.2, 41.8, 35.7, 22.8, 20, 15.7, 15.6]


Suppose we want only jets with at least 30 GeV of $p_t$. We can create a filter as follows. In our filter, an entry is True if the corresponding jet has > 30 GeV, and is False otherwise.

In [3]:
pt_filter0 = jets0.pt > 30
print(f"Our p_t filter for the first event: {pt_filter0.compute()}")

Our p_t filter for the first event: [True, True, True, True, True, False, False, False, False]


Now, we can filter out the jets we want. We'll print the filtered jets' $p_t$ to confirm that this worked.

In [4]:
filtered_jets0 = jets0[pt_filter0]
print(f"p_t of AK4 jets in first event after filtering: {filtered_jets0.pt.compute()}")
print(f"eta of AK4 jets in first event after filtering: {filtered_jets0.eta.compute()}")

p_t of AK4 jets in first event after filtering: [489, 72.1, 45.2, 41.8, 35.7]
eta of AK4 jets in first event after filtering: [0.811, -2.01, -1.83, -2.32, 1.71]


We can further filter on eta. Suppose we only want jets with $|\eta|$ < 2.0.

In [5]:
twice_filtered_jets0 = filtered_jets0[abs(filtered_jets0.eta) < 2.0]
print(f"p_t of AK4 jets in first event after filtering twice: {twice_filtered_jets0.pt.compute()}")
print(f"eta of AK4 jets in first event after filtering twice: {twice_filtered_jets0.eta.compute()}")

p_t of AK4 jets in first event after filtering twice: [489, 45.2, 35.7]
eta of AK4 jets in first event after filtering twice: [0.811, -1.83, 1.71]


We also could have done those both at once by performing logical operations on the filters. For example, this gives the same set of jets as the above:

In [6]:
also_twice_filtered_jets0 = jets0[(jets0.pt > 30) & (abs(jets0.eta) < 2.0)]
print(f"p_t of AK4 jets in first event with both filters: {also_twice_filtered_jets0.pt.compute()}")
print(f"eta of AK4 jets in first event with both filters: {also_twice_filtered_jets0.eta.compute()}")

p_t of AK4 jets in first event with both filters: [489, 45.2, 35.7]
eta of AK4 jets in first event with both filters: [0.811, -1.83, 1.71]


## Multiple Events

To apply the same selections on multiple events, we can use the same syntax as above, just replacing `event0` with `events`.

In [8]:
jets = events.Jet
pt_filter = jets.pt > 30
eta_filter = abs(jets.eta) < 2.0
filtered_jets = jets[pt_filter & eta_filter]

To see that this worked, let's look at the pt and eta of jets in the first event before and after filtering. It should be the same as what we got in the last section.

In [9]:
print(f"p_t of AK4 jets in first event before filtering: {jets[0].pt.compute()}")
print(f"eta of AK4 jets in first event before filtering: {jets[0].eta.compute()}")
print(f"p_t of AK4 jets in first event after filtering: {filtered_jets[0].pt.compute()}")
print(f"eta of AK4 jets in first event after filtering: {filtered_jets[0].eta.compute()}")

p_t of AK4 jets in first event before filtering: [489, 72.1, 45.2, 41.8, 35.7, 22.8, 20, 15.7, 15.6]
eta of AK4 jets in first event before filtering: [0.811, -2.01, -1.83, -2.32, 1.71, -3.26, 1.29, 2.1, 1.15]
p_t of AK4 jets in first event after filtering: [489, 45.2, 35.7]
eta of AK4 jets in first event after filtering: [0.811, -1.83, 1.71]


We also made the same selections in the second, and all other events. Looking at the second event:

In [11]:
print(f"p_t of AK4 jets in second event before filtering: {jets[1].pt.compute()}")
print(f"eta of AK4 jets in second event before filtering: {jets[1].eta.compute()}")
print(f"p_t of AK4 jets in second event after filtering: {filtered_jets[1].pt.compute()}")
print(f"eta of AK4 jets in second event after filtering: {filtered_jets[1].eta.compute()}")

p_t of AK4 jets in second event before filtering: [346, 76.1, 29.6, 21.5, 18.9, 17.8, 16.3, 15.7, 15.6]
eta of AK4 jets in second event before filtering: [-0.796, -1.54, -0.0024, -4.39, 1.91, -0.403, 4.64, -3.08, -0.805]
p_t of AK4 jets in second event after filtering: [346, 76.1]
eta of AK4 jets in second event after filtering: [-0.796, -1.54]


## Masking

Awkward array provides another way to select objects, but without losing information about where the dropped object were, or how many were dropped. This is with `ak.mask`.

Instead of dropping jets that don't meet the requirements, they are replaced with None. This can become useful when combining multiple selections for more complicated requirements. We give an example of this with the first event in our file.

In [27]:
jets0 = events.Jet[0]
pt_filter0 = jets0.pt > 30
masked_jets0 = ak.mask(jets0,pt_filter0)
print(f"Masked array of jets in event 1: {masked_jets0.compute()}")
print(f"pt of the masked array of jets in event 1: {masked_jets0.pt.compute()}")

Masked array of jets in event 1: [Jet, ...]
pt of the masked array of jets in event 1: [489, 72.1, 45.2, 41.8, 35.7, None, None, None, None]
