# `coffea.NanoEvents` for PHYSLITE

This notebook is a quick demonstration of the `PHYSLITE` schema for [`coffea.NanoEvents`](https://github.com/CoffeaTeam/coffea/tree/master/coffea/nanoevents). `DAOD_PHYSLITE` is a small analysis format for ATLAS.

The PHYSLITE schema and the corresponding behavior classes are still under development - [CoffeaTeam/coffea#540](https://github.com/CoffeaTeam/coffea/issues/540) tracks the progress of some TODO items.

For more information on `NanoEvents` see the [NanoEvents tutorial](https://github.com/CoffeaTeam/coffea/blob/master/binder/nanoevents.ipynb) or [Nick Smith's presentation](https://youtu.be/udzkE6t4Mck) at the [pyHEP 2020](https://indico.cern.ch/event/882824).

First, download a `DAOD_PHYSLITE` example file

In [1]:
import urllib
import shutil
import os
import warnings

url = "https://github.com/CoffeaTeam/coffea/blob/6d548538653e7003281a572f8eec5d68ca57b19f/tests/samples/DAOD_PHYSLITE_21.2.108.0.art.pool.root?raw=true"
filename = "DAOD_PHYSLITE_21.2.108.0.art.pool.root"

if not os.path.exists(filename):
    with urllib.request.urlopen(url) as response, open(filename, 'wb') as f:
        shutil.copyfileobj(response, f)

In [2]:
from coffea.nanoevents import NanoEventsFactory, PHYSLITESchema
import awkward as ak

Until uproot uses `AwkwardForth` to deserialize vector branches, we need to use our custom routines for efficient reading. The following patches the function that extracts a column from a rootfile:

In [3]:
from physlite_experiments.deserialization_hacks import patch_nanoevents
patch_nanoevents()

Let's set the `UprootSourceMapping` into debug mode, such that we can see when new branches are loaded:

In [4]:
from coffea.nanoevents.mapping import UprootSourceMapping
UprootSourceMapping._debug = True

Now, create the `NanoEventsFactory` object and get `events`, an awkward array representation of the event structure.

In [5]:
with warnings.catch_warnings():
    # lot's of warnings for unreadable branches, ignore for now
    warnings.simplefilter("ignore")
    factory = NanoEventsFactory.from_root(filename, "CollectionTree", schemaclass=PHYSLITESchema)
    events = factory.events()

In [6]:
events

Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['AntiKt10TruthSoftDropBeta100Zcut10JetsAuxDyn.Tau1_wta', '!load', '!offsets']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['egammaClustersAuxDyn.calE', '!load', '!offsets']


<NanoEventsArray [...] type='40 * event'>

All collections are accessible as fields of this object:

In [7]:
events.fields

['METAssoc_METAux',
 'AntiKt10TruthSoftDropBeta100Zcut10JetsAux',
 'AntiKt10UFOCSSKJetsAux',
 'AntiKt4TruthDressedWZJetsAux',
 'Jets',
 'LargeRJets',
 'Photons',
 'AntiKt10TruthSoftDropBeta100Zcut10Jets',
 'AntiKt10UFOCSSKJets',
 'AntiKt4TruthDressedWZJets',
 'BTagging_AntiKt4EMPFlow_201903',
 'BornLeptons',
 'CaloCalTopoClusters',
 'CombinedMuonTrackParticles',
 'ExtrapolatedMuonTrackParticles',
 'GSFTrackParticles',
 'HardScatterParticles',
 'HardScatterVertices',
 'InDetTrackParticles',
 'MET_Core_MET',
 'MET_Truth',
 'MuonSpectrometerTrackParticles',
 'PrimaryVertices',
 'TruthBoson',
 'TruthBosonsWithDecayParticles',
 'TruthBosonsWithDecayVertices',
 'TruthBottom',
 'TruthElectrons',
 'TruthEvents',
 'TruthNeutrinos',
 'TruthPhotons',
 'TruthPrimaryVertices',
 'TruthTaus',
 'TruthTop',
 'egammaClusters',
 'HLT_mu24_iloose',
 'HLT_mu24_imedium',
 'HLT_mu24_ivarloose',
 'HLT_mu24_ivarmedium',
 'HLT_mu26_ivarmedium',
 'HLT_mu40',
 'Muons',
 'TruthMuons',
 'Electrons',
 'HLT_e24_lhtig

In [8]:
events.Electrons

Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['AnalysisElectronsAuxDyn.trackParticleLinks', '!load', '!offsets']


<ElectronArray [[], [], ... Electron], [Electron]] type='40 * var * electron'>

All columns from the `Aux` and `AuxDyn` branches are available and grouped under the collections:

In [9]:
events.Electrons.fields

['DFCommonElectronsECIDS',
 'DFCommonElectronsECIDSResult',
 'DFCommonElectronsLHLoose',
 'DFCommonElectronsLHLooseBL',
 'DFCommonElectronsLHLooseBLIsEMValue',
 'DFCommonElectronsLHLooseIsEMValue',
 'DFCommonElectronsLHMedium',
 'DFCommonElectronsLHMediumIsEMValue',
 'DFCommonElectronsLHTight',
 'DFCommonElectronsLHTightIsEMValue',
 'DFCommonElectronsLHVeryLoose',
 'DFCommonElectronsLHVeryLooseIsEMValue',
 'OQ',
 '_eventindex',
 'ambiguityLink.m_persIndex',
 'ambiguityLink.m_persKey',
 'ambiguityType',
 'author',
 'caloClusterLinks',
 'charge',
 'eta',
 'firstEgMotherPdgId',
 'firstEgMotherTruthOrigin',
 'firstEgMotherTruthParticleLink.m_persIndex',
 'firstEgMotherTruthParticleLink.m_persKey',
 'firstEgMotherTruthType',
 'm',
 'phi',
 'pt',
 'ptcone20_TightTTVA_pt1000',
 'ptcone20_TightTTVA_pt500',
 'ptvarcone20',
 'ptvarcone20_TightTTVA_pt1000',
 'ptvarcone30_TightTTVA_pt1000',
 'ptvarcone30_TightTTVA_pt500',
 'ptvarcone40',
 'topoetcone20',
 'topoetcone20ptCorrection',
 'trackParticl

In [10]:
events.Electrons.pt

Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['AnalysisElectronsAuxDyn.pt', '!load', '!content', '!data']


<Array [[], [], ... 3.81e+04], [6.7e+04]] type='40 * var * float32[parameters={"...'>

Cross references work transparently:

In [11]:
events.Electrons.trackParticles

Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['GSFTrackParticlesAuxDyn.phi', '!load', '!offsets']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['GSFTrackParticlesAuxDyn.phi', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['AnalysisElectronsAuxDyn.pt', '!load', '!eventindex', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['AnalysisElectronsAuxDyn.trackParticleLinks', '!load', '!content', '!offsets']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['AnalysisElectronsAuxDyn.trackParticleLinks', '!load', '!content', '!content', 'm_persIndex', '!item', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['AnalysisElectronsAuxDyn.trackParticleLinks', '!load', '!content', '!content', 'm_persKey', '!item', '!data']


<TrackParticleArray [[], [], ... TrackParticle]]] type='40 * var * var * ?trackP...'>

In [12]:
events.Electrons.trackParticles.fields

Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['GSFTrackParticlesAuxDyn.phi', '!load', '!eventindex', '!content', '!data']


['_eventindex',
 'chiSquared',
 'd0',
 'definingParametersCovMatrix',
 'phi',
 'qOverP',
 'theta',
 'vertexLink.m_persIndex',
 'vertexLink.m_persKey',
 'vz',
 'z0']

Some fields are accessible via the defined behavior, e.g. the `pt` for track particles will be calculated from the track parameters:

In [13]:
events.Electrons.trackParticles.pt

Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['GSFTrackParticlesAuxDyn.qOverP', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['GSFTrackParticlesAuxDyn.theta', '!load', '!content', '!data']


<Array [[], [], ... [[2.78e+04, 1.87e+04]]] type='40 * var * var * ?float32'>

Cross referencing will even work when the array is sliced, selected or reshuffled:

In [14]:
events.Electrons[events.Electrons.pt > 10000].trackParticles

<TrackParticleArray [[], [], ... TrackParticle]]] type='40 * var * var * ?trackP...'>

In [15]:
events[[2, 3]].Electrons.trackParticles.pt.tolist()

[[[70525.125]],
 [[13800.5927734375,
   3641.297119140625,
   3632.580810546875,
   15255.3095703125,
   2126.81982421875]]]

In [16]:
events[[3, 2]].Electrons.trackParticles.pt.tolist()

[[[13800.5927734375,
   3641.297119140625,
   3632.580810546875,
   15255.3095703125,
   2126.81982421875]],
 [[70525.125]]]

... and also when referencing into multiple collections

In [17]:
events.TruthElectrons.parents.pdgId

Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthElectronsAuxDyn.prodVtxLink/TruthElectronsAuxDyn.prodVtxLink.m_persKey', '!load', '!offsets']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthElectronsAuxDyn.parentLinks', '!load', '!content', '!offsets']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthElectronsAuxDyn.parentLinks', '!load', '!content', '!content', 'm_persKey', '!item', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthTausAuxDyn.prodVtxLink/TruthTausAuxDyn.prodVtxLink.m_persKey', '!load', '!offsets']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthTausAuxDyn.prodVtxLink/TruthTausAuxDyn.prodVtxLink.m_persKey', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthElectronsAuxDyn.prodVtxLink/TruthElectronsAuxDyn.prodVtxLink.m_persKey', '!load', '!eventindex', '!content', '

Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthBosonAuxDyn.decayVtxLink/TruthBosonAuxDyn.decayVtxLink.m_persKey', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthBosonAuxDyn.e', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthBosonAuxDyn.m', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthBosonAuxDyn.parentLinks', '!load', '!content', '!offsets']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthBosonAuxDyn.parentLinks', '!load', '!content', '!content', 'm_persIndex', '!item', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthBosonAuxDyn.parentLinks', '!load', '!content', '!content', 'm_persKey', '!item', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthBosonAuxDyn.pdgId', '!load', '!

<Array [[[15]], [[]], ... -5, -5], [-5, -5]]] type='40 * var * var * ?int32'>

behavior will be attached to the resulting arrays, so one can also do "cyclic" references:

In [18]:
events.TruthElectrons.parents.children.parents.children.pdgId

Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthTausAuxDyn.Classification', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthTausAuxDyn.childLinks', '!load', '!content', '!offsets']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthTausAuxDyn.childLinks', '!load', '!content', '!content', 'm_persIndex', '!item', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthTausAuxDyn.childLinks', '!load', '!content', '!content', 'm_persKey', '!item', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthTausAuxDyn.prodVtxLink/TruthTausAuxDyn.prodVtxLink.m_persKey', '!load', '!eventindex', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthPhotonsAuxDyn.prodVtxLink/TruthPhotonsAuxDyn.prodVtxLink.m_persKey', '!load', '!offsets']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef 

Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthMuonsAuxDyn.etcone20', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthElectronsAuxDyn.etcone20', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthMuonsAuxDyn.m', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthElectronsAuxDyn.m', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthMuonsAuxDyn.nPhotons_dressed', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthElectronsAuxDyn.nPhotons_dressed', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['TruthMuonsAuxDyn.parentLinks', '!load', '!content', '!offsets']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 

<Array [[[[[[16, 11, -12, ... -12, 22, 22]]]]]] type='40 * var * var * option[va...'>

In [19]:
events.TruthElectrons.parents.children.parents.children.ndim

6

LorentzVector arithmetic also works:

In [20]:
(events.Jets[:, 0] + events.Jets[:, 1]).mass

Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['AnalysisJetsAuxDyn.pt', '!load', '!offsets']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['AnalysisJetsAuxDyn.phi', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['AnalysisJetsAuxDyn.pt', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['AnalysisJetsAuxDyn.eta', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['AnalysisJetsAuxDyn.m', '!load', '!content', '!data']


<Array [1.29e+05, 8.83e+04, ... 7.75e+04] type='40 * float32'>

In [21]:
events.Electrons.delta_r(events.Electrons.nearest(events.Jets))

Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['AnalysisElectronsAuxDyn.eta', '!load', '!content', '!data']
Gettting: 5a5bf972-b48c-11eb-ae42-0101007fbeef /CollectionTree;1 0 40 ['AnalysisElectronsAuxDyn.phi', '!load', '!content', '!data']


<Array [[], [], ... 0.0157], [0.0221]] type='40 * var * ?float32'>