# The Jet Collection

The Jet collections is one of the more sophisticated collections in ATLAS. Below we'll look at:

* Getting jet attributes like $p_T$ and $\eta$
* Getting the constituents that went into building the jet
* How calibrations are applied, and how one can get at the systematic variations

In [None]:
from config import ds_zee as ds
from config import ds_jz3_exot15
import matplotlib.pyplot as plt
import awkward as ak
import numpy as np
from func_adl_servicex_xaodr21.xAOD.jet_v1 import cpp_float

In [None]:
jets = (ds
        .SelectMany(lambda e: e.Jets("AntiKt4EMPFlowJets"))
        .Where(lambda j: (j.pt() / 1000) > 30)
        .Select(lambda j: j.pt() / 1000.0)
        .AsAwkwardArray('JetPt')
        .value())

In [None]:
plt.hist(jets.JetPt, bins=100, range=(0, 100))
plt.xlabel('Jet $p_T$ [GeV]')
plt.ylabel('Number of jets')
_ = plt.title('Jet $p_T$ distribution for $Z\\rightarrow ee$ events')

## Jet Constituents

Jets are composed of `TopoClusters` in ATLAS, unfortunately, they are often skimmed away. They are not present, for example, in `DAOD_PHYS`, which is the skim being used here. We'll have to pull from a different file instead.

In [None]:
topo_clusters = (ds_jz3_exot15
                    .SelectMany(lambda e: e.Jets("AntiKt4EMTopoJets", truth_jets="AntiKt4TruthJets"))
                    .SelectMany(lambda j: j.getConstituents())
                    .Select(lambda tc: tc.pt())
                    .AsAwkwardArray('JetClusterPt')
                    .value()
                )

In [None]:
plt.hist(topo_clusters.JetClusterPt/1000.0, bins=100, range=(0, 20))
plt.xlabel('Jet Cluster $p_T$ [GeV]')
plt.ylabel('Number of jets')
_ = plt.title('Jet Cluster $p_T$ distribution for jets in $Z\\rightarrow ee$ events')

## Jet Attributes

In [None]:
jets = (ds
        .SelectMany(lambda e: e.Jets("AntiKt4EMPFlowJets"))
        .Where(lambda j: (j.pt() / 1000) > 30)
        .Select(lambda j: j.getAttribute[cpp_float]('moment_name'))
        .Select(lambda v: v+1)
        .AsAwkwardArray('moment')
        .value())

## Calibration

By default the jets we pulled from above are calibrated, and the best central value for the jet collection you request is returned. This section shows you how to:

* Pull out the raw, uncalibrated jets
* How to get a particular systematic variation

Because we want to do a comparison, and the jet corrections change the number of jets, we will need to do jet matching, which means doing it inside each event. Lets get the default calibration with eta and phi:

In [None]:
jets = (ds
        .Select(lambda e: e.Jets("AntiKt4EMPFlowJets")
                           .Where(lambda j: (j.pt() / 1000) > 30))
        .Select(lambda jets: {
            'pt': jets.Select(lambda j: j.pt() / 1000.0),
            'eta': jets.Select(lambda j: j.eta()),
            'phi': jets.Select(lambda j: j.phi()),
        })
        .AsAwkwardArray()
        .value())

To grab the raw jets (without calibration) we just set the `calibrated` parameter to `False` (there is very little reason one will do this normally):

In [None]:
raw_jets = (ds
           .Select(lambda e: e.Jets("AntiKt4EMPFlowJets", calibration=None)
                              .Where(lambda j: (j.pt() / 1000) > 30))
           .Select(lambda jets: {
                'pt': jets.Select(lambda j: j.pt() / 1000.0),
                'eta': jets.Select(lambda j: j.eta()),
                'phi': jets.Select(lambda j: j.phi()),
           })
           .AsAwkwardArray()
           .value())

The number of raw jets and the number of calibrated jets are quite different from the number of raw jets, so we'll need to match them in $\eta$ and $\phi$:

In [None]:
len(raw_jets), len(jets)

In [None]:
def match(jets, jets_to_match):
    'Find the closest eta/phi jet in jets_to_match for each jet in jets'

    to_match_pt = jets_to_match.pt
    to_match_eta = jets_to_match.eta
    to_match_phi = jets_to_match.phi
    jet_eta = jets.eta
    jet_phi = jets.phi

    pair_eta = ak.cartesian([jet_eta, to_match_eta], axis=1, nested=True)
    pair_phi = ak.cartesian([jet_phi, to_match_phi], axis=1, nested=True)

    delta_eta = np.abs(pair_eta[:, :, :]["0"] - pair_eta[:, :, :]["1"])
    # TODO: Missing wrap around fro phi
    delta_phi = np.abs(pair_phi[:, :, :]["0"] - pair_phi[:, :, :]["1"])

    delta = delta_eta**2 + delta_phi**2

    # TODO: remove anything larger that 0.2*0.2
    best_match = ak.argmin(delta, axis=2)

    return ak.Record({"eta": to_match_eta[best_match], "phi": to_match_phi[best_match], "pt": to_match_pt[best_match]})

raw_jets_matched = match(jets, raw_jets)

In [None]:
plt.hist(ak.flatten(jets.pt-raw_jets_matched.pt)/1000.0, bins=100, range=(-0.1, 0.1))
plt.xlabel('$\Delta p_T$ for calibrated jets matched to their raw jets [GeV]')
plt.ylabel('Number of jets')
_ = plt.title('The effect of jet calibration on jet $p_T$ in $Z\\rightarrow ee$ events')

If we instead want a particular systematic error, we need only name that error to get it back. Knowing what the names of the systematic errors, however, is not something that can be programmatically determined ahead of time. See the further information section at the end of this chapter to links to the ATLAS jet calibration info twiki.

In [None]:
sys_jets = (ds
           .Select(lambda e: e.Jets("AntiKt4EMPFlowJets", calibration="JET_Pileup_PtTerm__1up")
                              .Where(lambda j: (j.pt() / 1000) > 30))
           .Select(lambda jets: {
                'pt': jets.Select(lambda j: j.pt() / 1000.0),
                'eta': jets.Select(lambda j: j.eta()),
                'phi': jets.Select(lambda j: j.phi()),
           })
           .AsAwkwardArray()
           .value())

In [None]:
sys_jets_matched = match(jets, sys_jets)

In [None]:
plt.hist(ak.flatten(jets.pt-sys_jets_matched.pt)/1000.0, bins=100, range=(-0.005, 0.005))
plt.xlabel('$\Delta p_T$ for calibrated jets matched to their eta calib jets [GeV]')
plt.ylabel('Number of jets')
_ = plt.title('The effect of a jet calibration sys error on jet $p_T$ in $Z\\rightarrow ee$ events')

Currently you only get the calibration constants that are built into the release. If you need to pin a particular calibration, get in touch.

## The Datamodel

The data model when this documentation was last built was:

In [None]:
from func_adl_servicex_xaodr21.xAOD.jet_v1 import Jet_v1
help(Jet_v1)

In [None]:
from func_adl_servicex_xaodr21.xAOD.jetconstituent import JetConstituent
help(JetConstituent)

## Further Information

* The [`xAOD::Jet_v1` C++ header file](https://gitlab.cern.ch/atlas/athena/-/blob/21.2/Event/xAOD/xAODJet/xAODJet/versions/Jet_v1.h) with all the inline documentation.
* The [`xAOD::JetConstituent` C++ header File](https://gitlab.cern.ch/atlas/athena/-/blob/21.2/Event/xAOD/xAODJet/xAODJet/JetConstituentVector.h) with all the inline documentation.
* The [Jet ET-Miss Recommendation Pages for R21](https://twiki.cern.ch/twiki/bin/view/AtlasProtected/JetEtmissRecommendationsR21) on the ATLAS TWiki