# `func_adl` advanced xAOD Usage

The `func_adl` language is a functional SQL like language, losely based on the C# language feature called `LINQ`.

Particle physics data is considered as a stream of small databases - each one containing all the information for an event. Operations are applied to each of these mini-databases. For example, extract all electrons with a $p_T$ greater than 35 GeV.

The key to understanding and thinking about `func_adl` and the data flow through it is to think of it like a stream of buckets, and each bucket contains data. A bucket could contain all the data in an event, or maybe it contains all the data associated with an electron or a jet, etc.

With that in mind lets go back to our first getting-started example:

In [1]:
from func_adl_servicex import ServiceXSourceXAOD

dataset_xaod = "mc15_13TeV:mc15_13TeV.361106.PowhegPythia8EvtGen_AZNLOCTEQ6L1_Zee.merge.DAOD_STDM3.e3601_s2576_s2132_r6630_r6264_p2363_tid05630052_00"
ds = ServiceXSourceXAOD(dataset_xaod)
data = ds \
    .Select('lambda e: e.Jets("AntiKt4EMTopoJets")') \
    .Select('lambda jets: jets.Where(lambda j: (j.pt()/1000)>30)') \
    .Select('lambda good_jets: good_jets.Select(lambda j: j.pt()/1000.0)') \
    .AsAwkwardArray(["JetPt"]) \
    .value()

The `ServiceXSourceXAOD` is a data source - each bucket contains all the information in an event.

The `Select` function is a transform. It applies the `lambda` function to the bucket and returns the contents of the bucket for the next in line. In this case, the first `.Select` transforms the event, `e`, into a list of jets. Thus, an event enters, and a list of jets exits.

Note the nested structure of the third line. We know that the `lambda` argument `jets` will contain a list of jets from each event. Thus the `jets.Where(...)` goes down a level. Inside that expression, each bucket is a single jet. So the `Where` labmda applies to a single jet. In this case, the `lambda` returns true if the jet $p_T$ is greater than 30 GeV. The `.Select` on that third line then takes a bucket with a list of all jets as input, and as output it has a list of all jets that have $p_T$ > 30 GeV.

The fourth is similar - it transforms the list of jets in each bucket into a list of jet $p_T$ in GeV in each bucket.

You can quickly see how composable this is, and how it exactly expresses what level one operates at.

## Where

Each statement can operate at all levels. For example, lets say we wanted all jets with $p_T$ above 30 GeV, as above, but only in events with a missing ET larger than 50 GeV. Missing transverse energy is an event-level quantity, and so we want to eliminate buckets that contian events with less missing transverse energy. We might end up writing this:

In [None]:
ds = ServiceXSourceXAOD(dataset_xaod)
data = ds \
    .Where('lambda e: e.MissingET("MissingET")/1000.0 > 50') \
    .Select('lambda e: e.Jets("AntiKt4EMTopoJets")') \
    .Select('lambda jets: jets.Where(lambda j: (j.pt()/1000)>30)') \
    .Select('lambda good_jets: good_jets.Select(lambda j: j.pt()/1000.0)') \
    .AsAwkwardArray(["JetPt"]) \
    .value()

## SelectMany

## Async

## What happens when you screw up

Of course, it depends on how you screw up. Common things that happen:

- Your dataset identifier is not correct
- You reference something that does not exist in the xAOD
- ...

This is where the abstraction leaks! You'll get back a dump of what happened. Unfortunately, you'll have ot figure out from there what you did wrong.

Here is an example where we go for a leaf name that does not exist in the ATLAS xAOD.

In [None]:
ds = ServiceXSourceXAOD(dataset_xaod)
data = ds \
    .Select('lambda e: e.Jets("AntiKt4EMTopoJets")') \
    .Select('lambda good_jets: good_jets.Select(lambda j: j.ptt()/1000.0)') \
    .AsAwkwardArray(["JetPt"]) \
    .value()

HBox(children=(HTML(value='mc15_13TeV:mc15_13TeV.361106.PowhegPythia8EvtGen_AZNLOCTEQ6L1_Zee.merge.DAOD_STDM3.…

HBox(children=(HTML(value='        Downloaded'), FloatProgress(value=0.0, layout=Layout(flex='2'), max=9000000…

Error transforming file: root://fax.mwt2.org:1094//pnfs/uchicago.edu/atlaslocalgroupdisk/rucio/mc15_13TeV/8a/f1/DAOD_STDM3.05630052._000001.pool.root.1
  -> error: Failed to transform input file root://fax.mwt2.org:1094//pnfs/uchicago.edu/atlaslocalgroupdisk/rucio/mc15_13TeV/91/60/DAOD_STDM3.05630052._000006.pool.root.1: Output file /home/atlas/root:::fax.mwt2.org:1094::pnfs:uchicago.edu:atlaslocalgroupdisk:rucio:mc15_13TeV:91:60:DAOD_STDM3.05630052._000006.pool.root.1 was not found -- errors: 
  -> Configured GCC from: /opt/lcg/gcc/8.3.0-eda0e/x86_64-centos7/bin/gcc
  -> Configured AnalysisBase from: /usr/AnalysisBase/21.2.102/InstallArea/x86_64-centos7-gcc8-opt
  -> Configured GCC from: /opt/lcg/gcc/8.3.0-eda0e/x86_64-centos7/bin/gcc
  -> Configured AnalysisBase from: /usr/AnalysisBase/21.2.102/InstallArea/x86_64-centos7-gcc8-opt
  -> xAOD::Init                INFO    Environment initialised for data access
  -> SampleHandler with 1 files
  -> Sample:name=ANALYSIS,tags=()
  -> root:/