# CoDaS-HEP Columnar Data Analysis, hands-on project

This is the third of four notebooks on [columnar data analysis](https://indico.cern.ch/event/1151367/timetable/#41-columnar-data-analysis), presented at CoDaS-HEP at 12:30pm on August 3, 2022 by Jim Pivarski and Ioana Ifrim.

See the [GitHub repo](https://github.com/jpivarski-talks/2022-08-03-codas-hep-columnar-tutorial#readme) for instructions on how to run it.

<br><br><br><br><br>

## Project: H → ZZ → 4ℓ

In this exercise, we'll reconstruct Z masses and the Higgs mass from four leptons (4μ, 4e, 2μ2e).

We'll use Vector, which requires Awkward 1, both versions of Awkward are installed, but the installed version of Uproot only supports Awkward 2.

So we'll get the data from a Parquet file, which does not require Uproot.

In [1]:
import awkward as ak   # version 1
import vector
vector.register_awkward()

In [3]:
raw_data = ak.from_parquet("data/SMHiggsToZZTo4L.parquet")
raw_data

<Array [{run: 1, ... MET_phi: 1.65}] type='299973 * {"run": int32, "luminosityBl...'>

(We can't call `raw_data.show()` or `raw_data.type.show()`. ☹)

In [7]:
{field_name: raw_data[field_name].type for field_name in raw_data.fields}

{'run': 299973 * int32,
 'luminosityBlock': 299973 * int64,
 'event': 299973 * uint64,
 'PV_npvs': 299973 * int32,
 'PV_x': 299973 * float32,
 'PV_y': 299973 * float32,
 'PV_z': 299973 * float32,
 'nMuon': 299973 * int64,
 'Muon_pt': 299973 * var * float32,
 'Muon_eta': 299973 * var * float32,
 'Muon_phi': 299973 * var * float32,
 'Muon_mass': 299973 * var * float32,
 'Muon_charge': 299973 * var * int32,
 'Muon_pfRelIso03_all': 299973 * var * float32,
 'Muon_pfRelIso04_all': 299973 * var * float32,
 'Muon_dxy': 299973 * var * float32,
 'Muon_dxyErr': 299973 * var * float32,
 'Muon_dz': 299973 * var * float32,
 'Muon_dzErr': 299973 * var * float32,
 'nElectron': 299973 * int64,
 'Electron_pt': 299973 * var * float32,
 'Electron_eta': 299973 * var * float32,
 'Electron_phi': 299973 * var * float32,
 'Electron_mass': 299973 * var * float32,
 'Electron_charge': 299973 * var * int32,
 'Electron_pfRelIso03_all': 299973 * var * float32,
 'Electron_dxy': 299973 * var * float32,
 'Electron_dxyE

<br><br><br><br><br>

Vector requires arrays to be formatted with fields named `pt`, `phi`, `eta`, `mass` with name `"Momentum4D"`.

They don't need `charge`, but don't mind having extra fields.

[ak.zip](https://awkward-array.readthedocs.io/en/latest/_auto/ak.zip.html) can do that.

In [12]:
events = ak.zip({
    "muons": ak.zip({
        "pt": raw_data["Muon_pt"],
        "phi": raw_data["Muon_phi"],
        "eta": raw_data["Muon_eta"],
        "mass": raw_data["Muon_mass"],
    }, with_name="Momentum4D"),
    "electrons": ak.zip({
        "pt": raw_data["Electron_pt"],
        "phi": raw_data["Electron_phi"],
        "eta": raw_data["Electron_eta"],
        "mass": raw_data["Electron_mass"],
    }, with_name="Momentum4D"),
}, depth_limit=1)

events

<Array [{muons: [{pt: 63, ... mass: 0.0185}]}] type='299973 * {"muons": var * Mo...'>

<br><br><br><br><br>

Without `.show()`, we can get a sense of the structure by converting the first few events into lists and dicts.

In [14]:
events[:3].tolist()

[{'muons': [{'pt': 63.04386901855469,
    'phi': 2.968005895614624,
    'eta': -0.7186822295188904,
    'mass': 0.10565836727619171},
   {'pt': 38.12034606933594,
    'phi': -1.0324749946594238,
    'eta': -0.8794569969177246,
    'mass': 0.10565836727619171},
   {'pt': 4.04868745803833,
    'phi': 1.0385035276412964,
    'eta': -0.320764422416687,
    'mass': 0.10565836727619171}],
  'electrons': []},
 {'muons': [],
  'electrons': [{'pt': 21.902679443359375,
    'phi': 0.1339961737394333,
    'eta': -0.7021886706352234,
    'mass': 0.00543835386633873},
   {'pt': 42.63296890258789,
    'phi': -1.8634047508239746,
    'eta': -0.9796805381774902,
    'mass': 0.008667422458529472},
   {'pt': 78.01239013671875,
    'phi': -2.2078325748443604,
    'eta': -0.9338527917861938,
    'mass': 0.018527036532759666},
   {'pt': 23.835430145263672,
    'phi': -0.6215649247169495,
    'eta': -1.362490177154541,
    'mass': 0.008162532933056355}]},
 {'muons': [],
  'electrons': [{'pt': 11.571166992187

<br><br><br><br><br>

HERE

<br><br><br><br><br>

### 4 leptons of the same flavor

<br><br><br><br><br>

### Opposite charges

<br><br><br><br><br>

### On your own: the H → ZZ → 2μ2e case

<br><br><br><br><br>

# Next stop: break time!

When we get back, we'll look at [part-4.ipynb](part-4.ipynb), but DO NOT go there yet.