# PURSUE Python for HEP: Combinatronics

## Review: Awkward Arrays

* Before we continue, we need to be sure we understand how slicing works in Awkard arrays. In this these type of objects, the slicing is just a generalization of slicing in Numpy.

In [None]:
import awkward as ak

In [None]:
array = ak.Array([[0.0, 1.1, 2.2], [], [3.3, 4.4], [5.5], [6.6, 7.7, 8.8, 9.9]])
array

**Exercise:** Lets go together over these and try to see what each of them do. One of them fails. Why?

In [None]:
array[2]

In [None]:
array[-1, 1]

In [None]:
array[2:, 0]

In [None]:
array[2:, 1:]

In [None]:
# Why does this one fail?
array[:, 0]

In [None]:
array[[True, False, True, False, True]]

In [None]:
array > 4

In [None]:
array

In [None]:
array[[2, 3, 3, 1]]

In [None]:
ak.num(array)

In [None]:
ak.num(array) > 0

In [None]:
array[ak.num(array) > 0]

In [None]:
array[ak.num(array) > 0][0]

In [None]:
array[ak.num(array) > 0, 0]

In [None]:
array[ak.num(array) > 0][:, 0]

In [None]:
# A jagged array of booleans!
cut = (array * 10 % 2) == 0
cut

In [None]:
array[cut]

## Particle Selection

* Load the following cell to load the data we were using before.

In [None]:
import uproot

file = uproot.open(
    "root://eospublic.cern.ch//eos/opendata/cms/derived-data/AOD2NanoAODOutreachTool/Run2012BC_DoubleMuParked_Muons.root"
)
tree = file["Events"]

In [None]:
muons = tree.arrays(entry_stop=10)

* Before, we selected events by requiring `tree["nMuon"] == 2`. This operation created a flat array of booleans which we then applied to our data. However, what if we want to require that the $p_T$ of the muons be larger than some amount. In that case, we would need to be selecting by particle, rather than by event. We can do this in the following way

In [None]:
muonpt = muons["Muon_pt"]
muonpt

In [None]:
ptcut = muons["Muon_pt"] > 20 # GeV
muonpt[ptcut]

* If we want to select event which have *at least one muon with over 20 GeV*, we can use the function `ak.any` which applies the OR operator between all of the boolean values in each subarray (or any specified axis), giving us a flat array of booleans we can use to then filter by event.

In [None]:
ptcut

In [None]:
event_cut = ak.any(ptcut, axis=1)
event_cut

In [None]:
muons[event_cut]

**Exercise:** Create the exact sme `event_cut` using `ak.max`. Keep in mind that you will have to use the `axis` argument similar to how it was done for `ak.any`.

In [None]:
# Answer
event_cut = ak.max(muonpt, axis=1) > 20
muons[event_cut]

## Combinatronics

* Because we don’t know exactly which process each detected particle originated from (i.e., whether it is part of the signal or background), we have to use combinatorics to consider all possible combinations of particles. This approach, combined with other physical considerations such as conservation of charge and energy, allows us to reconstruct the properties of parent particles, such as their mass, while effectively reducing the influence of background events.
* Awkward offers us to key functions:
  * `ak.cartesian()`: Computes a Cartesian product (i.e. cross product) of data from a set of arrays.
  * `ak.combinations()`: Computes a Cartesian product (i.e. cross product) of `array` with itself that is restricted to combinations sampled without replacement.

### `ak.cartesian()`

* This will perform the cartesian product of an array with another array.

In [None]:
numbers = ak.Array([[1, 2, 3], [], [5, 7], [11]])
letters = ak.Array([["a", "b"], ["c"], ["d"], ["e", "f"]])

pairs = ak.cartesian((numbers, letters))
pairs

* To get the first element of each pair, we pass a "string index".

In [None]:
pairs["0"]

In [None]:
pairs["1"]

* Note that this is different from passing an integer index

In [None]:
pairs[0]

* If you want to separate the first element of each pair to an array and the second element of each pair to another array (something which later on we will see is quite useful), we can use `ak.unzip`

In [None]:
lefts, rights = ak.unzip(pairs)
print(lefts)
print(rights)

### `ak.combinations`

* This will peform the cartesian product of an array with itself, without repetitions. 

In [None]:
pairs = ak.combinations(numbers, 2)
pairs

In [None]:
# Because the elements in the sub-arrays line up once they are divided using `ak.unzip`, we can do computations with them
lefts, rights = ak.unzip(pairs)
lefts * rights

* We can change the size of the combinations we allow.

In [None]:
ak.combinations(numbers, 3)