# Example Usage

Start by getting a dummy file

In [1]:
from ftag import get_dummy_file
fname, f = get_dummy_file()
jets = f['jets']

### Cuts

The `Cuts` class provides an interface for applying selections to structured nummpy arrays loaded from HDF5 files.
To take a look, first import the `Cuts`:


In [2]:
from ftag import Cuts

Instances of `Cuts` can be defined from lists of strings or tuples of strings and values. For example

In [3]:
kinematic_cuts = Cuts.from_list(["pt > 20e3", "abs_eta < 2.5"])
flavour_cuts = Cuts.from_list([("HadronConeExclTruthLabelID", "==", 5)])

It's easy to combine cuts

In [4]:
combined_cuts = kinematic_cuts + flavour_cuts

And then apply them to a a structured array with 

In [5]:
idx, selected_jets = combined_cuts(jets)

Both the selected indices and the selected jets are returned. The indices can be used to reapply the same selection on another array (e.g. tracks). The return values `idx` and `values` can also be accessed directly

In [6]:
idx = combined_cuts(jets).idx
selected_jets = combined_cuts(jets).values

### Flavours

A list of flavours is provided.

In [7]:
from ftag import Flavours
Flavours.bjets

Flavour(name='bjets', label='$b$-jets', cuts=['HadronConeExclTruthLabelID == 5'], colour='#1f77b4')

`dict` like access is also supported:

In [8]:
Flavours["qcd"]

Flavour(name='qcd', label='QCD', cuts=['R10TruthLabel_R22v1 == 10'], colour='#38761D')

As you can see from the output, each flavour has a `name`, a `label` and `colour` (used for plotting), and a `Cuts` instance, which can be used to select jets of the given flavour.
For example:

In [9]:
bjets = Flavours.bjets.cuts(jets).values

Probability names are also accessible using `.px`:

In [10]:
[f.px for f in Flavours]

['pb', 'pc', 'pu', 'ptau', 'phbb', 'phcc', 'ptop', 'pqcd']

### H5Reader

Allows for batched reading from one or more h5 files. 
Variables are specified as `dict[str, list[str]]`.
By default the reader will randomly access chunks in the file, giving you a weakly shuffled stream of jets.
For example to read three batches of 100 jets:


In [11]:
from ftag import H5Reader

reader = H5Reader(fname, batch_size=100)
stream = reader.stream({"jets": ["pt", "eta"]}, num_jets=300)
for batch in stream:
    print(len(batch["jets"]))

100
100
100


To transparently load jets across several files `fname` can also be a pattern including wildcards (`*`).
Behind the scenes files are globbed and merged into a [virtual dataset](https://docs.h5py.org/en/stable/vds.html).
So the following also works:

In [13]:
sample_dir = Path(fname).parent
reader = H5Reader(sample_dir / "*.h5", batch_size=100)

You can read jets and tracks at the same time, and access the relevant group from the `batch` dictionary.

In [20]:
stream = reader.stream({"jets": ["pt", "eta"], "tracks": ["deta", "dphi"]}, num_jets=300)
batch = next(stream)
batch["tracks"].dtype

dtype([('deta', '<f4'), ('dphi', '<f4')])

You can specify cuts to apply to the jets as they are loaded. For example, to stream jets with $p_T > 20$ GeV:

In [26]:
stream = reader.stream({"jets": ["pt"]}, num_jets=300, cuts=Cuts.from_list(["pt > 20e3"]))
batch = next(stream)
assert batch["jets"]["pt"].min() > 20e3

If you are not interested in working with batches 



### H5Writer
