# Practical Considerations, Challenges and Constraints in Applying Deep Learning to High Energy Physics
### aka a biased point of view on why we need to think hard before using neural nets
by Michela Paganini (Yale) and 

### Yielding batches can be a bottleneck

### Feature normalization

In [None]:
# show how trainig fails if you don't

### Data in ROOT files

In [None]:
# root_numpy (uproot? others?)

### No natural order in sequence

Using "Jet Flavor dataset" from UCI and available at: http://mlphysics.ics.uci.edu/data/hb_jet_flavor_2016/

> **Description**:
Each line is a data sample. Each sample contains variables related to jet kinematics, tracks, vertex and high level features.
Each track contains 20 variables and each sample has a variable number of tracks.
Each vertex contains 8 variables and each sample has a variable number of vertices.
For each sample there are 14 high level variables.

>The variables can be subdivided into 3 categories; low level, medium level and high level. Each one of these is constructed strictly using information from the previous level so low level < medium level < high level.
The low level variables are the tracks + jet kinematics variables.
The medium level variables are tracks + vertices + jet kinematics variables.
The high level are high level features + jet kinematics variables.


>**Label variable**:

>0: Light Jet (Background)

>4: Charm Jet (Background)

>5: Bottom Jet (Signal)


>**Feature description**

>Each sample looks like this:
`jet_pt, jet_eta, flavor, {high level track variables}, {high level vertex variables},
[{{track_variables}, {track_covariance}, {track_weight}, {vertex_variables}}, …,  }]`.

>*High level variables*:
The high level variables are sub-divided as  `{{high_level_track_variables}, {high_level_vertex_variables}}`.
Where the `{high_level_tracking_variables}` are:
`track_2_d0_significance, track_3_d0_significance,
track_2_z0_significance, track_3_z0_significance,
n_tracks_over_d0_threshold, jet_prob, jet_width_eta, jet_width_phi`,
and the `{high_level_vertex_variables}` are:
`vertex_significance, n_secondary_vertices, n_secondary_vertex_tracks,
delta_r_vertex, vertex_mass, vertex_energy_fraction`

>The track and vertex variables are organized as
`{{track_variables}, {track_covariance}, {track_weight}, {vertex_variables}}`

>The track variables are given by:
`D0`, `Z0`, `PHI`, `THETA`, `QOVERP`

>Track covariance is given by:
`D0D0`, 
`Z0D0`, `Z0Z0`, 
`PHID0`, `PHIZ0`, `PHIPHI`, 
`THETAD0`, `THETAZ0`, `THETAPHI`, `THETATHETA`,
`QOVERPD0`, `QOVERPZ0`, `QOVERPPHI`, `QOVERPTHETA`, `QOVERPQOVERP`

>The track_weight is related to how strongly the track is associated the corresponding vertex, a higher value means the track is a better fit to the vertex.

>Vertex variables are taken from the vertex that the track is associated to, and can therefore be identical for several tracks. The variables are as follows: `mass, displacement, delta_eta_jet, delta_phi_jet, displacement_significance, n_tracks, energy_fraction`

I preprocessed the data for you, so now we only have 1000 jets, and each jet is only defined by its track and vertex variables.

In [4]:
import numpy as np

In [27]:
data = np.load('data.npy')

The dataset is in numpy array format and has dimensions `(n_jets, n_tracks, n_features)` where `n_jets=1000`, `n_tracks` is variable, and `n_features=28`. Since `n_tracks` is variable, the shape of the dataset will print as `(n_jets,)`.

In [28]:
data.shape

(1000,)

In [29]:
print data[0].shape, data[1].shape, data[999].shape

(2, 28) (6, 28) (4, 28)


The 28 variables describing each jet are: `D0, Z0, PHI, THETA, QOVERP, D0D0,  Z0D0, Z0Z0,  PHID0, PHIZ0, PHIPHI,  THETAD0, THETAZ0, THETAPHI, THETATHETA, QOVERPD0, QOVERPZ0, QOVERPPHI, QOVERPTHETA, QOVERPQOVERP, track_weight, mass, displacement, delta_eta_jet, delta_phi_jet, displacement_significance, n_tracks, energy_fraction`

We want to classify these jets based on their labels:

In [42]:
y = np.load('labels.npy')
print 'Number of entries in y:', len(y)
print 'Available flavors:', np.unique(y)

Number of entries in y: 1000
Available flavors: [0 4 5]


In [None]:
# Zero pad, order, take all in ffnn
# Zero pad, order, take all in rnn
# Deep set

### Heavily preprocessed images with constant structure

### Need to decorrelate models from a given feature 

### Learning with noisy labels

### Weights handling

### Picking the right metric

### Issues with GANs

### Domain knowledge in the training

### Rolling convolutions