# Convolutional Neural Network

## CNN2D

### Inputs (3)

|Feature|Description|
|:--:|:--:|
|j1_etarot|rotated eta of each constituent|
|j1_phirot|rotated phi of each constituent|
|j1_ptrel|ratio of the pT of each consistent to the pT of the jet|
|(j1_index)|This will be dropped in training|

MaxParticles: 100

### Labels (5)

Label|Description
:--:|:--:
j_g|Gluon jet
j_q|Light-quark jet
j_w|W-boson
j_z|Z-boson
j_t|Top-quark
(j1_index)|This will be dropped in training

### Preprocessing

    2D feature map (etarot, phirot) weighted by ptrel
    binning: 40×40, range: [0.8,0.8] in (etarot, phirot)
    Pixelated each jet as input to 2D CNN.
    Jet image can also be used as input to the ResNet-50

### Model structure

    Model: "model"
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    input_1 (InputLayer)         [(None, 40, 40, 1)]       0         
    _________________________________________________________________
    conv1_relu (Conv2D)          (None, 40, 40, 8)         976       
    _________________________________________________________________
    conv2_relu (Conv2D)          (None, 20, 20, 4)         292       
    _________________________________________________________________
    conv3_relu (Conv2D)          (None, 10, 10, 2)         74        
    _________________________________________________________________
    flatten (Flatten)            (None, 200)               0         
    _________________________________________________________________
    dense (Dense)                (None, 32)                6432      
    _________________________________________________________________
    output_softmax (Dense)       (None, 5)                 165       
    =================================================================
    Total params: 7,939
    Trainable params: 7,939
    Non-trainable params: 0
    _________________________________________________________________

#### Input Shape: (40, 40, 1)

#### Conv2d Layers (3)

    Kernel Size:            (11,11) + (3,3) + (3,3)
    Strides:                (1,1) + (2,2) + (2,2)
    Number of Filters:      8 + 4 + 2
    Activation function:    Relu
    Kernel initializer:     he_normal
    Padding:                Same

#### Flatten Layers (1)

#### Dense Layers (1)

    Perceptrons:            32
    Activation function:    Relu

#### Output layer (1)

    Output:                 5-class Classification
    Activation function:    Softmax
    Kernel initializer:     lecun_uniform

##### Learning rate: 1e-4

##### Optimizer: Adam

##### Loss function: categorical_crossentropy

##### Metrics: Accuracy

Welcome to CNN2D tutorial. For CNN2D, it is pretty obvious that the input for this model is 2D images. Images are actually 2D array with values in each pixels. That is how we will preprocess our input data.

In [None]:
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense, Flatten, Conv2D
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
import matplotlib.pyplot as plt
import h5py
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_curve, auc
from tqdm import tqdm

## Data Preprocessing

Still the same old get features and labels data we need.

In [None]:
# To use one data file:
h5File = h5py.File('data/processed-pythia82-lhc13-all-pt1-50k-r1_h022_e0175_t220_nonu_withPars_truth_0.z', 'r')
treeArray = h5File['t_allpar_new'][()]

h5File.close()

print(treeArray.shape)

# List of features to use
features = ['j1_etarot', 'j1_phirot', 'j1_ptrel', 'j_index']

# List of labels to use
labels = ['j_g', 'j_q', 'j_w', 'j_z', 'j_t', 'j_index']

# Convert to dataframe
features_labels_df = pd.DataFrame(treeArray,columns=list(set(features+labels)))
features_labels_df = features_labels_df.drop_duplicates()

features_df = features_labels_df[features]
labels_df = features_labels_df[labels]
labels_df = labels_df.drop_duplicates()

# Convert to numpy array 
features_val = features_df.values
labels_val = labels_df.values     

if 'j_index' in features:
    features_val = features_val[:,:-1] # drop the j_index feature
if 'j_index' in labels:
    labels_val = labels_val[:,:-1] # drop the j_index label
    print(labels_val.shape)

In this case, we will process our data into an 40x40 2d array for each jet.

In [None]:
BinsX = 40
MinX = -0.8
MaxX = 0.8
BinsY = 40
MinY = -1.0
MaxY = 1.0
features_2dval = np.zeros((len(labels_df), BinsX, BinsY, 1))
for i in tqdm(range(0, len(labels_df))):
    features_df_i = features_df[features_df['j_index']==labels_df['j_index'].iloc[i]]
    index_values = features_df_i.index.values

    xbins = np.linspace(MinX,MaxX,BinsX+1)
    ybins = np.linspace(MinY,MaxY,BinsY+1)

    x = features_df_i[features[1]]           
    y = features_df_i[features[0]]
    w = features_df_i[features[2]]

    hist, xedges, yedges = np.histogram2d(x, y, weights=w, bins=(xbins,ybins))

    for ix in range(0,BinsX):
        for iy in range(0,BinsY):
            features_2dval[i,ix,iy,0] = hist[ix,iy]
features_val = features_2dval

And the same old train_test_split.

In [None]:
X_train, X_test, y_train, y_test = train_test_split(features_val, labels_val, test_size=0.2, random_state=42)
 
if 'j_index' in labels:
    labels = labels[:-1]

## Training

Notice that the kernel size and strides hyperparameters were changed to 2d vectors instead of 1d values, and the input shape is also changed to our image resolution and one image per input.

In [None]:
Inputs = Input(shape=(40, 40, 1,))
x = Conv2D(filters=8, kernel_size=(11,11), strides=(1,1), padding='same',
           kernel_initializer='he_normal', use_bias=True, name='conv1_relu',
           activation = 'relu')(Inputs)
x = Conv2D(filters=4, kernel_size=(3,3), strides=(2,2), padding='same',
           kernel_initializer='he_normal', use_bias=True, name='conv2_relu',
           activation = 'relu')(x)
x = Conv2D(filters=2, kernel_size=(3,3), strides=(2,2), padding='same',
           kernel_initializer='he_normal', use_bias=True, name='conv3_relu',
           activation = 'relu')(x)
x = Flatten()(x)
x = Dense(32, activation='relu')(x)
predictions = Dense(5, activation='softmax', kernel_initializer='lecun_uniform', name='output_softmax')(x)
model = Model(inputs=Inputs, outputs=predictions)
print(model.summary())

In [None]:
adam = Adam(lr=0.0001)
model.compile(optimizer=adam, loss=['categorical_crossentropy'], metrics=['accuracy'])

In [None]:
history = model.fit(X_train, y_train, batch_size = 1024, epochs = 100,
                    validation_split = 0.25, shuffle = True, callbacks = None, 
                    use_multiprocessing=True, workers=4)

## Evaluation

In [None]:
def learningCurveLoss(history):
    plt.figure()
    plt.plot(history.history['loss'], linewidth=1)
    plt.plot(history.history['val_loss'], linewidth=1)
    plt.title('Model Loss over Epochs')
    plt.legend(['training sample loss','validation sample loss'])
    plt.xlabel('epoch')
    plt.ylabel('loss')
    plt.show()

In [None]:
learningCurveLoss(history)

In [None]:
def makeRoc(features_val, labels_val, labels, model, outputSuffix=''):
    labels_pred = model.predict(features_val)
    df = pd.DataFrame()
    fpr = {}
    tpr = {}
    auc1 = {}
    plt.figure()       
    for i, label in enumerate(labels):
        df[label] = labels_val[:,i]
        df[label + '_pred'] = labels_pred[:,i]
        fpr[label], tpr[label], threshold = roc_curve(df[label],df[label+'_pred'])
        auc1[label] = auc(fpr[label], tpr[label])
        plt.plot(fpr[label],tpr[label],label='%s tagger, AUC = %.1f%%'%(label.replace('j_',''),auc1[label]*100.))
    plt.xlabel("Background Efficiency")
    plt.ylabel("Signal Efficiency")
    plt.xlim([-0.05, 1.05])
    plt.ylim(0.001,1.05)
    plt.grid(True)
    plt.legend(loc='lower right')
    plt.title('%s ROC Curve'%(outputSuffix))
    plt.savefig('%s_ROC_Curve.png'%(outputSuffix))
    return labels_pred

In [None]:
y_pred = makeRoc(X_test, y_test, labels, model, outputSuffix='Conv2d')

### Excercise

Try other resolutions like 80x80 and see if the result performance would be better or not.

In [None]:
# TODO
