# NN Training

In this example notebook, we will utilize the `wvz_ml_framework` module to:

1) Load data in the 4l-DF signal region.

2) Train a neural network to pick out signal events.

## Load data

There is already a utility created to load data and generate train, test, and validation sets for NN training in one function.

In [3]:
import sys
sys.path.append('../')

from wvz_ml_framework.nn_training import data_management

We must specify the data paths, the training features, and the file used for rescaling. We can specify the training features separately from the features to be rescaled, which is useful if we have a feature that we don't want to rescale.

If we don't already have a file to be used for rescaling, we can generate one first:

In [43]:
data_paths = {
    'Signal': '/home/grabanal/WVZ/gabriel_ML_data/20220301_ELReLMIs54_MUReLMIs31_btag77_VVZ.arrow',
    'Other': '/home/grabanal/WVZ/gabriel_ML_data/20220301_ELReLMIs54_MUReLMIs31_btag77_others.arrow',
    'ttZ': '/home/grabanal/WVZ/gabriel_ML_data/20220301_ELReLMIs54_MUReLMIs31_btag77_ttZ.arrow',
    'tWZ': '/home/grabanal/WVZ/gabriel_ML_data/20220301_ELReLMIs54_MUReLMIs31_btag77_tWZ.arrow',
    'tZ': '/home/grabanal/WVZ/gabriel_ML_data/20220301_ELReLMIs54_MUReLMIs31_btag77_tZ.arrow',
    'WZ': '/home/grabanal/WVZ/gabriel_ML_data/20220301_ELReLMIs54_MUReLMIs31_btag77_WZ.arrow',
    'Zgamma': '/home/grabanal/WVZ/gabriel_ML_data/20220301_ELReLMIs54_MUReLMIs31_btag77_Zgamma.arrow',
    'Zjets': '/home/grabanal/WVZ/gabriel_ML_data/20220301_ELReLMIs54_MUReLMIs31_btag77_Zjets.arrow',
    'ZZ': '/home/grabanal/WVZ/gabriel_ML_data/20220301_ELReLMIs54_MUReLMIs31_btag77_ZZ.arrow'
}

with open('training_features.txt', 'r') as file:
    training_features = [line.strip() for line in file.readlines()]
    
rescale_features = [feat for feat in training_features if feat not in ['SR']]

data_management.generate_scale_params_file(data_paths, rescale_features, 'rescaling_parameters.json')

Now we can load the data for NN training:

In [52]:
x_train, y_train, w_train, x_test, y_test, w_test, x_val, y_val, w_val \
    = data_management.get_train_test_val_data(data_paths=data_paths, 
                                              train_feats=training_features,
                                              sr_to_train='DF',
                                              test_prop=0.2,
                                              val_prop=0.1,
                                              rescale_filepath='rescaling_parameters.json',
                                              rescale_feats=rescale_features
                                             )

Data loaded...
Data scaled...
Data cut down to DF signal region...
Splits generated... Finished.


We can verify that the training data has been scaled appropriately:

In [47]:
x_train.head()

Unnamed: 0,HT,MET,METPhi,METSig,Njet,Nlep,SR,Wlep1_ambiguous,Wlep1_dphi,Wlep1_eta,...,phi_1,phi_2,phi_3,phi_4,pt_1,pt_2,pt_3,pt_4,pt_4l,total_HT
69008,0.0,4e-05,0.820966,0.018378,0.0,0.0,2,0.0,0.914435,0.518686,...,0.214376,0.634932,0.620066,0.900982,0.000101,0.119961,0.187219,0.132414,4e-05,0.000195
69295,0.0,1.1e-05,0.800249,0.00707,0.0,0.0,2,0.0,0.989696,0.749059,...,0.66851,0.222197,0.262986,0.859758,2.3e-05,0.062273,0.078476,0.067732,9e-06,6.7e-05
53637,0.062606,7e-06,0.584074,0.002546,0.107143,0.333333,2,0.0,0.868365,0.186055,...,0.341462,0.6427,0.400171,0.838771,5.3e-05,0.046632,0.064664,0.081993,6e-05,0.000176
103645,0.118793,4.3e-05,0.373225,0.012539,0.035714,0.0,2,0.0,0.198775,0.312964,...,0.422225,0.488499,0.740492,0.311498,8.4e-05,0.0751,0.154655,0.037449,0.00012,0.000304
40007,0.0,2.9e-05,0.923061,0.019082,0.0,0.0,2,0.0,0.834832,0.502581,...,0.525657,0.170426,0.968463,0.432287,2.9e-05,0.048493,0.059724,0.07149,2.9e-05,6.6e-05


And that the datasets are the correct size:

In [56]:
total_size = len(x_train) + len(x_test) + len(x_val)

print('Training proportion: %.2f'%(len(x_train) / total_size))
print('Test proportion: %.2f'%(len(x_test) / total_size))
print('Validation proportion: %.2f'%(len(x_val) / total_size))

Training proportion: 0.70
Test proportion: 0.20
Validation proportion: 0.10


## Train neural network

There is a utility written to easily train the neural networks that we have been using.

In [48]:
from wvz_ml_framework.nn_training import nn_training

2022-05-26 18:01:35.314121: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1


It trains a 3-layer model with a specified number of nodes per layer and dropout per layer, using the Adam optimizer. One simply needs to feed in train and validation data and hyperparameters. The model will be saved to a specified folder.

The train and validation data must be formatted in tuples of the form (training features, labels, weights).

In [57]:
nn_training.make_and_train_model(
    training_data=(x_train, y_train, w_train),
    validation_data=(x_val, y_val, w_val),
    batch_size=512,
    num_nodes=64,
    dropout=0.1,
    learn_rate=1e-4,
    epochs=15,
    patience=5,
    model_dir='models/',
    model_name='example'
)

2022-05-26 18:07:28.580228: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-05-26 18:07:28.583597: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2022-05-26 18:07:28.710857: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:65:00.0 name: Quadro RTX 4000 computeCapability: 7.5
coreClock: 1.545GHz coreCount: 36 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 387.49GiB/s
2022-05-26 18:07:28.710954: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
2022-05-26 18:07:28.715748: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2022-05-26 18:07:28.715852: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2022-0

Epoch 1/15
 1/76 [..............................] - ETA: 49s - loss: 0.0027 - accuracy: 0.4766

2022-05-26 18:07:33.851028: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10


Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15


2022-05-26 18:07:39.010525: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.


INFO:tensorflow:Assets written to: models/example/assets
