## Summary of notebook:

This notebook shows the tuning process to obtain the optimal model architecture and hyperparameters for the MPRI model using the **keras-tuner** package.

Terminal's command: ```pip install keras-tuner```

The following shows some of the important details tuning process:
- Dataset used: Dataset augmented with Gaussian noise (augmented_features_10_ue1_v2_ds.npy\augmented_labels_10_ue1_v2_ds.npy)
- Tuner: RandomSearch
- Max trials: 50

Parameters to be tuned:
- mpriupperhalf_filterno: No. of filters in bottleneck layer of 1x1 filter
- output_convno: No. of filters in convolution layer at the output module
- fc_dropout: Boolean on whether to include a Dropout layer with rate = 0.5 just before softmax output layer at the output module to prevent overfitting
- learning_rate: Learning rate of model
- pooling_dropout: Boolean on whether to include a Dropout layer with rate = 0.2 after every MaxPool2D layer to prevent overfitting
- l2_conv2d: L2 regularisation factor on weights of Conv2D layers to prevent overfitting
- batch_size: Batch size during training of model

Optimal Hyperparameters:
- mpriupperhalf_filterno: 128
- output_convno: 128
- fc_dropout: True
- learning_rate: 0.0001
- pooling_dropout: False
- l2_conv2d: 0.0
- batch_size: 64

In [4]:
# Configure amd test GPU
import tensorflow as tf
from tensorflow.python.client import device_lib

# Prevent automatic GPU memory pre-allocation
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
    print(gpu)
    tf.config.experimental.set_memory_growth(gpu, True)

print(tf.__version__)
# print(device_lib.list_local_devices())

PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')
2.9.1


In [9]:
# Whole network composed of 63 layers, approximately 2.6m total no. of parameters
import numpy as np
import matplotlib.pyplot as plt
import os

import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.layers import Conv2D, BatchNormalization, ReLU, MaxPool2D,\
                                    GlobalAvgPool2D, Dense, Add, Concatenate, Input,\
                                    Dropout
from tensorflow.keras import Model

# Note: tf version 2.9.1 does not have Identity layer. Implement our own identity layer which is argument insensitive
# and returns its inputs argument as output

In [12]:
import keras_tuner as kt

class HyperModel(kt.HyperModel):
    
    def build(self, hp):
        
        # Should test
        mpriupperhalf_convno = hp.Choice('mpriupperhalf_filterno', [32, 64, 128])
        output_convno = hp.Choice('output_convno', [32, 64, 128])
        fc_dropout = hp.Boolean('fc_dropout', default = False)
        lr = hp.Choice('learning_rate', [0.01, 0.001, 0.0001])

        # May be removed
        pooling_dropout = hp.Boolean('pooling_dropout', default = False)
        l2_conv2d = hp.Choice('l2_conv2d', [0.0, 0.01])

        # mpriupperhalf_convno, batch_no, pooling_dropout, output_convno,
        # fc_dropout, lr, l2_conv2d
        def input_module(x):

            # Set no. of filters to 256 to match the output of Add layer at the end of
            # upper half of MPRI module
            x = Conv2D(filters = 256, kernel_size = 3, strides = 1,
                       padding = 'same', kernel_regularizer = keras.regularizers.L2(l2_conv2d))(x)
            x = BatchNormalization()(x)
            x = ReLU()(x)

            # Normally, strides = 2 to reduce dimensions but set strides =1 for now to match
            # output shapes
            x = MaxPool2D(pool_size= 3, strides = 2, padding = 'same')(x)
            
            if pooling_dropout:
                x = Dropout(rate = 0.2)(x)
            return x
        
        def mpri_upperhalf(x):

            # Save input as another variable since need to add input of mpri
            # with output of mpri
            input_tensor = x

            # Bottleneck layer with 1x1 conv filter
            bottlenecked_tensor = Conv2D(filters = mpriupperhalf_convno, kernel_size = 1, strides = 1,
                                         padding = 'same', kernel_regularizer = keras.regularizers.L2(l2_conv2d))(x)

            # First path
            firstpath_tensor = BatchNormalization()(bottlenecked_tensor)
            firstpath_tensor = ReLU()(firstpath_tensor)
            firstpath_tensor = Conv2D(filters = 64, kernel_size = 1, strides = 1,
                                      padding = 'same', kernel_regularizer = keras.regularizers.L2(l2_conv2d))(firstpath_tensor)

            # Second path
            secondpath_tensor = BatchNormalization()(bottlenecked_tensor)
            secondpath_tensor = ReLU()(secondpath_tensor)
            secondpath_tensor = Conv2D(filters = 32, kernel_size = (5,1), strides = 1,
                                       padding = 'same', kernel_regularizer = keras.regularizers.L2(l2_conv2d))(secondpath_tensor)
            secondpath_tensor = Conv2D(filters = 32, kernel_size = (1,3), strides = 1,
                                       padding = 'same', kernel_regularizer = keras.regularizers.L2(l2_conv2d))(secondpath_tensor)
            
            # Third path
            # Normally, strides = 2 to reduce the dimensions of the input
            # In this case, experiment with strides = 1 to fit desired output shape for concatenation layer
            thirdpath_tensor = MaxPool2D(pool_size = 3, strides = 1, padding = 'same')(bottlenecked_tensor)
            
            if pooling_dropout:
                thirdpath_tensor = Dropout(rate = 0.2)(thirdpath_tensor)
            
            thirdpath_tensor = BatchNormalization()(thirdpath_tensor)
            thirdpath_tensor = ReLU()(thirdpath_tensor)
            thirdpath_tensor = Conv2D(filters = 32, kernel_size = 3, strides = 1,
                                      padding = 'same', kernel_regularizer = keras.regularizers.L2(l2_conv2d))(thirdpath_tensor)
            
            # Fourth path
            fourthpath_tensor = BatchNormalization()(bottlenecked_tensor)
            fourthpath_tensor = ReLU()(fourthpath_tensor)
            fourthpath_tensor = Conv2D(filters = 32, kernel_size = 1, strides = 1,
                                       padding = 'same', kernel_regularizer = keras.regularizers.L2(l2_conv2d))(fourthpath_tensor)

            fourthpath_tensor = BatchNormalization()(fourthpath_tensor)
            fourthpath_tensor = ReLU()(fourthpath_tensor)
            fourthpath_tensor = Conv2D(filters = 128, kernel_size = 1, strides = 1,
                                       padding = 'same', kernel_regularizer = keras.regularizers.L2(l2_conv2d))(fourthpath_tensor)
            
            # Depth concatenate the output from the four paths
            concatenated_tensor = Concatenate()([firstpath_tensor, secondpath_tensor, thirdpath_tensor, fourthpath_tensor])

            # Add the depth concatenated layer and input tensor
            # To add successfully, input tensor must have 256 channels as well to match the shape of
            # the concatenated tensor
            output_tensor = Add()([input_tensor, concatenated_tensor])

            return output_tensor
        
        def mpri_lowerhalf(x):

            def conv3x3_block(x):
                x = BatchNormalization()(x)
                x = ReLU()(x)
                x = Conv2D(filters = 256, kernel_size = 3, strides = 1,
                           padding = 'same', kernel_regularizer = keras.regularizers.L2(l2_conv2d))(x)
                return x

            def conv1x1_block(x):
                x = BatchNormalization()(x)
                x = ReLU()(x)
                x = Conv2D(filters = 256, kernel_size = 1, strides = 1,
                           padding = 'same', kernel_regularizer = keras.regularizers.L2(l2_conv2d))(x)
                return x

            # --- First layer ---
            upperpath_pooledtensor = MaxPool2D(pool_size = 3, strides = 2, padding = 'same')(x)
            if pooling_dropout:
                upperpath_pooledtensor = Dropout(rate = 0.2)(upperpath_pooledtensor)
        
            upperpath_tensor = conv3x3_block(upperpath_pooledtensor)

            lowerpath_pooledtensor = MaxPool2D(pool_size = 3, strides = 2, padding = 'same')(x)
            if pooling_dropout:
                lowerpath_pooledtensor = Dropout(rate = 0.2)(lowerpath_pooledtensor)
            
            lowerpath_tensor = conv1x1_block(lowerpath_pooledtensor)

            upperpath_tensor = Add()([upperpath_pooledtensor, upperpath_tensor, lowerpath_tensor])
            lowerpath_tensor = Add()([lowerpath_pooledtensor, lowerpath_tensor, upperpath_tensor])

            # --- Second layer ---
            upperpath_pooledtensor = MaxPool2D(pool_size = 3, strides = 2, padding = 'same')(upperpath_tensor)
            if pooling_dropout:
                upperpath_pooledtensor = Dropout(rate = 0.2)(upperpath_pooledtensor)
                
            upperpath_tensor = conv3x3_block(upperpath_pooledtensor)

            lowerpath_pooledtensor = MaxPool2D(pool_size = 3, strides = 2, padding = 'same')(lowerpath_tensor)
            if pooling_dropout:
                lowerpath_pooledtensor = Dropout(rate = 0.2)(lowerpath_pooledtensor)
                
            lowerpath_tensor = conv1x1_block(lowerpath_pooledtensor)

            upperpath_tensor = Add()([upperpath_pooledtensor, upperpath_tensor, lowerpath_tensor])
            lowerpath_tensor = Add()([lowerpath_pooledtensor, lowerpath_tensor, upperpath_tensor])

            # --- Third layer ---
            upperpath_pooledtensor = MaxPool2D(pool_size = 3, strides = 2, padding = 'same')(upperpath_tensor)
            if pooling_dropout:
                upperpath_pooledtensor = Dropout(rate = 0.2)(upperpath_pooledtensor)
            upperpath_tensor = conv3x3_block(upperpath_pooledtensor)

            lowerpath_pooledtensor = MaxPool2D(pool_size = 3, strides = 2, padding = 'same')(lowerpath_tensor)
            if pooling_dropout:
                lowerpath_pooledtensor = Dropout(rate = 0.2)(lowerpath_pooledtensor)
            lowerpath_tensor = conv1x1_block(lowerpath_pooledtensor)

            upperpath_tensor = Add()([upperpath_pooledtensor, upperpath_tensor, lowerpath_tensor])
            lowerpath_tensor = Add()([lowerpath_pooledtensor, lowerpath_tensor, upperpath_tensor])

            # Final layer - Add upper and lower path tensors
            output_tensor = Add()([upperpath_tensor, lowerpath_tensor])

            return output_tensor
        
        def output_module(x, num_classes = 1000):
            
            x = Conv2D(filters = output_convno, kernel_size = 3, strides = 1,
                       padding = 'same', kernel_regularizer = keras.regularizers.L2(l2_conv2d))(x)
            x = BatchNormalization()(x)
            x = ReLU()(x)
            x = GlobalAvgPool2D()(x)
            if fc_dropout:
                x = Dropout(rate = 0.5)(x)
            x = Dense(units = num_classes, activation = 'softmax')(x)

            return x
        
        model_input = Input(shape = (193,16,1))
        model_output = output_module(mpri_lowerhalf(mpri_upperhalf(input_module(model_input))), num_classes = 3876)
        mpri_model = Model(model_input, model_output)
        
        optimizer = tf.keras.optimizers.Adam(lr)
        mpri_model.compile(optimizer = optimizer,
                      loss = tf.keras.losses.SparseCategoricalCrossentropy(),
                      metrics = ['accuracy'])
   
        # mpri_model.summary()
    
        return mpri_model
    
    def fit(self, hp, model, X_train, y_train, validation_data = None, **kwargs):

        return model.fit(X_train, y_train,
                        validation_data = validation_data,
                        batch_size = hp.Choice('batch_size', [16,32,64]),
                        **kwargs,
                        )

In [18]:
tuner = kt.RandomSearch(
        HyperModel(),
        objective = 'val_loss',
        max_trials = 50)

In [19]:
tuner.search_space_summary()

Search space summary
Default search space size: 6
mpriupperhalf_filterno (Choice)
{'default': 32, 'conditions': [], 'values': [32, 64, 128], 'ordered': True}
output_convno (Choice)
{'default': 32, 'conditions': [], 'values': [32, 64, 128], 'ordered': True}
fc_dropout (Boolean)
{'default': False, 'conditions': []}
learning_rate (Choice)
{'default': 0.01, 'conditions': [], 'values': [0.01, 0.001, 0.0001], 'ordered': True}
pooling_dropout (Boolean)
{'default': False, 'conditions': []}
l2_conv2d (Choice)
{'default': 0.0, 'conditions': [], 'values': [0.0, 0.01], 'ordered': True}


In [20]:
print(os.getcwd())
os.chdir('../datasets')
print(os.getcwd())

/home/jovyan/committed_git/datasets
/home/jovyan/committed_git/datasets


In [22]:
from sklearn.model_selection import train_test_split

# Import dataset
features = np.load('augmented_features_10_ue1_v2_ds.npy')
labels = np.load('augmented_labels_10_ue1_v2_ds.npy')

print(f'Shape of features np array: {features.shape}')
print(f'Shape of labels np array: {labels.shape}')

X = features
y = labels

# Split the dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, shuffle=True)

Shape of features np array: (89628, 193, 16)
Shape of labels np array: (89628,)


In [23]:
print(os.getcwd())
os.chdir('../mpri')
print(os.getcwd())

/home/jovyan/committed_git/datasets
/home/jovyan/committed_git/mpri


In [24]:
stop_early = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)
tuner.search(X_train, y_train,
            validation_data = (X_test, y_test),
            epochs = 100,
            callbacks = [stop_early])

Trial 49 Complete [00h 25m 35s]
val_loss: 4.464103698730469

Best val_loss So Far: 0.17371131479740143
Total elapsed time: 16h 35m 26s

Search: Running Trial #50

Value             |Best Value So Far |Hyperparameter
64                |128               |mpriupperhalf_filterno
128               |128               |output_convno
True              |True              |fc_dropout
0.0001            |0.0001            |learning_rate
False             |False             |pooling_dropout
0                 |0                 |l2_conv2d
32                |64                |batch_size

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
 325/2241 [===>..........................] - ETA: 30s - loss: 3.6445 - accuracy: 0.2507

IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)



Epoch 24/100

IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)



Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
 177/2241 [=>............................] - ETA: 32s - loss: 0.4101 - accuracy: 0.8713

IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)



Epoch 51/100
Epoch 52/100

IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)



Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100

IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)



## Analysis of results

In [25]:
tuner.results_summary()

Results summary
Results in ./untitled_project
Showing 10 best trials
Objective(name="val_loss", direction="min")

Trial 20 summary
Hyperparameters:
mpriupperhalf_filterno: 128
output_convno: 128
fc_dropout: True
learning_rate: 0.0001
pooling_dropout: False
l2_conv2d: 0.0
batch_size: 64
Score: 0.17371131479740143

Trial 49 summary
Hyperparameters:
mpriupperhalf_filterno: 64
output_convno: 128
fc_dropout: True
learning_rate: 0.0001
pooling_dropout: False
l2_conv2d: 0.0
batch_size: 32
Score: 0.17775847017765045

Trial 12 summary
Hyperparameters:
mpriupperhalf_filterno: 64
output_convno: 128
fc_dropout: False
learning_rate: 0.001
pooling_dropout: False
l2_conv2d: 0.0
batch_size: 16
Score: 0.18610873818397522

Trial 24 summary
Hyperparameters:
mpriupperhalf_filterno: 64
output_convno: 64
fc_dropout: True
learning_rate: 0.001
pooling_dropout: False
l2_conv2d: 0.0
batch_size: 16
Score: 0.21404226124286652

Trial 08 summary
Hyperparameters:
mpriupperhalf_filterno: 128
output_convno: 64
fc_drop

In [26]:
best_hp = tuner.get_best_hyperparameters()[0]

In [27]:
print(best_hp.values)

{'mpriupperhalf_filterno': 128, 'output_convno': 128, 'fc_dropout': True, 'learning_rate': 0.0001, 'pooling_dropout': False, 'l2_conv2d': 0.0, 'batch_size': 64}
