# **Hyperparameter Search**

# **Student Identification**

Student Name       | Student Email
-------------------|------------------
Daniel Branco      | r20191230@novaims.unl.pt
Filipe Dias        | r20181050@novaims.unl.pt
Gonçalo Lourenço   | r20191097@novaims.unl.pt
Inês Santos        | r20191184@novaims.unl.pt
Manuel Marreiros   | r20191223@novaims.unl.pt

# **Data Source**

Brain Tumor Classification (MRI) Dataset: https://www.kaggle.com/datasets/sartajbhuvaji/brain-tumor-classification-mri

Drive with data: https://drive.google.com/file/d/1P3hcUss5Kqb28_WQUcTsuTW2VjNTX4pd/view?usp=share_link

# **Notebook Summary**





The selection of effective hyperparameters frequently has a significant impact on the performance of problems such as this. Using trial and error to determine appropriate values for these factors, which we have been doing so far, can only get so far and in that sense it is important to apply hyperparameter optimization algorithms to those models we consider to be the best to further enhance them. In our case, the model that show better performance in the handcraft notebook is our Convolutional Neural Network V6 (CNN_V6).

The approach we decided to follow in order to find the ideal hyperparameter values was to use Keras Tuner. That way, we just had to define a search area of parameters (like the number of filters, kernel size, dropout rate, learning rate, etc) and then we were able to take advantage of the included methods of Keras Tuner. More specifically, we tried Bayesian Optimization, Hyperband, and Random Search.

# **References**

1. [Guide to Hyperparameters Search For Deep Learning Models
](https://blog.floydhub.com/guide-to-hyperparameters-search-for-deep-learning-models/)
2. [Random Search](https://keras.io/api/keras_tuner/tuners/random/#randomsearch-class)
3. [Bayesian Optimization](https://keras.io/api/keras_tuner/tuners/bayesian/#bayesianoptimization-class)
4. [Hyperband](https://keras.io/api/keras_tuner/tuners/hyperband/#hyperband-class)
5. [Hyperband: A Novel Bandit-Based Approach to
Hyperparameter Optimization](https://arxiv.org/pdf/1603.06560.pdf)
6. [A Gentle Introduction to Dropout for Regularizing Deep Neural Networks](https://machinelearningmastery.com/dropout-for-regularizing-deep-neural-networks/)

# **Imports**

In [None]:
pip install keras-tuner tensorflow-addons --quiet

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m172.2/172.2 KB[0m [31m5.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m591.0/591.0 KB[0m [31m22.1 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
import os
import time
import math
import random 
import zipfile
import shutil

import numpy as np
import pandas as pd

import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from matplotlib.colors import ListedColormap

import tensorflow as tf
from tensorflow.keras import datasets
from tensorflow.keras.preprocessing import image_dataset_from_directory
from tensorflow.keras import Sequential, Model, layers, initializers, regularizers, optimizers, metrics 
from tensorflow.keras.initializers import GlorotNormal
import tensorflow_addons as tfa

import keras
from keras_tuner import Objective
import keras_tuner as kt
from kerastuner.tuners import RandomSearch, BayesianOptimization
from kerastuner.engine.hyperparameters import HyperParameters
from keras_tuner.tuners import Hyperband
from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score, accuracy_score, precision_score, recall_score

# **Things needed from previous notebooks**

In [None]:
#EXPLORATION

# Set the machine
gdrive = True
# Set the connection string
path = "/content/drive/MyDrive/Deep Learning/Projeto/"
main_folder, training_folder, testing_folder = "brain_tumor_data/", "Training/", "Testing/"
# If using Google Drive
if gdrive:
    # Setup drive
    from google.colab import drive
    drive.mount('/content/drive')        
    # Transfer zip dataset to the current virtual machine
    t0 = time.time()
    shutil.copyfile(path + 'brain_tumor_data.zip', 'brain_tumor_data.zip')
    # Extract files
    zip_ = zipfile.ZipFile('brain_tumor_data.zip')
    zip_.extractall()
    zip_.close()
    print("File transfer completed in %0.3f seconds" % (time.time() - t0))
    path = ""

classes = ["no_tumor", "glioma_tumor", "meningioma_tumor", "pituitary_tumor"]

# Create empty lists to store the number of instances and class names
n_train = []
class_names = []

# Loop through each class in the dataset
for c in classes:
    # Get the number of instances in the training set for the current class
    n_train_c = len(os.listdir(path + main_folder + training_folder + f"/{c}"))
    # Append the number of instances and class name to their respective lists
    n_train.append(n_train_c)
    class_names.append(c)

image_size=(128, 128)
crop_to_aspect_ratio=True
color_mode='grayscale'
batch_size=64
label_mode="categorical"
validation_split=0.2
shuffle=True
seed=0

ds_train, ds_val = image_dataset_from_directory(path + main_folder + training_folder, 
                                                image_size=image_size,
                                                crop_to_aspect_ratio=crop_to_aspect_ratio,
                                                color_mode=color_mode,
                                                batch_size=batch_size,
                                                label_mode=label_mode,
                                                subset='both',
                                                validation_split=validation_split, 
                                                shuffle=shuffle,
                                                seed=seed)

iter_train = iter(ds_train)
batch_x_train, batch_y_train = iter_train.next()

n_classes = len(classes)
total_samples = np.sum(n_train)

#PREPROCESSING

class_weights = {}
for i in range(n_classes):
    w = total_samples / (2.0 * n_train[i])
    class_weights[i] = w

print('Class counts:', n_train)
print('Class weights:', class_weights)

input_shape = tuple(batch_x_train.shape)
rescaling = layers.Rescaling(1./255)
batchnormalization = layers.BatchNormalization()

rotation_layer = layers.RandomRotation(factor=0.05)
zoom_layer = layers.RandomZoom(height_factor=0.05, width_factor=0.05)
contrast_layer = layers.RandomContrast(factor=0.10)
brightness_layer = layers.RandomBrightness(factor=0.05)
noise_layer = layers.GaussianNoise(0.05)
flip_layer = layers.RandomFlip(mode='horizontal')
crop_layer = layers.RandomCrop(height=300, width=300)
translation_layer = layers.RandomTranslation(height_factor=0.1, width_factor=0.1)

def augmentation(inputs):
    x = rotation_layer(inputs)
    x = zoom_layer(x)
    x = contrast_layer(x)
    x = brightness_layer(x)
    x = noise_layer(x)
    return x


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
File transfer completed in 2.394 seconds
Found 2870 files belonging to 4 classes.
Using 2296 files for training.
Using 574 files for validation.
Class counts: [395, 826, 822, 827]
Class weights: {0: 3.632911392405063, 1: 1.7372881355932204, 2: 1.745742092457421, 3: 1.735187424425635}


# **Hyperparameter search**
First, below is a simple function to print the best parameters in each model.

In [None]:
def print_hyperparameters_table(best_hyperparameters):
    # define a dictionary to map hyperparameter names to their values
    hyperparams_dict = {}
    # Check and add hyperparameters to the dictionary if they exist in the best_hyperparameters dictionary
    if 'filters_1' in best_hyperparameters:
        hyperparams_dict["filters_1"] = best_hyperparameters.get('filters_1')
    if 'filters_2' in best_hyperparameters:
        hyperparams_dict["filters_2"] = best_hyperparameters.get('filters_2')
    if 'dropout_rate' in best_hyperparameters:
        hyperparams_dict["dropout_rate"] = best_hyperparameters.get('dropout_rate')
    if 'kernel_height_width' in best_hyperparameters:
        hyperparams_dict["kernel_height_width"] = best_hyperparameters.get('kernel_height_width')
    if 'learning_rate' in best_hyperparameters:
        hyperparams_dict["learning_rate"] = best_hyperparameters.get('learning_rate')
    if 'n_conv2d' in best_hyperparameters:
        hyperparams_dict["n_conv2d"] = best_hyperparameters.get('n_conv2d')

    # define the header row of the table
    header = "| Hyperparameter | Value |"
    # print the header and separator rows
    print(header)

    # loop through the hyperparameters dictionary and print each one in a row of the table
    for param, value in hyperparams_dict.items():
        row = f"| {param} | {value} |"
        print(row)


## **Random Search**

###CNN_V6_HYPER_1
The first hyperparameter search method used for this project is the random search due to the low computational power needed for it. Our group decided to give more trials to this model as it is a technique based on randomness, thus the greater number of trials, the higher the chance of getting good parameters as a starting point.

Our code will perform a random search over the hyperparameter space defined in the build_model function, with the goal of finding the set of hyperparameters that maximize the "val_F1-Score" on the validation data. We chose "val_F1-Score" because it is a more robust metric than the accuracy.

The search will be performed for a maximum of 10 trials, with each trial corresponding to a different set of hyperparameters. If a trial fails due to an error, it will be retried at most 3 times.[2]


In [None]:
class CNN_V6_HYPER_1(tf.keras.Model):
    def __init__(self, seed=0, filters_1=32, filters_2=64, dropout_rate=0.25 , kernel_height_width=3):
        super().__init__()
        self.augmentation = augmentation
        self.batchnormalization = layers.BatchNormalization()
        self.conv1 = layers.Conv2D(
            filters=32*input_shape[-1], 
            kernel_size=(3, 3), 
            kernel_initializer=initializers.GlorotNormal(seed=seed),
            kernel_regularizer=regularizers.l2(0.001),
        )
        self.relu = layers.Activation("relu")
        self.maxpool = layers.MaxPooling2D(pool_size=(2, 2))          
        self.conv2 = layers.Conv2D(
            filters=64*input_shape[-1], 
            kernel_size=(3, 3), 
            kernel_initializer=initializers.GlorotNormal(seed=seed),
            kernel_regularizer=regularizers.l2(0.001), 
        )
        self.dropout = layers.Dropout(0.3) 
        self.flatten = layers.Flatten()
        self.dense1 = layers.Dense(
            units=4, 
            activation="softmax", 
            kernel_initializer=initializers.GlorotNormal(seed=seed),
            kernel_regularizer=regularizers.l2(0.001), 
            activity_regularizer=regularizers.l1(0.001) 
        ) 

    def call(self, inputs):
        x = self.augmentation(inputs)
        x = self.batchnormalization(x)
        x = self.conv1(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = self.conv2(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = self.dropout(x)
        x = self.flatten(x)
        x = self.dense1(x)
        return x

def build_model(hp):
    cnn_v6_hyper_1 = CNN_V6_HYPER_1(
        seed=seed,
        filters_1=hp.Int('filters_1', min_value=6, max_value=128, step=16),
        filters_2=hp.Int('filters_2', min_value=6, max_value=128, step=16),
        dropout_rate=hp.Float('dropout_rate', min_value=0.1, max_value=0.5, step=0.1),
        kernel_height_width=hp.Int('kernel_height_width', min_value=3, max_value=11, step=2)
    )
    cnn_v6_hyper_1.build(input_shape)
    cnn_v6_hyper_1.compile(
        optimizer=keras.optimizers.RMSprop(hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])),
        loss="categorical_crossentropy",
        metrics=[metrics.CategoricalAccuracy(name='accuracy'),
        tfa.metrics.F1Score(num_classes=4, average='macro', name='F1-Score')])

    return cnn_v6_hyper_1
# Use Keras Tuner library to search for the best hyperparameters
tuner_cnn_v6_hyper_1 = RandomSearch(
    build_model,
    objective=Objective('val_F1-Score', direction='max'),
    max_trials=10,
    overwrite=True,
    max_retries_per_trial=3
    )

tuner_cnn_v6_hyper_1.search(ds_train, epochs=5, validation_data=ds_val,class_weight = class_weights)

# Train the model with the best hyperparameters
best_model_cnn_v6_hyper_1 = tuner_cnn_v6_hyper_1.get_best_models(num_models=1)[0]
best_hyperparameters_cnn_v6_hyper_1 = tuner_cnn_v6_hyper_1.get_best_hyperparameters(num_trials=1)[0]

best_model_cnn_v6_hyper_1.fit(ds_train, epochs=10, validation_data=ds_val,class_weight = class_weights)

Trial 10 Complete [00h 00m 26s]
val_F1-Score: 0.807634711265564

Best val_F1-Score So Far: 0.807634711265564
Total elapsed time: 00h 03m 44s
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7fc2fc61e190>

In [None]:
print_hyperparameters_table(best_hyperparameters_cnn_v6_hyper_1)

| Hyperparameter | Value |
| filters_1 | 86 |
| filters_2 | 54 |
| dropout_rate | 0.2 |
| kernel_height_width | 3 |
| learning_rate | 0.001 |


In [None]:
best_model_cnn_v6_hyper_1.summary()

Model: "cnn_v6_hyper_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 batch_normalization (BatchN  multiple                 4         
 ormalization)                                                   
                                                                 
 conv2d (Conv2D)             multiple                  320       
                                                                 
 activation (Activation)     multiple                  0         
                                                                 
 max_pooling2d (MaxPooling2D  multiple                 0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           multiple                  18496     
                                                                 
 dropout (Dropout)           multiple               

## **Bayesian Optimization**
At this point, we understood that computational power was not an issue and we started to explore hyperparameters methods more computationally demanding in search for better performance. Also, as the name says, random search is a technique that gives random results which aren't guaranteed to be the best ones. 

Below, we decided to explore the Bayesian Optimization technique keeping all the previous model configurations.

###CNN_V6_HYPER_2

In [None]:
# Use Keras Tuner library to search for the best hyperparameters
tuner_cnn_v6_hyper_2 = BayesianOptimization(
    build_model,
    objective=Objective('val_F1-Score', direction='max'),
    max_trials=5,
    overwrite=True,
    max_retries_per_trial=3
    )

tuner_cnn_v6_hyper_2.search(ds_train, epochs=5, validation_data=ds_val,class_weight = class_weights)

# Train the model with the best hyperparameters
best_model_cnn_v6_hyper_2 = tuner_cnn_v6_hyper_2.get_best_models(num_models=1)[0]
best_hyperparameters_cnn_v6_hyper_2 = tuner_cnn_v6_hyper_2.get_best_hyperparameters(num_trials=1)[0]

best_model_cnn_v6_hyper_2.fit(ds_train, epochs=10, validation_data=ds_val,class_weight = class_weights)

Trial 5 Complete [00h 00m 20s]
val_F1-Score: 0.8100018501281738

Best val_F1-Score So Far: 0.8100018501281738
Total elapsed time: 00h 01m 40s
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7fc2fcbe5c40>

In [None]:
print_hyperparameters_table(best_hyperparameters_cnn_v6_hyper_2)

| Hyperparameter | Value |
| filters_1 | 102 |
| filters_2 | 54 |
| dropout_rate | 0.30000000000000004 |
| kernel_height_width | 11 |
| learning_rate | 0.001 |


###CNN_V6_HYPER_3


With the results from the first approaches using Random Search and Bayesian Optimization our group gathered some important inputs regarding the hyperparameters. We found out that the dropout_rate is always close to 0.3, the  kernel_height_width around 3, the learning_rate near 0.001 and that the diference between the number of filters is normally small but they assume high values.


Furthermore, with the goal to explore a little more we thought about changes in the structure of our model by adding a loop inside the class to create blocks of  Conv2D layers, Activation layers and Maxpolling2D layers. The choice of the numbers of loops generated is processsed as a hyperparameter named 'n_conv2d' with a range from 2 to 4 blocks of layers.


In [None]:
class CNN_V6_HYPER_3(tf.keras.Model):
    def __init__(self, seed=0, filters_1=64, dropout_rate=0.3, kernel_height_width=2, n_conv2d=2):
        super().__init__()
        self.augmentation = augmentation
        self.batchnormalization = layers.BatchNormalization()
        self.conv_layers = []
        for i in range(n_conv2d):
            self.conv_layers.append(layers.Conv2D(filters=filters_1*input_shape[-1], 
                                                  kernel_size=(kernel_height_width,kernel_height_width), 
                                                  kernel_initializer=initializers.GlorotNormal(seed=seed),
                                                  kernel_regularizer=regularizers.l2(0.001)))
            self.conv_layers.append(layers.Activation("relu"))
            self.conv_layers.append(layers.MaxPooling2D(pool_size=(2, 2)))
        self.dropout = layers.Dropout(dropout_rate) 
        self.flatten = layers.Flatten()
        self.dense1 = layers.Dense(units=4, 
                                   activation="softmax", 
                                   kernel_initializer=initializers.GlorotNormal(seed=seed),
                                   kernel_regularizer=regularizers.l2(0.001), 
                                   activity_regularizer=regularizers.l1(0.001) 
                                   ) 

    def call(self, inputs):
        x = self.augmentation(inputs)
        x = self.batchnormalization(x)
        for layer in self.conv_layers:
            x = layer(x)
        x = self.dropout(x)
        x = self.flatten(x)
        x = self.dense1(x)
        return x

def build_model(hp):
    cnn_v6_hyper_3 = CNN_V6_HYPER_3(
        seed=seed,
        filters_1=hp.Int('filters_1', min_value=64, max_value=128, step=8),
        dropout_rate=hp.Float('dropout_rate', min_value=0.3, max_value=0.5, step=0.05),
        kernel_height_width=hp.Int('kernel_height_width', min_value=2, max_value=6, step=1),
        n_conv2d=hp.Int('n_conv2d', min_value=2, max_value=4, step=1)
    )
    cnn_v6_hyper_3.build(input_shape)
    cnn_v6_hyper_3.compile(
        optimizer=keras.optimizers.RMSprop(hp.Choice('learning_rate', values=[ 0.0005,0.001,0.005])),
        loss="categorical_crossentropy",
        metrics=[metrics.CategoricalAccuracy(name='accuracy'),
        tfa.metrics.F1Score(num_classes=4, average='macro', name='F1-Score')])

    return cnn_v6_hyper_3
# Use Keras Tuner library to search for the best hyperparameters
tuner_cnn_v6_hyper_3 = BayesianOptimization(
    build_model,
    objective=Objective('val_F1-Score', direction='max'),
    max_trials=5,
    overwrite=True,
    max_retries_per_trial=3
    )

tuner_cnn_v6_hyper_3.search(ds_train, epochs=5, validation_data=ds_val,class_weight = class_weights)

# Train the model with the best hyperparameters
best_model_cnn_v6_hyper_3 = tuner_cnn_v6_hyper_3.get_best_models(num_models=1)[0]
best_hyperparameters_cnn_v6_hyper_3 = tuner_cnn_v6_hyper_3.get_best_hyperparameters(num_trials=1)[0]

best_model_cnn_v6_hyper_3.fit(ds_train, epochs=10, validation_data=ds_val,class_weight = class_weights)

Trial 5 Complete [00h 00m 22s]
val_F1-Score: 0.7964640855789185

Best val_F1-Score So Far: 0.7964640855789185
Total elapsed time: 00h 02m 45s
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7fc320282160>

In [None]:
print_hyperparameters_table(best_hyperparameters_cnn_v6_hyper_3)

| Hyperparameter | Value |
| filters_1 | 64 |
| dropout_rate | 0.35 |
| kernel_height_width | 3 |
| learning_rate | 0.001 |
| n_conv2d | 2 |


In [None]:
best_model_cnn_v6_hyper_3.summary()

Model: "cnn_v6_hyper_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 batch_normalization (BatchN  multiple                 4         
 ormalization)                                                   
                                                                 
 conv2d (Conv2D)             multiple                  640       
                                                                 
 activation (Activation)     multiple                  0         
                                                                 
 max_pooling2d (MaxPooling2D  multiple                 0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           multiple                  36928     
                                                                 
 activation_1 (Activation)   multiple               

###CNN_V6_HYPER_4


From analysing the previous model we understood that overfitting was present and a solution for this type of problems can be the use of Dropout layers. Dropout layers in this context can work as a regularization technique to prevent overfitting by randomly setting a fraction of the input units to 0 at each training iteration, which helps to prevent the model from relying too heavily on any single input unit.

In sum, for each block of Conv2D, Activation and MaxPolling2D layers we added a Dropout layer.


In [None]:
class CNN_V6_HYPER_4(tf.keras.Model):
    def __init__(self, seed=0, filters_1=32, dropout_rate=0.25, kernel_height_width=3, n_conv2d=4):
        super().__init__()
        self.augmentation = augmentation
        self.batchnormalization = layers.BatchNormalization()
        self.conv_layers = []
        for i in range(n_conv2d):
            self.conv_layers.append(layers.Conv2D(filters=filters_1*input_shape[-1],
                                                  kernel_size=(kernel_height_width,kernel_height_width),
                                                  kernel_initializer=initializers.GlorotNormal(seed=seed),
                                                  kernel_regularizer=regularizers.l2(0.001)))
            self.conv_layers.append(layers.Activation("relu"))
            self.conv_layers.append(layers.MaxPooling2D(pool_size=(2, 2)))
            self.conv_layers.append(layers.Dropout(dropout_rate))
        self.flatten = layers.Flatten()
        self.dense1 = layers.Dense(units=4, 
                                   activation="softmax", 
                                   kernel_initializer=initializers.GlorotNormal(seed=seed),
                                   kernel_regularizer=regularizers.l2(0.001), 
                                   activity_regularizer=regularizers.l1(0.001) 
                                   ) 

    def call(self, inputs):
        x = self.augmentation(inputs)
        x = self.batchnormalization(x)
        for layer in self.conv_layers:
            x = layer(x)
        x = self.flatten(x)
        x = self.dense1(x)
        return x

def build_model(hp):
    cnn_v6_hyper_4 = CNN_V6_HYPER_4(
        seed=seed,
        filters_1=hp.Int('filters_1', min_value=64, max_value=128, step=8),
        dropout_rate=hp.Float('dropout_rate', min_value=0.3, max_value=0.5, step=0.05),
        kernel_height_width=hp.Int('kernel_height_width', min_value=2, max_value=6, step=1),
        n_conv2d=hp.Int('n_conv2d', min_value=2, max_value=5, step=1)
    )
    cnn_v6_hyper_4.build(input_shape)
    cnn_v6_hyper_4.compile(
        optimizer=keras.optimizers.RMSprop(hp.Choice('learning_rate', values=[ 0.0005,0.001,0.005])),
        loss="categorical_crossentropy",
        metrics=[metrics.CategoricalAccuracy(name='accuracy'),
        tfa.metrics.F1Score(num_classes=4, average='macro', name='F1-Score')])

    return cnn_v6_hyper_4

# Use Keras Tuner library to search for the best hyperparameters
tuner_cnn_v6_hyper_4= BayesianOptimization(
    build_model,
    objective=Objective('val_F1-Score', direction='max'),
    max_trials=5,
    overwrite=True,
    max_retries_per_trial=3
    )

tuner_cnn_v6_hyper_4.search(ds_train, epochs=5, validation_data=ds_val,class_weight = class_weights)

# Train the model with the best hyperparameters
best_model_cnn_v6_hyper_4 = tuner_cnn_v6_hyper_4.get_best_models(num_models=1)[0]
best_hyperparameters_cnn_v6_hyper_4 = tuner_cnn_v6_hyper_4.get_best_hyperparameters(num_trials=1)[0]

best_model_cnn_v6_hyper_4.fit(ds_train, epochs=10, validation_data=ds_val,class_weight = class_weights)


Trial 5 Complete [00h 00m 44s]
val_F1-Score: 0.7226176261901855

Best val_F1-Score So Far: 0.7226176261901855
Total elapsed time: 00h 02m 45s
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7fc2fe7ec9d0>

In [None]:
print_hyperparameters_table(best_hyperparameters_cnn_v6_hyper_4)

| Hyperparameter | Value |
| filters_1 | 104 |
| dropout_rate | 0.35 |
| kernel_height_width | 5 |
| learning_rate | 0.0005 |
| n_conv2d | 2 |


In [None]:
best_model_cnn_v6_hyper_4.summary()

Model: "cnn_v6_hyper_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 batch_normalization (BatchN  multiple                 4         
 ormalization)                                                   
                                                                 
 conv2d (Conv2D)             multiple                  2704      
                                                                 
 activation (Activation)     multiple                  0         
                                                                 
 max_pooling2d (MaxPooling2D  multiple                 0         
 )                                                               
                                                                 
 dropout (Dropout)           multiple                  0         
                                                                 
 conv2d_1 (Conv2D)           multiple               

## **Hyperband**

Until now, our group still believes that we can get better results and since the previous model didn't perform very well or settle our overfitting problem we thought about using other hyperparameter tuning technique called Hyperband.

Hyperband works as a combination of random search and adaptive resource allocation to efficiently search the hyperparameter space and find optimal hyperparameter configurations for a machine learning model. It allocates resources (nº of epochs) dynamically to different hyperparameter configurations based on their performance. It starts by training multiple models with random hyperparameter configurations for a small number of epochs, and then selects a fraction of the top-performing models (based on their validation performance) to continue training with more epochs. This process is repeated in successive rounds, with fewer models and more epochs allocated to the top-performing configurations at each round. This allows Hyperband to allocate more resources to promising hyperparameter configurations and converge towards the optimal hyperparameter values faster.



###CNN_V6_HYPER_5


In [None]:
class CNN_V6_HYPER_5(tf.keras.Model):
    def __init__(self, seed=0, filters_1=32, dropout_rate=0.25, kernel_height_width=3, n_conv2d=5):
        super().__init__()
        self.augmentation = augmentation
        self.batchnormalization = layers.BatchNormalization()
        self.conv_layers = []
        for i in range(n_conv2d):
            self.conv_layers.append(layers.Conv2D(filters=filters_1*input_shape[-1],
                                                  kernel_size=(kernel_height_width,kernel_height_width),
                                                  kernel_initializer=initializers.GlorotNormal(seed=seed),
                                                  kernel_regularizer=regularizers.l2(0.001)))
            self.conv_layers.append(layers.Activation("relu"))
            self.conv_layers.append(layers.MaxPooling2D(pool_size=(2, 2)))
            self.conv_layers.append(layers.Dropout(dropout_rate))
        self.flatten = layers.Flatten()
        self.dense1 = layers.Dense(units=4, 
                                   activation="softmax", 
                                   kernel_initializer=initializers.GlorotNormal(seed=seed),
                                   kernel_regularizer=regularizers.l2(0.001), 
                                   activity_regularizer=regularizers.l1(0.001) 
                                   ) 

    def call(self, inputs):
        x = self.augmentation(inputs)
        x = self.batchnormalization(x)
        for layer in self.conv_layers:
            x = layer(x)
        x = self.flatten(x)
        x = self.dense1(x)
        return x

def build_model(hp):
    cnn_v6_hyper_5 = CNN_V6_HYPER_5(
        seed=seed,
        filters_1=hp.Int('filters_1', min_value=64, max_value=128, step=8),
        dropout_rate=hp.Float('dropout_rate', min_value=0.3, max_value=0.5, step=0.05),
        kernel_height_width=hp.Int('kernel_height_width', min_value=2, max_value=6, step=1),
        n_conv2d=hp.Int('n_conv2d', min_value=2, max_value=5, step=1)
    )
    cnn_v6_hyper_5.build(input_shape)
    cnn_v6_hyper_5.compile(
        optimizer=keras.optimizers.RMSprop(hp.Choice('learning_rate', values=[ 0.0005,0.001,0.005])),
        loss="categorical_crossentropy",
        metrics=[metrics.CategoricalAccuracy(name='accuracy'),
        tfa.metrics.F1Score(num_classes=4, average='macro', name='F1-Score')])

    return cnn_v6_hyper_5
# Use Keras Tuner library to search for the best hyperparameters
tuner_cnn_v6_hyper_5 = Hyperband(
    build_model,
    objective=Objective('val_F1-Score', direction='max'),
    max_epochs=10,
    factor=3,
    overwrite=True,
    max_retries_per_trial=3
)

tuner_cnn_v6_hyper_5.search(ds_train, epochs=5, validation_data=ds_val,class_weight = class_weights)

# Train the model with the best hyperparameters
best_model_cnn_v6_hyper_5 = tuner_cnn_v6_hyper_5.get_best_models(num_models=1)[0]
best_hyperparameters_cnn_v6_hyper_5 = tuner_cnn_v6_hyper_5.get_best_hyperparameters(num_trials=1)[0]

best_model_cnn_v6_hyper_5.fit(ds_train, epochs=10, validation_data=ds_val,class_weight = class_weights)


Trial 28 Complete [00h 01m 12s]
val_F1-Score: 0.7593338489532471

Best val_F1-Score So Far: 0.8430914282798767
Total elapsed time: 00h 13m 27s
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f327132c670>

In [None]:
print_hyperparameters_table(best_hyperparameters_cnn_v6_hyper_5)

| Hyperparameter | Value |
| filters_1 | 112 |
| dropout_rate | 0.3 |
| kernel_height_width | 4 |
| learning_rate | 0.001 |
| n_conv2d | 4 |


In [None]:
best_model_cnn_v6_hyper_5.summary()

Model: "cnn_v6_hyper_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 batch_normalization (BatchN  multiple                 4         
 ormalization)                                                   
                                                                 
 conv2d (Conv2D)             multiple                  1904      
                                                                 
 activation (Activation)     multiple                  0         
                                                                 
 max_pooling2d (MaxPooling2D  multiple                 0         
 )                                                               
                                                                 
 dropout (Dropout)           multiple                  0         
                                                                 
 conv2d_1 (Conv2D)           multiple               