# Deep Learning - Hyperparameter-Tuning
### *Facial age prediction - a SML Regression problem*

# 1. References

1. Introduction to the Keras Tuner, 2022, [official documentation](https://www.tensorflow.org/tutorials/keras/keras_tuner)
2. Hyperband Tuner, [official documentation](https://keras.io/api/keras_tuner/tuners/hyperband/)
3. The base Tuner class, [official documentation](https://keras.io/api/keras_tuner/tuners/base_tuner/#tuner-class)
4. Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization, 2018, [link article](https://jmlr.org/papers/v18/16-558.html)
5. HyperBand and BOHB: Understanding State of the Art Hyperparameter Optimization Algorithms, 2023, [blog link](https://neptune.ai/blog/hyperband-and-bohb-understanding-state-of-the-art-hyperparameter-optimization-algorithms)

# 2. Initial Treatment

## 2.1. Configurations and import Libraries

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Installs the keras-tuner.
%pip install -q -U keras-tuner

Note: you may need to restart the kernel to use updated packages.


In [None]:
import matplotlib.pyplot as plt
import matplotlib
import matplotlib.image as mpimg
from matplotlib.colors import ListedColormap
import seaborn as sns

import numpy as np
import pandas as pd

import tensorflow as tf
from tensorflow.keras import datasets
from tensorflow.keras.preprocessing import image_dataset_from_directory
from tensorflow.keras import layers, initializers, regularizers, optimizers, metrics, losses
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras import backend as K
import keras_tuner as kt

import os
import time
from pathlib import Path
import pickle

import warnings
warnings.filterwarnings("ignore")

## 2.2. Auxiliary functions

Collection of all user defined functions in this notebook. 

In [None]:
def train_best_model(model, training, test, epochs, batch_size):
  '''Train the best model found by a Keras Tuner and return its training history.

  Args:
  --
      tuner (keras_tuner.engine.tuner.Tuner): A Keras Tuner object.
      training: The training data
      epochs (int): The number of epochs to train the model.
      batch_size (int): The batch size to use during training.

  Returns:
  --
      A dictionary containing the training history of model.
  '''
  tf.keras.backend.clear_session()
  
  early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)
  print(f"Start training of {model.__class__.__doc__}")
  start = time.time()

  history = model.fit(training, 
                    epochs=epochs, 
                    validation_data = test,
                    batch_size=batch_size,
                    callbacks=[early_stopping])
  print("Training time: {:.4f}s".format(time.time() - start), end="\n\n")

  return history

## 2.3. Import dataset

In the following all paths for the data import will be defined.

In [None]:
# Create folder for saving the hypertuned models.
!mkdir -p /content/drive/MyDrive/FacialAgeProject/models/hypertune

In [None]:
# Define the path, where the dataset should be saved
vm_path = "/content"
path = "/content/drive/MyDrive/FacialAgeProject/"

data_path = os.path.join(path, 'data')
metadata_path = os.path.join(path, 'metadata')
dataset_path = os.path.join(data_path, "facial_age_dataset_unsplit/")

preprocessed_path = os.path.join(data_path, 'facial_age_dataset_preprocessed')

train_path = Path(os.path.join(preprocessed_path, 'train'))
test_path = Path(os.path.join(preprocessed_path, 'test'))

metadata_csv_path = os.path.join(metadata_path, 'images_metadata.csv')
# metadata_csv_path = path + 'metadata/images_metadata.csv'

In [None]:
# Image dataset with all the files configurations
images_df = pd.read_csv(metadata_csv_path)
images_df = images_df.sample(frac = 1.0).reset_index(drop = True)
images_df.head()

# Train dataset to be used in the generator
train_file_names = pd.Series(list(train_path.glob(r'**/*.png')),name = 'file_names').astype('str')
train_file_names = train_file_names.apply(lambda x : x.split("/")[-1])
train_images_df = images_df[images_df['file_name'].isin(train_file_names)]
train_images_df['file_name'] = '..' + f'{train_path}'+ '/' + train_images_df['age_label'].apply(lambda x : f"{x:03d}") + '/' + train_images_df['file_name']
# train_images_df['weights'] = train_images_df['age_label'].apply(lambda x : weights_dict[x])

# Test dataset to be used in the generator
test_file_names = pd.Series(list(test_path.glob(r'**/*.png')),name = 'file_names').astype('str')
test_file_names = test_file_names.apply(lambda x : x.split('/')[-1])
test_images_df = images_df[images_df['file_name'].isin(test_file_names)]
test_images_df['file_name'] = '..' + f'{test_path}'+ '/' + test_images_df['age_label'].apply(lambda x : f"{x:03d}") + '/' + test_images_df['file_name']

print('Train files and dataset lenght (must check) : {} vs {}'.format(len(train_file_names),train_images_df.shape[0]))
print('test files and dataset lenght (must check) : {} vs {}'.format(len(test_file_names),test_images_df.shape[0]))

In [None]:
# Defining global variables
batch_size = 64
seed = 0
epochs = 10
input_shape = (None, 200, 200, 3)

In [None]:
# Data generators and parameters
train_generator = tf.keras.preprocessing.image.ImageDataGenerator(
    validation_split=0.2
)

test_generator = tf.keras.preprocessing.image.ImageDataGenerator()

generate_params = {
    'target_size' : (200,200),
    'color_mode' : 'rgb',
    'class_mode' : 'raw',
    'batch_size' : batch_size,
    'seed' : seed
}

In [None]:
# Generators instances and data spliting
ds_train = train_generator.flow_from_dataframe(
    dataframe=train_images_df,
    x_col='file_name',
    y_col='age_label',
    #weight_col = 'weights',
    shuffle=True,
    subset='training',
    **generate_params
)

ds_val = train_generator.flow_from_dataframe(
    dataframe=train_images_df,
    x_col='file_name',
    y_col='age_label',
    shuffle=True,
    subset='validation',
    **generate_params
)

ds_test = test_generator.flow_from_dataframe(
    dataframe=test_images_df,
    x_col='file_name',
    y_col='age_label',
    **generate_params
)

Found 5607 validated image filenames.
Found 1401 validated image filenames.
Found 1752 validated image filenames.


# 3. Hyperparameter-Tuning



- In the following we will apply hyperparameter tuning for our best LeNet and AlexNet architectures
- Again here the details of both models:
  - LeNet-V7 Enhanced Architecture + L2 Regularization (0.001) + `Adam` optimizer 
  - AlexNet-V4 Less Complex Architecture 1 + Dropout Rate (0.25) + L2 Regularization (0.001)
- As hyperparameter technique we will use `keras_tuner.Hyperband()`, we followed the following resources about configuration and implementation of hypeband using keras [1][2][3]


**Hyperband**

Hyperband is a highly efficient hyperparameter optimization algorithm that has gained popularity in recent years. One of its key advantages is its speed, which makes it faster than RandomSearch and Bayesian Algorithms. 

It's based on SuccessiveHalving that: 
> "...uniformily allocate a budget to a set of hyperparameter configurations, evaluate the performance of all configurations, throw out the worst half, and repeat until one configuration remains. The algorithm allocates exponentially more resources to more promising configurations" [4]

And there's some studies that proves that can be: 
> "... 5x to 30x faster than popular Bayesian optimization algorithms ..." [4]

Others resources, also shown that: 
> "... HyperBand has better performance in comparison with random search." [5]


## 3.1. LeNet-V7

In this part of the project we took our best LeNet class from the handcrafted model to try to hypertune the possible parameters in it. 

For this class, the search parameters defined were: 
- number of filters in convolutional layers
- size of the kernerls in convolutional layers
- the type of the regularizer applied in the layers, in this point in specific we used the l2 regularization, but we control the learning rate applied to it.

In [None]:
class LeNet_v7(tf.keras.Model):
    '''LeNet-V7 Enhanced Architecture + L2 Regularization (0.001) + Adam optimizer'''
    def __init__(self, seed=0,
                conv_1_filters = 6, 
                conv_2_filters = 16,
                kernel = (5,5),
                regularizer = None
                 ):
        super().__init__()
        # Convolutional layers (with learnable parameters)
        self.resize = layers.Rescaling(1./255)
        self.conv_1 = layers.Conv2D(filters=conv_1_filters, kernel_size=kernel, 
                                    kernel_initializer=initializers.GlorotNormal(seed=seed), padding="same",
                                    kernel_regularizer=regularizer)
        self.conv_2 = layers.Conv2D(filters=conv_2_filters, kernel_size=kernel, 
                                    kernel_initializer=initializers.GlorotNormal(seed=seed), padding="same",
                                    kernel_regularizer=regularizer) 
        self.dense_1 = layers.Dense(units=120, activation="relu", kernel_regularizer=regularizer,
                                   kernel_initializer=initializers.GlorotNormal(seed=seed))
        self.dense_2 = layers.Dense(units=84, activation="relu", kernel_regularizer=regularizer,
                                   kernel_initializer=initializers.GlorotNormal(seed=seed))
        self.output_layer = layers.Dense(units=1, activation="linear", 
                                   kernel_initializer=initializers.GlorotNormal(seed=seed))
        # Non-learnable layers (define only once)
        self.relu = layers.Activation("relu")
        self.maxpool2x2 = layers.MaxPooling2D(pool_size=2, strides=2)
        self.flatten = layers.Flatten()
        
    def call(self, inputs):
        # Orderly flows the inputs through the network's components
        x = self.resize(inputs)
        x = self.conv_1(x)
        x = self.relu(x)
        x = self.maxpool2x2(x)
        x = self.conv_2(x)
        x = self.relu(x)
        x = self.maxpool2x2(x)
        x = self.flatten(x)
        x = self.dense_1(x)
        x = self.dense_2(x)
        x = self.output_layer(x)

        return x

### 3.1.1. Setup tuner

In this step, we will specify the hyperparameters to be tuned and define the search spaces, which determine the range of values that the hyperparameters can take during the tuning process.

The parameters for LeNet best class: 

- number of filter applied in the first convolutional layer with values varying between 5 to 25 with step of 5.
- number of filter applied in the second convolutional layer with values varying between 20 to 40 with step of 5.
- sizes of the kernels varying in (3,3), (5,5) and (10,10).
- learning rates in 0.01 and 0.001.

In [None]:
def lenet_v7_builder(hp):
    conv_1_filters = hp.Int("conv_1_filters", min_value=5, max_value=25, step=5)
    conv_2_filters = hp.Int("conv_2_filters", min_value=20, max_value=40, step=5)

    kernel_size = hp.Choice("kernel", [3,5,10])
    kernel = (kernel_size, kernel_size)

    hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3])

    model = LeNet_v7(
        seed=0,
        regularizer=regularizers.l2(hp_learning_rate),
        conv_1_filters=conv_1_filters,
        conv_2_filters=conv_2_filters,
        kernel=kernel
    )
    model.compile(
        optimizer=optimizers.Adam(learning_rate=hp_learning_rate),
        loss = losses.MeanSquaredError(),
        metrics=[metrics.MeanAbsolutePercentageError(name='MAPE')]
        )
    return model

### 3.1.2. Build and start tuner 

In this step we'll set the folder to be saved all trials made in the hypertuning search. 

As we mentioned before we're using the `HyperBand class` from `keras_tuner`.

In [None]:
# Creates a new folder in hypertune, to save the AlexNet-V9 results and tuner.
project_name = "lenet_v7_1"
path_save_tuning_lenet = "/content/drive/MyDrive/FacialAgeProject/models/hypertune/" + project_name

In [None]:
# Initialize the Hyperband tuner.
tuner_lenet = kt.Hyperband(lenet_v7_builder,
                            objective='val_loss',
                            max_epochs=10,
                            factor=3,
                            seed=seed,
                            directory=path_save_tuning_lenet,
                            project_name=project_name)

In [None]:
# Initialize early stopping on validation loss after 5 epochs.
stop_early = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

# Start the hyperparameter search.
history_tuner_lenet = tuner_lenet.search(
                                ds_train, 
                                validation_data=ds_val,
                                epochs=epochs, 
                                batch_size=batch_size,
                                callbacks=[stop_early]
                                )

Trial 30 Complete [00h 02m 29s]
val_loss: 499.698486328125

Best val_loss So Far: 103.35916137695312
Total elapsed time: 00h 27m 24s
INFO:tensorflow:Oracle triggered exit


We reached with hypertuning the following parameters for LeNet:

< MISSING THE PRINT OF THE BEST PARAMETERS FOR ALEXNET>

With that we reached a validation loss of 103.

In [None]:
tuner_lenet.get_best_hyperparameters(1)[0]

## 3.2. AlexNet-V4

In this part of the project we took our best AlexNet class from the handcrafted model to try to hypertune the possible parameters in it. 

For this class, the search parameters defined were: 
- number of filters in convolutional layers
- size of the kernerls in convolutional layers
- the size of the strides
- the dropout rate
- quantity of units in dense layers
- the type of the regularizer applied in the layers, in this point in specific we used the l2 regularization, but we control the learning rate applied to it.

In [None]:
class AlexNet_v4(tf.keras.Model):
    '''AlexNet-V4 Less Complex Architecture 1 + Dropout Rate (0.25) + L2 Regularization (0.001)'''
    def __init__(self, 
                 regularizer=None, 
                 seed=0,
                 conv1_filter=64,
                 conv2_filter=128,
                 conv3_filter=192,
                 conv4_filter=192,
                 conv5_filter=128,
                 kernel=(3,3),
                 strides=(2,2),
                 dense1_units=2048,
                 dense2_units=2048,
                 dropout_rate=0.25
                 ):
        super().__init__()
        
        # Rescaling.
        self.rescaling = layers.Rescaling(1./255)

        # Convolutional layers (with learnable parameters).
        self.conv1 = layers.Conv2D(filters=conv1_filter, kernel_size=kernel, strides=strides, 
                                    kernel_initializer=initializers.GlorotNormal(seed=seed),
                                    kernel_regularizer=regularizer)
        self.conv2 = layers.Conv2D(filters=conv2_filter, kernel_size=kernel, strides=strides, padding="same",
                                 kernel_initializer=initializers.GlorotNormal(seed=seed),
                                 kernel_regularizer=regularizer)   
        self.conv3 = layers.Conv2D(filters=conv3_filter, kernel_size=kernel, strides=strides, padding="same",
                                 kernel_initializer=initializers.GlorotNormal(seed=seed),
                                 kernel_regularizer=regularizer)
        self.conv4 = layers.Conv2D(filters=conv4_filter, kernel_size=kernel, strides=strides, padding="same",
                                 kernel_initializer=initializers.GlorotNormal(seed=seed),
                                 kernel_regularizer=regularizer)
        self.conv5 = layers.Conv2D(filters=conv5_filter, kernel_size=kernel, strides=strides, padding="same",
                                 kernel_initializer=initializers.GlorotNormal(seed=seed),
                                 kernel_regularizer=regularizer)

        # Batch normalization layers (with learnable parameters, gamma and beta).
        self.bn0 = layers.BatchNormalization()
        self.bn1 = layers.BatchNormalization() 
        self.bn2 = layers.BatchNormalization()
        self.bn3 = layers.BatchNormalization()
        self.bn4 = layers.BatchNormalization()
        self.bn5 = layers.BatchNormalization()
        
        # Classifier's head.
        self.dense1 = layers.Dense(units=dense1_units,
                                   kernel_initializer=initializers.GlorotNormal(seed=seed),
                                   kernel_regularizer=regularizer)
        self.dense2 = layers.Dense(units=dense2_units,
                                   kernel_initializer=initializers.GlorotNormal(seed=seed),
                                   kernel_regularizer=regularizer)
        self.output_layer = layers.Dense(units=1, activation="linear", 
                                   kernel_initializer=initializers.GlorotNormal(seed=seed),
                                   kernel_regularizer=regularizer)
        
        # Non-learnable layers (define only once)
        self.relu = layers.Activation("relu")
        self.maxpool3x3 = layers.MaxPooling2D(pool_size=(3,3), strides=(2,2))
        self.dropout = layers.Dropout(dropout_rate)
        self.flatten = layers.Flatten()

        
    def call(self, inputs):
        # Orderly flows the inputs through the network's components
        x = self.rescaling(inputs)
        x = self.bn0(x)
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool3x3(x)
        x = self.conv2(x)
        x = self.bn2(x)
        x = self.relu(x)
        x = self.maxpool3x3(x)
        x = self.conv3(x)
        x = self.bn3(x)
        x = self.relu(x)
        x = self.conv4(x)
        x = self.bn4(x)
        x = self.relu(x)
        x = self.conv5(x)
        x = self.bn5(x)
        x = self.relu(x)
        x = self.flatten(x)
        x = self.dense1(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.dense2(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.output_layer(x)
        return x

### 3.2.1. Setup tuner

In this step, we will specify the hyperparameters to be tuned and define the search spaces, which determine the range of values that the hyperparameters can take during the tuning process.

As we saw in the behaviour of the handcrafted model, if we decrease the number of filters in the architecture of AlexNet we got a better result in terms of our metric, for this we decided to try to reduce more the filters in the convolutional layers, varying with a lowest number and increasing until the upper bound found in the hadcrafted model. We replicated this strategy for the quantity of units in the dense layers.

We tried also to change the numbers of strides, but it was very costly, so we decided to maintain the standard value of 2.

And for last, to control better the overfitting we choose to control the dropout rate putting the possible values as 0.05, 0.25 and 0.5.

**The parameters for AlexNet best class:**

- number of filter applied in the first convolutional layer with values varying between 32 to 128 with step of 8.
- number of filter applied in the second convolutional layer with values varying between 64 to 256 with step of 16.
- number of filter applied in the third convolutional layer with values varying between 96 to 384 with step of 32.
- number of filter applied in the fourth convolutional layer with values varying between 96 to 384 with step of 32.
- number of filter applied in the fifth convolutional layer with values varying between 64 to 256 with step of 16.
- sizes of the kernels varying in (3,3), (5,5) and (10,10).
- learning rates in 0.01 and 0.001.
- first dense units varying between 1024 to 2480 with steps of 256
- second dense units varying between 1024 to 2480 with steps of 256

In [None]:
def alexnet_v4_builder(hp):
    conv1_filter = hp.Int('conv1_filter', min_value=32, max_value=128, step=8)
    conv2_filter = hp.Int('conv2_filter', min_value=64, max_value=256, step=16)
    conv3_filter = hp.Int('conv3_filter', min_value=192/2, max_value=192*2, step=32)
    conv4_filter = hp.Int('conv4_filter', min_value=192/2, max_value=192*2, step=32)
    conv5_filter = hp.Int('conv5_filter', min_value=64, max_value=256, step=16)

    kernel_size = hp.Choice("kernel", [3,4,5])
    kernel = (kernel_size, kernel_size)

    stride_size = hp.Choice("strides", [2])
    strides = (stride_size, stride_size)

    dense1_units = hp.Int('dense1_units', min_value=1024, max_value=2480, step=256)
    dense2_units = hp.Int('dense2_units', min_value=1024, max_value=2480, step=256)

    dropout_rate = hp.Choice('dropout_rate', values=[0.05, 0.25, 0.5])
    hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])

    tf.keras.backend.clear_session()

    model = AlexNet_v9(
        regularizer=regularizers.l2(hp_learning_rate),
        seed=0,
        conv1_filter=conv1_filter,
        conv2_filter=conv2_filter,
        conv3_filter=conv3_filter,
        conv4_filter=conv4_filter,
        conv5_filter=conv5_filter,
        kernel=kernel,
        strides=strides,
        dense1_units=dense1_units,
        dense2_units=dense2_units,
        dropout_rate=dropout_rate
    )
    model.compile(
        optimizer=optimizers.Adam(learning_rate=hp_learning_rate),
        loss = losses.MeanSquaredError(),
        metrics=[metrics.MeanAbsolutePercentageError(name='MAPE')]
        )
    return model

### 3.2.2. Build and start tuner 

In this step we'll set the folder to be saved all trials made in the hypertuning search. 

As we mentioned before we're using the `HyperBand class` from `keras_tuner`.

In [None]:
# Creates a new folder in hypertune, to save the AlexNet-V9 results and tuner.
project_name = "alexnet_v4"
path_save_tuning_alexnet = "/content/drive/MyDrive/FacialAgeProject/models/hypertune/" + project_name

In [None]:
# Initialize the Hyperband tuner.
tuner_alexnet = kt.Hyperband(alexnet_v4_builder,
                            objective='val_loss',
                            max_epochs=10,
                            factor=3,
                            seed=seed,
                            directory=path_save_tuning_alexnet,
                            project_name=project_name)

In [None]:
# Initialize early stopping on validation loss after 5 epochs.
stop_early = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

# Start the hyperparameter search.
history_tuner_alexnet = tuner_alexnet.search(
                        ds_train, 
                        validation_data=ds_val,
                        epochs=epochs, 
                        batch_size=batch_size,
                        callbacks=[stop_early]
                        )

Trial 30 Complete [00h 02m 15s]
val_loss: 84.8982162475586

Best val_loss So Far: 82.37137603759766
Total elapsed time: 00h 27m 53s
INFO:tensorflow:Oracle triggered exit


We reached with hypertuning the following parameters for AlexNet:

< MISSING THE PRINT OF THE BEST PARAMETERS FOR ALEXNET>

With that we reached a validation loss of 82 that was the best of our models.

In [None]:
tuner_alexnet.get_best_hyperparameters(1)[0]

## 3.3. Train best models on full training dataset

### 3.3.1. Create datasets

We have to split again our full dataset in training and test dataset, without validation.

In [None]:
# Data generators and parameters
train_generator = tf.keras.preprocessing.image.ImageDataGenerator()

test_generator = tf.keras.preprocessing.image.ImageDataGenerator()

generate_params = {
    'target_size' : (200,200),
    'color_mode' : 'rgb',
    'class_mode' : 'raw',
    'batch_size' : batch_size,
    'seed' : seed
}

In [None]:
train_images = train_generator.flow_from_dataframe(
    dataframe=train_images_df,
    x_col='file_name',
    y_col='age_label',
    shuffle=True,
    **generate_params
)

test_images = test_generator.flow_from_dataframe(
    dataframe=test_images_df,
    x_col='file_name',
    y_col='age_label',
    shuffle=True,
    **generate_params
)

Found 7008 validated image filenames.
Found 1752 validated image filenames.


In [None]:
# Define the save path of the models.
path_save_model =  "/content/drive/MyDrive/FacialAgeProject/models/best_models"

### 3.3.2. Train LeNet-V7

In [None]:
# Get the best hyperparameters and build model.
best_hps_lenet = tuner_lenet.get_best_hyperparameters(1)[0]
best_lenet = tuner_lenet.hypermodel.build(best_hps_lenet)

In [None]:
history_lenet = train_best_model(model=best_lenet,
                                   training=train_images,
                                   test = test_images,
                                   epochs=epochs,
                                   batch_size=batch_size)

Start training of LeNet-V7 Enhanced Architecture + L2 Regularization (0.001) + `Adam` optimizer 
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Training time: 151.0221s



In [None]:
# Saves the best model as .keras file.
history_lenet.model.save(filepath=path_save_model + "/tuned_LeNet-V7", 
                    overwrite=True, 
                    save_format="keras")

print(f"LeNet saved successfully into {path_save_model}")



INFO:tensorflow:Assets written to: C:/Users/jkick/OneDrive - NOVAIMS/Sem. 2/Deep Learning/Project/FacialAgeProject/models/best_models/LeNet-V7\assets


INFO:tensorflow:Assets written to: C:/Users/jkick/OneDrive - NOVAIMS/Sem. 2/Deep Learning/Project/FacialAgeProject/models/best_models/LeNet-V7\assets


LeNet saved successfully into C:/Users/jkick/OneDrive - NOVAIMS/Sem. 2/Deep Learning/Project/FacialAgeProject/models/best_models


In [None]:
# Save the history dictionary of our best model.
name_file = 'LeNet-V7-history.pkl'
save_object = history_lenet.history
save_path = os.path.join(path_save_model, "tuned_LeNet-V7", name_file)
with open(save_path, 'wb') as fp:
    pickle.dump(save_object, fp)
    print(f'{name_file} saved successfully into {path_save_model}')

LeNet-V7-history.pkl saved successfully into C:/Users/jkick/OneDrive - NOVAIMS/Sem. 2/Deep Learning/Project/FacialAgeProject/models/best_models


### 3.3.3. Train AlexNet-V4

In [None]:
# Get the best hyperparameters and build model.
best_hps_alexnet = tuner_alexnet.get_best_hyperparameters(1)[0]
best_alexnet = tuner_alexnet.hypermodel.build(best_hps_alexnet)

In [None]:
history_alexnet = train_best_model(model=best_alexnet,
                                   training=train_images,
                                   test = test_images,
                                   epochs=epochs,
                                   batch_size=batch_size)

Start training of AlexNet-V9 Less Complex Architecture 1 + Dropout Rate (0.25) + L2 Regularization (0.001)
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Training time: 177.9209s



In [None]:
# Saves the best model as .keras file.
history_alexnet.model.save(filepath=path_save_model + "/tuned_AlexNet-V4", 
                    overwrite=True, 
                    save_format="keras")

print(f"AlexNet saved successfully into {path_save_model}")



INFO:tensorflow:Assets written to: C:/Users/jkick/OneDrive - NOVAIMS/Sem. 2/Deep Learning/Project/FacialAgeProject/models/best_models/AlexNet-V9\assets


INFO:tensorflow:Assets written to: C:/Users/jkick/OneDrive - NOVAIMS/Sem. 2/Deep Learning/Project/FacialAgeProject/models/best_models/AlexNet-V9\assets


AlexNet saved successfully into C:/Users/jkick/OneDrive - NOVAIMS/Sem. 2/Deep Learning/Project/FacialAgeProject/models/best_models


In [None]:
# Save the history dictionary of our best model.
name_file = 'AlexNet-V4-history.pkl'
save_object = history_alexnet.history
save_path = os.path.join(path_save_model, "tuned_AlexNet-V4",  name_file)
with open(save_path, 'wb') as fp:
    pickle.dump(save_object, fp)
    print(f'{name_file} saved successfully into {path_save_model}')

AlexNet-V9-history.pkl saved successfully into C:/Users/jkick/OneDrive - NOVAIMS/Sem. 2/Deep Learning/Project/FacialAgeProject/models/best_models


# 4. Conclusion

Hyperparameter tuning is a crucial aspect of building high-performing machine learning models. As we observed during the hypertuning part, it can be time consuming and very costly to find the best solution.

Another important consideration that we notice was to limit the search space to only the parameters that are most relevant to the problem at hand, with that we could reduce the search space and then run the hypertuning in time. 

We found that tuning the hyperparameters of our two best-performing models, LeNet and AlexNet, led to a slight improvement in their performance evidencing that it'll not be just the hypertuning phase that can solve the problems of a machine learning problem. It's necessary in this case find differents strategies to improve the models as well as different architectures, differents approaches as transfer learnings, etc.