# Fine Tunning

In the pursuit of enhancing model performance, I ventured into fine-tuning, a sophisticated technique that involves adjusting the parameters of a pre-trained model to better suit the characteristics of a specific dataset or task.

In my fine-tuning endeavor, I adopted a structured approach to model creation and hyperparameter optimization. Leveraging the Keras Tuner library, I designed a custom model-building function capable of constructing neural networks with various configurations, including adjustable dense layers with regularization techniques like L1 and L2 regularization. This function allowed for comprehensive exploration of hyperparameter space, enabling the identification of optimal model architectures tailored to the CIFAR-10 dataset.

To facilitate fine-tuning, I strategically froze a subset of layers within the InceptionResNetV2 base model while leaving others trainable. This selective freezing approach helps preserve the pre-learned representations in lower layers while allowing higher-level features to adapt to the new task. By fine-tuning the model's parameters, I aimed to strike a balance between leveraging the generic features learned from ImageNet and tailoring the model to the nuances of CIFAR-10.

Hyperparameter optimization played a pivotal role in fine-tuning, guiding the search for optimal model configurations. The Hyperband algorithm facilitated efficient exploration of hyperparameter space, iteratively refining model architectures based on performance metrics such as validation accuracy. Through this iterative process, I identified the best combination of hyperparameters, including the number of units in dense layers, learning rate, dropout rate, and regularization strength.

Upon identifying the optimal hyperparameters, I constructed the final model and conducted training sessions to refine its parameters further. The trained model exhibited improved performance, achieving higher accuracy on both the training and validation datasets. Encouraged by these results, I preserved the best-performing model by saving it to disk, ensuring its availability for future deployment and inference tasks.

In summary, fine-tuning represents a sophisticated approach to model optimization, allowing for the adaptation of pre-trained neural networks to new tasks or datasets. By combining selective layer freezing, hyperparameter optimization, and iterative model refinement, I successfully enhanced the performance of the InceptionResNetV2 model on the CIFAR-10 dataset, underscoring the efficacy of fine-tuning in deep learning endeavors.

In [7]:
import tensorflow as tf
from tensorflow import keras
import keras_tuner as kt
import tensorflow.keras as K
import numpy as np

# Preprocess data
def preprocess_data(X, Y):
    X = tf.keras.applications.inception_resnet_v2.preprocess_input(X)
    y = tf.keras.utils.to_categorical(Y, 10)
    return X, y

# Load CIFAR-10
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.cifar10.load_data()

# preprocess data CIFAR10
X_train, y_train = preprocess_data(X_train, y_train)
X_test, y_test = preprocess_data(X_test, y_test)

# Function to create Model for Keras Tunning
def model_builder(hp):
    base_model = keras.applications.InceptionResNetV2(weights='imagenet',
                                                       include_top=False,
                                                       input_shape=(299, 299, 3))
    
    
    # freeze some layer (before 633)
    for layer in base_model.layers[:633]:
        layer.trainable=False

    for layer in base_model.layers[633:]:
        layer.trainable=True
    
    # Define a function for adding regularizers
    def add_regularization(layer, hp):
        if hp.Choice('regularization_type', ['l2', 'l1', 'none']) == 'l2':
            return keras.layers.Dense(
                units=hp.Int('units', min_value=32, max_value=512, step=32),
                activation='relu',
                kernel_regularizer=keras.regularizers.l2(hp.Choice('l2_rate', [0.001, 0.01, 0.1])),
            )(layer)
        elif hp.Choice('regularization_type', ['l2', 'l1', 'none']) == 'l1':
            return keras.layers.Dense(
                units=hp.Int('units', min_value=32, max_value=512, step=32),
                activation='relu',
                kernel_regularizer=keras.regularizers.l1(hp.Choice('l1_rate', [0.001, 0.01, 0.1])),
            )(layer)
        else:
            return keras.layers.Dense(
                units=hp.Int('units', min_value=32, max_value=512, step=32),
                activation='relu',
            )(layer)
        
    # resize image
    inputs = K.Input(shape=(32, 32, 3))
    input = K.layers.Lambda(lambda image: tf.image.resize(image, (299, 299)))(inputs)
    
    # construct model
    x = base_model(input, training=False)
    x = keras.layers.GlobalAveragePooling2D()(x)

    # Add regularized dense layers
    for _ in range(hp.Int('num_layers', 1, 5)):
        x = add_regularization(x, hp)
        x = keras.layers.Dropout(hp.Float('dropout', min_value=0.0, max_value=0.5, step=0.1))(x)
    
    outputs = keras.layers.Dense(10, activation='softmax')(x)

    model = keras.Model(inputs, outputs)

    # Compile the model
    model.compile(optimizer=keras.optimizers.Adam(
                    hp.Choice('learning_rate',
                              values=[1e-2, 1e-3, 1e-4, 1e-5])),
                  loss='categorical_crossentropy',
                  metrics=['accuracy'])

    return model

# Instantiate the tuner and perform hyperparameter search
tuner = kt.Hyperband(model_builder,
                     objective='val_accuracy',
                     max_epochs=10,
                     factor=3,
                     directory='my_dir',
                     project_name='intro_to_kt')

# Search for the best hyperparameters
tuner.search(X_train, y_train, epochs=10, validation_data=(X_test, y_test))

# Get the best hyperparameters
best_hps=tuner.get_best_hyperparameters(num_trials=1)[0]

# Print Best parameter
print("Best Hyperparameters:")
print("Number of units in the first densely-connected layer:", best_hps.get('units'))
print("Learning rate:", best_hps.get('learning_rate'))
print("Dropout rate:", best_hps.get('dropout'))  # Add this line to print dropout rate
print("L2 regularization rate:", best_hps.get('l2_rate')) 

# Build the model with the optimal hyperparameters and train it
model = tuner.hypermodel.build(best_hps)
history = model.fit(X_train, y_train, epochs=10, validation_data=(X_test, y_test))

# save best model
model.save('cifar10_best.h5')


Trial 26 Complete [01h 08m 37s]
val_accuracy: 0.9447000026702881

Best val_accuracy So Far: 0.9447000026702881
Total elapsed time: 11h 58m 20s

The hyperparameter search is complete. The optimal number of units in the first densely-connected
layer is 288 and the optimal learning rate for the optimizer
is 0.0001.

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [8]:
print("Best Hyperparameters:")
print("Number of units in the first densely-connected layer:", best_hps.get('units'))
print("Learning rate:", best_hps.get('learning_rate'))
print("Dropout rate:", best_hps.get('dropout'))  # Add this line to print dropout rate
print("L2 regularization rate:", best_hps.get('l2_rate')) 

Best Hyperparameters:
Number of units in the first densely-connected layer: 288
Learning rate: 0.0001
Dropout rate: 0.30000000000000004
L2 regularization rate: 0.001
