# Lab 5: Hyperparameter tuning, transfer learning, and fine-tuning


The focus of this lab is on improving the results of a model. In particular, we are interested in exploring how using a pre-trained model can help improve results when we do not have a lot of data.

Grading
Individual grades will be assigned for this lab as Part 1 is not a group activity.

For Parts 2-4, marks will be deducted for any extraneous code.

What to submit
One zipped file (NOT .rar) containing:

A copy of this notebook with:
Error-free code in Python/Keras
All code cells executed and output visible
Images of each group members Coursera certificate
An image (jpeg or png) in the same directory as your notebook can be embedded using the following code in a markdown cell:
<img src="my_image.jpg" width=600 align="center">

# Part 1: Learning Keras Tuner (25 marks)

Each group member should successfully complete the Coursera course Hyperparameter Tuning with Keras Tuner and embed an image of their course completion certificate in the cell below.

This is a free course that only requires registration (also free) with Coursera.

NOTE: you may need to use import keras_tuner instead of import kerastuner as noted in the course files

In [1]:
# import image module
from IPython.display import Image
  
# get the image
Image(url="c1.jpg", width=400, height=400)

In [2]:
Image(url="c2.jpg", width=400, height=400)

# Part 2: Hyperparameter tuning (25 marks)

Apply what you have learned in Part 1 to tune a convolutional neural network that you create for the Fashion MNIST data set. Make sure:

your search includes at least 4 hyperparameters
that you print out the results of your search
that you print a summary of the best model, and
that you quote the test accuracy of the best model

In [3]:
pip install keras-tuner

Note: you may need to restart the kernel to use updated packages.


In [4]:
import keras_tuner
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()
def create_model(hp):
    if hp:
        dropout_rate = hp.Float('dropout_rate', min_value=0.1, max_value=0.5)
        num_units = hp.Choice('num_units', values=[8, 16, 32])
        learning_rate = hp.Float('learning_rate', min_value=0.0001, max_value=0.1)
        num_hidden_layers = hp.Choice('num_hidden_layers', values=[1, 2, 3])
    else:
        dropout_rate = 0.1
        num_units = 8
        learning_rate = 1e-4
        num_hidden_layers = 1
    
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.Flatten(input_shape=(28, 28)))
    model.add(tf.keras.layers.Lambda(lambda x: x/255.))
    
    for _ in range(0, num_hidden_layers):
        model.add(tf.keras.layers.Dense(num_units, activation='relu'))
        model.add(tf.keras.layers.Dropout(dropout_rate))
    
    model.add(tf.keras.layers.Dense(10, activation='softmax'))

    model.compile(
        loss='sparse_categorical_crossentropy',
        optimizer=tf.keras.optimizers.Adam(learning_rate),
        metrics=['accuracy']
    )
    
    return model

In [5]:
create_model(None).summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten (Flatten)           (None, 784)               0         
                                                                 
 lambda (Lambda)             (None, 784)               0         
                                                                 
 dense (Dense)               (None, 8)                 6280      
                                                                 
 dropout (Dropout)           (None, 8)                 0         
                                                                 
 dense_1 (Dense)             (None, 10)                90        
                                                                 
Total params: 6,370
Trainable params: 6,370
Non-trainable params: 0
_________________________________________________________________


In [6]:
class CustomTuner(keras_tuner.tuners.BayesianOptimization):
    def run_trial(self, trial, *args, **kwargs):
        kwargs['batch_size'] = trial.hyperparameters.Int('batch_size', 32, 128, step=32)
        super(CustomTuner, self).run_trial(trial, *args, **kwargs)
        

In [7]:
tuner = CustomTuner(
    create_model,
    objective='val_accuracy',
    max_trials=20,
    directory='logs',
    project_name='fashion_mnist',
    overwrite=True,
)

In [8]:
tuner.search_space_summary(1)

Search space summary
Default search space size: 4
dropout_rate (Float)
{'default': 0.1, 'conditions': [], 'min_value': 0.1, 'max_value': 0.5, 'step': None, 'sampling': None}
num_units (Choice)
{'default': 8, 'conditions': [], 'values': [8, 16, 32], 'ordered': True}
learning_rate (Float)
{'default': 0.0001, 'conditions': [], 'min_value': 0.0001, 'max_value': 0.1, 'step': None, 'sampling': None}
num_hidden_layers (Choice)
{'default': 1, 'conditions': [], 'values': [1, 2, 3], 'ordered': True}


In [9]:
#model=tuner.get_best_models(num_models=1)[0]



In [10]:
#model.evaluate(x_test, y_test)

# Part 3: Transfer learning (25 marks)

In [11]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras.utils.vis_utils import plot_model

In [12]:
# Add our data-augmentation parameters to ImageDataGenerator
train= ImageDataGenerator(rescale = 1./255., rotation_range = 40)
validate=ImageDataGenerator(rescale = 1./255., rotation_range = 40)
test=ImageDataGenerator(rescale = 1./255., rotation_range = 40)

In [13]:
# Load the images into the data frames
train_data=train.flow_from_directory("C:/Users/santo/Downloads/images/train",
                                        target_size =(180,180),
                                        batch_size=32,
                                        class_mode='binary')
validation_data=validate.flow_from_directory("C:/Users/santo/Downloads/images/validate",
                                        target_size =(180,180),
                                        batch_size=32,
                                        class_mode='binary')
testing_data=test.flow_from_directory("C:/Users/santo/Downloads/images/test",
                                        target_size =(180,180),
                                        batch_size=32,
                                        class_mode='binary')

Found 350 images belonging to 5 classes.
Found 50 images belonging to 5 classes.
Found 200 images belonging to 11 classes.


In [14]:
train_data.class_indices

{"'black cat'": 0,
 "'black dog'": 1,
 "'brown cat'": 2,
 "'brown dog'": 3,
 'cat': 4}

In [15]:
#Create and Evaluate a Model
import tensorflow as tf
model = tf.keras.models.Sequential([ tf.keras.layers.Conv2D(16,(3,3),activation = 'relu', input_shape=(180,180,3)),
                                    tf.keras.layers.MaxPool2D(2,2),
                                    tf.keras.layers.Conv2D(32,(3,3),activation = 'relu', input_shape=(180,180,3)),
                                    tf.keras.layers.MaxPool2D(2,2),
                                    tf.keras.layers.Conv2D(64,(3,3),activation = 'relu', input_shape=(180,180,3)),
                                    tf.keras.layers.MaxPool2D(2,2),
                                    tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(512,activation='relu'),
                                    tf.keras.layers.Dense(1,activation='sigmoid')
                                   ])

In [16]:
# Compile the model using adam optimizer
model.compile(loss='binary_crossentropy',
              optimizer = 'adam',
              metrics=['accuracy'])

In [17]:
# Fit the model
model_fit = model.fit(train_data,
                      steps_per_epoch = 3,
                      epochs = 3,
                      validation_data= validation_data)

Epoch 1/3
Epoch 2/3
Epoch 3/3


In [18]:
scores=model.evaluate(testing_data)



# Part 3: Transfer learning (25 marks)

For this part, you can choose any pre-trained network available in keras.applications, except VGG16 or VGG19. Be sure to verify that the chosen network can be used for classification. Following what we did in class, apply this model, with data augmentation, to the data you created for Lab 4. Compare the accuracy on test data to what you achieved in Lab 4.

In [19]:
# Augmenting the dataset with Random Rotation to change the angle from which the image is viewed and It is augmented with RandomZoom to view it in maximised views.
data_augmentation = tf.keras.Sequential([
    tf.keras.layers.experimental.preprocessing.RandomRotation(0.1),
    tf.keras.layers.experimental.preprocessing.RandomZoom(0.1),
    
])

In [20]:
# Load the images into the data frames
train_data=train.flow_from_directory("C:/Users/santo/Downloads/images/train",
                                        target_size =(180,180),
                                        batch_size=32,
                                        class_mode='binary')
validation_data=validate.flow_from_directory("C:/Users/santo/Downloads/images/validate",
                                        target_size =(180,180),
                                        batch_size=32,
                                        class_mode='binary')
testing_data=test.flow_from_directory("C:/Users/santo/Downloads/images/test",
                                        target_size =(180,180),
                                        batch_size=32,
                                        class_mode='binary')

Found 350 images belonging to 5 classes.
Found 50 images belonging to 5 classes.
Found 200 images belonging to 11 classes.


In [21]:
train_data.class_indices

{"'black cat'": 0,
 "'black dog'": 1,
 "'brown cat'": 2,
 "'brown dog'": 3,
 'cat': 4}

In [22]:
# Get a ResNet50 model

from tensorflow.keras.applications import ResNet50

base_model = ResNet50(input_shape=(180, 180,3), include_top=False, weights="imagenet")
for layer in base_model.layers:
    layer.trainable = False

In [23]:
from tensorflow.keras.applications import ResNet50
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense, Flatten, GlobalAveragePooling2D

resnet_model = Sequential()
resnet_model.add(ResNet50(include_top=False, weights='imagenet', pooling='max'))
resnet_model.add(Dense(1, kernel_initializer='normal'))
tf.keras.layers.Dense(1,activation='sigmoid')

<keras.layers.core.dense.Dense at 0x24bbafda9d0>

In [27]:
resnet_model.summary()

ValueError: This model has not yet been built. Build the model first by calling `build()` or calling `fit()` with some data, or specify an `input_shape` argument in the first layer(s) for automatic build.

In [None]:
resnet_model.compile(optimizer = tf.keras.optimizers.SGD(lr=0.01), loss = 'binary_crossentropy', metrics = ['acc'])
resnet_model.fit(train_data, validation_data=validation_data, steps_per_epoch = 10, epochs = 5)

In [None]:
resnet_model.evaluate(test_data)

# Part 4: Fine-tuning (25 marks)

Following what we did in class, fine-tune the model that you trained in Part 3. Compare the accuracy on test data to what you achieved in Part 3

In [None]:
resnet_model.compile(optimizer = tf.keras.optimizers.SGD(lr=0.0001), loss = 'binary_crossentropy', metrics = ['acc'])
resnet_model.fit(train_data, validation_data=validation_data, steps_per_epoch = 40, epochs = 5)