<a href="https://colab.research.google.com/github/adeabio21/ADENIKE/blob/main/ADEYEMI_ADENIKE%5D_%5BID%5D_ADS2_Assignment_1_Deep_Learning_With_Keras.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Assignment 1 - Deep Learning With Keras - 40%

**IMPORTANT NOTE**: By default, this notebook is set to a CPU runtime, to help prevent you getting locked out of Google Colab. When training your models, you will need to switch to a GPU runtime, otherwise the training will take a very long time.

**Deadline**: 21 Mar 2023, 23:59

**Submission Requirements**: You should submit your Colab Notebook, with all the outputs printed, and a sharing link to the notebook in your Drive. As detailed above, you should submit a 2-page report in PDF or DOCX format.

**Learning Outcomes**

This Assignment assesses the following module Learning Outcomes (from Definitive Module Document):

* have knowledge of and understand how GPUs can accelerate data processing
* be able to write data processing pipelines that exploit Tensorflow
* have knowledge of and understand how to develop GPU-accelerated data processing pipelines using the Tensorflow and RAPIDS frameworks

**Assignment Details**

This assignment will test your ability to implement and test a deep neural network using keras. By varying the properties of the model and the data input pipeline, you will explore and analyse the training time and performance of the model. There will be four tasks for you to complete, each of which requires you to complete a range of tests on the model, visualise the results, and comment on them in a short report. Your report should focus on explaining and critically analysing the results—you will be assessed not just on your ability to show what is happening, but explain WHY it is happening.

All coding work for this assignment should be done inside a copy of the Colab Notebook provided on this page. Any submissions not in this format will not be marked.

**Task 1**: A model description is provided in the Colab Notebook for this assignment. Implement this model, ensuring that you have the correct output shapes for each of the layers and the correct number of model parameters. Train the model on the dataset provided in the notebook—initial training settings are provided also. Create plots of the losses and metrics of the training and validation data, and a plot that shows example images from each class that have been correctly AND incorrectly labelled by the model. Analyse these results in your report.

**Task 2**: Select two additional optimizers. Including the one provided in the initial training settings, test your model with each of these optimizers using a range of different learning rates. You may need to train the model for more epochs to ensure that it converges on a solution. Create plots that show the losses and metrics for each of these runs, and comment on the results in your report. Select the optimizer and learning rate that provided the best results, and move onto the next task.

**Task 3**: The batch size can heavily influence the amount of time it takes to train a model. Vary the batch size used to train the model and, utilising the Early Stopping callback provided, create plots that show how the time per epoch and total training time changes. Comment on these results in your report.

**Task 4**: The model as provided does not contain any regularisation techniques. Edit the model architecture to include at least two examples of regularisation. Retrain the model using the new architecture, and repeat the analysis performed in task 1. In your report, compare and contrast the results from this task, with those from the initial model configuration.



In [None]:
# Module Imports - Add any additional modules here
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from tensorflow import keras 
from keras import layers, models, optimizers, losses, callbacks,\
                             regularizers
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.optimizers import SGD, Adam

In [None]:
# Loading the Dataset. Here we use the CIFAR-10 dataset of labelled images

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

# Rescale the pixel values
x_train = x_train/255.
x_test = x_test/255.

# List of label names
class_names = ['plane', 'car', 'bird', 'cat', 'deer',
               'canine', 'frog', 'horse', 'boat', 'truck']

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz


# Task 1 - Initial Model

Implement the model architecture detailed below, using the Keras Functional API, ensuring that you have the correct output shapes for each of the layers.

Train the model on the CIFAR-10 dataset.

Create plots of the losses and metrics of the training and validation data, and plots that show example test images from each class that have been correctly AND incorrectly labelled by the model.

Analyse these results in your report.

**Model Architecture**

A summary of the model architecture is given here, which shows the layers of the model, the output shapes of those layers, and the activation functions used. You will need to work out the other settings used to produce the model, such as the kernal sizes, padding schemes, and stride lengths. You should ensure that the output shapes and total number of parameters in your model match the summary here.

```
Model: "cifar_model"
_________________________________________________________________
 Layer (type)                Output Shape              Activation   
=================================================================
 Input (InputLayer)          [(None, 32, 32, 3)]       None         
                                                                 
 conv_1 (Conv2D)             (None, 32, 32, 16)        ReLU       
                                                                 
 conv_2 (Conv2D)             (None, 32, 32, 16)        ReLU      
                                                                 
 pool_1 (MaxPooling2D)       (None, 16, 16, 16)        None         
                                                                 
 conv_3 (Conv2D)             (None, 16, 16, 32)        ReLU      
                                                                 
 conv_4 (Conv2D)             (None, 16, 16, 32)        ReLU      
                                                                 
 pool_2 (MaxPooling2D)       (None, 8, 8, 32)          None         
                                                                 
 conv_5 (Conv2D)             (None, 8, 8, 64)          ReLU     
                                                                 
 conv_6 (Conv2D)             (None, 8, 8, 64)          ReLU     
                                                                 
 pool_3 (MaxPooling2D)       (None, 4, 4, 64)          None         
                                                                 
 flat (Flatten)              (None, 1024)              None         
                                                                 
 fc_1 (Dense)                (None, 512)               ReLU    
                                                                 
 Output (Dense)              (None, 10)                SoftMax      
                                                                 
=================================================================
Total params: 602,010
Trainable params: 602,010
Non-trainable params: 0
_________________________________________________________________
```



In [None]:
### Create the model using the provided architecture

def cifar_model():
    inputs = layers.Input(shape=(32,32,3), name='Input')
    
    x = layers.Conv2D(16, kernel_size=(3,3), padding='same', activation='relu', name='conv_1')(inputs)
    x = layers.Conv2D(16, kernel_size=(3,3), padding='same', activation='relu', name='conv_2')(x)
    x = layers.MaxPooling2D(pool_size=(2,2), name='pool_1')(x)
    
    x = layers.Conv2D(32, kernel_size=(3,3), padding='same', activation='relu', name='conv_3')(x)
    x = layers.Conv2D(32, kernel_size=(3,3), padding='same', activation='relu', name='conv_4')(x)
    x = layers.MaxPooling2D(pool_size=(2,2), name='pool_2')(x)
    
    x = layers.Conv2D(64, kernel_size=(3,3), padding='same', activation='relu', name='conv_5')(x)
    x = layers.Conv2D(64, kernel_size=(3,3), padding='same', activation='relu', name='conv_6')(x)
    x = layers.MaxPooling2D(pool_size=(2,2), name='pool_3')(x)
    
    x = layers.Flatten(name='flat')(x)
    x = layers.Dense(512, activation='relu', name='fc_1')(x)
    
    outputs = layers.Dense(10, activation='softmax', name='Output')(x)
    
    model = models.Model(inputs=inputs, outputs=outputs, name='cifar_model')
    
    return model


In [None]:
### Compile the model using the SGC optimizer, with default learning rate,
### Sparse Categorical Crossentropy, and accuracy metric.
model = cifar_model()

optimizer = optimizers.Adam(learning_rate=0.001)

model.compile(optimizer=optimizer, loss=losses.SparseCategoricalCrossentropy(),
              metrics=['accuracy'])

history = model.fit(x_train, y_train, batch_size=64, epochs=20,
                    validation_data=(x_test, y_test))


Epoch 1/20
Epoch 2/20
Epoch 3/20
 19/782 [..............................] - ETA: 2:29 - loss: 0.9235 - accuracy: 0.6743

In [None]:
from keras.engine import sequential
### Train the model for 50 epochs, with a batch size of 128. Include the test
### data for model validation. Store the losses and metrics in a history object.
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam
from keras.datasets import mnist

#load  MNIST dataset
(x_train,y_train),(x_test,y_test)=mnist.load_data()

#preprocess data
x_train=x_train.reshape(-1, 784) / 255.0
x_test=x_test.reshape(-1, 784) / 255.0

# Define model architecture
model = Sequential()
model.add(Dense(128, activation='relu', input_shape=(784,)))
model.add(Dense(10, activation='softmax'))

# Compile model
model.compile(loss='sparse_categorical_crossentropy', optimizer=Adam(), metrics=['accuracy'])

#Train model
history = model.fit(x_train, y_train, batch_size=128, epochs=50, validation_data=(x_test, y_test))



In [None]:
### Create plots of the losses and metrics of the training and validation data,
### and plots that shows example test images from each class that have been
### correctly AND incorrectly labelled by the model.

import matplotlib.pyplot as plt

# Train the model and store the history object
history = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=10)

# Plot the training and validation loss
plt.plot(history.history['loss'], label='Training loss')
plt.plot(history.history['val_loss'], label='Validation loss')
plt.legend()
plt.show()

# Plot the training and validation accuracy
plt.plot(history.history['accuracy'], label='Training accuracy')
plt.plot(history.history['val_accuracy'], label='Validation accuracy')
plt.legend()
plt.show()

import numpy as np

# Get the predicted class labels for the test data
y_pred = model.predict(x_test)
y_pred_classes = np.argmax(y_pred, axis=1)

# Get the true class labels for the test data
from keras.utils import to_categorical

# Convert the class labels to one-hot encoded arrays
y_train_onehot = to_categorical(y_train, num_classes=10)
y_test_onehot = to_categorical(y_test, num_classes=10)

# Create a dictionary to store the correctly and incorrectly labelled images
correct_images = {}
incorrect_images = {}

# Iterate over the test data and compare the predicted labels with the true labels
# Get the true class labels for the test data
y_true = np.argmax(y_test_onehot, axis=1)

for i in range(len(y_true)):
    if y_pred_classes[i] == y_true[i]:
        # The image was correctly labelled
        label = y_true[i]
        if label not in correct_images:
            correct_images[label] = []
        if len(correct_images[label]) < 10:
            correct_images[label].append(x_test[i])
    else:
        # The image was incorrectly labelled
        label = y_true[i]
        if label not in incorrect_images:
            incorrect_images[label] = []
        if len(incorrect_images[label]) < 10:
            incorrect_images[label].append(x_test[i])

# Plot the correctly labelled images
for label in correct_images:
    print("Correctly labelled images for class", label)
    fig, axes = plt.subplots(nrows=1, ncols=len(correct_images[label]), figsize=(10, 2))
    for i in range(len(correct_images[label])):
        axes[i].imshow(correct_images[label][i].reshape(28, 28), cmap='gray')
        axes[i].axis('off')
    plt.show()

# Plot the incorrectly labelled images
for label in incorrect_images:
    print("Incorrectly labelled images for class", label)
    fig, axes = plt.subplots(nrows=1, ncols=len(incorrect_images[label]), figsize=(10, 2))
    for i in range(len(incorrect_images[label])):
        axes[i].imshow(incorrect_images[label][i].reshape(28, 28), cmap='gray')
        axes[i].axis('off')
    plt.show()



# Task 2 - Testing Optimizers

Select two additional optimizers. Including the SGD algorithm already used, test all three of these optimizers with a range of different learning rates.

You may need to train the model for more or less epochs to ensure that it converges on a solution.

Create plots that show the losses and metrics for each of these runs, and comment on the results in your report.

Select the optimizer and learning rate that provided the best results, and move onto the next task.

**Note**: You should reset the weights of the model between each test. A function is provided to perform this task. Store the losses and metrics of each run under a different variable name, so that they can all be plotted together.

In [None]:
# Utility function that resets the weights of your model. Call this before
# recompiling your model with updated settings, to ensure you train the model
# from scratch.

def reinitialize(model):
    # Loop over the layers of the model
    for l in model.layers:
        # Check if the layer has initializers
        if hasattr(l,"kernel_initializer"):
            # Reset the kernel weights
            l.kernel.assign(l.kernel_initializer(tf.shape(l.kernel)))
        if hasattr(l,"bias_initializer"):
            # Reset the bias
            l.bias.assign(l.bias_initializer(tf.shape(l.bias)))

# Function modified from here: https://stackoverflow.com/questions/63435679/reset-all-weights-of-keras-model

In [None]:
### Test the SGD Optimizer, plus two others of your choice, with a range of
### learning rates.

# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize pixel values to [0, 1]
x_train = x_train / 255.0
x_test = x_test / 255.0

# Convert labels to one-hot encoding
y_train = keras.utils.to_categorical(y_train)
y_test = keras.utils.to_categorical(y_test)

# Define hyperparameters
optimizers = [SGD, Adam]
learning_rates = [0.01, 0.001]

# Loop over hyperparameters
for optimizer in optimizers:
    for lr in learning_rates:
        model = Sequential()
        model.add(Flatten(input_shape=(28, 28)))
        model.add(Dense(128, activation="relu"))
        model.add(Dropout(0.2))
        model.add(Dense(10, activation="softmax"))
        model.compile(optimizer=optimizer(learning_rate=lr),
                      loss="categorical_crossentropy",
                      metrics=["accuracy"])
        model.fit(x_train, y_train, epochs=5, batch_size=32, verbose=1)
        test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
        print(f"Optimizer: {optimizer.__name__}, Learning Rate: {lr}")
        print(f"Test Loss: {test_loss:.4f}, Test Accuracy: {test_acc:.4f}")


In [None]:
### Create plots that show the losses and metrics for each of these runs, and
### comment on the results in your report.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten
from tensorflow.keras.optimizers import SGD, Adam
import matplotlib.pyplot as plt

# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize pixel values to [0, 1]
x_train = x_train / 255.0
x_test = x_test / 255.0

# Convert labels to one-hot encoding
y_train = keras.utils.to_categorical(y_train)
y_test = keras.utils.to_categorical(y_test)

# Define hyperparameters
optimizers = [SGD, Adam]
learning_rates = [0.01, 0.001]

# Loop over hyperparameters
for optimizer in optimizers:
    for lr in learning_rates:
        model = Sequential()
        model.add(Flatten(input_shape=(28, 28)))
        model.add(Dense(128, activation="relu"))
        model.add(Dropout(0.2))
        model.add(Dense(10, activation="softmax"))
        model.compile(optimizer=optimizer(learning_rate=lr),
                      loss="categorical_crossentropy",
                      metrics=["accuracy"])
        history = model.fit(x_train, y_train, epochs=5, batch_size=32, verbose=1, validation_data=(x_test, y_test))
        test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
        print(f"Optimizer: {optimizer.__name__}, Learning Rate: {lr}")
        print(f"Test Loss: {test_loss:.4f}, Test Accuracy: {test_acc:.4f}")
        
        # Plot the training and validation losses and metrics
        plt.figure(figsize=(12, 4))
        plt.subplot(1, 2, 1)
        plt.plot(history.history["loss"], label="Training Loss")
        plt.plot(history.history["val_loss"], label="Validation Loss")
        plt.xlabel("Epoch")
        plt.ylabel("Loss")
        plt.legend()
        plt.title(f"Optimizer: {optimizer.__name__}, Learning Rate: {lr}")

        plt.subplot(1, 2, 2)
        plt.plot(history.history["accuracy"], label="Training Accuracy")
        plt.plot(history.history["val_accuracy"], label="Validation Accuracy")
        plt.xlabel("Epoch")
        plt.ylabel("Accuracy")
        plt.legend()
        plt.title(f"Optimizer: {optimizer.__name__}, Learning Rate: {lr}")

        plt.show()



# Task 3 - Testing Batch Sizes

The batch size can heavily influence the amount of time it takes to train a model. Vary the batch size used to train the model and, utilising the Early Stopping callback provided, create plots that show how the time per epoch and total training time changes.

Comment on these results in your report—consider both how the batch size influences the number of epochs it takes to reach a solution, and how long each epoch takes to run. Why is this the case?

In [None]:
### Train the model with a range of different batch sizes, resetting the weights
### each time. Use an Early Stopping callback to prevent the model training for
### too long.


import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten
from tensorflow.keras.optimizers import SGD, Adam
from tensorflow.keras import callbacks
import matplotlib.pyplot as plt
import time

# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize pixel values to [0, 1]
x_train = x_train / 255.0
x_test = x_test / 255.0

# Convert labels to one-hot encoding
y_train = keras.utils.to_categorical(y_train)
y_test = keras.utils.to_categorical(y_test)

# Define hyperparameters
batch_sizes = [8, 16, 32, 64, 128, 256, 512, 1024]

# Loop over hyperparameters
for batch_size in batch_sizes:
    model = Sequential()
    model.add(Flatten(input_shape=(28, 28)))
    model.add(Dense(128, activation="relu"))
    model.add(Dropout(0.2))
    model.add(Dense(10, activation="softmax"))
    model.compile(optimizer=SGD(learning_rate=0.01),
                  loss="categorical_crossentropy",
                  metrics=["accuracy"])
    early_stop = callbacks.EarlyStopping(monitor='val_loss', patience=3)
    start_time = time.time()
    history = model.fit(x_train, y_train, epochs=100, batch_size=batch_size, verbose=1,
                        validation_data=(x_test, y_test), callbacks=[early_stop])
    end_time = time.time()
    total_time = end_time - start_time
    test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
    print(f"Batch Size: {batch_size}")
    print(f"Test Loss: {test_loss:.4f}, Test Accuracy: {test_acc:.4f}")
    print(f"Total Training Time: {total_time:.2f}s, Time Per Epoch: {total_time/len(history.history['loss']):.2f}s")
    
    # Plot the training and validation losses and metrics
    plt.figure(figsize=(12, 4))
    plt.subplot(1, 2, 1)
    plt.plot(history.history["loss"], label="Training Loss")
    plt.plot(history.history["val_loss"], label="Validation Loss")
    plt.xlabel("Epoch")
    plt.ylabel("Loss")
    plt.legend()
    plt.title(f"Batch Size: {batch_size}")

    plt.subplot(1, 2, 2)
    plt.plot(history.history["accuracy"], label="Training Accuracy")
    plt.plot(history.history["val_accuracy"], label="Validation Accuracy")
    plt.xlabel("Epoch")
    plt.ylabel("Accuracy")
    plt.legend()
    plt.title(f"Batch Size: {batch_size}")

    plt.show()

 



# Task 4 - Adding Regularisation

The model as provided does not contain any regularisation techniques. Edit the model architecture to include at least two examples of regularisation. Retrain the model using the new architecture, and repeat the analysis performed in task 1.

In your report, compare and contrast the results from this task, with those from the initial model configuration. Explain HOW and WHY the results are different, with consideration to the predicted classifications, losses and metrics.

In [None]:
### Update the model architecture to include at least two types of regularization.
### Train the model using the ideal settings found in previous tasks.


In [None]:
### Repeat your analysis from task 1, creating plots of the losses, metrics AND
### predicted classifications of images in the test set.
