<p style="text-align: center;">
    <strong>All work and rights by Noorullah Khan (47197404) for COMP3420 Artificial Intelligence for Text and Vision, Macquarie University, Session 2, 2024.</strong>
</p>

<p style="text-align: center; font-size: 28px; font-weight: bold; color: #4A90E2; margin-bottom: 10px;">
    COMP3420 - Artificial Intelligence for Text and Vision
</p>

<p style="text-align: center; font-size: 22px; font-weight: bold; color: #7F8C8D; margin-top: 0; margin-bottom: 20px;">
    Assignment 1, Part 2
</p>

<p style="text-align: left; font-size: 24px; font-weight: bold; margin-bottom: 40px;">
    Noorullah Khan, 47197404
</p>

<p style="text-align: right; font-size: 22px; font-weight: bold; margin-bottom: 5px;">
    Macquarie University,
</p>

<p style="text-align: right; font-size: 22px; font-weight: bold; margin-top: 0;">
    Session 2, 2024
</p>


#### **Note on Reproducibility and Dynamic Results**

This notebook employs Python code cells to dynamically generate markdown content, particularly for sections that involve displaying the optimal hyperparameters and the model's accuracy results. Due to the inherent randomness in the training process and the hyperparameter tuning using `keras_tuner`, the results, such as the optimal hyperparameters and accuracy percentages, may vary slightly with each run.

To ensure accuracy and consistency, the relevant metrics are directly captured from the model's outputs and automatically inserted into the markdown cells. This approach not only enhances the clarity of the report but also ensures that the most up-to-date and accurate information is presented, reflecting the true performance of the model after each execution.


<p style="text-align: center; font-size: 22px; font-weight: bold;">
    Setup & Initialisation
</p>

### Imports
Imports the required Libraries and the build_deep_nn function from part 1.

In [1]:
# Importing required libraries
import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
import keras_tuner as kt
import shutil

# Importing the build_deep_nn function from a1.py file
from a1 import build_deep_nn

print("Libraries imported successfully.")

Libraries imported successfully.


In [2]:
# Ensuring function was imported successfully
print(build_deep_nn)
print("build_deep_nn function imported successfully.")

<function build_deep_nn at 0x00000254AE4B7BE0>
build_deep_nn function imported successfully.


#### Resetting the Tuner
Resetting the tuner is to ensure that each hyperparameter search starts fresh, preventing any interference from previous searches, and guaranteeing that the optimal hyperparameters are identified based on the current model configuration.

In [4]:
shutil.rmtree('tuner_dir', ignore_errors=True)
print("Previous Models Reset")

Previous Models Reset


<p style="text-align: center; font-size: 24px; font-weight: bold;">
    Loading and Prepare MNIST Dataset
</p>

#### Loading MNIST Dataset

In [5]:
# Step 1b (Part 1): Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Print the shapes to verify the data is loaded correctly
print(f"x_train shape: {x_train.shape}, y_train shape: {y_train.shape}")
print(f"x_test shape: {x_test.shape}, y_test shape: {y_test.shape}")

x_train shape: (60000, 28, 28), y_train shape: (60000,)
x_test shape: (10000, 28, 28), y_test shape: (10000,)


#### Preparing MNIST Dataset for Training

In [6]:
# Prepare the MNIST dataset for training

# Reshape the data to add the channel dimension (since MNIST images are grayscale)
x_train = x_train.reshape((x_train.shape[0], 28, 28, 1))
x_test = x_test.reshape((x_test.shape[0], 28, 28, 1))

# Normalize the pixel values to be between 0 and 1
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255

# Convert the labels to categorical format (one-hot encoding for 10 classes)
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Print the shapes to verify everything is correct after preparation
print(f"x_train shape: {x_train.shape}, y_train shape: {y_train.shape}")
print(f"x_test shape: {x_test.shape}, y_test shape: {y_test.shape}")

x_train shape: (60000, 28, 28, 1), y_train shape: (60000, 10)
x_test shape: (10000, 28, 28, 1), y_test shape: (10000, 10)


<p style="text-align: center; font-size: 24px; font-weight: bold;">
    Hyperparameter Search
</p>

#### Define the Model Builder Function for Hyperparameter Tuning

In [7]:
def model_builder(hp):

    # Define the hyperparameters to search for
    num_layers = hp.Int('num_layers', min_value=1, max_value=3)
    layer_size = hp.Int('units', min_value=32, max_value=512, step=32)
    dropout_rate = hp.Float('dropout', min_value=0.0, max_value=0.5, step=0.1)

    # Create the layer options for build_deep_nn
    print(f"Building layer options with num_layers={num_layers}, layer_size={layer_size}, dropout_rate={dropout_rate}")
    layer_options = [(layer_size, 'relu', 0.0) for _ in range(num_layers - 1)]
    layer_options.append((layer_size, 'relu', dropout_rate))

    # Build the model using the build_deep_nn function
    model = build_deep_nn(28, 28, 1, layer_options)

    # Print the model summary
    print("Model Summary:")
    model.summary()

    # Compile the model
    print("Compiling model...")
    model.compile(optimizer='adam',
                  loss='categorical_crossentropy',
                  metrics=['accuracy'])
    return model

#### Set Up Keras Tuner for Hyperparameter Tuning

In [8]:
# Set up the tuner
print("Setting up the tuner...")
tuner = kt.RandomSearch(
    model_builder,                   # Pass the model builder function
    objective='val_accuracy',        # Objective to optimize (validation accuracy)
    max_trials=10,                   # Number of different hyperparameter combinations to try
    directory='tuner_dir',           # Directory to save tuner results
    project_name='mnist_tuning'      # Name of the project
)

print("Tuner set up successfully. Ready to start hyperparameter search.")

Setting up the tuner...
Building layer options with num_layers=1, layer_size=32, dropout_rate=0.0
Model Summary:
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten (Flatten)           (None, 784)               0         
                                                                 
 dense (Dense)               (None, 32)                25120     
                                                                 
 dense_1 (Dense)             (None, 10)                330       
                                                                 
Total params: 25,450
Trainable params: 25,450
Non-trainable params: 0
_________________________________________________________________
Compiling model...
Tuner set up successfully. Ready to start hyperparameter search.


#### Run the Hyperparameter Search

In [9]:
# Run the hyperparameter search
print("Starting hyperparameter search...")
tuner.search(x_train, y_train, epochs=5, validation_data=(x_test, y_test))
print("Hyperparameter search completed successfully.")

Trial 10 Complete [00h 00m 27s]
val_accuracy: 0.980400025844574

Best val_accuracy So Far: 0.9818000197410583
Total elapsed time: 00h 04m 23s
Hyperparameter search completed successfully.


<p style="text-align: center; font-size: 22px; font-weight: bold;">
   Model Optimisationn
</p>


#### Extract the Best Hyperparameters

In [10]:
# Get the optimal hyperparameters
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]
print("Optimal hyperparameters found:")
print(f"Best number of layers: {best_hps.get('num_layers')}")
print(f"Best layer size: {best_hps.get('units')}")
print(f"Best dropout rate: {best_hps.get('dropout')}")

Optimal hyperparameters found:
Best number of layers: 1
Best layer size: 416
Best dropout rate: 0.0


#### Build the Model with Optimal Hyperparameters

In [11]:
# Build the model with the optimal hyperparameters
optimal_layers = [(best_hps.get('units'), 'relu', 0.0) for _ in range(best_hps.get('num_layers') - 1)] + \
                 [(best_hps.get('units'), 'relu', best_hps.get('dropout'))]

model = build_deep_nn(28, 28, 1, optimal_layers)

# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

print("Model built and compiled successfully with optimal hyperparameters.")

Model built and compiled successfully with optimal hyperparameters.


#### Train the Model with Optimal Hyperparameters

In [12]:
# Train the model
history = model.fit(x_train, y_train, epochs=5, validation_data=(x_test, y_test))

print("Model training completed.")

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Model training completed.


<p style="text-align: center; font-size: 22px; font-weight: bold;">
    Conclusion
</p>

#### Evaluating the Optimal Model

In [15]:
# Evaluate the model
test_loss, test_accuracy = model.evaluate(x_test, y_test)
# Print the test accuracy
print(f"Test accuracy: {test_accuracy}")

Test accuracy: 0.9805999994277954


In [17]:
# Display the results in markdown format
from IPython.display import Markdown, display

display(Markdown(f"""
**Evaluating the Model:**

The model was evaluated on the test set, and the following results were obtained:

- **Test Loss:** `{test_loss:.4f}`  
  The loss metric indicates how well the model's predictions align with the actual labels. A lower loss generally implies better model performance. In this case, the model's test loss of `{test_loss:.4f}` suggests that it has learned to generalize well to unseen data.

- **Test Accuracy:** `{test_accuracy:.4f}`  
  The test accuracy of `{test_accuracy:.4f}` (or `{test_accuracy*100:.2f}%`) indicates that the model correctly classified approximately `{test_accuracy*100:.2f}%` of the images in the test set. This high accuracy reflects the model's effectiveness in recognizing and classifying the digits from the MNIST dataset with a high level of precision.

Overall, these results demonstrate that the model is performing exceptionally well.** Achieving a test accuracy close to 98% on the MNIST dataset is considered very good, indicating that the model has successfully learned to identify the digits with high accuracy.
"""))


**Evaluating the Model:**

The model was evaluated on the test set, and the following results were obtained:

- **Test Loss:** `0.0679`  
  The loss metric indicates how well the model's predictions align with the actual labels. A lower loss generally implies better model performance. In this case, the model's test loss of `0.0679` suggests that it has learned to generalize well to unseen data.

- **Test Accuracy:** `0.9806`  
  The test accuracy of `0.9806` (or `98.06%`) indicates that the model correctly classified approximately `98.06%` of the images in the test set. This high accuracy reflects the model's effectiveness in recognizing and classifying the digits from the MNIST dataset with a high level of precision.

Overall, these results demonstrate that the model is performing exceptionally well.** Achieving a test accuracy close to 98% on the MNIST dataset is considered very good, indicating that the model has successfully learned to identify the digits with high accuracy.


In [18]:
# Dynamically inserting the optimal hyperparameters into the markdown
best_layers = best_hps.get('num_layers')
best_units = best_hps.get('units')
best_dropout = best_hps.get('dropout')

display(Markdown(f"""
#### **Hyperparameters of the Optimal Model**

The optimal hyperparameters for the model, as determined by `keras_tuner`, are:

- **Number of hidden layers:** `{best_layers}`
- **Size of the hidden layer:** `{best_units} units`
- **Dropout rate of the final hidden layer:** `{best_dropout}`

These hyperparameters were found to yield the highest validation accuracy during the tuning process.
"""))



#### **Hyperparameters of the Optimal Model**

The optimal hyperparameters for the model, as determined by `keras_tuner`, are:

- **Number of hidden layers:** `1`
- **Size of the hidden layer:** `416 units`
- **Dropout rate of the final hidden layer:** `0.0`

These hyperparameters were found to yield the highest validation accuracy during the tuning process.


In [19]:
# After running the model training and evaluation, you can insert the accuracy results dynamically
test_accuracy_percent = test_accuracy * 100

display(Markdown(f"""
#### **Accuracy Results of the Optimal Model**

After training the model with the optimal hyperparameters, the model was evaluated on the test set. The accuracy results are as follows:

- **Test set accuracy:** `{test_accuracy_percent:.2f}%` (`{test_accuracy:.4f}`)

This indicates that the model correctly classified `{test_accuracy_percent:.2f}%` of the images in the test set, demonstrating its effectiveness at recognizing handwritten digits from the MNIST dataset.
"""))



#### **Accuracy Results of the Optimal Model**

After training the model with the optimal hyperparameters, the model was evaluated on the test set. The accuracy results are as follows:

- **Test set accuracy:** `98.06%` (`0.9806`)

This indicates that the model correctly classified `98.06%` of the images in the test set, demonstrating its effectiveness at recognizing handwritten digits from the MNIST dataset.


<p style="text-align: center;">
    <strong>All work and rights by Noorullah Khan (47197404) for COMP3420 Artificial Intelligence for Text and Vision, Macquarie University, Session 2, 2024.</strong>
</p>
