**AASD 4015 Advanced Mathematical Concepts for Deep Learning
Project 1: MNIST Handwriting Classification
The project is conducted by Cheuk Yin Li 101386432, Eman Elsefy 101428470, Shui Hei Yung 101409756
The dataset is extracted from MNIST. ***

The project aims to classify images of handwriting words using a convolutional neural network (CNN) with hyperparameter tuning and fine-tuning techniques. The dataset consists of 60,000 training samples and 10,000 testing samples, with 10 classes in total (0-9). Deep learning approaches are used to train the model, which consists of convolution layers, a fully connected layer, dropout layer, and softmax layer. To optimize the model's performance, hyperparameter tuning techniques are applied, including adjusting the optimizer and the number of epochs used during training.

In addition to hyperparameter tuning, fine-tuning techniques are also employed to further improve the model's accuracy. Fine-tuning involves taking a pre-trained neural network that was previously trained on a similar task and adjusting its parameters to improve its performance on the current task. In this project, we will use the pre-trained weights from a different dataset and fine-tune the model to classify handwriting words.

By fine-tuning the pre-trained model on the current dataset, we can leverage the knowledge captured in the pre-trained weights and significantly reduce the time and resources needed to achieve high accuracy. The fine-tuning process will involve adjusting the model's architecture, retraining the fully connected layers, and fine-tuning the pre-trained weights to improve the model's accuracy on the current task.

Through this project, we aim to showcase the effectiveness of fine-tuning techniques in improving the performance of deep learning models, specifically in the context of image classification tasks.

In [None]:
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten
from tensorflow.keras.applications import VGG16
import numpy as np
import matplotlib.pyplot as plt
from tensorflow import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator

In [None]:
# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

In [None]:
# Print the shape of the training and testing data
print("Training data shape:", x_train.shape)
print("Training labels shape:", y_train.shape)
print("Testing data shape:", x_test.shape)
print("Testing labels shape:", y_test.shape)

In [None]:
# Plot the first 25 images in the training set
plt.figure(figsize=(10, 10))
for i in range(25):
    plt.subplot(5, 5, i + 1)
    plt.imshow(x_train[i], cmap="gray")
    plt.title(str(y_train[i]))
    plt.axis("off")
plt.show()

In [None]:
# The class distribution is a nearly equal distributed which means the biases of data is relatively low.
sns.barplot(class_counts, x="label", y="count")
plt.title("Class distribution in train_df")
plt.show()

In [None]:
#By coloring the points based on their corresponding labels, we can visualize how the different classes are distributed in this space, and whether there are any clear patterns or clusters that separate them from each other.
idx = np.random.randint(x_train.shape[0], size=500)
x_train_sample = x_train[idx, :]
y_train_sample = y_train[idx]

# Fit PCA model to the data
pca = PCA(n_components=2)
x_train_pca = pca.fit_transform(x_train_sample.reshape(x_train_sample.shape[0], -1))

# Plot the scatterplot
plt.scatter(x_train_pca[:, 0], x_train_pca[:, 1], c=y_train[idx], cmap='tab10')
plt.colorbar()
plt.show()

In [None]:
# Normalize pixel values to be between 0 and 1
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

In [None]:
# Reshape input data to match VGG16 input shape
x_train = tf.image.grayscale_to_rgb(tf.expand_dims(x_train, axis=-1))
x_test = tf.image.grayscale_to_rgb(tf.expand_dims(x_test, axis=-1))
x_train = tf.image.resize(x_train, (48, 48))
x_test = tf.image.resize(x_test, (48, 48))

In [None]:
# Print the new shape of the data
print("Training data shape:", x_train.shape)
print("Testing data shape:", x_test.shape)

In [None]:
# Load pre-trained VGG16 model without top layers
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(48, 48, 3))


In [None]:
# Freeze layers in the base model
for layer in base_model.layers[:15]:
    layer.trainable = False

In [None]:
# Create a new model by adding additional layers on top of the base model
model = Sequential()
model.add(base_model)
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

In [None]:
# Compile the model
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])


In [None]:
# Data augmentation for training images
train_datagen = ImageDataGenerator(rotation_range=10, zoom_range=0.1, width_shift_range=0.1, height_shift_range=0.1)
train_datagen.fit(x_train)

In [None]:
# Train the model
model.fit(x_train, y_train, batch_size=32, epochs=10, validation_data=(x_test, y_test))


Epoch 1/10

In [None]:
plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)

In [None]:
# Plot the accuracy over epochs
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

In [None]:
# Evaluate the model on the test set
test_loss, test_acc = model.evaluate(x_test, y_test)
print('Test loss:', test_loss)
print('Test accuracy:', test_acc)

In [None]:
plt.plot(fit_model.history['accuracy'])
plt.plot(fit_model.history['val_accuracy'])

plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train','test'], loc='upper left')
plt.show()

plt.plot(fit_model.history['loss'])
plt.plot(fit_model.history['val_loss'])

plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train','test'], loc='upper left')
plt.show()

In [None]:
# Fine-tune the model by unfreezing more layers
for layer in base_model.layers[10:]:
    layer.trainable = True

In [None]:
# Re-compile the model
model.compile(loss='sparse_categorical_crossentropy', optimizer=tf.keras.optimizers.Adam(lr=1e-5), metrics=['accuracy'])


In [None]:
# Train the model again with a lower learning rate
model.fit(x_train, y_train, batch_size=32, epochs=10, validation_data=(x_test, y_test))

In [None]:
plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)

In [None]:
# Plot the accuracy over epochs again
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model Accuracy (Fine-tuning)')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

In [None]:
# Evaluate the model on the test set
test_loss, test_acc = model.evaluate(x_test, y_test)
print('Test loss:', test_loss)
print('Test accuracy:', test_acc)

In [None]:
plt.plot(fit_model.history['accuracy'])
plt.plot(fit_model.history['val_accuracy'])

plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train','test'], loc='upper left')
plt.show()

plt.plot(fit_model.history['loss'])
plt.plot(fit_model.history['val_loss'])

plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train','test'], loc='upper left')
plt.show()