<a href="https://colab.research.google.com/github/Ibrahim-Ayaz/FashionMNIST-Computer-Vision-Project/blob/main/fashion_mnist_classification_project_final.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## FashionMNIST Computer Vision Object Classification Project

In this notebook, we're going to be doing a classification project to classify a clothing dataset (FashionMNSIT) based on a total of 10 different clothing classes. The link to the dataset can be found here: https://www.tensorflow.org/datasets/catalog/fashion_mnist

For the paper reference, you can open the following link: https://arxiv.org/abs/1708.07747


### Check for a GPU

In [None]:
# Make sure we have access to a GPU
!nvidia-smi

### Downloading helper functions script for our modelling experiments

In [None]:
# Download script
!wget https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/refs/heads/main/extras/helper_functions.py

In [None]:
# Import the necessary functions from script
from helper_functions import create_tensorboard_callback, plot_loss_curves, compare_historys, make_confusion_matrix

In [None]:
import tensorflow_datasets as tfds
import tensorflow as tf

# Check if our target dataset is within TensorFlow datasets
all_datasets = tfds.list_builders()
print(f'fashion_mnist' in all_datasets)

> 🔑 **Note:** Most TensorFlow datasets (TFDS) have already been split into training and test sets, so there's no need to create a splitting function or to split the dataset manually.

In [None]:
# Download the target dataset
(train_ds, test_ds), ds_info = tfds.load(name = 'fashion_mnist', split = ['train', 'test'], as_supervised = True, with_info = True)

### Visualising one or more random samples from our dataset

It's important that before building or picking a pretrained computer vision model, we visualise **random** samples from our training data so we can have an idea of how our training dataset of clothing images is going to look like. This can give us an inkling on where the model can potentially get confused when trying to predict the target class, due to similarity between two or more classes.

The data explorer's motto: visualise, visualise, visualise!

To visualise one or more random samples from the training dataset, there are two options:
* The short way: we can use the method `tfds.show_examples()` method from TensorFlow datasets: https://www.tensorflow.org/datasets/api_docs/python/tfds/visualization/show_examples
* Or we can create a visualisation function to plot one or more random samples if we want

In [None]:
# Visualise a sample of clothing from the train dataset (using the tfds.show_examples() function)
tfds.show_examples(train_ds, ds_info)

In [None]:
import matplotlib.pyplot as plt

def view_random_image(train_data, ds_info):
    class_names = ds_info.features['label'].names

    for image, label in train_data.take(5):
        plt.figure(figsize = (7, 7))

        # Handle grayscale vs RGB
        img = tf.squeeze(image).numpy()
        if img.ndim == 2:
            plt.imshow(img, cmap = 'gray')
        else:
            plt.imshow(img.astype('uint8'))

        plt.title(class_names[label])
        plt.axis(False)
        plt.show()

In [None]:
# Test function
view_random_image(train_data = train_ds, ds_info = ds_info)

### Creating  preprocessing functions and data pipelines for our data for faster execution time

Now that we've seen what our training dataset looks like, it's time to start creating efficient data pieplines and preprocessing functions for our GPU to run as fast as possible.

In [None]:
# Create data augmentation layer (for diversity)
from tensorflow.keras import layers

data_augmentation = tf.keras.Sequential([
    layers.RandomFlip('horizontal'),
    layers.RandomRotation(0.1),
    layers.RandomZoom(0.1),
], name = 'data_augmentation')

In [None]:
# Define a simplified preprocessing function
def preprocess_simple(image, label, image_size = 224):
    image = tf.image.resize(image, [image_size, image_size])
    image = tf.cast(image, dtype = tf.float32)
    if image.shape[-1] == 1:
      image = tf.image.grayscale_to_rgb(image)
    return image, label

In [None]:
batch_size = 32

# Map preprocessing function, shuffle and parallelise it to training data
train_ds = train_ds.map(map_func = preprocess_simple, num_parallel_calls = tf.data.AUTOTUNE)
train_ds = train_ds.shuffle(1000).batch(batch_size).prefetch(tf.data.AUTOTUNE)

# Validation/test pipeline
test_ds = test_ds.map(map_func = preprocess_simple, num_parallel_calls = tf.data.AUTOTUNE)
test_ds = test_ds.batch(batch_size).prefetch(tf.data.AUTOTUNE)

train_ds, test_ds

### Creating ModelCheckPoint callback

In [None]:
# Import tensorboard callback from script
from helper_functions import create_tensorboard_callback

# Setup model checkpoint path to save all of models' progress during training
checkpoint_path = 'model_checkpoint.weights.h5'
model_checkpoint = tf.keras.callbacks.ModelCheckpoint(filepath = checkpoint_path, save_weights_only = True, monitor = 'val_accuracy', save_best_only = True)

### Mixed precision training

In [None]:
from tensorflow.keras import mixed_precision

mixed_precision.set_global_policy('mixed_float16')
mixed_precision.global_policy()

## Model 0 (baseline): Pretrained ResNet101V2:

Our first modelling experiment will start off will a simple ResNet18 architecture (baseline), we will proceed further will complex ImageNet model architectures as we will have plenty of modelling experiments to cover.

In [None]:
from tensorflow.keras import layers

# Setup input shape for base model
input_shape = (224, 224, 3)

# Get number of classes
num_classes = ds_info.features['label'].num_classes

# Create model using functional API
base_model = tf.keras.applications.ResNet101V2(include_top = False, weights = 'imagenet', input_shape = input_shape)
base_model.trainable = False

inputs = layers.Input(shape = input_shape, name = 'input_layer')
x = data_augmentation(inputs)
x = base_model(inputs)
x = layers.GlobalAveragePooling2D(name = 'global_average_pooling_layer')(x)
x = layers.BatchNormalization()(x)
x = layers.Dense(256, activation = 'relu')(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(num_classes, activation = 'softmax', name = 'output_layer')(x)
model_0 = tf.keras.Model(inputs, outputs)

# Compile baseline model
model_0.compile(loss = tf.keras.losses.SparseCategoricalCrossentropy(),
                optimizer = tf.keras.optimizers.Adam(),
                metrics = ['accuracy'])

# Get model summary
model_0.summary()

In [None]:
# Fit feature extractor model to data
history_model_0 = model_0.fit(train_ds, epochs = 5, steps_per_epoch = len(train_ds), validation_data = test_ds, validation_steps = int(0.15 * len(test_ds)), callbacks = [create_tensorboard_callback(dir_name = 'tensorboard_logs', experiment_name = 'resnet18_feature_extraction'), model_checkpoint])

In [None]:
# Evaluate model
model_0.evaluate(test_ds)

In [None]:
# Make predictions with model
model_0_preds = model_0.predict(test_ds, verbose = 1)
model_0_preds, model_0_preds.shape

In [None]:
# Convert model preds to tensor -> label
model_0_preds_tensor = tf.convert_to_tensor(model_0_preds)
labels = tf.argmax(model_0_preds_tensor, axis = 1)
labels

In [None]:
# Plot model loss curves
plot_loss_curves(history = history_model_0)

In [None]:
import numpy as np

# Get true labels from the test dataset
true_labels = []
for images, labels_batch in test_ds.unbatch():
    true_labels.append(labels_batch.numpy())

true_labels = np.array(true_labels)

# Plot model 0's confusion matrix
make_confusion_matrix(y_true = true_labels, y_pred = labels)

From all of `model_0`'s experiment tracking and results, everything turned out to be very impressive (i.e. ROC curves for loss and accuracy, as well as when fitting the model to the training data).

From the confusion matrix above, we can see which classes (labels) is the model getting confused on and where it is getting confused the **most**.

## Model 1: Pretrained ResNet152V2

Let's see if our pretrained ResNet1252V2 feature extractor can beat the ResNet152 pretrained model.

In [None]:
#input_shape = (224, 224, 3)

num_classes = ds_info.features['label'].num_classes

base_model = tf.keras.applications.ResNet152V2(include_top = False, weights = 'imagenet', input_shape = input_shape)
base_model.trainable = False

inputs = layers.Input(shape = input_shape, name = 'input_layer')
x = data_augmentation(inputs)
x = base_model(inputs)
x = layers.GlobalAveragePooling2D(name = 'global_average_pooling_layer')(x)
x = layers.BatchNormalization()(x)
x = layers.Dense(256, activation = 'relu')(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(num_classes, activation = 'softmax', name = 'output_layer')(x)
model_1 = tf.keras.Model(inputs, outputs)

# Compile
model_1.compile(loss = tf.keras.losses.SparseCategoricalCrossentropy(),
                optimizer = tf.keras.optimizers.Adam(),
                metrics = ['accuracy'])

model_1.summary()

In [None]:
# Fit
history_model_1 = model_1.fit(train_ds, epochs = 5, steps_per_epoch = len(train_ds), validation_data = test_ds, validation_steps = int(0.15 * len(test_ds)), callbacks = [create_tensorboard_callback(dir_name = 'tensorboard_logs', experiment_name = 'resnet152_feature_extraction'), model_checkpoint])

In [None]:
# Evaluate model
model_1.evaluate(test_ds)

In [None]:
# Make predictions with model
model_1_preds = model_1.predict(test_ds, verbose = 1)
model_1_preds

In [None]:
# Convert preds to pred labels
model_1_preds_tensor = tf.convert_to_tensor(model_1_preds)
labels = tf.argmax(model_1_preds_tensor, axis = 1)
labels

In [None]:
# Plot loss curves
plot_loss_curves(history = history_model_1)

The loss curves on `model_1` look much the same and aligned with each other like `model_0`'s. However, `model_0` still outperforms `model_1`'s val_accuracy score very slightly.

In [None]:
# Plot model_1's confusion matrix
make_confusion_matrix(y_true = true_labels, y_pred = labels)

### Comparing `model_1` and `model_2` histories

We're now going to compare the histories of both `model_1` and `model_2`.

> 🔑 **Note:** We haven't fine-tuned both models since they're already performing very well, so the curve/graph for fine-tuning won't be displayed.

In [None]:
# Comapre model_0 and model_1 histories
compare_historys(original_history = history_model_0, new_history = history_model_1)

## Model 2: Pretrained ResNet50V2

In [None]:
input_shape = (224, 224, 3)

num_classes = ds_info.features['label'].num_classes

base_model = tf.keras.applications.ResNet50V2(include_top = False, weights = 'imagenet', input_shape = input_shape)
base_model.trainable = False

inputs = layers.Input(shape = input_shape, name = 'input_layer')
x = data_augmentation(inputs)
x = base_model(inputs)
x = layers.GlobalAveragePooling2D(name = 'global_average_pooling_layer')(x)
x = layers.BatchNormalization()(x)
x = layers.Dense(256, activation = 'relu')(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(num_classes, activation = 'softmax', name = 'output_layer')(x)
model_2 = tf.keras.Model(inputs, outputs)

# Compile
model_2.compile(loss = tf.keras.losses.SparseCategoricalCrossentropy(),
                optimizer = tf.keras.optimizers.Adam(),
                metrics = ['accuracy'])

model_2.summary()

In [None]:
# Fit
history_model_2 = model_2.fit(train_ds, epochs = 5, steps_per_epoch = len(train_ds), validation_data = test_ds, validation_steps = int(0.15 * len(test_ds)), callbacks = [create_tensorboard_callback(dir_name = 'tensorboard_logs', experiment_name = 'resnet50_feature_extraction'), model_checkpoint])

In [None]:
# Evaluate model
model_2.evaluate(test_ds)

In [None]:
# Make predictions with model
model_2_preds = model_2.predict(test_ds, verbose = 1)
model_2_preds

In [None]:
# Convert preds to labels
model_2_preds_tensor = tf.convert_to_tensor(model_2_preds)
labels = tf.argmax(model_2_preds_tensor, axis = 1)
labels

In [None]:
# Plot loss curves
plot_loss_curves(history = history_model_2)

In [None]:
# Plot model's confusion matrix
make_confusion_matrix(y_true = true_labels, y_pred = labels)

## Model 3: Pretrained EfficientNetV2B0

In [None]:
input_shape = (224, 224, 3)

num_classes = ds_info.features['label'].num_classes

base_model = tf.keras.applications.EfficientNetV2B0(include_top = False, weights = 'imagenet', input_shape = input_shape)
base_model.trainable = False

inputs = layers.Input(shape = input_shape, name = 'input_layer')
x = data_augmentation(inputs)
x = base_model(inputs)
x = layers.GlobalAveragePooling2D(name = 'global_average_pooling_layer')(x)
x = layers.BatchNormalization()(x)
x = layers.Dense(256, activation = 'relu')(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(num_classes, activation = 'softmax', name = 'output_layer')(x)
model_3 = tf.keras.Model(inputs, outputs)

# Compile
model_3.compile(loss = tf.keras.losses.SparseCategoricalCrossentropy(),
                optimizer = tf.keras.optimizers.Adam(),
                metrics = ['accuracy'])

model_3.summary()

In [None]:
# Fit
history_model_3 = model_3.fit(train_ds, epochs = 5, steps_per_epoch = len(train_ds), validation_data = test_ds, validation_steps = int(0.15 * len(test_ds)), callbacks = [create_tensorboard_callback(dir_name = 'tensorboard_logs', experiment_name = 'efficientnetv2b0_feature_extraction'), model_checkpoint])

In [None]:
# Evaluate model
model_3.evaluate(test_ds)

In [None]:
# Make predictions with model
model_3_preds = model_3.predict(test_ds, verbose = 1)
model_3_preds

In [None]:
# Convert preds to labels
model_3_preds_tensor = tf.convert_to_tensor(model_3_preds)
labels = tf.argmax(model_3_preds_tensor, axis = 1)
labels

In [None]:
# Plot loss curves
plot_loss_curves(history = history_model_3)

In [None]:
# Plot model's confusion matrix
make_confusion_matrix(y_true = true_labels, y_pred = labels)

## Model 4: Pretrained EfficientNetV2M (M = medium sized)

In [None]:
input_shape = (224, 224, 3)

num_classes = ds_info.features['label'].num_classes

# Create an instance of the EfficientNetV2M model
base_model = tf.keras.applications.EfficientNetV2M(include_top = False, weights = 'imagenet', input_shape = input_shape)
base_model.trainable = False

inputs = layers.Input(shape = input_shape, name = 'input_layer')
x = data_augmentation(inputs)
x = base_model(inputs)
x = layers.GlobalAveragePooling2D(name = 'global_average_pooling_layer')(x)
x = layers.BatchNormalization()(x)
x = layers.Dense(256, activation = 'relu')(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(num_classes, activation = 'softmax', name = 'output_layer')(x)
model_4 = tf.keras.Model(inputs, outputs)

model_4.compile(loss = tf.keras.losses.SparseCategoricalCrossentropy(),
                optimizer = tf.keras.optimizers.Adam(),
                metrics = ['accuracy'])

model_4.summary()

In [None]:
# Fit model 4
history_model_4 = model_4.fit(train_ds, epochs = 5, steps_per_epoch = len(train_ds), validation_data = test_ds, validation_steps = int(0.15 * len(test_ds)), callbacks = [create_tensorboard_callback(dir_name = 'tensorboard_logs', experiment_name = 'efficientnetv2m_feature_extraction'), model_checkpoint])

In [None]:
# Evaluate model
model_4.evaluate(test_ds)

In [None]:
# Make predictions with model
model_4_preds = model_4.predict(test_ds, verbose = 1)
model_4_preds

In [None]:
# Convert preds to labels
model_4_preds_tensor = tf.convert_to_tensor(model_4_preds)
labels = tf.argmax(model_4_preds_tensor, axis = 1)
labels

In [None]:
# Plot loss curves
plot_loss_curves(history = history_model_4)

In [None]:
# Plot model's confusion matrix
make_confusion_matrix(y_true = true_labels, y_pred = labels)

## Uploading our modelling experiments to TensorBoard

We're now done with our modelling exeperiments with this notebook, let's now upload our results to TensorBoard to view all of our models' training logs.

In [None]:
# Upload models' results to TensorBoard
!pip install -q tensorboard
%load_ext tensorboard
%tensorboard --logdir=/content/tensorboard_logs
%reload_ext tensorboard