<a href="https://colab.research.google.com/github/kristina0614/Computer-Vision-Course/blob/main/Fashion_MNIST_Kaggle_File.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src="https://raw.githubusercontent.com/zalandoresearch/fashion-mnist/master/doc/img/fashion-mnist-sprite.png" width="500" height="250" align="center"/>

<br>
<h1 style = "font-size:30px; font-weight : bold; color : blue; text-align: center; border-radius: 10px 15px;"> Fashion MNIST: Image Classification with Convolutional Neural Networks </h1>
<br>

---

## Overview

This is my first notebook on image classification using Convolutional Neural Networks. My motivation to create this notebook was to put in practice what I’ve learned in Kaggle’s Computer Vision course.

I’ve built two CNN models, one simpler to serve as a baseline and one improved version, with more convolutional layers and the addition of dropouts for regularization.

Validation Accuracy (20% of train data):

- Baseline CNN: 91.86%
- Improved CNN: 93.58%

Test Accuracy:

- Improved CNN: 93.94%

To prepare the data, I used the preprocessing function found in [Gabriel Preda’s](https://www.kaggle.com/gpreda) notebook [‘CNN with Tensorflow|Keras for Fashion MNIST’](https://www.kaggle.com/gpreda/cnn-with-tensorflow-keras-for-fashion-mnist). 


# <a id='0'>Content</a>

- <a href='#1'>Dataset Information</a>  
- <a href='#2'>Importing Packages and Exploring the Data</a>  
- <a href='#3'>Baseline Model</a>  
- <a href='#4'>Improved Model</a>
- <a href='#5'>Predictions on the Test Set</a>

# <a id="1">Dataset Information</a> 

### Context
Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. Zalando intends Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.

### Content
Each image is 28 pixels in height and 28 pixels in width, for a total of 784 pixels. Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker. This pixel-value is an integer between 0 and 255. The training and test data sets have 785 columns. The first column consists of the class labels, and represents the article of clothing. The rest of the columns contain the pixel-values of the associated image.

#### Labels
Each training and test example is assigned to one of the following labels:

- 0: T-shirt/top
- 1: Trouser
- 2: Pullover
- 3: Dress
- 4: Coat
- 5: Sandal
- 6: Shirt
- 7: Sneaker
- 8: Bag
- 9: Ankle boot

## <center>If you find this notebook useful, support with an upvote!<center>

# <a id="2">Importing Packages and Exploring the Data</a> 

In [None]:
import pandas as pd       
import matplotlib as mat
import matplotlib.pyplot as plt    
import numpy as np
import seaborn as sns
%matplotlib inline

import random
import os

from numpy.random import seed
seed(42)

random.seed(42)
os.environ['PYTHONHASHSEED'] = str(42)
os.environ['TF_DETERMINISTIC_OPS'] = '1'

from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn.metrics import accuracy_score

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras import callbacks
from keras.models import Model

from tensorflow.random import set_seed
set_seed(42)

In [None]:
df_train = pd.read_csv('../input/fashionmnist/fashion-mnist_train.csv')
df_test = pd.read_csv('../input/fashionmnist/fashion-mnist_test.csv')

In [None]:
df_train

In [None]:
df_test

Both datasets have 785 columns. The first column presents the class labels for each article of clothing, while the remaining columns contains the pixel values. As described in the dataset information, each image is 28 pixels in height and 28 pixels in width (for a total of 784 pixels).

We will separate the labels from each dataset and reshape their columns from 784 columns to a 28x28x1 format (Height, Width and Number of Channels) using the preprocessing function found in Preda’s notebook.  

In [None]:
#https://www.kaggle.com/gpreda/cnn-with-tensorflow-keras-for-fashion-mnist

#784 pixels(columns) -> 28x28 (height and width in pixels)
IMG_ROWS = 28
IMG_COLS = 28
NUM_CLASSES = 10

def data_preprocessing(raw):
    out_y = keras.utils.to_categorical(raw.label, NUM_CLASSES) 
    num_images = raw.shape[0]
    x_as_array = raw.values[:,1:]
    
    # Reshaping to (num_images, Width, Height, Colour Channels)
    x_shaped_array = x_as_array.reshape(num_images, IMG_ROWS, IMG_COLS, 1)
    
    #Normalizing the values (pixel-value is an integer between 0 and 255)
    out_x = x_shaped_array / 255
    return out_x, out_y

In [None]:
X, Y = data_preprocessing(df_train)
X_test, Y_test = data_preprocessing(df_test)

In [None]:
print('X_train shape: ', X.shape)
print('X_test shape: ', X_test.shape)

Let’s plot a few samples from each dataset to see what kind of images we have.

In [None]:
labels_list = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal'
              ,'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

plt.figure(figsize=(12,12))

for i in range(0, 36):
    plt.subplot(6,6,i + 1)
    plt.imshow(X[i], cmap=plt.get_cmap('gray'))
    plt.title(labels_list[df_train.iloc[i, 0]])
    plt.axis("off")

plt.tight_layout()

plt.show()

In [None]:
plt.figure(figsize=(12,12))

for i in range(0, 36):
    plt.subplot(6,6,i + 1)
    plt.imshow(X_test[i], cmap=plt.get_cmap('gray'))
    plt.title(labels_list[df_test.iloc[i, 0]])
    plt.axis("off")

plt.tight_layout()

plt.show()

Before we move on to the modelling stage, we will check the how many samples each class has.

In [None]:
print('Train Set Class Distribution:\n')
print(df_train['label'].value_counts())

print('\nTest Set Class Distribution:\n')
print(df_test['label'].value_counts())

Both datasets are perfectly balanced, with the samples equally distributed among the classes.

# <a id="3">Baseline Model</a> 

First, we will split the train dataset into train/validation.

In [None]:
X_train, X_val, Y_train, Y_val = train_test_split(X, Y, test_size = 0.2, random_state = 42
                                                    , stratify = Y)

Now, let's create our baseline model with two convolutional layers.

In [None]:
def get_model():
    
    #Input shape = [width, height, color channels]
    inputs = layers.Input(shape=(IMG_ROWS, IMG_COLS, 1))
    
    # Block One
    x = layers.BatchNormalization()(inputs)
    x = layers.Conv2D(filters=32, kernel_size=3, activation='relu', padding='same')(x)
    x = layers.MaxPool2D()(x)

    # Block Two
    x = layers.BatchNormalization()(x)
    x = layers.Conv2D(filters=64, kernel_size=3, activation='relu', padding='same')(x)
    x = layers.MaxPool2D()(x)

    # Head
    x = layers.BatchNormalization()(x)
    x = layers.Flatten()(x)
    x = layers.Dense(64, activation='relu')(x)
    
    #Final Layer (Output)
    output = layers.Dense(NUM_CLASSES, activation='softmax')(x)
    
    model = keras.Model(inputs=[inputs], outputs=output)
    
    return model

In [None]:
#Setting early_stopping callback
early_stopping = callbacks.EarlyStopping(
    monitor='val_accuracy',
    patience=15,
    min_delta=0.0000001,
    restore_best_weights=True,
)

In [None]:
keras.backend.clear_session()

model_1 = get_model()
model_1.compile(loss='categorical_crossentropy'
              , optimizer = keras.optimizers.Adam(learning_rate=0.001), metrics='accuracy')

model_1.summary()

Let’s train our baseline model and use the validation set for a first assessment of its performance.

In [None]:
history = model_1.fit(X_train, Y_train,
          batch_size = 128, epochs = 100,
          validation_data=(X_val, Y_val),
          callbacks=[early_stopping]);

In [None]:
fig, ax = plt.subplots(figsize=(20,8))
sns.lineplot(x = history.epoch, y = history.history['loss'])
sns.lineplot(x = history.epoch, y = history.history['val_loss'])
ax.set_title('Learning Curve (Loss)')
ax.set_ylabel('Loss')
ax.set_xlabel('Epoch')
ax.legend(['train', 'val'], loc='best')
plt.show()

In [None]:
fig, ax = plt.subplots(figsize=(20,8))
sns.lineplot(x = history.epoch, y = history.history['accuracy'])
sns.lineplot(x = history.epoch, y = history.history['val_accuracy'])
ax.set_title('Learning Curve (Accuracy)')
ax.set_ylabel('Accuracy')
ax.set_xlabel('Epoch')
ax.legend(['train', 'val'], loc='best')
plt.show()

In [None]:
score = model_1.evaluate(X_val, Y_val, verbose = 0)
print('Val loss:', score[0])
print('Val accuracy:', score[1])

Our baseline model reached a validation accuracy of 91.86%, which seems to be good. However, when we look at the learning curves (specifically the loss curves), we can see that the model is overfitting right after the first 5 epochs. This shows us that the addition of regularization methods, such as dropout layers, can probably improve our model’s performance.

# <a id="4">Improved Model</a> 

Now, let’s build a new, better model. As previously stated, the solely addition of dropout layers could improve the validation accuracy. I’ve tested this and found a validation accuracy of 93,36% (not shown in this notebook). There is still room to build a more robust CNN. After a few tests, I’ve decided to include a third Conv Block, with 2 convolutional layers. Let’s check the performance of this new model.

In [None]:
def get_model2():
    
    #Input shape = [width, height, color channels]
    inputs = layers.Input(shape=(IMG_ROWS, IMG_COLS, 1))
    
    # Block One
    x = layers.BatchNormalization()(inputs)
    x = layers.Conv2D(filters=32, kernel_size=3, activation='relu', padding='same')(x)
    x = layers.MaxPool2D()(x)
    x = layers.Dropout(0.3)(x)

    # Block Two
    x = layers.BatchNormalization()(x)
    x = layers.Conv2D(filters=64, kernel_size=3, activation='relu', padding='same')(x)
    x = layers.MaxPool2D()(x)
    x = layers.Dropout(0.3)(x)

    # Block Three
    x = layers.BatchNormalization()(x)
    x = layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding='same')(x)
    x = layers.Conv2D(filters=128, kernel_size=3, activation='relu', padding='same')(x)
    x = layers.MaxPool2D()(x)
    x = layers.Dropout(0.3)(x)

    # Head
    x = layers.BatchNormalization()(x)
    x = layers.Flatten()(x)
    x = layers.Dense(64, activation='relu')(x)
    x = layers.Dropout(0.3)(x)
    
    #Final Layer (Output)
    output = layers.Dense(NUM_CLASSES, activation='softmax')(x)
    
    model = keras.Model(inputs=[inputs], outputs=output)
    
    return model

In [None]:
keras.backend.clear_session()

model_2 = get_model2()
model_2.compile(loss='categorical_crossentropy'
              , optimizer = keras.optimizers.Adam(learning_rate=0.001), metrics='accuracy')

model_2.summary()

In [None]:
history = model_2.fit(X_train, Y_train,
          batch_size = 128, epochs = 100,
          validation_data=(X_val, Y_val),
          callbacks=[early_stopping]);

In [None]:
fig, ax = plt.subplots(figsize=(20,8))
sns.lineplot(x = history.epoch, y = history.history['loss'])
sns.lineplot(x = history.epoch, y = history.history['val_loss'])
ax.set_title('Learning Curve (Loss)')
ax.set_ylabel('Loss')
ax.set_xlabel('Epoch')
ax.legend(['train', 'test'], loc='best')
plt.show()

In [None]:
fig, ax = plt.subplots(figsize=(20,8))
sns.lineplot(x = history.epoch, y = history.history['accuracy'])
sns.lineplot(x = history.epoch, y = history.history['val_accuracy'])
ax.set_title('Learning Curve (Accuracy)')
ax.set_ylabel('Accuracy')
ax.set_xlabel('Epoch')
ax.legend(['train', 'test'], loc='best')
plt.show()

In [None]:
score = model_2.evaluate(X_val, Y_val, verbose = 0)
print('Val loss:', score[0])
print('Val accuracy:', score[1])

The effect of the dropout layers can be observed by looking at the learning curves. The validation loss decreases for a longer period, reaching lower values than the one found in our baseline model, and it stabilizes at a certain point. The new model reached a validation accuracy of 93,58%, beating the baseline model.

# <a id="5">Predictions on the Test Set</a> 

Now, let’s use our model to make predictions on the test set and check our results.

In [None]:
predictions = model_2.predict(X_test)
pred_labels = predictions.argmax(axis=-1) #From probabilities to class labels
print("Test Accuracy: ", accuracy_score(df_test['label'], pred_labels))

In [None]:
print(metrics.classification_report(df_test['label'], pred_labels, target_names=labels_list))

We were able to reach a test accuracy of 93.94% with the new model. As seen in the classification report, our model was almost perfect at predicting Trousers. The performance in predicting Sandals and Bags is also noticeable, with a 99% F1-Score. On the other hand, the model had some difficulties in predicting Shirts, with a Recall around 80% and less than 85% in Precision. This will be explored in a future version of this notebook.

## <center>If you find this notebook useful, support with an upvote!<center>