# Multiclass Classification - Clothing Type Classification

To put everything learned from the classification problems together, I am going to use an actual dataset to cover Multi-class classification. Multiclass classification problems are when there are more than two classes as an option (two classes is binary classification).

For this example, I am classifying images of fashion, we're going to build a neural network to classify images of different items of clothing.

* https://www.tensorflow.org/datasets/catalog/fashion_mnist
* https://github.com/zalandoresearch/fashion-mnist

In [None]:
import itertools
import os
import random
import sys

module_path = os.path.abspath(os.path.join('../..'))
if module_path not in sys.path:
    sys.path.append(module_path)

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.metrics import confusion_matrix
import tensorflow as tf
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras.utils import plot_model

from src import utils

In [None]:
# Data has already been split
(train_data, train_labels), (test_data, test_labels) = fashion_mnist.load_data()

# Create a small list so we can index onto our training labels so they're human readable
# Grabbed from dataset github
class_names = [
    'T-shirt/top',
    'Trouser',
    'Pullover',
    'Dress',
    'Coat',
    'Sandal',
    'Shirt',
    'Sneaker',
    'Bag',
    'Ankle boot',
]

## Visualize Data

In [None]:
# Looking at a single X value
train_data[0]

In [None]:
train_labels[0]

In [None]:
train_data[0].shape, train_data.dtype, train_labels[0].shape, train_labels[0].dtype

In [None]:
# Looking at some images
utils.plot.plot_images([0, 1, 2, 3], train_data, train_labels, class_names, black_and_white=True)

In [None]:
utils.plot.plot_image(35, train_data, train_labels, class_names, black_and_white=True)

In [None]:
# Look at a handful of images
indexes = [random.randint(0, len(train_data)) for i in range(4)]
utils.plot.plot_images(indexes, train_data, train_labels, class_names, black_and_white=True)

## Building Multiclass Classification Model

* Input shape = 28 x 28
* Output shape = 10
* Loss function = CategoricalCrossentropy (if one-hot encoded) or SparseCategoricalCrossentropy (if integers)
* Output activation = SoftMax

In [None]:
# Need to rename my data to be more consistant with other notebooks I've done
X_train, y_train = train_data, train_labels
X_test, y_test = test_data, test_labels

In [None]:
# Normalizing my data
X_train_norm = X_train / X_train.max()
X_test_norm = X_test / X_train.max()

In [None]:
# Setting seed to compare models nicely
tf.random.set_seed(27)

# 1. Create Model
# NOTE: Our data needs to be flatttened into a single vector
model_1 = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(28, 28)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(4, activation=tf.keras.activations.relu),
    tf.keras.layers.Dense(4, activation=tf.keras.activations.relu),
    tf.keras.layers.Dense(10, activation=tf.keras.activations.softmax)
])

# 2. Compile Model
model_1.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(),
                optimizer=tf.keras.optimizers.legacy.Adam(),
                metrics=['accuracy'])

# 3. Fit Model
history_1 = model_1.fit(X_train_norm, y_train, epochs=10, validation_data=(X_test_norm, y_test))

In [None]:
model_1.summary()

In [None]:
# Looking at model
utils.visualize.visualize_model(model_1)

## Finding the Ideal Learning Rate

In [None]:
# Rerunning model_1, but going to make a learning rate scheduler to help find optimal learning rate

# Setting seed to compare models nicely
tf.random.set_seed(27)

# 1. Create Model
model_2 = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(28, 28)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(4, activation=tf.keras.activations.relu),
    tf.keras.layers.Dense(4, activation=tf.keras.activations.relu),
    tf.keras.layers.Dense(10, activation=tf.keras.activations.softmax)
])

# 2. Compile Model
model_2.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(),
                optimizer=tf.keras.optimizers.legacy.Adam(),
                metrics=['accuracy'])

lr_scheduler = tf.keras.callbacks.LearningRateScheduler(
    utils.learning_rate.exponential_decay_callback(decay_step=20, decay_factor=10))

# 3. Fit Model w/ Learning Rate Scheduler
history_2 = model_2.fit(
    X_train_norm,
    y_train,
    epochs=40,
    validation_data=(X_test_norm, y_test),
    callbacks=[lr_scheduler])

In [None]:
history_2_df = pd.DataFrame(history_2.history)
history_2_df.head()

In [None]:
utils.plot.plot_history(history_2)

In [None]:
# Loss vs epoch
# Plot learning rate versus the loss
lrs = 1e-3 * (10 ** (tf.range(40) / 20))
utils.plot.plot_learning_rate_versus_loss(lrs, history_2.history['loss'])

#### Ideal Learning Rate

After looking at the plot above, is looks like our ideal learning rate is between 0.002 and 0.001, so going to use 0.002.

### Model w/ Ideal Learning Rate

In [None]:
# Setting seed to compare models nicely
tf.random.set_seed(27)

# 1. Create Model
# NOTE: Our data needs to be flatttened into a single vector
model_3 = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(28, 28)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(4, activation=tf.keras.activations.relu),
    tf.keras.layers.Dense(4, activation=tf.keras.activations.relu),
    tf.keras.layers.Dense(10, activation=tf.keras.activations.softmax)
])

# 2. Compile Model
model_3.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(),
                optimizer=tf.keras.optimizers.legacy.Adam(learning_rate=0.002),
                metrics=['accuracy'])

# 3. Fit Model
history_3 = model_3.fit(X_train_norm, y_train, epochs=20, validation_data=(X_test_norm, y_test))

### Evaluating our Multi-class Classificaiton Model

To evaluate our multi-class classification model we coild:
* Evaluate its performance using other classification metrics (such as a confusion matrix).
* Assess some of its predictions (through visualisations).
* Improve its results (by training it for longer or changing the architecture).
* Save and export it for usi in an application

In [None]:
# Predicting values from model_3
y_pred_3 = model_3.predict(X_test_norm)
y_pred_3[:5]

In [None]:
# Need to convert all to integers
y_pred_3_ints = y_pred_3.argmax(axis=1)
y_pred_3_ints

In [None]:
# Create a confusion matrix.
utils.plot.plot_confusion_matrix(y_test, y_pred_3_ints, cell_text_size=8, classes=class_names)

In [None]:
# Getting predictions of model_3
y_pred_probabilities = model_3.predict(X_test_norm)

In [None]:
utils.plot.plot_random_image_label_and_prediction(images=X_test_norm,
                                                  true_labels=y_test,
                                                  pred_probabilities=y_pred_probabilities,
                                                  class_names=class_names,
                                                  black_and_white = True)

## What Patterns is the Model Learning?

In [None]:
model_3.layers

In [None]:
# Get the patterns in a network by looking at the weight of the first hidden layer
weights, biases = model_3.layers[1].get_weights()

weights, weights.shape, biases, biases.shape

In [None]:
# NOTE: the param represents the weights and biases that are trainable for each layer:
# flatten layer: 0 (just reshapes the data from 28x28 to 784)
# Dense (Hidden Layer 1): 3140 (784 * 4 = 3136 weights, 4 biases (one for each neuron in layer))
# Dense (Hidden Layer 2): 20 (4 * 4 = 16 weights, 4 biases (one for each neuron in layer))
# Dense (Output Layer): 50 (4 * 10 = 40 weights, 10 biases (one for each neuron in layer))
model_3.summary()

## Trial-4: Testing Model Weight Saving Idea

Going to test a new sequential model real quick.

In [None]:
class MemorySequentialModel(tf.keras.models.Sequential):
    