In [1]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
import plotly.express as px
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report
%matplotlib inline
%config InlineBackend.figure_format='retina'

2023-07-12 10:34:39.159885: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


## Load data

[Fashion MNIST](https://keras.io/api/datasets/fashion_mnist/) data is loaded using `keras.datasets`.  
The dataset consists of both training and testing sets. There are 60,000 images for training and 10,000 for testing.  

In [2]:
from tensorflow.keras.datasets import fashion_mnist

(X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()

In [3]:
print(f"Shape of training data: {X_train.shape}")
print(f"Shape of testing data: {X_test.shape}")

Shape of training data: (60000, 28, 28)
Shape of testing data: (10000, 28, 28)


As the label in the train and test sets are 0-9 integers, we need to create a `category_list` for lookup between the integer labels and the true category in text.  
<br>
Based on the dataset description, the labels of the categories are as follows:

| Category | Label |
|:-|:-:|
|T-shirt/Top|0|
|Trouser|1|
|Pullover|2|
|Dress|3|
|Coat|4|
|Sandal|5|
|Shirt|6|
|Sneaker|7|
|Bag|8|
|Ankle Boot|9|

Therefore, we create the following list contains the *categories* as the list items and the *labels* as the item indices.


In [4]:
category_list = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat", "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]

## Visualize sample images

Randomly pick 10 images from the training set and show them in two rows of 5

In [5]:
# Pick 10 random indices
idx = np.random.randint(0, len(X_train), 10)
facet_labels = y_train[idx]

# Work-around to get titles in correct order for plotting in plotly
facet_col_wrap = len(idx) // 2
facet_labels = np.hstack((facet_labels[facet_col_wrap:], facet_labels[:facet_col_wrap]))

# Plot images
fig = px.imshow(X_train[idx], binary_string=True, facet_col=0,
                facet_col_wrap=facet_col_wrap,
                labels={'facet_col':'category'})

# Show category titles
for i, label in enumerate(facet_labels):
    fig.layout.annotations[i]['text'] = f"category = {category_list[label]}"

fig.show()

## Data Preprocessing

Add the *channel* dimension to the images

In [6]:
X_train = X_train[:, :, :, np.newaxis]
X_test = X_test[:, :, :, np.newaxis]

print(f"Shape of training data: {X_train.shape}")
print(f"Shape of testing data: {X_test.shape}")

Shape of training data: (60000, 28, 28, 1)
Shape of testing data: (10000, 28, 28, 1)
