#### To run the notebook, you can either press the Run button above, or select each cell and press Shift+Enter, in order.

We will make a convolutional neural network to classify images from the fashion MNIST dataset.

The fashion MNIST dataset is a drop in replacement for the classic MNIST digit recognition dataset. Fashion MNIST contains 28x28 greyscale images of articles of clothing.

They are labelled by digits 0-9 which have the following meanings:


In [None]:
label_meanings={0:"T-Shirt/Top", 
                1:"Trouser", 
                2:"Pullover", 
                3:"Dress", 
                4:"Coat",
                5:"Sandal",
                6:"Shirt",
                7:"Sneaker",
                8:"Bag",
                9:"Ankle Boot"
               }

We will start with importing the packages we will be using.

In [None]:
import keras
import gzip
from os import path
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt

The following code reads the data from the /demo_data directory we passed as a volume into the container, into numpy arrays.

In [None]:
files = ['train-labels-idx1-ubyte.gz', 'train-images-idx3-ubyte.gz',
             't10k-labels-idx1-ubyte.gz', 't10k-images-idx3-ubyte.gz']
paths = []
for fname in files:
    paths.append(path.join('/demo_data', fname))
    
with gzip.open(paths[0], 'rb') as lbpath:
    train_labels = np.frombuffer(lbpath.read(), np.uint8, offset=8)
with gzip.open(paths[1], 'rb') as imgpath:
    train_data = np.frombuffer(imgpath.read(), np.uint8, offset=16).reshape(len(train_labels), 28, 28)
with gzip.open(paths[2], 'rb') as lbpath:
    test_labels = np.frombuffer(lbpath.read(), np.uint8, offset=8)
with gzip.open(paths[3], 'rb') as imgpath:
    test_data = np.frombuffer(imgpath.read(), np.uint8, offset=16).reshape(len(test_labels), 28, 28)

We can visualize some of the data using matplotlib:

In [None]:
fig, ax = plt.subplots(figsize=(10,4))
for i in range(24):
    plt.subplot(3, 8, i+1)
    digit_image = np.ones((28,28)) - test_data[10+i].reshape(28,28)
    plt.imshow(digit_image, cmap='Greys_r')
    plt.axis('off')
plt.suptitle('Examples of Fashion-MNIST Data')

We need to tell keras that our image data is in the form of an array of shape (28,28,1)

In [None]:
from keras import backend as K
K.set_image_data_format('channels_last')

We will reshape our data accordingly, and set a variable to hold the shape of each individual image, which will be used when we set up the input layer of our network.

In [None]:
img_rows, img_cols = 28, 28
train_data = train_data.reshape(train_data.shape[0], img_rows, img_cols, 1)
test_data = test_data.reshape(test_data.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

We will normalize each pixel value to a float32 value between 0 and 1:

In [None]:
train_data = train_data.astype('float32')
test_data = test_data.astype('float32')
train_data = train_data / 255
test_data = test_data / 255

We will also convert the data labels from digits 0-9 to 'one-hot' encodings for our labels.

In [None]:
num_classes = 10
train_labels = keras.utils.to_categorical(train_labels, num_classes)
test_labels = keras.utils.to_categorical(test_labels, num_classes)

Now that our data is preprocessed, we can build our model.

In [None]:
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Dense, Dropout, Flatten
from keras.utils.training_utils import multi_gpu_model
model = Sequential()

When we set the model to use GPU's the Jupyter servers logs will show that the TensorFlow backend has access to them. 

#### Note: if your machine has a different number of gpus, change the gpus=# in the last line.

In [None]:
model.add(Conv2D(32, kernel_size=4, activation='relu', input_shape=input_shape))
model.add(Conv2D(64, (4,4), activation='relu'))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model = multi_gpu_model(model, gpus=2)

We also have to set our loss function, and learning algorithm.

In [None]:
model.compile(loss=keras.losses.mean_squared_error,
              optimizer=keras.optimizers.Adagrad(),
              metrics=['accuracy'])

In [None]:
batch_size = 500
epochs = 8
model.fit(train_data, train_labels,
          batch_size = batch_size,
          epochs = epochs,
          validation_data=(test_data[:5000], test_labels[:5000]))

Now let's test our model on the second half of the test data, which it has never seen before.

In [None]:
score = model.evaluate(test_data[5000:], test_labels[5000:]
                       ,verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

The model is about 90% accurate. Let's see where the model goes wrong.

We can run the second half of the test data on the model, and save the inferred results.

In [None]:
predicted = model.predict_on_batch(test_data[5000:])

We can see what the model predicted on a few of the images

In [None]:
fig, axs = plt.subplots(nrows=3, ncols=6, figsize=(9,6))
ax = axs.flat
for i in range(18):
    digit_image = np.ones((28,28)) - test_data[5000+i].reshape(28,28)
    ax[i].imshow(digit_image, cmap='Greys_r')
    ax[i].set_title(label_meanings[np.argmax(predicted[i])])
    ax[i].axis('off')

We will use scikit-learn to create a confusion matrix, and seaborn to visualize it.

In [None]:
from sklearn.metrics import confusion_matrix
mat = confusion_matrix(np.argmax(test_labels[5000:],  axis=1), np.argmax(predicted, axis=1))


In [None]:
import seaborn as sns
from matplotlib import colors
sns.set()
sns.set_context("notebook", font_scale=1.5)

In [None]:
fig, ax = plt.subplots(figsize=(8,8))
sns.heatmap(mat.T, annot=True, fmt='d', cbar=False, cmap="Blues", 
            xticklabels=label_meanings.values(), yticklabels=label_meanings.values(),
            norm=colors.SymLogNorm(vmin=mat.T.min(), vmax=mat.T.max(),  linthresh=5.0, linscale=10.0), ax=ax)
plt.xlabel('true label')
plt.ylabel('predicted')

Thanks to:

Keras examples: https://github.com/keras-team/keras/tree/master/examples

These scikit learn examples for the idea of using a seaborn heatmap for a confusion matrix:
https://jakevdp.github.io/PythonDataScienceHandbook/05.11-k-means.html