## CNN Example

For this example we are taking a look at the MNIST dataset, which is used often in ML tutorials and is a large database of handwritten digits.

Documentation: https://keras.io/api/datasets/mnist/

First, we load the libraries and dataset.

In [None]:
#import libraries
import keras
from keras.datasets import mnist
import matplotlib.pyplot as plt
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Flatten
from tensorflow.keras.optimizers import SGD

#load mnist dataset
#split into train/test
(X_train, y_train), (X_test, y_test) = mnist.load_data()

Exercise 1: Print the raw data from the first image from the training dataset. Can you tell what it is? What do the numbers represent? Now use matplotlib to print the first 6 images from this training dataset.

In [None]:
import sys
import numpy
numpy.set_printoptions(threshold=sys.maxsize)

Exercise 2:The CNN will only take a 4D tensor. Replace the pseudocode with the correct numbers into the reshaping commands. See documentation for Conv2D at https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2D
What do these new arrays represent?

In [None]:
#Hint: reshaping
#X_train = X_train.reshape(('batch size','height','width','channels'))
#X_train[1]

Exercise 3: These images are in grayscale. Each number value corresponds to a color between white and black. Normalise the image with reference to black (value:255). How does this change the images?

In [None]:
#scaling

print('X_train shape:', X_train.shape) #X_train shape: (60000, 28, 28, 1)

Next, we encode the outcome variables into categories

In [None]:
#set number of categories
num_category = 10
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_category)
y_test = keras.utils.to_categorical(y_test, num_category)

Specify the model structure.

In [None]:
#model 1
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(100, activation='relu'))
model.add(Dense(num_category, activation='softmax'))

Next, compile and fit the model. Training time may take several minutes to days depending on the largeness of the data.

In [None]:
#compile model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'])

In [None]:
#fit the model
batch_size = 128
num_epoch = 10
model_log = model.fit(X_train, y_train,
          batch_size=batch_size,
          epochs=num_epoch,
          verbose=1,
          validation_data=(X_test, y_test))

Loss and accuracy can both be used to assess the performance of the model.

In [None]:
score = model.evaluate(X_test, y_test, verbose=0)
print('Test loss:', score[0]) 
print('Test accuracy:', score[1])

In [None]:
import os
# plotting the metrics
fig = plt.figure()
plt.subplot(2,1,1)
plt.plot(model_log.history['accuracy'])
plt.plot(model_log.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='lower right')
plt.subplot(2,1,2)
plt.plot(model_log.history['loss'])
plt.plot(model_log.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper right')
plt.tight_layout()
fig

Exercise 4: Run model 1 and model 2. For model 2, you will need to specify an additional convolutional layer after the first that is twice a large. You will also specify a dropout layer after the pooling layer. Which of these two models performs better?

Exercise 5: After finding the best out of the two models, add an optimiser and report the results. Does changing the learning rate within the optimiser affect the outcome?