# Practical on Convolutional Neural Networks

During this practical you will learn how to develop and train a convolutional neural network for recognising hand-written digits provided by the MNIST dataset (you can read more here:https://en.wikipedia.org/wiki/MNIST_database), using the keras library: https://keras.io/ 


In [None]:
#import libraries
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.utils import np_utils

#The dataset can be downloaded and loaded through the keras library as follows:
#(the first time might take longer because the dataset will be downloaded first)
(X_train, y_train), (X_test, y_test) = mnist.load_data()

### Visualise the data 

To view the four first examples of the dataset, we will you matplotlib. The command %matplotlib inline tells your jupyter notebook to show the image, if you forget it the images won't show on your browser. 

The cryptic arguments (2,2,1), (2,2,2), (2,2,3), and (2,2,4) below correspond to top left, top right, bottom left and bottom right. 

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

plt.subplot(2,2,1)
plt.imshow(X_train[0], cmap=plt.get_cmap('gray'))
plt.subplot(2,2,2)
plt.imshow(X_train[1], cmap=plt.get_cmap('gray'))
plt.subplot(2,2,3)
plt.imshow(X_train[2], cmap=plt.get_cmap('gray'))
plt.subplot(2,2,4)
plt.imshow(X_train[3], cmap=plt.get_cmap('gray'))

#show the plot
plt.show()

# 1. Data preprocessing

After we load the MNIST dataset, we will need to reshape it so that it is suitable for training a CNN. 
In Keras, the layers used for two-dimensional convolutions expect pixel values with the follwong dimensions:

[samples,width,height,channels]

Note that because the MNIST data is in grey scale, we only have one channel, i.e the channel dimension is 1. If we had RGB images, the channels would have a value of 3, and it would be like having 3 image inputs for every color image.

In [None]:
X_train = X_train.reshape((X_train.shape[0], 28, 28, 1)).astype('float32')
X_test = X_test.reshape((X_test.shape[0], 28, 28, 1)).astype('float32')

### Scaling 

For grayscale images, the pixel values are between 0 and 255. When using neural networks, it is common practice to scale rhe input values by dividing each value by the maximum of 255, so they can take values between 0-1 (so they are more similar). This process is called normalisation. 

In [None]:
X_train = X_train / 255
X_test = X_test / 255

### Encoding of outputs

Finally, the output variable is an integer from 0 to 9. This is a multi-class classification problem (as opposed to binary that we only predict between two classes). Therefore, we will encode the outputs using one-hot encoding as follows: 

Digit 0 ---> [0,0,0,0,0,0,0,0,0]<br>
Digit 1 ---> [1.0,0,0,0,0,0,0,0]<br>
Digit 2 ---> [0,1,0,0,0,0,0,0,0]<br>
...<br>
Digit 9 ---> [0,0,0,0,0,0,0,0,1]

In [None]:
# one-hot encoding for outputs
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
num_classes = y_test.shape[1]

# 2. Defining a CNN model

The function below defines a sequential model (this is how keras defines a CNN). Our models has aconvolutional layer, with a filter 5x5, and ReLU as an activation function (you can define more arguments, see here: https://keras.io/layers/convolutional/). 

The CNN also includes a maxpooling layer with strides 2x2, a dropout layer (randomly sets a fraction rate of input units to 0 at each update during training time, which helps prevent overfitting) and two fully-connected layers (which are denoted as Dense in keras).

The model will be trained by using stochastic gradients descent ("sgd"), and it will use MSE as a loss function. 

In [None]:
def cnn_model():
    model = Sequential()
    model.add(Conv2D(32, (5, 5), input_shape=(28, 28, 1), activation='relu'))
    model.add(MaxPooling2D(strides=(2, 2)))
    model.add(Dropout(0.2))
    model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))
    #Compile/train model
    model.compile(loss='mean_squared_error', optimizer='sgd', metrics=['accuracy'])
    return model


In [None]:
#define the model
model = cnn_model()
model.summary()
#train the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=3, batch_size=200)
#Evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("CNN Accuracy: %.2f%%" % (scores[1]*100))

### Exercise(s): 

1. Define a new function "deep_cnn_model", that includes three convolutional layers, with maxpooling in between them. <br>
2. Evaluate your model as above. Which of the two models is more accurate? <br>
3. Finally, try changing the loss function, the optimizer, strides, epochs etc. (check keras documentation for examples). Does your model perform better? 

In [None]:
def deep_cnn_model(): 
    model = Sequential()
    model.add(Conv2D(30, (5, 5), input_shape=(28, 28, 1), activation='relu'))
    model.add(MaxPooling2D())
    model.add(Conv2D(30, (3, 3), activation='relu'))
    model.add(MaxPooling2D())
    model.add(Conv2D(30, (2, 2), activation='relu'))
    model.add(MaxPooling2D())
    model.add(Dropout(0.2))
    model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))
    
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model