<a href="https://colab.research.google.com/github/carlosfmorenog/CMM536/blob/master/CMM536_Topic_8/CMM536_T8_Lab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Topic 8 Lab

Now that you know the basis of a CNN, you will run one of them in very few lines of code! To do so, we will use `Keras` with a `Tensorflow` backend, along with the very popular `mnist` dataset of handwritten numbers.

First, install the necessary packages if you don't have them already:

In [None]:
# 0. Installing the necesssary packages
!pip install keras
!pip install tensorflow

Then we will ensure that `Keras` uses `Tensorflow` as backend. Notice that if you prefer to use another backend such as `Theano`, you simply need to change the name in the second line of the code.

In [None]:
# 1. Ensure you are using Theano backend
import os
os.environ['KERAS_BACKEND'] = 'tensorflow'

Now you will import the necessary packages:

In [None]:
# 2. Import libraries and modules
import numpy as np
import matplotlib.pyplot as plt
from keras.datasets import mnist
from keras.utils import np_utils

Then, we will set a **random seed** to be able to repeat the results and get the same results every time.

In [None]:
# 3. Set random seed (for reproducibility)
np.random.seed(123)

With the following cell you will download the data from the `mnist` dataset. Notice that the data comes already partitioned in test and training sets:

In [None]:
# 4. Load pre-shuffled MNIST data into train and test sets
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

Let's check the shape of the things obtained:

In [None]:
print(X_train.shape, Y_train.shape, X_test.shape, Y_test.shape)

Notice that we have 60'000 samples for training and 10'000 for testing.

Now we will preprocess the data to be used by the classifier. To be able to use Keras, we need to do the following:
1. **Reshape** the data into **four** dimensions i.e. the training set will be of shape (60000,28,28,1) and the test set of (10000,28,28). This is useful since the network needs an input shape of (1,28,28) **for each of the samples**.
2. Convert the format of the input into `float32` (apparently the CNN works better with it).
3. **Normalise** i.e. divide all values by 255.

In [None]:
# 5. Preprocess input data
# Reshape into four dimensions.
X_train_reshape = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test_reshape = X_test.reshape(X_test.shape[0], 28, 28, 1)
# Convert to float 32
X_train_reshape = X_train_reshape.astype('float32')
X_test_reshape = X_test_reshape.astype('float32')
# normalise
X_train_reshape /= 255 
X_test_reshape /= 255

CNNs also like their target to be categorical, i.e. instead of the target being values from 0 to 9, each target value will be a vector indicating which is the class according to the position. Run the following cell to see what I mean...

In [None]:
# 6. Preprocess class labels
Y_train_categorical = np_utils.to_categorical(Y_train, 10)
Y_test_categorical = np_utils.to_categorical(Y_test, 10)
# Show a sample target entry. You will see that this sample corresponds to a 5 as
# there is a one in the 6th position (remember that python starts in 0)
print(Y_train_categorical[0])
plt.imshow(X_train[0])

Now it's time to train the model. We will define a **sequential** CNN with two convolutional layers, a **max pooling** of size $2 \times 2$ and a **dropout** of $0.25$. Then, we will add a **flatten** layer, add a **densely connected** layer with a **ReLu** activation, afterwards add another **dropout** of $0.5$, and finally add a densely connected layer to the output with a **softmax** activation function. This configuration is not strict and you can find many different examples, such as [this other one](https://towardsdatascience.com/image-classification-in-10-minutes-with-mnist-dataset-54c35b77a38d).
**NOTE:** If you get an error, you may need to change the `input_shape` to (1,28,28), which means that you also need to change the shape of `X_train_reshape` and `X_test_reshape` to (X.shape[0],1,28,28).

In [None]:
# 7. Define model architecture
from keras.models import Sequential
from keras.layers import Dense, Conv2D, Dropout, Flatten, MaxPooling2D

model = Sequential()
 
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28,28,1)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

After creating the model architecture, we will compile it. We will use the **adam** optimiser to improve the loss obtained by the **cross_entropy** method, and then we will request the model to obtain the accuracy

In [None]:
# 8. Compile model
model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

Once compiled, we will fit this model using our training data. If your computer is slow, I recommend you **NOT** to use the entire training dataset. This can be done by reducing the number `n` to something smaller than 60'000. Also, you can reduce the number of **epochs**.

In [None]:
# 9. Fit model on training data
n=100

model.fit(X_train_reshape[:n], Y_train_categorical[:n], 
          batch_size=32, epochs=5, verbose=1) # verbose = 1 lets you see the training log for each iteration
                                              # verbose = 2 would print only each 2 iterations!

We can try to add more `n` and epochs to see how much `training accuracy` we are capable to achieve. Also, keep in mind that if the `loss` is not going down, then the model is not learning! 

Now we can evaluate your model in the `test` data. We can as well obtain the `loss`and the `accuracy` for this new, unseen data for the model.

In [None]:
# 10. Evaluate model on test data
loss, accuracy = model.evaluate(X_test_reshape[:n], Y_test_categorical[:n], verbose=0)
print('Loss: ', loss,'\nAcc: ', accuracy)

With the following cell you can print the labels that were predicted by the model. Notice that the classes are not categorical anymore!

In [None]:
# 11. Check the label that has been predicted
predicted_labels = model.predict_classes(X_test_reshape[:n])
print(predicted_labels)

The next cell has a brief code that will help you find the incorrect labels by comparing the labels obtained for the test samples with their ground truth.

In [None]:
# 12. Check the label that has been predicted incorrectly
incorrect_labels = np.nonzero(model.predict_classes(X_test_reshape[:n]).reshape((-1,))!= Y_test[:n])
print(incorrect_labels)

Finally, you can use the following cell to print training or test cells from the dataset and see their actual (and predicted, for the case of test) labels.

In [None]:
# 13. Show a sample from the mnist dataset
image_to_show = 72
from_group = 'test' # put 'train' or 'test'
if from_group == 'train':
    plt.imshow(X_train[image_to_show])
    print('Ground truth label: ',Y_train[image_to_show])
else:
    plt.imshow(X_test[image_to_show])
    print('Ground truth label: ',Y_test[image_to_show])
    if len(predicted_labels)>image_to_show:
        print('Predicted label: ',predicted_labels[image_to_show])