# Exercise

The current setup gives us about 99%+ accuracy. This is not bad, however, if we have to classify a lot of images:

| #images   | # wrong |
|-----------|---------|
| 1.000     | 10      |
| 10.000    | 100     |
| 1.000.000 | 10.000  |

our network still makes a lot of errors.

Try to play with the network to see if you can improve the current setup:

* Play with the number of convolutional layers
* What happens if you remove the pooling layers?
* What happens if you train the network longer?
* Add strides, padding...


# 0. Introduction

In [None]:
%matplotlib inline

Example based on: Deep Learning with Python by Francois Chollet:
https://www.manning.com/books/deep-learning-with-python

In [None]:
import keras
from keras import models
from keras import layers
from keras.datasets import mnist
from keras.utils import to_categorical
import matplotlib.pyplot as plt

# 1. Load data

The MNIST database of handwritten digit has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.
It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.

In [None]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# 2. Inspect data

Try to get a feel for the data you are using to train and test your neural network. 

## Training data

- Training data will be used to train our neural network to recognize hand-written digits.
- MNIST provides 60000 labeled training images, each 28x28 pixels

In [None]:
train_images.shape

In [None]:
train_labels.shape

In [None]:
def show_image(images, labels, index):
    img = images[index].reshape((28,28))
    label = labels[index]
    plt.imshow(img)
    plt.title(label)
    plt.show()

In [None]:
show_image(train_images, train_labels, 10)

## Test data

- Test data will be used to validate how good our network performs on data it has never seen.
- MNIST provides 10000 test images, each 28x28.
- It's important to note that these should never be used in the training cycle. A 'test set' should never contain images the network has already seen during training. (read more: [Model Selection and Train/Validation/Test Sets](https://www.coursera.org/lecture/machine-learning/model-selection-and-train-validation-test-sets-QGKbr) and [How (and why) to create a good validation set](https://www.fast.ai/2017/11/13/validation-sets/))

In [None]:
test_images.shape

In [None]:
test_labels.shape

In [None]:
show_image(test_images, test_labels, 168)

# 3. Network architecture

## 3.1. ConvNet (Feature extraction)

Define the network architecture that will be used for training

- how many layers 
- which type of layer
    - Convolution: #channels, kernel size, activation
    - MaxPool: matrix size

Important: ConvNets take as input tensors of shape: (image_height, image_width, channels)

**Note:**

Convolutional layers learn local patterns (features that can appear anywhere in the image)

Dense layers learn global patterns.

In [None]:
network = models.Sequential()
network.add(layers.Conv2D(32, (3,3), activation='relu', input_shape=(28, 28, 1)))
network.add(layers.MaxPool2D((2, 2)))
network.add(layers.Conv2D(64, (3, 3), activation='relu'))
network.add(layers.MaxPool2D((2, 2)))
network.add(layers.Conv2D(64, (3, 3), activation='relu'))

In [None]:
network.summary()

## 3.2. Classifier layer

In [None]:
network.add(layers.Flatten())
network.add(layers.Dense(64, activation='relu'))
network.add(layers.Dense(10, activation='softmax'))

In [None]:
network.summary()

# 4. Compilation Step

In the compilation step we define the:

- the loss function
- the optimizer
- the evaluation metric

In [None]:
network.compile(optimizer='rmsprop',
               loss='categorical_crossentropy',
               metrics=['accuracy'])

# 5. Data Preparation

Before feeding the data into the network for training, we make sure it is formatted properly.

## Prepare the images

In [None]:
train_images_reshaped = train_images.reshape((60000, 28, 28, 1))
test_images_reshaped = test_images.reshape((10000, 28, 28, 1))

In [None]:
train_images_reshaped.shape

In [None]:
train_images_transformed = train_images_reshaped.astype('float32') / 255
test_images_transformed = test_images_reshaped.astype('float32') / 255

## Prepare the labels

In [None]:
train_labels_categorical = to_categorical(train_labels)
test_labels_categorical = to_categorical(test_labels)

In [None]:
train_labels_categorical[0]

# 6. Network summary

In [None]:
network.summary()

# 7. Train the network

Feed the training images and labels to the network.

Two additional parameters need to be supplied:

- epochs: how many times the network will look at the entire dataset. 
- batch_size: how many images will be put through the network at one time.

In [None]:
network.fit(train_images_transformed, train_labels_categorical, epochs=5, batch_size=128)

# 8. Test the network

Use the test set (which the network has not seen yet) to test how well the network will perform on images it has not seen yet:

In [None]:
test_loss, test_acc = network.evaluate(test_images_transformed, test_labels_categorical)

In [None]:
print('test_acc; ', test_acc)