# The *Hello World* for Deep Learning
Deep learning algorithms fundamentally changed how we approach vision and natural language processing. In this notebook we show the *hello world* for deep learning: Classification of images of single digits on the MNIST dataset.

As usual, we start by importing all necessary libraries

In [None]:
!pip install keras

In [None]:
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras import utils
from keras.datasets import mnist # the "hello world" data set for deep learning
from matplotlib import pyplot as plt

# To show plots directly in the notebook
%matplotlib inline  
np.random.seed(3) # set seed for reproducability

# Loading the data
The MNIST dataset comes split in a train and test data set. It contains 70.000 28x28 pixel images

In [None]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()
print(X_train.shape)
print(X_test.shape)
print('\nExample:')
print(y_train[17])
plt.imshow(X_train[17])

# Preprocessing
Our model requires 4-dimensional inputs (nr_of_samples, height, width, channels). In the MNIST dataset, the channel is omitted since there is only one. We can make it explicit through reshaping.

In [None]:
print(X_train.shape)
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
print(X_train.shape)

The y arrays contains the number that can be seen in a image. We want our model to predict a probability for each possible digit. For this, we need to convert y from 1 to 10 dimensions, where each column corresponds to one digit.

In [None]:
Y_train = utils.to_categorical(y_train, 10)
Y_test = utils.to_categorical(y_test, 10)

In [None]:
for i in range(5):
    print('Digit: %s' % y_train[i])
    print('Encoding: %s' % Y_train[i])

# Initializing the model
After we prepared our data for the training, we are now ready to set up our model. 
We'll use a combination of two convolutions - max pooling layers, a dense layer with dropout and finally an output layer with one neuron for each class.

In [None]:
model = Sequential()

model.add(Convolution2D(filters=32, kernel_size=(3, 3), activation='relu', input_shape=(28,28,1)))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Convolution2D(filters=32, kernel_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

The model summary gives a good overview of the overall shape of our model

In [None]:
model.summary()

Tensorflow models need to be compiled. For this, we also need to define the loss function that will be optimized, the optimizer we want to use and any additional metrics we want to keep track of during the training

In [None]:
model.compile(loss = 'categorical_crossentropy',
              optimizer = 'adam',
              metrics=['accuracy'])

# Training and Evaluation
Training state-of-the-art deep learning models is impossible without GPUs or other specializes hardware. 
Even our toy example requires quite some time. Hence, we lower the number of training examples. This will reduce the performance of our model, but give us quicker results.

In [None]:
number_of_training_samples = 4000
X_train_small = X_train[1:number_of_training_samples]
Y_train_small = Y_train[1:number_of_training_samples]

Finally, we can start training our model. You can see how the loss decreases and the accuracy increases in each epoch.
The batch-size defines how many images are used for one upgrade step. An epoch is one full run through our training dataset.

In [None]:
model.fit(X_train_small, Y_train_small, 
          batch_size=32, epochs=10, verbose=1)

The *.fit* function shows the performance on the training set. It is essential to also measure how the model performs on new, unseen data. For this, we use the *_test* data.

In [None]:
score = model.evaluate(X_test, Y_test, verbose=1)
print(f"categorical_crossentropy: {score[0]}, accuracy: {score[1]}")

If everything went well, the accuracy should be approximately the same on the train and test set. Let's take a look at some examples:

In [None]:
index = 512
plt.imshow(np.vstack(X_test[index:index+5]).reshape([-1,28]))
np.argmax(model.predict(X_test[index:index+5]), axis=1)

Our model only got an accuracy of around *96%*. Let's take a look at some images where it predicted the wrong digit.

First, we create a mask for images where the predicition was wrong:

In [None]:
predictions_probabilities = model.predict(X_test)
predictions = np.argmax(predictions_probabilities, axis=1)
false_predictions_mask = predictions != y_test

Then we extract the images, probabilites, predictions and labels of the wrongly classified images.

In [None]:
false_pics = X_test[false_predictions_mask]
false_predictions_probabilities = predictions_probabilities[false_predictions_mask]
false_predictions = predictions[false_predictions_mask]
false_labels = y_test[false_predictions_mask]

And finally, we can look at some examples

In [None]:
samples = 5
plt.imshow(np.vstack(false_pics[:samples]).reshape([-1,28]))
print('Predictions')
print(false_predictions[:samples])
print('Labels')
print(false_labels[:samples])
print('Probabilities:')
print(np.round(false_predictions_probabilities[:samples], 2))

# Excercises
Preprocessing can have a huge impact on the performance of the model. Normalizing the input to e.g. 0-1 usually helps with the convergance of the model. Since the default type of MNIST is uint8, we first need to convert it to float.

In [None]:
print(X_train.dtype)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

Since the values of X are below 0 and 255 normalization can be done via a simply division.

In [None]:
print(X_train.max())
print(X_train.min())
X_train /= 255
X_test /= 255

# normalize to -1 to 1
#X_train /= 127.5 - 1
#X_test /= 127.5 - 1

**Exercises:**
- Test the effect of 0 - 1 and -1 - 1 normalization on the training and accuracy.
- Achieve an accuracy > *98%* on the test set.