<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc" style="margin-top: 1em;"><ul class="toc-item"><li><span><a href="#Purpose" data-toc-modified-id="Purpose-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Purpose</a></span><ul class="toc-item"><li><span><a href="#Data-handling" data-toc-modified-id="Data-handling-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Data handling</a></span></li><li><span><a href="#Hyperparameters-definition" data-toc-modified-id="Hyperparameters-definition-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Hyperparameters definition</a></span></li></ul></li><li><span><a href="#Model-construction:-towards-a-MLP" data-toc-modified-id="Model-construction:-towards-a-MLP-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Model construction: towards a MLP</a></span></li><li><span><a href="#Todo" data-toc-modified-id="Todo-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Todo</a></span></li></ul></div>

# Purpose
Well, this notebook is intended to practice the basics of Keras with [MNIST](https://github.com/keras-team/keras/blob/master/examples/mnist_mlp.py)

In [1]:
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import RMSprop
import numpy as np

Using TensorFlow backend.


## Data handling
Keras library comes with datasets loader built-in, so let us enjoy the pleasure !

In [2]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()
print(str(X_train.shape) + " "+ str(y_train.shape))

(60000, 28, 28) (60000,)


X format is a 3-d matrix, what we do not wish to mingle with. Let's shape it as a 2-D:

In [3]:
X_train = X_train.reshape(60000,28*28)
X_test = X_test.reshape(10000,28*28)
print(str(X_train.shape) + " "+ str(y_train.shape))

(60000, 784) (60000,)


In order to make this a classification problem, one needs to make the target into categorical. Since there are ten digits possible, target will be splitted into ten booleans fields.

In [4]:
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

## Hyperparameters definition
Usually, MLP requires to know the dimensions of input and input, since it will decide the number of neurons in these layers.
Next parameters are the number of epochs, batch size for processing.

In [5]:
nb_input = X_train.shape[1]
nb_classes = y_train.shape[1]
print("nb_input : "+ str(nb_input) + " || nb_classes : " + str(nb_classes))

nb_input : 784 || nb_classes : 10


In [6]:
epochs = 20
batch_size = 128


# Model construction: towards a MLP
Multi-Layer-Perceptron (MLP) is an architecture often referred as shallow feed-forward-network. It is capable of classifying efficiently.

Sequential consists in a succession of layers, where every neuron of previous layer is conected to every neuron of the next layer _dense_.

_relu_ activation enables fast computations, and in deeper networks, minimize information loss during retropropagation, _softmax_ enables a soft landing on the various classes.

_0.2_ dropout allows a better redistribution of error to neurons by randomly shutting down some of them at each iteration.

Input shape is defined by the number of variables for one sample, __784__, and the output has to be the same as the number of classes, __10__

In [7]:
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(nb_input,)))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(nb_classes, activation='softmax'))

model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 512)               401920    
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 512)               262656    
_________________________________________________________________
dropout_2 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 10)                5130      
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________


In [8]:
model.compile(loss='categorical_crossentropy',
              optimizer=RMSprop(),
              metrics=['accuracy'])

history = model.fit(X_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(X_test, y_test))
score = model.evaluate(X_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Train on 60000 samples, validate on 10000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Test loss: 14.6803610641
Test accuracy: 0.0892


MLP are usually sensitive to initialisation: in this case, bad luck.
# Todo
* Relaunch
* Tune hyperparameters more
* Get some illustration
* ?