## MLP and Dropout

Notes from Lukas Biewald's [Crowdflower Machine Learning class](https://github.com/lukas/ml-class)

### Multi-Layer Perceptron
Load and preprocess the MNIST digits data:

In [5]:
#mlp.py
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Flatten, Dropout
from keras.utils import np_utils

from keras.callbacks import TensorBoard
tensorboard = TensorBoard(log_dir="logs")

(X_train, y_train), (X_test, y_test) = mnist.load_data()
img_width = X_train.shape[1]
img_height = X_train.shape[2]

X_train = X_train.astype('float32')
X_train /= 255.
X_test = X_test.astype('float32')
X_test /= 255.

y_train = np_utils.to_categorical(y_train)
num_classes = y_train.shape[1]

y_test = np_utils.to_categorical(y_test)

Add a fully-connected [Dense](https://keras.io/layers/core/#dense) layer with 100 units with `relu` as the [activation function](https://towardsdatascience.com/exploring-activation-functions-for-neural-networks-73498da59b02):

In [7]:
model=Sequential()
model.add(Flatten(input_shape=(img_width, img_height)))
model.add(Dense(100, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam',
                    metrics=['accuracy'])

model.fit(X_train, y_train, validation_data=(X_test, y_test),
        callbacks=[tensorboard], epochs=10)

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x12e2f8da0>

Compared to the best `loss: 0.2428` and `acc: 0.9331` of the previous perceptron model, this model with 100 additional hidden layers achieved a loss of 0.0195 and accuracy of 0.9941 after 10 epochs.

### Dropout

Randomly select nodes to be [dropped-out](https://machinelearningmastery.com/dropout-regularization-deep-learning-models-keras/) with a given probability for each update cycle: 

In [13]:
model=Sequential()
model.add(Flatten(input_shape=(img_width,img_height)))
model.add(Dropout(0.2))
model.add(Dense(100, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam',
                    metrics=['accuracy'])

model.fit(X_train, y_train, validation_data=(X_test, y_test),
          callbacks=[tensorboard], epochs=10)

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x11aeeadd8>

Applying dropout regularization resulted in poorer accuracy performance after 10 epochs. Further optimizations such as using a larger network, increasing the learning rate and momentum, and constraining the size of the network weights may improve future performance.