
## Keras 101: Using Simple Neural Network to Classify MNIST

Train a simple deep NN on the MNIST dataset. Please read the code and get yourself familiar with Keras API.

You shoud get **~98.40%** test accuracy after 20 epochs
(although there is *a lot* of margin for parameter tuning).


Adapted from an example in Keras Github Repo: https://github.com/fchollet/keras/blob/master/examples/mnist_mlp.py

In [1]:
# some setup code
import numpy as np
np.random.seed(1337)  # for reproducibility

from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD, Adam, RMSprop
from keras.utils import np_utils

Using TensorFlow backend.


#### Load MNIST Data

In [2]:
batch_size = 128
nb_classes = 10
nb_epoch = 20

# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32') # float32 type usually works better
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')

Downloading data from https://s3.amazonaws.com/img-datasets/mnist.pkl.gz
60000 train samples
10000 test samples


#### Build NN Model and Train It

In [7]:
# convert class vectors to binary class matrices
# because Keras wants something called "one-hot" (https://en.wikipedia.org/wiki/One-hot) to be labels
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

model = Sequential()
# Dense is just simple neural network without any magic. 
# Note here you should include the argument "input_shape" when it is the input layer
model.add(Dense(512, input_shape=(784,))) 
model.add(Activation('relu')) # remember ReLu? max(x, 0)
# Dropout is a simple but powerful regularization method to prevent overfitting - 
# it drops some information in the network in order to increase its generalizability.
model.add(Dropout(0.2)) 
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(10))
# softmax outputs a vector v whose shape is (10,) where v[i] = the possibility of the input belonging to ith category
model.add(Activation('softmax')) 


# Read this if you are not familiar with  softmax or cross entropy: 
# http://cs231n.github.io/linear-classify/#softmax
# Also, do not worry about RMSProp right now. It is just another (awesome) optimization method
model.compile(loss='categorical_crossentropy', optimizer=RMSprop()) 

# check Keras documentation for details: http://keras.io/models/#sequential
model.fit(X_train, Y_train,
          batch_size=batch_size, nb_epoch=nb_epoch,
          verbose=2,
          validation_data=(X_test, Y_test) )

score = model.evaluate(X_test, Y_test,
                       verbose=0)

print( 'Validation error:', round(score*100,2), "%" )
print( 'Validation accuracy:', round(100 - (score*100) ,2), "%" )

Train on 60000 samples, validate on 10000 samples
Epoch 1/20
16s - loss: 0.2745 - val_loss: 0.1088
Epoch 2/20
14s - loss: 0.1123 - val_loss: 0.0826
Epoch 3/20
14s - loss: 0.0796 - val_loss: 0.0677
Epoch 4/20
15s - loss: 0.0623 - val_loss: 0.0646
Epoch 5/20
14s - loss: 0.0501 - val_loss: 0.0630
Epoch 6/20
15s - loss: 0.0429 - val_loss: 0.0592
Epoch 7/20
14s - loss: 0.0352 - val_loss: 0.0577
Epoch 8/20
13s - loss: 0.0292 - val_loss: 0.0590
Epoch 9/20
13s - loss: 0.0258 - val_loss: 0.0589
Epoch 10/20
14s - loss: 0.0216 - val_loss: 0.0643
Epoch 11/20
15s - loss: 0.0194 - val_loss: 0.0592
Epoch 12/20
14s - loss: 0.0182 - val_loss: 0.0630
Epoch 13/20
14s - loss: 0.0155 - val_loss: 0.0606
Epoch 14/20
14s - loss: 0.0141 - val_loss: 0.0649
Epoch 15/20
14s - loss: 0.0130 - val_loss: 0.0679
Epoch 16/20
14s - loss: 0.0140 - val_loss: 0.0579
Epoch 17/20
13s - loss: 0.0097 - val_loss: 0.0645
Epoch 18/20
14s - loss: 0.0107 - val_loss: 0.0711
Epoch 19/20
14s - loss: 0.0087 - val_loss: 0.0662
Epoch 20/