# Classifying MNIST digits with a Convoluted Neural Network
## Joseph Bentivegna

This project involved creating a CNN to classify the MNIST dataset that consists of pictures of handwritten digits. The data was imported from keras and split into three different subsets for training, verifying and testing.  Tuning of hyperparameters was done using the verification set to test the results of different lambdas and dropout rates. The model consists of the following layers: convolution -> max pooling -> dropout -> dense -> dropout -> dense (softmax).  The model has good runtime (<1m) and ~96% accuracy on the test set in only 2 epochs.

References: https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py

In [1]:
# imports
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from sklearn.model_selection import train_test_split

Using TensorFlow backend.


In [2]:
# define globals
batch_size = 32
num_classes = 10
epochs = 2

# input image dimensions
img_rows, img_cols = 28, 28

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train, x_ver, y_train, y_ver = train_test_split(x_train, y_train, train_size=55000)

# properly shape input data
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_ver = x_ver.reshape(x_ver.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

# type edits
x_train = x_train.astype('float32')
x_ver = x_ver.astype('float32')
x_test = x_test.astype('float32')
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_ver.shape[0], 'ver samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_ver = keras.utils.to_categorical(y_ver, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

# full model
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', kernel_regularizer=keras.regularizers.l2(0.01)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.1))
model.add(Flatten())
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))

# build the graph
model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adam(), metrics=['accuracy'])

# fit the graph
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(x_ver, y_ver))

# score the model on the test set
score = model.evaluate(x_test, y_test, verbose=0)

print('Test loss:', score[0])
print('Test accuracy:', score[1])



x_train shape: (55000, 28, 28, 1)
55000 train samples
5000 ver samples
10000 test samples
Train on 55000 samples, validate on 5000 samples
Epoch 1/2
Epoch 2/2
Test loss: 0.12067932260930538
Test accuracy: 0.9707
