# ConvNets Intro - MNIST

During an introduction to deep learning, I did a MNIST as a hello world deep learning problem. Simple network consisting of two fully-connected Dense layers achieved the 97.86% accuracy. Let's try to solve this problem using a convolutional neural network - hearth of computer vision - and see, how big of an improvement we will get.

In [1]:
from keras.datasets import mnist

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [2]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images.shape

(60000, 28, 28)

In [3]:
from keras import models
from keras import layers

In [4]:
network = models.Sequential()
network.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1))) # 28x28 image size, 1 channel
network.add(layers.MaxPooling2D((2, 2)))
network.add(layers.Conv2D(64, (3, 3), activation='relu'))
network.add(layers.MaxPooling2D((2, 2)))
network.add(layers.Conv2D(64, (3, 3), activation='relu'))
network.add(layers.Flatten())
network.add(layers.Dense(64, activation='relu'))
network.add(layers.Dense(10, activation='softmax'))
network.compile(optimizer='rmsprop',
                loss='categorical_crossentropy',
                metrics=['accuracy'])

We are using 3 convolutional layers with 3x3 kernel size. First conv layer uses 32 different 'filters', other two 64. By that, our neural network is able to learn abstract features and characteristics from the image. Between every 2 conv layers, we use max pooling layer to shrink down the 'image' size to half. We then flatten our output and use it as an input for two dense layers that are responsible for the final digit classification based on the discovered image features.

In [5]:
X_train = train_images.reshape((60000, 28, 28, 1))
X_train = X_train.astype('float32') / 255

X_test = test_images.reshape((10000, 28, 28, 1))
X_test = X_test.astype('float32') / 255

Since we are using convolutional layers to learn characteristics of a given image, instead of only dense layers, we don't neet to squeze our 28x28x1 image into the single 1D tensor.

In [6]:
from keras.utils import to_categorical

In [7]:
y_train = to_categorical(train_labels)
y_test = to_categorical(test_labels)

In [8]:
X_train.shape, y_train.shape

((60000, 28, 28, 1), (60000, 10))

In [9]:
network.fit(X_train, y_train, epochs=5, batch_size=64)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f5681fe1cf8>

In [10]:
test_loss, test_acc = network.evaluate(X_test, y_test)
test_acc



0.9887