A First Look at a neural network.
==================

We aim here to classify grayscale images of handwritten digits. Specifically the MNIST dataset. Solving this problem according to this book can be considered the "Hello World" of deep learning.

First we load our data into train and test sets.

In [1]:
from keras.datasets import mnist

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

Using TensorFlow backend.


Next we can check that the size and shape of our numpy arrays are as expected. Images of 28x28 with a label between [0, 9]

In [2]:
print(train_images.shape)
print(len(train_labels))
print(all([x >= 0 and x<=9 for x in train_labels]))

(60000, 28, 28)
60000
True


In [3]:
print(test_images.shape)
print(len(test_labels))
print(all([x >= 0 and x<=9 for x in test_labels]))

(10000, 28, 28)
10000
True


Next is a basic example of how to set up a neural network with 2 FC layers in Keras.

In [4]:
from keras import models
from keras import layers

network = models.Sequential()
network.add(layers.Dense(512, activation='relu', input_shape=(28*28, )))
network.add(layers.Dense(10, activation='softmax'))

We then need to compile our network as tensorflow uses a static graph. We also choose our optimizer and loss function in this step.

In [5]:
network.compile(optimizer='rmsprop',
               loss='categorical_crossentropy',
               metrics=['accuracy'])

In order to pass our data into our network we need to reshape it into a flat array. We will also normalise the images to values between [0, 1]

In [6]:
train_images = train_images.reshape((train_images.shape[0], train_images.shape[1]*train_images.shape[2]))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((test_images.shape[0], test_images.shape[1]*test_images.shape[2]))
test_images = test_images.astype('float32') / 255

We also need to convert our labels into categorical labels. This means each label is now an an array with a 1 at the correct position and 0's everywhere else. This is also known as one hot encoding.

In [7]:
from keras.utils import to_categorical

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

Now we can train our model.

In [8]:
network.fit(train_images, train_labels, epochs=5, batch_size=128)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f64fffcbe80>

And finally we can check our model works by looking at the accuracy of predictions on the test data.

In [9]:
test_loss, test_acc = network.evaluate(test_images, test_labels)
print('test_acc', test_acc)

test_acc 0.9801
