## 2. The Mathematical Building Blocks of Neural Networks

### The MNIST Problem

This is a concrete example of a neural network that uses Keras applied to the MNIST problem. The problem is to classify the image of a digit to one of 10 categories (0 through 9). 

In [1]:
from keras.datasets import mnist

from keras import models
from keras import layers
from keras.utils import to_categorical

Using TensorFlow backend.


In [2]:
# Ingestion
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

In [3]:
# For testing
# print(train_images.shape)
# print(train_labels.shape)
# print(train_labels[:10])
# print()
# print(test_images.shape)
# print(test_labels.shape)
# print(test_labels[:10])

The following are steps to build the neural network.

The core building block of neural networks is the <b>layer</b>, a data-processing module that is a filter for data. Some data goes in and out comes data in a more useful form. Specificically, layers extract <b>representations</b> of the data fed into them, hopefully, representations that are more meaningful for the problem at hand. Most of deep learning consists of chaining together simple layers that will implement a form of <b>data distillation</b>. 

In [4]:
# Model preprocessing - network architecture
network = models.Sequential()
network.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,)))
network.add(layers.Dense(10, activation='softmax'))

This model contains two `Dense` layers, which are densely connected (fully connected). The last layer is a 10-way <b>softmax layer</b>, which means it will return an array of 10 probability scores summing to $1$.

The next step is to pick a <b>loss function</b>, an <b>optimiser</b> and a <b>performance metric</b>. 

In [5]:
# Model preprocessing - metrics
network.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

In [6]:
# Data preprocessing - reshaping
train_images_flat = train_images.reshape((60000, 28 * 28))
train_images_flat = train_images_flat.astype('float32') / 255
test_images_flat = test_images.reshape((10000, 28 * 28))
test_images_flat = test_images_flat.astype('float32') / 255

# Data preprocessing - prepare the labels
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

In [7]:
# Train model with data
network.fit(train_images_flat, train_labels, epochs=5, batch_size=128)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.callbacks.History at 0x138e808d0>

Here, two quantities are displayed during the training: the loss and the accuracy of the model. Naturally, a lower loss leads to a better accuracy as the model fits the training data better and better.

In [8]:
# Model evaluation with accuracy
test_loss, test_acc = network.evaluate(test_images_flat, test_labels)
print('test_acc:', test_acc)

test_acc: 0.9782999753952026
