In [33]:
from keras.datasets import mnist
from keras import models
from keras import layers
from keras.utils import to_categorical

In [27]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

In [28]:
train_images.shape, train_labels.shape

((60000, 28, 28), (60000,))

In [29]:
test_images.shape, test_labels.shape

((10000, 28, 28), (10000,))

1. Our network consists of two dense layers.
2. The last layer is a 10-way softmax layer, which means it will return an array of 10 probability scores (summing to 1)
3. Each score will be the probability that the current digit belongs to one of our 10 digit classes.

In [30]:
network = models.Sequential()

network.add(layers.Dense(512, 
                         activation='relu',
                         input_shape=(28 * 28,)))
network.add(layers.Dense(10, activation='softmax'))

To prepare the network for training, three more components are needed as part of the compilation steps:
1. a loss function - how the network is able to measure it's performance on the training data so that it can steer itself in the right direction
2. optimizer - a mechanism through which the network will update itself based on the data it sees and its loss function
3. metrics to monitor training and testing - in this case the accuracy matters (the fraction of the images that are correctly classified

In [31]:
network.compile(optimizer='rmsprop',
                loss='categorical_crossentropy',
                metrics=['accuracy'])

Before training, we need to reshape the data to fit the shape of the network and scaling it so that values are in the [0, 1] interval.

Previous data are stored in an array of shape (60_000, 28, 28) as type uint8 with values in the [0, 255] interval.

In [32]:
train_images = train_images.reshape((60_000, 28 * 28))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10_000, 28 * 28))
test_images = test_images.astype('float32') / 255

In [35]:
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

In [39]:
network.fit(train_images, 
            train_labels, 
            epochs=5, 
            batch_size=128)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x14481f7c0>

In [40]:
test_loss, test_acc = network.evaluate(test_images, test_labels)
test_loss, test_acc



(0.07656168192625046, 0.9765999913215637)