In [1]:
from tensorflow.keras.datasets import mnist

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

**THIS IPYNB IS JUST A BASIC EXAMPLE TO LOOK AT HOW IS STUFF DONE, DETAILS ON EACH ASPECT WILL FOLLOW**

- The `train_images` and `train_labels` form the `training set` and so on
- The images are encoded as numpy arrays, and the labels are an array of digits, ranging from 0 to 9

Let us look at the training and test data

# Exploring the data

In [2]:
train_images.shape

(60000, 28, 28)

In [3]:
len(train_labels)

60000

In [4]:
train_labels

array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

In [5]:
test_images.shape

(10000, 28, 28)

In [6]:
len(test_labels)

10000

In [7]:
test_labels

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

First we will feed the neural network the training data and it shall learn to associate the images and the labels and then we will test it on the test data

# The network architecture

In [8]:
from tensorflow.keras import models
from tensorflow.keras import layers

network = models.Sequential()
network.add(layers.Dense(512, activation='relu', input_shape=(28 * 28, )))
network.add(layers.Dense(10, activation='softmax'))

In [9]:
# Add optimizaer and loss function to the net
network.compile(
    optimizer='rmsprop',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Transforming the data
- The images need to be in a form that the neural net can accept, and will be scaled so that they are in the range from [0,1]
- The training images previously were stored in an array of shape (60000, 28, 28) of type uint8, after scaling, we will transform into a float32 array of shape (60000, 28 \* 28) with values between 0 and 1

In [10]:
train_images = train_images.reshape((60000, 28*28))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28*28))
test_images = test_images.astype('float32') / 255

- We will need to categorically encode the labels too

In [11]:
from tensorflow.keras.utils import to_categorical

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

# And the training starts

Training usually is done by fitting the network to the training data, this nomenclature of the function is common accross most libraries

In [12]:
network.fit(train_images, train_labels, epochs=5, batch_size=128)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7fbad6a1fe50>

- The training accuracy, we easily reach around 99%

Time to test our model 

In [13]:
test_loss, test_acc = network.evaluate(test_images, test_labels)
print('test accuracy: ', test_acc)

test accuracy:  0.98089998960495


- The test accuracy is slightly less, this can be attributed to overfitting
- Thus, it was possible to train a neural net to classify handwritten digits in less than 20 lines of python code