MNIST is a set of images of handwritten digits

The problem is to classify each greyscale image into the correct category, namely '0', '1'..., '9'
 
There are 60,000 **training** images and 10,000 **test** images

We begin with the 'hello world' of neural networks: MNIST - the classification of handwritten digits. The MNIST dataset consists of 70,000 small greyscale images. We met a few examples win Topic 1. The dataset is split into two parts - training and test images. Splitting datasets into two parts is fundamental to the deep learning methodology. The aim is the classification of each image into its correct category.  

In [1]:
from tensorflow.keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

import matplotlib.pyplot as plt
def mnist_peek(rows, cols):
    fig, axs = plt.subplots(rows, cols)
    for i in range(rows):
        for j in range(cols):
            axs[i, j].imshow(train_images[i * cols + j], cmap=plt.cm.binary)

In [None]:
mnist_peek(6, 6)

Here are some MNIST images. I don't think we would have too much difficulty recognising these digits, except perhaps a 1 - 7 ambiguity in the penultimate row. 

|               |     |        |
|--------------:|:---:|:-------|
|data point     | \|  | sample |
|category       | \|  | class  |
|class of sample| \|  | label  |

The jargon starts here. Each image, or in general, each data point is known as a 'sample'. Data points typically belong to one or more categories. Each MNIST image belongs to exactly one category or 'class' - '0', '1', '2' etc. The class of a sample is known as its 'label'. 

Workflow:
1. **Load data**
2. Preprocess data
3. Build network
4. Train
5. Test

The workflow is split into five stages. First, load data into our system.

The MNIST dataset is one of several TensorFlow datasets

Loading MNIST is painless because MNIST is integral to TensorFlow. But this facility is unusual: normally data has to be retrieved from the internet or from a private source.  

In [None]:
from tensorflow.keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

`mnist.load_data` downloads the data set - samples and labels - already split into training and test parts.

Data is stored in special multidimensional arrays - tensors

There are 60000 greyscale images in the training set

Each image is 28 pxl x 28 pxl

- the training set is a data container with 60000 x 28 x 28 elements

- the training labels are stored in a 60,000 element vector 

Data is stored in special multidimensional arrays - as tensors. `train_images` is a data container with 60000 x 28 x 28 elements - 60000 greyscale images, each 28 by 28 pixels.

In [None]:
print('tensor shape')
print('\ttraining images:', train_images.shape)
print('\ttraining labels:', train_labels.shape)
print('\ttest images:\t', test_images.shape)
print('\ttest labels:\t', test_labels.shape)

The shape of a tensor is the number of elements along each dimension. We will have much more to say about tensor shape later in this topic. But for now, read (60000, 28, 28) as a container for 60,000 two dimensional pixel maps.  The trailing comma for vector shape is standard Python for one dimensional tuples. 