# Digital recognition with the mnist dataset

This notebook will investigate the classification and identification of hand written digits using a neural network.<br/>
The mnist dataset will be first used to train the network and then test the networks performance in recognising a digit.<br/>
Once training has been completed a single image from the dataset will be passed to the network and the result will be displayed to the screen along with the actual digit expected.<br/>
![Mnist Image](https://corochann.com/wp-content/uploads/2017/02/mnist_plot-800x600.png)
<cite>Image source https://corochann.com/wp-content/uploads/2017/02/mnist_plot-800x600.png</cite>


## Packages needed for the program to run

The following packages will need to be imported for creating the network and importing the images to memory:
* The keras package used for creating the network
* The gzip package used for unzipping the the dataset images and labels
* The numpy package used for altering the dataset into numpy arrays
* The sklearn preprocessing package used for classification and binary encoding each digit


In [1]:
# Importing the packages 
import keras as kr
import numpy as np
import sklearn.preprocessing as pre
import gzip

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


## Building the neural network
To begin we need to initialise the netowrk using the sequential model.<br/>
This allows us to add layers as we need them. <br/>
These layers can be tweaked to increse performance.<br/>
We will investigate this later in the notebook.


In [2]:
# Initialise the neural network
model = kr.models.Sequential()

## Adding the layers to the network
To add layers to the network the layers method from keras will be used.<br/>
There will be a dense connection between neurons meaning that every neuron from the input is connected to every neuron in the middle layer and every neuron frim the middle layer is connected to every neuron on the output layer.

![Neural Network](https://cdn-images-1.medium.com/max/800/1*jYhgQ4I_oFdxgDD-AOgV1w.png)
<cite>Image source https://cdn-images-1.medium.com/max/800/1*jYhgQ4I_oFdxgDD-AOgV1w.pngS</cite>

* In the below code segment the units attribute represents the amount of neurons that will be in the middle layer in this case we have 1000 neurons.<br/>
* The activation attribute sets the activation function in this case we are using  [relu activation](https://keras.io/activations/) the relu activation has a steeper gradient than softmax and as a result speeds up the training process wothout the loss of performance. 

* The final attribute is used to set the amount of input neurons the network has. In the below exapmle the number is set to 784 as this is equal to the number of bytes each image has within the mnist dataset.


In [3]:
# Add a hidden(middle layer) with 1000 neurons and an input layer with 784.
# There are 784 input neurons as this value is equal to the total amount of bytes each image has.
model.add(kr.layers.Dense(units=1000, activation='relu', input_dim=784))


## Output layer
The output layer has ten neurons that will map to the amount of training labels that are within the dataset. The predicted results are sent from the middle layer to the ouput layer and compared to the actual number that has benn sent in as image data.<br/>
The closer to the value one the result is the more accurate the algorithm is perfoming.<br/>
This process is repeated and the point of gradent decent converges towards the base of the slope. <br/>
The process ends when all of the epochs have completed which will be explained later in this notebook.

In [None]:
# Add ten neurons to the output layer
model.add(kr.layers.Dense(units=10, activation='softmax'))

## Building the model

The compile method is used to build the model based on each layer created along with their connections specified in the above cell.</br>
* The first argument [categorical_crossentropy](https://keras.io/losses/) creates a vector to hold the values of each digit as a binary representation, this will be set with the pre.LabelBinarizer() to be discussed further in this notebook.
* The second optimizer argument is set to [stochastic gradient descent optimizer](https://keras.io/optimizers/) This sets the learning rate, and the decay of this learning rate over time.
* The final argument [metrics](https://keras.io/metrics/) is used to output the performance to the neural network after each run of data has been sent from the central layer to the output layer.

In [4]:
# Build the graph.
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])

## Opening the files in .gz format

As dicussed in my previous [mnist notebook](https://github.com/kevgleeson78/Emerge-tech-assign/blob/master/Mnist%20Dataset.ipynb) the gzipped files are opened and read using the gzip package.

In [5]:
# Open the gzipped files and read as bytes.
with gzip.open('data/train-images-idx3-ubyte.gz', 'rb') as f:
    train_img = f.read()

with gzip.open('data/train-labels-idx1-ubyte.gz', 'rb') as f:
    train_lbl = f.read()

## Reading in the data into memory

Each of the 60000 images and labels are then stored inot their respective variables.<br/>
We divde by 255 to convert the grey scale value to a value between one and zero.<br/>
These values are then used by the neural netwrok in conjunction with the softmax function.

In [6]:
# read in all images and labels into memory
train_img = ~np.array(list(train_img[16:])).reshape(60000, 28, 28).astype(np.uint8) / 255.0
train_lbl =  np.array(list(train_lbl[8:])).astype(np.uint8)

## Flattening the data into a single array
The data is converter from a three dimensional array to a one dimensional array where all of the image bytes (28 *28) 784 are sequentailly stored one after another.<br/>
This techneque is used so each byte representing the image can be have a one to one mappimg to the neural networks input layer.

In [7]:
# Flatten the array so the inputs can be mapped to the input neurons
inputs = train_img.reshape(60000, 784)
inputs[0:1]

array([[1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.  

## Encoding the data
The label data is encoded into a matrix of 10 x 10 this will represent the didgits in binary format.
Firstly we to setup the matrix using the labelBinerizer function.<br/>
The fit function passes the training labels as a argument. AS the set of labels are from zero - nine the (encoder.fit) function generates a matrix based on these values. In this case it will be a 10 x 10 matrix.

In [8]:
# encode the labels into binary format
encoder = pre.LabelBinarizer()
# get the size of the array needed for each category
encoder.fit(train_lbl)

LabelBinarizer(neg_label=0, pos_label=1, sparse_output=False)

## Transforming the labels
The labels are then transformed to a binary value based on the deimal value of the label.</br>
With each number being transfromed to the following:
* (0) 1000000000
* (1) 0100000000
* (3) 0010000000 

And so on until we reach the number nine which is 0000000001.<br/>


In the below example the number five has the representation of '0 0 0 0 0 1 0 0 0 0'

In [9]:
# encode each label to be used as binary outputs
outputs = encoder.transform(train_lbl)
# print out the integer value and the new representation of the number
print(train_lbl[0], outputs[0])

5 [0 0 0 0 0 1 0 0 0 0]


### Full example
Below is a full view of the matrix.

In [10]:
# print out each array
for i in range(10):
    print(i, encoder.transform([i]))

0 [[1 0 0 0 0 0 0 0 0 0]]
1 [[0 1 0 0 0 0 0 0 0 0]]
2 [[0 0 1 0 0 0 0 0 0 0]]
3 [[0 0 0 1 0 0 0 0 0 0]]
4 [[0 0 0 0 1 0 0 0 0 0]]
5 [[0 0 0 0 0 1 0 0 0 0]]
6 [[0 0 0 0 0 0 1 0 0 0]]
7 [[0 0 0 0 0 0 0 1 0 0]]
8 [[0 0 0 0 0 0 0 0 1 0]]
9 [[0 0 0 0 0 0 0 0 0 1]]


## Trianing the model
We are now ready to begin training the netwrok to regognise the images.</br>
The training set of 60000 images are used and passed to the networks first layer of 784 neurons.<br/>
Model parameters:
1. The encoded training images are sent as input
2. The encided trainig labels are attached as the expected output
3. epochs is the amount of times the 60000 images will be processed 
4. The batch size sets the amount of images that will be sent to the network as one unit


In [11]:
# Start the training
# Set the model up by adding the input and output layers to the network
#The epochs value is the amount of test runs are needed
# The batch_size value is the amount of images sent at one time to the network
model.fit(inputs, outputs, epochs=20, batch_size=100)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x1a47f204278>