# Digital recognition with the mnist dataset

This notebook will investigate the classification and identification of hand written digits using a neural network.<br/>
The mnist dataset will be first used to train the network and then test the networks performance in recognising a digit.<br/>
Once training has been completed twenty random test images from the dataset will be passed to the network and the result will be displayed to the screen along with the actual digit expected.<br/>
![Mnist Image](https://corochann.com/wp-content/uploads/2017/02/mnist_plot-800x600.png)
<cite>Image source https://corochann.com/wp-content/uploads/2017/02/mnist_plot-800x600.png</cite>

## Saving the model
For convienience the training model has been saved to a file. This can be loaded from the user input prompt or overwritten if you want to rerun the training.<br/>

I have the saved model file in the main folder of the repository. For it to work you will need to move this file into the data folder you have created for the gzip files.<br/>




## Packages needed for the program to run

The following packages will need to be imported for creating the network and importing the images to memory:
* The keras package used for creating the network
* The gzip package used for unzipping the dataset images and labels
* The numpy package used for altering the dataset into numpy arrays
* The sklearn pre-processing package used for classification and binary encoding each digit
* The random package used to generate a random value for the test images
* tkinter for uploading you own image files
* pathlib for checking the existence of the saved model file
* numpy random package for selecting random images from the test set.



In [1]:
# import keras for building the nural network
import keras as kr

# Import gzip for unpacking the images and labels
import gzip

# Import numpy
import numpy as np

# Import sklearn for categorising each digit
import sklearn.preprocessing as pre

# inport imge for resizing
from keras.preprocessing import image

# tkinter for selecting images
import tkinter as tk
# For opening file upload dialog
from tkinter import filedialog

# Used for checking if the model file exists.
from pathlib import Path

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


## Building the neural network
To begin we need to initialise the network using the sequential model.<br/>
This allows us to add layers as we need them. <br/>
These layers can be tweaked to increase performance.<br/>
We will investigate this later in the notebook.



In [2]:
# The code in this script was mainly Adapted from: https://raw.githubusercontent.com/ianmcloughlin/jupyter-teaching-notebooks/master/mnist.ipynb
# Initialise the neural network
model = kr.models.Sequential()

## Adding the layers to the network
To add layers to the network the layers method from keras will be used.<br/>
There will be a dense connection between neurons meaning that every neuron from the input is connected to every neuron in the middle layer and every neuron frim the middle layer is connected to every neuron on the output layer.

![Neural Network](https://cdn-images-1.medium.com/max/800/1*jYhgQ4I_oFdxgDD-AOgV1w.png)
<cite>Image source https://cdn-images-1.medium.com/max/800/1*jYhgQ4I_oFdxgDD-AOgV1w.pngS</cite>

* In the below code segment the units attribute represents the amount of neurons that will be in the middle layer in this case we have 1000 neurons.<br/>
* The activation attribute sets the activation function in this case we are using  [relu activation](https://keras.io/activations/) the relu activation has a steeper gradient than softmax and as a result speeds up the training process without the loss of performance. 

* The final attribute is used to set the amount of input neurons the network has. In the below example the number is set to 784 as this is equal to the number of bytes each image has within the mnist dataset.


In [3]:
# Add a hidden(middle layer) with 1000 neurons and an input layer with 784.
# There are 784 input neurons as this value is equal to the total amount of bytes each image has.
model.add(kr.layers.Dense(units=1000, activation='relu', input_dim=784))


## Output layer
The output layer has ten neurons that will map to the amount of training labels that are within the dataset. The predicted results are sent from the middle layer to the output layer and compared to the actual number that has been sent in as image data.<br/>
The closer to the value one the result is the more accurate the algorithm is performing.<br/>
While this process is repeating the loss point of gradient decent converges towards the base of the slope. <br/>
The process ends when all of the epochs have completed which will be explained later in this notebook.


In [4]:
# Add ten neurons to the output layer
model.add(kr.layers.Dense(units=10, activation='softmax'))

## Building the model

The compile method is used to build the model based on each layer created along with their connections specified in the above cell.</br>
* The first argument [categorical_crossentropy](https://keras.io/losses/) creates a vector to hold the values of each digit as a binary representation, this will be set with the pre.LabelBinarizer() to be discussed further in this notebook.
* The second optimizer argument is set to [stochastic gradient descent optimizer](https://keras.io/optimizers/) This sets the learning rate, and the decay of this learning rate over time.
* The final argument [metrics](https://keras.io/metrics/) is used to output the performance to the neural network after each run of data has been sent from the central layer to the output layer.

In [5]:
# Build the graph.
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])

## Opening the files in .gz format

As discussed in my previous [mnist notebook](https://github.com/kevgleeson78/Emerge-tech-assign/blob/master/Mnist%20Dataset.ipynb) the gzipped files are opened and read using the gzip package.


In [6]:
# Open the gzipped files and read as bytes.
with gzip.open('data/train-images-idx3-ubyte.gz', 'rb') as f:
    train_img = f.read()

with gzip.open('data/train-labels-idx1-ubyte.gz', 'rb') as f:
    train_lbl = f.read()

## Reading in the data into memory

Each of the 60000 images and labels are then stored into their respective variables.<br/>
We dived by 255 to convert the grey scale value to a value between one and zero.<br/>
These values are then used by the neural network in conjunction with the softmax function.


In [7]:
# read in all images and labels into memory
train_img = ~np.array(list(train_img[16:])).reshape(60000, 28, 28).astype(np.uint8) / 255.0
train_lbl =  np.array(list(train_lbl[8:])).astype(np.uint8)

## Flattening the data into a single array
The data is converted from a three dimensional array to a one dimensional array where all of the image bytes (28 *28) 784 are sequentially stored one after another.<br/>
This technique is used so each byte representing the image can have a one to one mapping to the neural networks input layer.


In [8]:
# Flatten the array so the inputs can be mapped to the input neurons
inputs = train_img.reshape(60000, 784)
inputs[0:1]

array([[1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.  

## Encoding the data
The label data is encoded into a matrix of 10 x 10 this will represent the digits in binary format.
Firstly we to setup the matrix using the labelBinerizer function.<br/>
The fit function passes the training labels as an argument. AS the set of labels are from zero - nine the (encoder.fit) function generates a matrix based on these values. In this case it will be a 10 x 10 matrix.


In [9]:
# encode the labels into binary format
encoder = pre.LabelBinarizer()
# get the size of the array needed for each category
encoder.fit(train_lbl)

LabelBinarizer(neg_label=0, pos_label=1, sparse_output=False)

## Transforming the labels
The labels are then transformed to a binary value based on the decimal value of the label.</br>
With each number being transformed to the following:
* (0) 1000000000
* (1) 0100000000
* (3) 0010000000 

And so on until we reach the number nine which is 0000000001.<br/>


In the below example the number five has the representation of '0 0 0 0 0 1 0 0 0 0'


In [10]:
# encode each label to be used as binary outputs
outputs = encoder.transform(train_lbl)
# print out the integer value and the new representation of the number
print(train_lbl[0], outputs[0])

5 [0 0 0 0 0 1 0 0 0 0]


### Full example
Below is a full view of the matrix.

In [11]:
# print out each array
for i in range(10):
    print(i, encoder.transform([i]))

0 [[1 0 0 0 0 0 0 0 0 0]]
1 [[0 1 0 0 0 0 0 0 0 0]]
2 [[0 0 1 0 0 0 0 0 0 0]]
3 [[0 0 0 1 0 0 0 0 0 0]]
4 [[0 0 0 0 1 0 0 0 0 0]]
5 [[0 0 0 0 0 1 0 0 0 0]]
6 [[0 0 0 0 0 0 1 0 0 0]]
7 [[0 0 0 0 0 0 0 1 0 0]]
8 [[0 0 0 0 0 0 0 0 1 0]]
9 [[0 0 0 0 0 0 0 0 0 1]]


## Training the model
We are now ready to begin training the network to recognise the images.</br>
The training set of 60000 images are used and passed to the networks first layer of 784 neurons.<br/>
Model parameters:
1. The encoded training images are sent as input
2. The encoded training labels are attached as the expected output
3. Epochs is the amount of times the 60000 images will be processed 
4. The batch size sets the amount of images that will be sent to the network as one unit

## train_model function
Used for an option to start traing. If function is run the final model will be saved to a file in the data folder. This file can then be loaded for future use rather than training the model every time we run the program. THe user qill be prompted to choose if they wish to load the model or start training.


In [12]:
# Train model function for user input
def train_model():
    # Start the training
    # Set the model up by adding the input and output layers to the network
    #The epochs value is the amount of test runs are needed
    # The batch_size value is the amount of images sent at one time to the network
    model.fit(inputs, outputs, epochs=50, batch_size=100)
    # adapted from https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model
    # save the model to a file for future use
    model.save("data/my_model.h5")

## Results of the training

After 50 epochs the network is getting approx 96.5% of the images correct for the training set.<br/>
With each epoch the performance is slightly increasing and the loss is converging towards zero (The optimal value).
## Saving the model
We can now save the above training model weights to a file for future access. This file will be saved to the same data folder you have created for the gzip files.




## load_model function
This function will be called if the user chooses from the input prompt.<br/>
If called the model file will be loaded and we can begin testing straight away.

In [13]:
# function for loading the model weights file
def load_model():
    model.load_weights("data/my_model.h5")

## User input for loading model or run training
The below input prompt asks the user to choose loading the model file or run training the model. <br/>
If the user chooses 'y' a further condition is used to check if the file exists.<br/>
If The file exits the load_model() function is called. <br/>
If the files does not exist the trian_model() function is called and training begins.<br/>
Once training has completed the saved model file will be created for future use.

In [14]:
# Prompt user to load or re-run the training.
option = input("Please choose an option and press enter: \n"
               "Load model = y/ train model = n \n")
if option == 'y':
    # Adapted from: https://stackoverflow.com/questions/82831/how-do-i-check-whether-a-file-exists-without-exceptions
    # Check if the model file exists
    my_file = Path("data/my_model.h5")
    if my_file.is_file():
        load_model()
    else:
        print("No file exists running training.....")
        train_model()
elif option == 'n':
    train_model()

Please choose an option and press enter: 
Load model = y/ train model = n 
y
No file exists running training.....
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


## Testing the network with test images

The test images and labels are unzipped and stored in memory using the same methods as the training images and labels.<br/>

A single image can then be sent to the network to see if it is identifying the number correctly.


In [15]:
# open the gzipped test images and labels
# Adapted from : https://docs.python.org/2/library/gzip.html
with gzip.open('data/t10k-images-idx3-ubyte.gz', 'rb') as f:
    test_img = f.read()

with gzip.open('data/t10k-labels-idx1-ubyte.gz', 'rb') as f:
    test_lbl = f.read()

# Store each image and label into memory
# Adapted from: https://raw.githubusercontent.com/ianmcloughlin/jupyter-teaching-notebooks/master/mnist.ipynb
test_img = ~np.array(list(test_img[16:])).reshape(10000, 784).astype(np.uint8) / 255.0
test_lbl =  np.array(list(test_lbl[ 8:])).astype(np.uint8)

## Show the performance results

The below result shows that 9622 images have been identified correctly this matches the final accuracy of the training output.

In [16]:
# Show the total number of correct images identified out of 10000 test images
performance = (encoder.inverse_transform(model.predict(test_img)) == test_lbl).sum()
print("The correct number of predictions: ", performance)

The correct number of predictions:  9610


## Passing an image to the network
In the below example the 128th image in the dataset is passed to the network for identification.<br/>
The result is then printed out for us to examine as an array.<br/>
The index with the highest value within this array represents the number that has been picked by the network.<br/>
In this case the number identified by the network is eight as the ninth position in the array is the highest value.



In [17]:
test = model.predict(test_img[128:129])
# Print the
print(test)

[[5.42031330e-07 2.47588218e-07 4.38852876e-05 5.94604900e-03
  1.35175935e-06 1.73228516e-04 1.33408633e-07 3.06217498e-06
  9.93728638e-01 1.02859485e-04]]


## Printing out the results.

We can get the array index with the highest value by using the argmax function.<br/>
This will return the index of the array with the highest value in this case it is eight.<br/>
Additionally the label for the image can be accessed using the same index as used for the test image.

In [18]:
# Get the maximum value from the machine prediction
pred_result = test.argmax(axis=1)

print("The machine prediction is : =>> ",  pred_result)
print("The actual number is : =>> ", test_lbl[128:129])

The machine prediction is : =>>  [8]
The actual number is : =>>  [8]


## Testing some more images 

Below we will test twenty more images selected at random to see if the network is performing as expected.

In [19]:
## Get 20 random images form the test set and pass them to the trained model.
# Random int adapted from https://stackoverflow.com/questions/3996904/generate-random-integers-between-0-and-9
from random import randint
# Select 20 images
for i in range(20):
    # The test number
    print("Test Number : ", i+1,"\n")
    # Get a random value between 1 and 10000
    x = randint(0, 9999)
    # PRint the random value
    print("The random index: ", x, "\n")
    print("The result array: ")
    # Predicting the number passed in
    test = model.predict(test_img[x:x+1])
    # Print the result array
    print(test, "\n")
    # Get the maximum value from the machine predictions
    pred_result = test.argmax(axis=1)

    print("The machine prediction is : =>> ",  pred_result)
    print("The actual number is : =>> ", test_lbl[x:x+1])
    print("##############################################")


Test Number :  1 

The random index:  1535 

The result array: 
[[9.99100804e-01 1.00136832e-09 2.81895453e-04 8.22992661e-06
  2.56168367e-08 3.18874227e-04 5.92213200e-06 1.19278575e-05
  1.77004658e-05 2.54621526e-04]] 

The machine prediction is : =>>  [0]
The actual number is : =>>  [0]
##############################################
Test Number :  2 

The random index:  3195 

The result array: 
[[2.4263347e-06 8.6829554e-05 9.9850094e-01 1.2886698e-03 6.0070941e-11
  1.4896189e-06 9.2124601e-06 1.5644073e-11 1.1047465e-04 7.9177123e-11]] 

The machine prediction is : =>>  [2]
The actual number is : =>>  [2]
##############################################
Test Number :  3 

The random index:  2148 

The result array: 
[[3.8887080e-04 5.1244198e-05 4.0077833e-03 4.6792170e-03 6.2458205e-01
  3.1202249e-04 3.3932473e-04 4.0026479e-03 6.9639628e-04 3.6094037e-01]] 

The machine prediction is : =>>  [4]
The actual number is : =>>  [4]
##############################################
Test

## Result
The above output shows that the network has all of the numbers correct on this run.<br/>
However as it has a 96.5% accuracy it will get 3.5 predictions wrong out of every 100 test images.


## Uploading you own images for testing
The code below will allow the user to upload their own images of any size.<br/>
When prompted if the user chooses 'y' tkinter's file dialog will open where the user can choose an image file to test with the network.<br/>
The image is then coloured to greyscale and resized to 28 x 28 pixels.<br/>
Finally the image is converted to an array and flattened to 784 bytes. <br/>
This will convert the image to the same format as the mnist data and can now be used to pass to the network for identification.



In [20]:
# Uploading your own images for testing
# input file adapted from https://stackoverflow.com/questions/9319317/quick-and-easy-file-dialog-in-python
def file_upload():
    # tkinter for uploading file
    root = tk.Tk()
    root.withdraw()
    
    # -topmost, True to ensure the upload screen appears on top of the current window.
    #Adapted from: https://stackoverflow.com/questions/31778176/how-do-i-get-tkinter-askopenfilename-to-open-on-top-of-other-windows
    root.attributes("-topmost", True)
    
    # get the file path from the chosen file in the dialog box
    file_path = filedialog.askopenfilename()
    
    # Adapted from https://towardsdatascience.com/basics-of-image-classification-with-keras-43779a299c8b
    # resize the image that has been uploaded
    img = image.load_img(path=file_path,color_mode = "grayscale",target_size=(28,28,1))
    
    # flatten the image
    imgage1 = np.array(list(image.img_to_array(img))).reshape(1, 784).astype(np.uint8) / 255.0

    # Test the network with new image
    test1 = model.predict(imgage1)

    # Print out the result of the prediction
    print("The number predicted is : ", test1.argmax(axis=1))

# Boolean for while loop
keep_running = True

while keep_running:
    upload_img = input("upload image? \n"
                   " press 'y' + enter (to upload)\n"
                   "press 'n'  + enter (to exit)")

    if upload_img == 'y':
        # Call the file_upload function.
        file_upload()
    elif upload_img == 'n':
        # exit the loop.
        keep_running = False
       

upload image? 
 press 'y' + enter (to upload)
press 'n'  + enter (to exit)y
The number predicted is :  [9]
upload image? 
 press 'y' + enter (to upload)
press 'n'  + enter (to exit)y
The number predicted is :  [8]
upload image? 
 press 'y' + enter (to upload)
press 'n'  + enter (to exit)y
The number predicted is :  [2]
upload image? 
 press 'y' + enter (to upload)
press 'n'  + enter (to exit)y
The number predicted is :  [1]
upload image? 
 press 'y' + enter (to upload)
press 'n'  + enter (to exit)y
The number predicted is :  [7]
upload image? 
 press 'y' + enter (to upload)
press 'n'  + enter (to exit)y
The number predicted is :  [7]
upload image? 
 press 'y' + enter (to upload)
press 'n'  + enter (to exit)y
The number predicted is :  [7]
upload image? 
 press 'y' + enter (to upload)
press 'n'  + enter (to exit)y
The number predicted is :  [7]
upload image? 
 press 'y' + enter (to upload)
press 'n'  + enter (to exit)y
The number predicted is :  [5]
upload image? 
 press 'y' + enter (to