# Digit Recognition Script
This script tries to recognize handwritten images previous training against the MNIST dataset. The main feature of this script is that aims to create sequential neural networks using the Keras library.

## Imports
First section of the script consists in importing the required libraries and modules

In [None]:
# Adapted from https://github.com/ianmcloughlin/jupyter-teaching-notebooks/blob/master/mnist.ipynb
import keras as kr
from keras.models import load_model
import sklearn.preprocessing as pre
import gzip
import numpy as np
from PIL import Image

## Global Variables
After imports, the global variables are defined or declared
* enconder: it converts multi-class labels to binary labels (belong or does not belong to the class). Since our classes are well known, decimal digits, the enconder can be defined early.
* model: it represent the neural network model to load/create/train/test or validate
This variables are read and modified from various functions

In [None]:
with gzip.open('data/t10k-labels-idx1-ubyte.gz', 'rb') as f:
		test_lbl = f.read()
test_lbl = np.array(list(test_lbl[ 8:])).astype(np.uint8)
encoder = pre.LabelBinarizer()
encoder.fit(test_lbl)

# Global model variable
model = None

## Functions
Next are the definition of the different functions of this script

### Load Function
It prompts the user for an HDF5 file that contains a Keras neural network model. That model is going to be stored in the global variable *model*. As a last step it displays a summary of the model, so the user can have an idea about the loaded model.

In [None]:
def load():
	global model
	filename = input("Please enter a HDF5 file to load: ")
	model = load_model(filename)
	model.summary()

#### Construct Function
This function creates and configures a model. It overwrites the *model* global variable so any previous model stored in this variable is destroyed.

In [None]:
def construct():
	global model
	if model:
		confirmation = input("\nDo you want to delete the saved model? (y/n) ")
		if confirmation == "y":
			del model
		elif confirmation == "n":
			return
	
	model = kr.models.Sequential()

Predefined neural network type is sequential, meaning one layer of neurons after another.

In [None]:
	model = kr.models.Sequential()

The amount of neuron layers created by this script is four:
* input layer: 784 neurons
* second layer: 600 neurons, activation function: linear
* third layer: 400 neurons, activation function: relu
* output layer: 10 neurons, activation function: softmax

The [activation functions](https://keras.io/activations/) can be changed to the ones specified in the Keras documentation.

The compile options are:
* loss function: categorical_crossentropy
* optimizer: adam

In [None]:
model.add(kr.layers.Dense(units=600, activation='linear', input_dim=784))
model.add(kr.layers.Dense(units=400, activation='relu'))
model.add(kr.layers.Dense(units=10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

Finally a model of the just configured model is displayed

In [None]:
	model.summary

#### Train Function
As its name suggests, this function trains a neural network stored in the global variable *model*. First checks that the global variable *model* is not empty.

In [None]:
def train():
	global model
	global encoder
	
	if not model:
		print("No model found. Please create or load your model first")
		return

Then, it uncompress the training files containing the images of the MNIST handwritten numbers and their respective labels.

In [None]:

	with gzip.open('data/train-images-idx3-ubyte.gz', 'rb') as f:
		train_img = f.read()

	with gzip.open('data/train-labels-idx1-ubyte.gz', 'rb') as f:
		train_lbl = f.read()

The local variables *train_img* and *train_lbl* are defined. The variable *train_img* skips the first 16 header bytes of the images file and the proceeds to store the images as 60000 matrices of size 28 by 28 unsigned bytes. Each image is processed by the NOT bitwise operand, since the background and foreground of the MNIST are black and white, the scripts invert them to obtain a more common black foreground over white background image. Each matrix is divided by the scalar *255.0* to normalize the data

In [None]:
	
	train_img = ~np.array(list(train_img[16:])).reshape(60000, 28, 28).astype(np.uint8) / 255.0

The digits labels have a similar treatment. They are stored in the local variable *train_lbl*, skipping the first 8 hearder bytes of the labels file. Labels are stored in an unidimensional array of 1 row and 60000 columns of type unsinged bytes, each containing a label for each image.

In [None]:
	train_lbl =  np.array(list(train_lbl[ 8:])).astype(np.uint8)

Now the images are "flateen" into a two-dimensional array of 60000 rows, each row is an image, and 784 unsigned byte columns, each column represents a pixel of the MNIST 28x28 image. This is done in order to match a single pixel with a single input neuron of the model. Also, for performance purposes avoided nested *for* loops. 

In [None]:
	inputs = train_img.reshape(60000, 784)

In [None]:
	outputs = encoder.transform(train_lbl)

A summary of the model to be trained is displayed to the user, for verification and rememberance purposes.

In [None]:
	model.summary()

The model training starts

In [1]:
	model.fit(inputs, outputs, epochs=2, batch_size=100)

NameError: name 'model' is not defined

#### Test Function
This function is in charge of testing a model against MNIST image test set. As good practice, it checks if the model is not empty.

In [None]:
def test():
	global model
	global encoder
	global test_lbl
	
	if not model:
		print("Empty model. Please create/load a model first")
		return

Then it uncompress the image test file, and processes the images the same exact way the train function does: it skips the first 16 header bytes of the images file and the proceeds to store the images as 10000 matrices of size 28 by 28 unsigned bytes. Each image is processed by the NOT bitwise operand, since the background and foreground of the MNIST are black and white, the scripts invert them to obtain a more common black foreground over white background image. Each matrix is divided by the scalar 255.0 to normalize the data

In [None]:
	with gzip.open('data/t10k-images-idx3-ubyte.gz', 'rb') as f:
		test_img = f.read()
		
	test_img = ~np.array(list(test_img[16:])).reshape(10000, 784).astype(np.uint8) / 255.0

A model summary is display to remind about the model that is going to be tested

In [None]:
	model.summary()

The model starts predicting over the array of normalized flatten test images, since the predictions are arrays of float numbers between 0 and 1, they are *inversed transformed* to the category label, that is to the respective predicted digit (0 - 9):  
```
prediction: [[9.7214666e-05 8.9237624e-01 8.1143016e-03 2.9746909e-03 7.8786700e-04 6.9424585e-02 3.3270712e-03 3.2408212e-04 2.0257998e-02 2.3161303e-03]]
```
The closest value to 1 it is the second element in the array that corresponds to the label category for digit *1*. That is the work that `encoder.inverse_transform` does. Each prediction is compared to its respective label, if the prediction is successful the `rs` success counter is increased.

In [None]:
	result_set = (encoder.inverse_transform(model.predict(test_img)) == test_lbl).sum()

Finally, the success rate is calculated and displayed to the user

In [None]:
	percent = (result_set/10000)*100
	print("\nModel has made", result_set, "successful predictions out of 10000 tests (", percent, "%)")

#### Save Function
This functions checks for a model stored in the global variable *model*, if so, it proceeds to save it in an HDF5 file

In [None]:
def save():
	global model
	# Save model
	if not model:
		print("There is no model!\nPlease create/load a model first")
		return

	filename = input("Please enter a filename: ")
	model.save(filename)

#### Read PNG Image function
As before the fuction returns to the main menu in case of an empty model

In [None]:
def png_read():
	if not model:
		print("There is no model!\nPlease create/load a model first")
		return

Then asks the user for the PNG image filename. Then it converts it to grayscale coloring with the `convert("L")` method.

The *img_width* and *img_height* are used to scale the image. They are intialized 28 pixels each.

In [None]:
	img_width = 28
	img_height = 28

The user is asked to input new dimensions that are compatible with the model input size or press 'Enter' to leave as it is and not scale.
            
The `img.thumbnail((img_width,img_height), Image.ANTIALIAS)` function call performs the scaling if the image's original dimensions were changed.

The processing dimensions are displayed to the user.

In [None]:
	if (img_width != img.size[0]) or (img_height != img.size[1]):
		img.thumbnail((img_width,img_height), Image.ANTIALIAS)

Then the process to 'flatten' the image is perform multiplying the processing dimensions. For example, let's say the PNG image dimensions are 375 by 375 pixels and we have a model with 784 input neurons. The product 375x375 = 140625 is clearly not equal to the model's input 784. To solve this, since the image dimensional ratio is 1:1, we can scale the image down to 28 by 28 pixels, since 28x28 = 784

In [None]:
	one_dim =  img_width*img_height

The `im2arr` variable is defined with the pixel bytes of the scaled (or not) image. Then it is 'flatten' following a similar procedure in the previuous *train* and *test* functions: it is reshaped into a one-dimensional array of 784 columns each representing each pixel usigned byte of the image. They are divided by the scalar *255.0* to normalize the input.

In [None]:
	im2arr = np.array(img.getdata())
	im2arr = np.array(list(im2arr)).reshape(1, one_dim).astype(np.uint8) / 255.0

Finally, the model makes a prediction, wich is reversed encoded to display more clearly the predicted class, or digit in this case.

In [None]:
	pred = model.predict(im2arr)
	result_set = encoder.inverse_transform(pred)
	print("The program predicts the image is a:", result_set)

### Menu
The menu displays the operation that the user can perform with this script. There are 7 items that the user can choose from. The option is stored in the *choice* variable and then parsed for function execution. The functions are the ones described in the previous sections. If an invalid option number (e.g. *0*) or other invalid character (e.g. *A*), the script yields an `Invalid choice, enter a valid number.` error and asks for correct input. Option *7* exits the script.

In [None]:
choice = True
while choice:
	print("""
	1. Load HDF5 file
	2. Create a NN model
	3. Train with MNIST training images
	4. Test using MNIST testing images
	5. Save model
	6. Read and predict from a PNG file
	7. Exit
	""")
	choice = input("Menu: ")
	
	if choice == "1":
		load()
	elif choice =="2":
		construct()
	elif choice =="3":
		train()
	elif choice=="4":
		test()
	elif choice =="5":
		save()
	elif choice =="6":
		png_read()
	elif choice=="7":
		choice = None
	else:
		print("Invalid choice, enter a valid number.") 