# Digit Recognition Script
This script tries to recognize handwritten images previous training against the MNIST dataset. The main feature of this script is that aims to create sequential neural networks using the Keras library.


## Structure
This script is conceived as an interactive script using a menu-item scheme. Structured in six diffrent modules or functions, each function responds to one item displayed in the menu.

The script can be divided in three big sections:
* Library/module imports and global variables
* Function definitions
* Menu

### Library/Module Imports and Global Variables
#### Imports
First section of the script consists in importing the required libraries and modules

In [None]:
import matplotlib.pyplot as plt # For image debugging purposes

import sklearn.preprocessing as pre # Input preprocessing
import numpy as np # Utility library 
import keras as kr # Tensorflow high-level neural networks API
import gzip # Uncompressing MNIST files

from keras.models import load_model # Loading neural networks from HDF5 files
from PIL import Image # Image processing

#### Global Variables
After imports, the global variables are defined or declared
* enconder: it converts multi-class labels to binary labels (belong or does not belong to the class). Since our classes are well known, decimal digits, the enconder can be defined early.
* model: it represent the neural network model to load/create/train/test or validate
This variables are read and modified from various functions

In [None]:
with gzip.open('data/t10k-labels-idx1-ubyte.gz', 'rb') as f: # Uncompress MNIST test labels file
		test_lbl = f.read()
test_lbl = np.array(list(test_lbl[ 8:])).astype(np.uint8) # Set the classes as input for the enconder
encoder = pre.LabelBinarizer() # Deefine and set the encoder
encoder.fit(test_lbl) 

# Global model variable
model = None

### Function Definitions
Next are the definition of the different functions of this script

#### Load Model Function
It prompts the user for an HDF5 file that contains a Keras neural network model. That model is going to be stored in the global variable *model*. As a last step it displays a summary of the model, so the user can have an idea about the loaded model.

In [None]:
def load():
	global model
	filename = input("Please enter a HDF5 file to load: ")
	model = load_model(filename)
	# DEBUG print(type(model))
	model.summary()

#### Configure Function
This function creates and configures a model. It overwrites the *model* global variable so any previous model stored in this variable is destroyed.

In [None]:
def configure():
	global model
	if model: # Checks if a model already exists
		confirmation = input("\nThe current model is about to be destroyed. Continue? (y/n) ")
		if confirmation == "y":
			del model # Wipes any previous model
		elif confirmation == "n":
			return # Exits to main menu if user decides to keep the current model

The predefined neural network type is sequential, meaning one layer of neurons after another.

In [None]:
	model = kr.models.Sequential() # Defines the model as a sequential neural network

The minimum amount of neuron layers created by this script is three:
* input layer: defaults to 784 neurons
* hidden layer(s) : defaults to 600 neurons
* output layer: has 10 neurons mandatory ( digits 0 to 9 )

This script allow the user to input the amount of neurons and the activation function for each layer. The [activation functions](https://keras.io/activations/) are the ones specified in the Keras documentation

In [None]:
	print("Model options")
	
    # First layer setup
	input_str = input("\nFirst layer: how many input neurons? (default: 784) ")
	inital_neurons = 784
	if not len(input_str) == 0:
		try:
			inital_neurons = int(input_str)
			if inital_neurons <= 0:
				raise ValueError('InvalidInput')
		except ValueError:
			# handle input error or assign default for invalid input
			print('First layer neurons can\'t be less or equal to zero')

	# Second layer setup
	neurons = 600
	input_str = input("Second layer: how many neurons? (default: 600) ")
	if not len(input_str) == 0:
		try:
			neurons = int(input_str)
			if neurons <= 0:
				raise ValueError('InvalidInput')
		except ValueError:
			# handle input error or assign default for invalid input
			print('Second layer neurons can\'t be less or equal to zero')
	
    # Second layer activation function input
	print("Second layer: which activation function to use? (e.g. linear, sigmoid, elu, selu, relu, softplus, softmax)  ")
	print("More activation functions at https://keras.io/activations/")
	activation_function = input()
	
	# First and second layers definition
	model.add(kr.layers.Dense(units=neurons, activation=activation_function, input_dim=inital_neurons))

It uses a while loop to add layers and it stops when the user doesn't require to add more layers.

In [None]:
    # While loop for adding more layers
	answer = input("\nAdd another layer? (y/n) ")
	while answer == "y":
		neurons = 400 # After the second layer, neuron default value is 400
		input_str = input("New layer: how many neurons? (default: 400) ")
		if not len(input_str) == 0:
			try:
				neurons = int(input_str)
				if neurons <= 0:
					raise ValueError('InvalidInput')
			except ValueError:
				# handle input error or assign default for invalid input
				print('New layer neurons can\'t be less or equal to zero')
		
		print("New layer: which activation function to use? (e.g. linear, sigmoid, elu, selu, relu, softplus, softmax)  ")
		print("More activation functions at https://keras.io/activations/")
		activation_function = input()
		
		model.add(kr.layers.Dense(units=neurons, activation=activation_function))
		
		answer = input("\nAdd another layer? (y/n) ")

The output layer is predifined to have 10 neurons. The user only inputs the activation function

In [None]:
    # Output layer, it has 10 neurons mandatory
	print("Last layer: it is set by default to 10 output neurons strictly")
	print("Last layer: which activation function to use? (e.g. linear, sigmoid, elu, selu, relu, softplus, softmax) ")
	print("More activation functions at https://keras.io/activations/")
	activation_function = input()
	model.add(kr.layers.Dense(units=10, activation=activation_function))

Once the layers are define the script asks for compile options, the loss function and the optimizer. The [loss functions](https://github.com/keras-team/keras/blob/master/keras/losses.py) and the [optimizers](https://keras.io/optimizers/) are the ones specified by the Keras documentation. Then the script proceeds to compile.

In [None]:
	print("\nCompile options")
	print("Which loss function to use? (e.g. binary_crossentropy, categorical_crossentropy, mse, mae, mape, msle, kld, cosine) ")
	print("More loss functions at https://github.com/keras-team/keras/blob/master/keras/losses.py")
	loss_function = input()
	
	print("\nWhich optimizer? (e.g sgd, rmsprop, adam, adadelta, adagrad)")
	print("More optimizers at https://keras.io/optimizers/")
	optimizer_value = input()
    
	model.compile(loss=loss_function, optimizer=optimizer_value, metrics=['accuracy'])

Finally a model of the just configured model is displayed

In [None]:
	model.summary

#### Train Function
As its name suggests, this function trains a neural network stored in the global variable *model*. First checks that the global variable *model* is not empty.

In [None]:
def train():
	global model
	global encoder
	
	if not model:
		print("Empty model. Please create/load a model first")
		return

Then, it uncompress the training files containing the images of the MNIST handwritten numbers and their respective labels.

In [None]:
	with gzip.open('data/train-images-idx3-ubyte.gz', 'rb') as f:
		train_img = f.read()

	with gzip.open('data/train-labels-idx1-ubyte.gz', 'rb') as f:
		train_lbl = f.read()

The local variables *train_img* and *train_lbl* are defined. The variable *train_img* skips the first 16 header bytes of the images file and the proceeds to store the images as 60000 matrices of size 28 by 28 unsigned bytes. Each image is processed by the NOT bitwise operand, since the background and foreground of the MNIST are black and white, the scripts invert them to obtain a more common black foreground over white background image. Each matrix is divided by the scalar *255.0* to normalize the data

In [None]:
	train_img = ~np.array(list(train_img[16:])).reshape(60000, 28, 28).astype(np.uint8) / 255.0

The digits labels have a similar treatment. They are stored in the local variable *train_lbl*, skipping the first 8 hearder bytes of the labels file. Labels are stored in an unidimensional array of 1 row and 60000 columns of type unsinged bytes, each containing a label for each image.

In [None]:
	train_lbl =  np.array(list(train_lbl[ 8:])).astype(np.uint8)

Now the images are "flateen" into a two-dimensional array of 60000 rows, each row is an image, and 784 unsigned byte columns, each column represents a pixel of the MNIST 28x28 image. This is done in order to match a single pixel with a single input neuron of the model. Also, for performance purposes avoided nested *for* loops. 

In [None]:
	inputs = train_img.reshape(60000, 784)

As for the labels, they are enconded in a multi-class binary categorization of the following format:
* 0 is represented as the array \[1, 0, 0, 0, 0, 0, 0, 0, 0, 0\]
* 1 is represented as the array \[0, 1, 0, 0, 0, 0, 0, 0, 0, 0\]
* 2 is represented as the array \[0, 0, 1, 0, 0, 0, 0, 0, 0, 0\]
* 3 is represented as the array \[0, 0, 0, 1, 0, 0, 0, 0, 0, 0\]
* 4 is represented as the array \[0, 0, 0, 0, 1, 0, 0, 0, 0, 0\]
* 5 is represented as the array \[0, 0, 0, 0, 0, 1, 0, 0, 0, 0\]
* 6 is represented as the array \[0, 0, 0, 0, 0, 0, 1, 0, 0, 0\]
* 7 is represented as the array \[0, 0, 0, 0, 0, 0, 0, 1, 0, 0\]
* 8 is represented as the array \[0, 0, 0, 0, 0, 0, 0, 0, 1, 0\]
* 9 is represented as the array \[0, 0, 0, 0, 0, 0, 0, 0, 0, 1\]

There are 60000 labels for the 60000 training images, each label is enconded in the above fashion and passed to the last/output layer of 10 neurons when training an image, so with this technique the the neural network is instructed  if it has achivied or not the desired classification.

In [None]:
	outputs = encoder.transform(train_lbl)

	# DEBUG print("outputs", outputs, outputs.shape)

A summary of the model to be trained is displayed to the user, for verification and rememberance purposes.

In [None]:
	model.summary()

Now the script asks the users the training parameters:
* epoch: an arbitrary cutoff, generally defined as "one pass over the entire dataset", used to separate training into distinct phases, which is useful for logging and periodic evaluation. Defaults to 2
* batch size: generally approximates the distribution of the input data better than a single input. Defaults to 100

In [None]:
	epoch_input = 2
	input_str = input("Train model: how many epochs? (default: 2) ")
	if not len(input_str) == 0:
		try:
			epoch_input = int(input_str)
			if epoch_input <= 0:
				raise ValueError('InvalidInput')
		except ValueError:
			# handle input error or assign default for invalid input
			print('Second layer neurons can\'t be less or equal to zero')
			
	batch_input = 100
	input_str = input("Train model: batch size? (default: 100) ")
	if not len(input_str) == 0:
		try:
			batch_input = int(input_str)
			if batch_input <= 0:
				raise ValueError('InvalidInput')
		except ValueError:
			# handle input error or assign default for invalid input
			print('Second layer neurons can\'t be less or equal to zero')

The model training starts

In [None]:
	model.fit(inputs, outputs, epochs=epoch_input, batch_size=batch_input)

After the model is trained, the script asks the user if the model should be saved into an HDF5 file.

In [None]:
	print("\nSave this model into a HDF5 file? (y/n) ")
	save_file = input()
	
	if save_file == "y":
		save()

#### Test Function
This function is in charge of testing a model against MNIST image test set. As good practice, it checks if the model is not empty.

In [None]:
def test():
	global model
	global encoder
	global test_lbl
	
	if not model:
		print("Empty model. Please create/load a model first")
		return

Then it uncompress the image test file, and processes the images the same exact way the train function does: it skips the first 16 header bytes of the images file and the proceeds to store the images as 10000 matrices of size 28 by 28 unsigned bytes. Each image is processed by the NOT bitwise operand, since the background and foreground of the MNIST are black and white, the scripts invert them to obtain a more common black foreground over white background image. Each matrix is divided by the scalar 255.0 to normalize the data

In [None]:
	with gzip.open('data/t10k-images-idx3-ubyte.gz', 'rb') as f:
		test_img = f.read()
		
	test_img = ~np.array(list(test_img[16:])).reshape(10000, 784).astype(np.uint8) / 255.0

A model summary is display to remind about the model that is going to be tested

In [None]:
	model.summary()

The model starts predicting over the array of normalized flatten test images, since the predictions are arrays of float numbers between 0 and 1, they are *inversed transformed* to the category label, that is to the respective predicted digit (0 - 9):  
```
prediction: [[9.7214666e-05 8.9237624e-01 8.1143016e-03 2.9746909e-03 7.8786700e-04 6.9424585e-02 3.3270712e-03 3.2408212e-04 2.0257998e-02 2.3161303e-03]]
```
The closest value to 1 it is the second element in the array that corresponds to the label category for digit *1*. That is the work that `encoder.inverse_transform` does. Each prediction is compared to its respective label, if the prediction is successful the `rs` success counter is increased.

In [None]:
	rs = (encoder.inverse_transform(model.predict(test_img)) == test_lbl).sum()

Finally, the success rate is calculated and displayed to the user

In [None]:
	pct = (rs/10000)*100
	print("\nModel has made", rs, "successful predictions out of 10000 tests (", pct, "%)")

#### Save Function
This functions checks for a model stored in the global variable *model*, if so, it proceeds to save it in an HDF5 file

In [None]:
def save():
	global model
	# Save model
	if not model:
		print("There is no model!\nPlease create/load a model first")
		return

	filename = input("Please enter a filename: ")
	model.save(filename)

#### Read PNG Image function
As before the fuction returns to the main menu in case of an empty model

In [None]:
def png_read():
	if not model:
		print("There is no model!\nPlease create/load a model first")
		return

Then asks the user for the PNG image filename. Then it converts it to grayscale coloring with the `convert("L")` method. After that it gives information about the image dimensions: width and height. It also warns about not compatible image sizes to work with the model

In [None]:
	filename = input("Please enter a PNG image file: ")
	img = Image.open(filename).convert("L")
	
	print("Image width (pixels): ", img.size[0], " Image height (pixels): ", img.size[1])
	print("\n!Notice! Processing width times processing height must equal the amount of input neurons of a model!\n")

The *proc_width* and *proc_height* are used to scale the image. They are intialized with the same values of the image's dimensions.

In [None]:
	proc_width = img.size[0]
	proc_height = img.size[1]

The user is asked to input new dimensions that are compatible with the model input size or press 'Enter' to leave as it is and not scale.

The `img.thumbnail((proc_width,proc_height), Image.ANTIALIAS)` function call performs the scaling if the image's original dimensions were changed.

The processing dimensions are displayed to the user.

In [None]:
	input_str = input("Please enter new image processing width: (Press enter to keep original dimension) ")
	if input_str:
		try:
			proc_width = int(input_str)
		except ValueError:
				# handle input error or assign default for invalid input
				print('Invalid input')
				
	input_str = input("Please enter new image processing height: (Press enter to keep original dimension) ")
	if input_str:
		try:
			proc_height = int(input_str)
		except ValueError:
				# handle input error or assign default for invalid input
				print('Invalid input')
	
	if (proc_width != img.size[0]) or (proc_height != img.size[1]):
		img.thumbnail((proc_width,proc_height), Image.ANTIALIAS)
		
	print("\nProcessing width:", proc_width, "Processing height:", proc_height)

Then the process to 'flatten' the image is perform multiplying the processing dimensions. For example, let's say the PNG image dimensions are 375 by 375 pixels and we have a model with 784 input neurons. The product 375x375 = 140625 is clearly not equal to the model's input 784. To solve this, since the image dimensional ratio is 1:1, we can scale the image down to 28 by 28 pixels, since 28x28 = 784

In [None]:
	one_dim =  proc_width*proc_height

The `im2arr` variable is defined with the pixel bytes of the scaled (or not) image. Then it is 'flatten' following a similar procedure in the previuous *train* and *test* functions: it is reshaped into a one-dimensional array of 784 columns each representing each pixel usigned byte of the image. They are divided by the scalar *255.0* to normalize the input.

In [None]:
	im2arr = np.array(img.getdata())
	im2arr = np.array(list(im2arr)).reshape(1, one_dim).astype(np.uint8) / 255.0

Finally, the model makes a prediction, wich is reversed encoded to display more clearly the predicted class, or digit in this case.

In [None]:
	pred = model.predict(im2arr)
	rs = encoder.inverse_transform(pred)
	print("The program predicts that the image is a:", rs)

### Menu
The menu displays the operation that the user can perform with this script. There are 7 items that the user can choose from. The option is stored in the *choice* variable and then parsed for function execution. The functions are the ones described in the previous sections. If an invalid option number (e.g. *0*) or other invalid character (e.g. *A*), the script yields an `Invalid option` error and asks for correct input. Option number *7* exits the script.

In [None]:
choice = True
while choice:
	print("""
	1. Load model from HDF5 file
	2. Create, configure and compile model
	3. Train with MNIST training images
	4. Test against the MNIST testing images
	5. Save model
	6. Read and predict from a PNG file
	7. Exit
	""")
	choice = input("Option: ")
	
	if choice == "1":
		load()
	elif choice =="2":
		configure()
	elif choice =="3":
		train()
	elif choice=="4":
		test()
	elif choice =="5":
		save()
	elif choice =="6":
		png_read()
	elif choice=="7":
		choice = None
	else:
		print("Invalid option")