# Simple Handwritten Digit Recognition Neural Network

## Brief introduction

#### A neural network is a network that consists of a input layer, hidden layer (middle layer) and the output layer. The input layer takes in the input (images, files, audio, video etc), passes it to the hidden layer where come processing/learning is done and passed to the output layer for results.

##### This notebook aims to explain simply how the neural network performs its task.  Its important to note that the neural network decides on the output of a result based on the weights of the input passed to the hidden and then output layer. In other words, if the weight of an input node is very low, the output would likely be low.

##### Take a moment to think about this: assume you are in a group of 3 friends and you are want to show much you love your 3rd friend. Now  let's say you are the first friend, your second friend, Jay is the messenger or channel of communication between you and your 3rd friend, Lola. You really wanna appreciate Lola but you have to tell it through Jay.

##### It means you are the input node(s), Jay is the hidden/middle node and Lola is the output node. Let's say you made a casual whisper to Jay to inform Lola you love her. Jay is reluctant but goes on to say it to Lola. Its easy for Lola to smile and discard it - meaning the output was not strong enough. Let's assume you call Jay to a corner and tell him with all seriousness that you love Lola and that he should tell Lola the same say you told him. Jay did exactly what you told him. Lola would likely take it more seriously put into consideration. She might even reply telling you she loves you too. The words you said are the input, the whispered word has small weight - its not really weighty. While the seriousness you added to the corner talk had more weight in shaping Lola's response. 

##### This is explains the basics of how neural network works. Its takes the product of the input from each layer multiplies it by the weight to give an output.

##### Tweaking it a little. Let's assume Jay told Lola that you "like" her instead of "love". That's an error. You had it mind that he would tell Lola with all seriousness that you love her but he said like. What you had in mind was an intended output (your target) while what Jay said was the actual output - what he did. Mathematically we can calculate this as output error = intended output - actual output.

##### Futhermore, we can moderate this output error by including a learning rate. Lets assume this learning rate is a figure that we multiply with the output error to reduce it so that when next we tell Jay to speak with Lola, the error is reduced. We use this learning rate each time we want to communicate, as a moderator.

##### Finally, lets assume that Lola has a level or threshold that must be met before she takes people's word into consideration. I mean, there's a level of seriousness that must displayed for her to believe the speaker. Mathematically, this level of seriousness is called the activation/sigmoid/logistic function. So it means, Jay must meet that level for her to believe him. Trust me, we all have it. So it means even for you & Jay, there's a level of seriousness or trust that must be overcome before you believe people. Putting this into the neural network, all layers (input, hidden & output layers) have an activation function that takes in the previous input to provide an output.

##### Before we wrap up, do you know we can improve the output by increasing the number of times give Jay our message to Lola. If we see that Lola's output was way below the intended result, we can call Jay to the corner again to tell him the same message to tell Lola with the hope that it would improve Lola's output and minimize Jay's error. The number of times we relay our message (input) is called an epoch. We can have 5 epochs so as to improve our output. The process of relaying the message is called training - lets just say we are training our network to improve the output.

#### Lets make a recap:
##### Input layer: this is the entrance of the neural network 
##### Hidden layer: the is where the communcation (learning) happens
##### Output layer: this is where the results 
##### Output error: intended output - actual error
##### Learning rate: moderating factor used to minimise the error
##### Activation function: threshold that must be overcome for an input to move to the next layer. It takes in the input at every layer.
##### Epoch: number of times a training is carried out

##### Putting it all together, a neural network is a combines the inputs, learning rate, weights, errors and the activation function to give us the output.

##### Now to the codes!

In [0]:
# import libraries

import numpy as np
import scipy.special

##### We create a class called neuralNetwork with 3 functions: init, train, query
##### init is to initialize the class and define our variables: inputnodes, outputnodes, hiddennodes & learning rate

In [0]:
class neuralNetwork():
    def __init__(self, inodes, onodes, hnodes, lrate):
        # creating the input, output, hidden nodes & learning rate
        self.inodes = inodes
        self.onodes = onodes
        self.hnodes = hnodes
        self.lrate = lrate

        # randomly creates weight to the hidden layer from the input layer
        self.whi = np.random.normal(0.0, pow(self.hnodes, -0.5), (self.inodes, self.hnodes))
        # randomly creates weight to the output layer from the hidden layer
        self.who = np.random.normal(0.0, pow(self.hnodes, -0.5), (self.hnodes, self.onodes))
        # activation function
        self.activ_funct = lambda x: scipy.special.expit(x)
        pass

##### The train function takes in the input (input_list) and the intended output (targets_lists) and then use the activation function to calculate the output. This output will be included in the multiplication with the learning rate & errors.

In [0]:
    def train(self, inputs_lists, targets_lists):
        # converts inputs and targets to array
        inputs = np.array(inputs_lists, ndmin=2).T
        targets = np.array(targets_lists, ndmin=2).T

        # calculate the input & output for the hidden layer - np.dot is used for multiplying matrices
        hidden_inputs = np.dot(self.whi, inputs)
        hidden_outputs = self.activ_funct(hidden_inputs)

        # calculate the input & output for the final layer
        final_inputs = np.dot(self.who, hidden_outputs)
        final_outputs = self.activ_funct(final_inputs)

        # calculating the output errors
        output_errors = targets - final_outputs

        # calculating errors at the hidden layers from the outputs
        hidden_errors = np.dot(self.who.T, output_errors)

        # calculating weight change from the output to the hidden layer
        self.who += self.lrate * np.dot(final_outputs * output_errors * (1.0 - final_outputs), np.transpose(hidden_outputs))
        # calculating weight change from the input to the hidden layer
        self.whi += self.lrate * np.dot(hidden_outputs * hidden_errors * (1.0 - hidden_outputs), np.transpose(inputs))
        pass

##### the query function is simply to get our results.

In [0]:
    def query(self, inputs_lists):
        # converts list to 2D array
        inputs = np.array(inputs_lists, ndmin=2).T

        # finding the input & output to the hidden layers
        hidden_inputs = np.dot(self.whi, inputs)
        hidden_outputs = self.activ_funct(hidden_inputs)

        # finding the input & output to the hidden layers
        final_inputs = np.dot(self.who, hidden_outputs)
        final_outputs = self.activ_funct(final_inputs)

        return final_outputs

##### Now lets train the network with a real data set what contains images of handwritten digits.

In [0]:
# ---------- Training the network ----------
# provide values for the nodes & learning rate

inodes = 784
hnodes = 100
onodes = 10
lrate = 0.3

# create instance for the neural network
n = neuralNetwork(inodes, onodes, hnodes, lrate)

# function that reads file
def convert_file(csvfile):
    # reads the csv file
    f = open(csvfile, "r")
    inputfile = f.readlines()
    f.close()
    # converts from string to real numbers then scales the valuesx
    return inputfile


trainingFile = convert_file("mnist_train_100.csv") 

for item in trainingFile:
    split_training = item.split(',')
    
    # scaling the data i.e adjusting all values to be between 0.01 & 0.99
    inputs = (np.asfarray(split_training[1:])/255 * 0.99) + 0.01
    
    # np.zeros convert all inputs to 0. Added 0.01 so that our output would not give zeros during multiplication
    targets = np.zeros(onodes) + 0.01
    targets[int(split_training[0])] = 0.99
    
    # calling the train function we defined earlier
    n.train(inputs, targets)
pass

In [0]:
# ------ Testing the network -------

# load the test data
testingFile = convert_file("mnist_test_10.csv")

for item in testingFile:
    split_test = item.split(',')
    
    # scaling the data i.e adjusting all values to be between 0.01 & 0.99
    inputs = (np.asfarray(split_test[1:])/255 * 0.99) + 0.01
    
    # setting the input as 'correct label'
    correct_label = int(split_test[0])
    print(correct_label, "correct label")
    
    # calling the query function we defined earlier
    outputs = n.query(inputs)
    
    # setting the output as 'label'
    label = np.argmax(outputs)
    print(label, "network's result")
    
    # creates a scorecard to measure results.
    scorecard = []
    if(label == correct_label):
        scorecard.append(1)
    else:
        scorecard.append(0)
    print(scorecard)
pass