# Simple Handwritten Digit Recognition Neural Network

## Brief introduction

#### A neural network consists of a input layer, hidden layer (middle layer) and the output layer. The input layer takes in the input (images, files, audio, video etc), passes it to the hidden layer where come processing/learning is done and passed to the output layer for results.
#####Take a moment to think about this: let's assume you are in a group of 3 friends and you want to tell your 3rd friend you love her. You are the first friend, your second friend, Jay is the middleman or the channel of communication between you and your 3rd friend, Lola.
#####It means you are the input node(s), Jay is the hidden/middle node and Lola is the output node. Let's say you made a casual whisper to Jay to inform Lola you love her. Jay is reluctant but goes on to say it to Lola.
#####Its easy for Lola to smile and discard it - meaning the output was not strong enough. Let's assume you call Jay to a corner and tell him with all seriousness that you love Lola and that he should tell Lola the same say you told him.
#####Jay did exactly what you told him. Lola would likely take it more seriously put that into consideration. She might even reply telling you she loves you too. The words you said are the input, the whispered joking word can be said to have a small weight - its not really serious.
#####While the seriousness you added to the corner talk had more weight in shaping Lola's response. This is explains the basics of how neural network works. Its takes the product of the input from each layer multiplies it by the weight to give an output.
#####Now, let's assume Jay told Lola that you "like" her instead of "love". That's an error. You had it mind that he would tell Lola with all seriousness that you love her but he didn't. What you had in mind was an intended output (target) while what Jay said was the actual output.
#####Mathematically we can calculate this as:
#####output error = intended output - actual output.
#####We can moderate this output error by including a learning rate. This learning rate is a figure that we'd multiply with the output error to reduce it so that when next we tell Jay to speak with Lola, the error is minimized. The learning rate is usually a small number.
#####Before we wrap up, lets assume that Lola has a level or threshold that must be met before she takes people's word into consideration. I mean, there's a level of 'trust' that must met for her to "believe" the speaker.
#####Mathematically, the threshold that measures the level of 'trust' is called the activation/sigmoid/logistic function.
#####So it means, Jay must meet that level for her to believe him.
#####Trust me, we all have it. So it means even for you & Jay, there's also a level of seriousness or trust that must be overcome before you can pass your message across.
#####Putting this analogy to the neural network, all layers (input, hidden & output layers) have an activation function that must be overcome for a successful message transfer.
#####Finally, you can improve the output by increasing the number of times you relay your message to Jay as he also speaks to Lola. If you notice that Lola's output was way below our intended result, you can call Jay to the corner again to tell him the same message to tell.
#####Hoping that it would minimize Jay's error and improve Lola's output.
#####The number of times we relay our message (input) is called an epoch. We can have 5 epochs so as to improve our output. And the process of relaying the message is called training.
#####Now lets make a recap:
#####Input layer: is the entrance of the neural network
#####Hidden layer: where communication (learning) happens
#####Output layer: where the results happen
#####Output error: intended output - actual error
#####Learning rate: moderating factor used to minimize the error
#####Activation function: threshold that must be overcome for an input to move to the next layer. It takes in the input at every layer.
#####Epoch: number of times a training is carried out.
#####Putting it all together, a neural network combines the inputs, learning rate, weights, errors and the activation function to give us the output.
##### Now to the codes!

In [0]:
# import libraries

import numpy as np
import scipy.special

##### We create a class called `neuralNetwork` with 3 functions: init, train, query
##### `init` is to initialize the class and define our variables: inputnodes, outputnodes, hidden nodes & learning rate

In [0]:
class neuralNetwork():
    def __init__(self, inodes, onodes, hnodes, lrate):
        # creating the input, output, hidden nodes & learning rate
        self.inodes = inodes
        self.onodes = onodes
        self.hnodes = hnodes
        self.lrate = lrate

        # randomly creates weight to the hidden layer from the input layer
        self.whi = np.random.normal(0.0, pow(self.hnodes, -0.5), (self.inodes, self.hnodes))
        # randomly creates weight to the output layer from the hidden layer
        self.who = np.random.normal(0.0, pow(self.hnodes, -0.5), (self.hnodes, self.onodes))
        # activation function
        self.activ_funct = lambda x: scipy.special.expit(x)
        pass

##### Next, the `train` function takes in the input (input_list) and the intended output (targets_lists) and then use the activation function to calculate the output. This output will be included in the multiplication with the learning rate & errors.

In [0]:
    def train(self, inputs_lists, targets_lists):
        # converts inputs and targets to array
        inputs = np.array(inputs_lists, ndmin=2).T
        targets = np.array(targets_lists, ndmin=2).T

        # calculate the input & output for the hidden layer - np.dot is used for multiplying matrices
        hidden_inputs = np.dot(self.whi, inputs)
        hidden_outputs = self.activ_funct(hidden_inputs)

        # calculate the input & output for the final layer
        final_inputs = np.dot(self.who, hidden_outputs)
        final_outputs = self.activ_funct(final_inputs)

        # calculating the output errors
        output_errors = targets - final_outputs

        # calculating errors at the hidden layers from the outputs
        hidden_errors = np.dot(self.who.T, output_errors)

        # calculating weight change from the output to the hidden layer
        self.who += self.lrate * np.dot(final_outputs * output_errors * (1.0 - final_outputs), np.transpose(hidden_outputs))
        # calculating weight change from the input to the hidden layer
        self.whi += self.lrate * np.dot(hidden_outputs * hidden_errors * (1.0 - hidden_outputs), np.transpose(inputs))
        pass

##### the `query` function is simply to get our output.

In [0]:
    def query(self, inputs_lists):
        # converts list to 2D array
        inputs = np.array(inputs_lists, ndmin=2).T

        # finding the input & output to the hidden layers
        hidden_inputs = np.dot(self.whi, inputs)
        hidden_outputs = self.activ_funct(hidden_inputs)

        # finding the input & output to the hidden layers
        final_inputs = np.dot(self.who, hidden_outputs)
        final_outputs = self.activ_funct(final_inputs)

        return final_outputs

##### Now lets train the network with a real data set what contains images of handwritten digits.

In [0]:
# ---------- Training the network ----------
# provide values for the nodes & learning rate

inodes = 784
hnodes = 100
onodes = 10
lrate = 0.3

# create instance for the neural network
n = neuralNetwork(inodes, onodes, hnodes, lrate)

# function that reads file
def convert_file(csvfile):
    # reads the csv file
    f = open(csvfile, "r")
    inputfile = f.readlines()
    f.close()
    return inputfile


trainingFile = convert_file("mnist_train_100.csv") 

for item in trainingFile:
    split_training = item.split(',')
    
    # scaling the data i.e adjusting all values to be between 0.01 & 0.99
    inputs = (np.asfarray(split_training[1:])/255 * 0.99) + 0.01
    
    # np.zeros convert all inputs to 0. Added 0.01 so that our output would not give zeros during multiplication
    targets = np.zeros(onodes) + 0.01
    targets[int(split_training[0])] = 0.99
    
    # calling the train function we defined earlier
    n.train(inputs, targets)
pass

#####Next is to test our results

In [0]:
# ------ Testing the network -------

# load the test data
testingFile = convert_file("mnist_test_10.csv")

for item in testingFile:
    split_test = item.split(',')
    
    # scaling the data i.e adjusting all values to be between 0.01 & 0.99
    inputs = (np.asfarray(split_test[1:])/255 * 0.99) + 0.01
    
    # setting the input as 'correct label'
    correct_label = int(split_test[0])
    print(correct_label, "correct label")
    
    # calling the query function we defined earlier
    outputs = n.query(inputs)
    
    # setting the output as 'label'
    label = np.argmax(outputs)
    print(label, "network's result")
    
    # creates a scorecard to measure results.
    scorecard = []
    if(label == correct_label):
        scorecard.append(1)
    else:
        scorecard.append(0)
    print(scorecard)
pass

This notebook aims to create an understand of how the neural network works. It doesn't answer all questions on neural network but gives a beginner the nudge to learn more and dive deep.

Thanks.