*Adapted from Make Your Own Neural Network by Tariq Rashid: https://github.com/makeyourownneuralnetwork/makeyourownneuralnetwork/blob/master/part2_neural_network.ipynb*

# A neural network from scratch

In this notebook we will code a basic neural network from scratch, do forward passes with it, and train it with backpropogation.

## Import required libraries

For this example you will need `keras`. You can use either standalone installation of Keras in your Anaconda environment or use `tf.keras` if you have Tensorflow install instead. If you have neither, run `pip install tensorflow` in your Anaconda environemnt then modify `import keras` to `import tf.keras as keras` in the cell below before running.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import math
import random
import keras

## Neural Network class

Here we have a 3-layer neural network: the input layer, 1 hidden layer, and the output layer. Each layer can have a defined amounts of neurons/nodes. The learning rate is constant and defined on object instance creation.

In [None]:
class NeuralNetwork:
    def __init__(self, input_nodes, hidden_nodes, output_nodes, learning_rate):
        # set number of nodes in each input, hidden, output layer
        self.inodes = input_nodes 
        self.hnodes = hidden_nodes 
        self.onodes = output_nodes
        
        # link weight matrices, wih and who
        # weights inside the arrays are w_i_j, where link is from node i to node j in the next layer
        # w11 w21
        # w12 w22 etc 
        self.wih = np.random.normal(0.0, pow(self.inodes, -0.5), (self.hnodes, self.inodes))
        self.who = np.random.normal(0.0, pow(self.hnodes, -0.5), (self.onodes, self.hnodes))
        
        # learning rate
        self.lr = learning_rate
        
        # activation function is the sigmoid function
        self.activation_function = lambda x: 1 / (1 + np.exp(-x))
    
    def train(self, inputs_list, targets_list):
        # convert inputs list to 2d array
        inputs = np.array(inputs_list, ndmin=2).T
        targets = np.array(targets_list, ndmin=2).T
        
        # calculate signals into hidden layer
        hidden_inputs = np.dot(self.wih, inputs)
        # calculate the signals emerging from hidden layer
        hidden_outputs = self.activation_function(hidden_inputs)
        
        # calculate signals into final output layer
        final_inputs = np.dot(self.who, hidden_outputs)
        # calculate the signals emerging from final output layer
        final_outputs = self.activation_function(final_inputs)
        
        # output layer error is the (target - actual)
        output_errors = targets - final_outputs
        # hidden layer error is the output_errors, split by weights, recombined at hidden nodes
        hidden_errors = np.dot(self.who.T, output_errors) 
        
        # update the weights for the links between the hidden and output layers
        self.who += self.lr * np.dot((output_errors * final_outputs * (1.0 - final_outputs)), np.transpose(hidden_outputs))
        
        # update the weights for the links between the input and hidden layers
        self.wih += self.lr * np.dot((hidden_errors * hidden_outputs * (1.0 - hidden_outputs)), np.transpose(inputs))
    
    def query(self, inputs_list):
        # convert inputs list to 2d array
        inputs = np.array(inputs_list, ndmin=2).T
        
        # calculate signals into hidden layer
        hidden_inputs = np.dot(self.wih, inputs)
        # calculate the signals emerging from hidden layer
        hidden_outputs = self.activation_function(hidden_inputs)
        
        # calculate signals into final output layer
        final_inputs = np.dot(self.who, hidden_outputs)
        # calculate the signals emerging from final output layer
        final_outputs = self.activation_function(final_inputs)
        
        return final_outputs

## Define the neural network's shape

Here we create a neural network with 784 inputs (28x28), 200 hidden nodes in the hidden layer, and 10 output nodes.

In [None]:
# number of input, hidden and output nodes
input_nodes = 784
hidden_nodes = 200
output_nodes = 10

# learning rate is 0.3
learning_rate = 0.3

# create instance of neural network
n = NeuralNetwork(input_nodes,hidden_nodes,output_nodes, learning_rate)

## Load data

The data we are going to be using is the "Hello, World!" of machine learning: the [MNIST dataset](http://yann.lecun.com/exdb/mnist/). It is a simple handwritten digit dataset with that is just the right size for simple neural networks like ours and some others. It is a series of 28x28 grayscale images. 

We are going to load in the dataset from Keras directly as it is one of the default dataset included with Keras and already in Numpy format so we can just do some minimal processing on it. The data is split up into two sets: the training set contains 60,000 images and the test set contains 10,000 images. Each image has a corresponding label indicating which digit from 0-9 it is.

In [None]:
# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

print(x_train.shape)
print(y_train.shape)

print(x_test.shape)
print(y_test.shape)

Here we have a look at a sample of the MNIST data.

In [None]:
plt.clf()
for i in range(9):
    ax = plt.subplot(3, 3, i + 1)
    plt.imshow(x_train[i], cmap='gray')
    plt.title(y_train[i])
    plt.axis("off")

plt.show()

As the data loaded in has a shape of `(28, 28)` we need to flatten it out before we can pass it into our neural network (as our network only take 1D inputs). In addition the data loaded in is in the range of 0-255, we will need to convert it to 0.0-1.0, this is important as we need the number type of the data to match the type of the weights and floating point numbers gives us the precision we need; 0.0-1.0 range inputs means that our weights can stay small as well, paired with a low learning rate, the network can (ironically) learn faster.

As for the output array, we are converting them into binary class matrices or simply the probability of the image being represented by its index position. This essentially means the individual values in the vector will only be 0 or 1 with the 1 value being at the position where it represent our class in this classification problem. Try printing the values of `y_train_flat` to see what it looks like. Here we use the Keras function `keras.utils.to_categorical()` to help us do this conversion.

In [None]:
# Reshape the data to (60000, 784) and convert values to the range of 0-1
x_train_flat = x_train.reshape(60000, 784).astype("float32") / 255
x_test_flat = x_test.reshape(10000, 784).astype("float32") / 255
print("Input shape:", x_train_flat.shape)

# Convert class vectors to binary class matrices
num_classes = 10
y_train_flat = keras.utils.to_categorical(y_train, num_classes)
y_test_flat = keras.utils.to_categorical(y_test, num_classes)

print("Output shape:", y_train_flat.shape)

## Training the neural network

Now we can train the network. Neural networks are usually train over epochs, one epoch means one pass over the whole training dataset. You can increate the epoch count below but note that it will mean training will take longer with more epochs.

In [None]:
# Epochs is the number of times the training data set is used for training
epochs = 1

for e in range(epochs):
    print("Starting epoch", e)
    
    # Passing each input data entry and its corresponding expected output to train the nn
    for i, input_data in enumerate(x_train_flat):
        n.train(input_data, y_train_flat[i])
        print("Training %d/60000" % (i+1), end="\r", flush=True)

## Assessing the result

After training, we can assess how well the neural network is doing. Here we will use the test dataset instead of the training dataset to make sure that the neural network actually learnt what we need it to learn and not found some obscure quirk that the training dataset have.

In [None]:
# Pick a random index from the test data set
random_i = random.randint(0, len(x_test))

# Plot the image that we just picked
plt.clf()
ax = plt.subplot(3, 3, 1)
plt.imshow(x_test[random_i], cmap='gray')
plt.title(y_test[random_i])
plt.axis("off")
plt.show()

# Make a prediction
print("Prediction:", np.argmax(n.query(x_test_flat[random_i]).reshape(10)))

We can also pass the whole test dataset and get a accuracy count from the neural network. You can rerun the cell with the training step and come back to this cell to see how much will the accuracy increase with each epoch of training.

In [None]:
correct_count = 0

for i, test_input in enumerate(x_test_flat):
    prediction = np.argmax(n.query(test_input).reshape(10))
    actual = y_test[i]
    
    if prediction == actual:
        correct_count += 1
        
accuracy = correct_count / len(x_test)
print("Accuracy:", format(accuracy, ".2%"))