# <center>Lets Make A Neural Network From Scratch!<center>
<div style="text-align: right"> - by <b><a href="https://github.com/akashadhikari/">Akash</a></b> (Not the one in the picture below, obviously! :p )</div>

<img src="images/feynman.png" alt="Drawing" style="width: 1000px; height: 400px"/>
<b>Richard Feynman</b><br>Picture credit: [Commonlounge](https://www.commonlounge.com/discussion/cd6812822aef4dd7a718b420c4fa4036/history)

### Note: No Machine Learning Framework or Libraries were 'harmed' while making this tutorial.
And before you ask, Yes, Frameworks make things a lot easier. But trust me, if you really want to GET IT from the core, you MUST be able to grasp the theoretical base of Neural Networks and implement a toy Neural Net of your own. Yes, without ANY libraries. That is how you will learn. The Hard Freaking Way!

### The Problem:
We begin with a very simple problem with a very limited data.<br>
Below, we have a set of random data points plotted in a 2 dimensional graph with axis X(B) and axis Y(A).
Some data points are coloured/labeled as red and some are labeled blue.
(Seeing the graph, you brain has already recognized the pattern and classified these data points in a certain way. You have to admit that our brain is very good at pattern recognition.)<br>
Our job is to create a neural network, feed this data to the network and train it so that it can correctly classify the new (test) data points with reasonable accuracy. Thus, we are making a neural network classifier to simply understand what's going on inside each layer and not just see the network as a "black box".
For testing, a random data point will be given (say, 2, -3). Our neural network classifier will identify the label of the given data point (either Red or Blue).

![Classification](images/data_plot.png)

# Architecture of our network

### Data format
Our data points can simply be represented in a 2 dimensional graph with coordinates (X1 and X2) and label Y. In the table below, the labels of Y are either 0 and 1. 0 indicates blue (mostly in left part of the graph) and 1 indicated red colour of the coordinate (mostly situated at the right part).

| X1 | X2 | Y |
|--- |:--:|--:|
| 0  | 0  | 0 |
| 0  | 1  | 0 |
| 1  | 0  | 1 |
| -3 | 2  | 0 |
| ...|... |...|

### Constructing our neural network with this data
Constructing the network is easy. We have a mandatory <b>Input</b> and the <b>Output</b> layer.
There should be 2 input neurons to accept the input data as the data points are defined by X1 and X2 Similarly, we have to define 1 output neuron as the classification is binary.
We will use 1 hiddne layer with 4 neurons. You can play with the model later by increasing the hidden layer neurons or even the layer itself.
![Architecture](images/architecture.png)

### Weights / Synapses
First set of connection: Weight matrix has 2 X 4 dimensions (Input neurons X Hidden layer neurons).<br>
Second set of connection: Weight matrix has 4 X 1 dimensions (Hidden layer neurons X Output layer neurons).

### Activation function and sigmoid
An activation function maps the input values in between 0 to 1 or -1 to 1, etc. (depending upon the function used). Here, we are using a sigmoid function that looks like an 'S' curve stretching from - infinity to + infinity and the values ranging from 0 to 1.

s(x) = 1/(1+e^(-x))
![Sigmoid](images/sigmoid.gif)

### What does that mean?
It means, if we pass any number to the above "nonlin" function (let's say 5 and -5 as in the above figure), the output should come accordingly.
```
nonlin(5) # Check the figure, the value (red curve) is closer to 1 and so is our output.
0.9933071490757153

nonlin(5) # Check the figure, the value (red curve) is closer to 0 and so is our output.
0.9933071490757153
```
### The SUMMARIZED process.
- We feed the input training set to the neural network.
- We initialize a random weights and perform **forward propagation**.
- With that process, the network gives some output. We compare this output with the actualy output label (Y).
- The difference we get is the error we obtain in that particular iteration.
- We **backpropagate** to reduce the error by changing the weights by small margin.
- We continue the iteration unless the error is really low.

# Just give me the damn code.
### (It won't be a bad idea to blindly (with your eyes wide open) copy-paste this code snippet and try tweaking things on your own. If not, skip this code and stick with me for the rest of the part).
**PS: Don't forget to ``` pip3 install numpy``` if you haven't already.**

In [None]:
import numpy as np


class Layer():
    def __init__(self, number_of_inputs_per_neuron, number_of_neurons):
        self.synaptic_weights = 2 * np.random.random((number_of_inputs_per_neuron, number_of_neurons)) - 1


class NeuralNetwork():
    def __init__(self, layer1, layer2):
        self.layer1 = layer1
        self.layer2 = layer2

    # The Sigmoid function, which describes an S shaped curve.
    # We pass the weighted sum of the inputs through this function to
    # normalise them between 0 and 1.
    def sigmoid_activation(self, x):
        return 1 / (1 + np.exp(-x))

    # The derivative of the Sigmoid function.
    # This is the gradient of the Sigmoid curve.
    # It indicates how confident we are about the existing weight.
    def sigmoid_activation_derivative(self, x):
        return x * (1 - x)

    # The neural network forward_propagates.
    def forward_propagate(self, inputs):
        output_from_layer1 = self.sigmoid_activation(np.dot(inputs, self.layer1.synaptic_weights))
        output_from_layer2 = self.sigmoid_activation(np.dot(output_from_layer1, self.layer2.synaptic_weights))
        return output_from_layer1, output_from_layer2

    # We train the neural network through a process of trial and error.
    # Adjusting the synaptic weights each time.
    def train(self, training_set_inputs, training_set_outputs, number_of_training_iterations):
        for iteration in range(number_of_training_iterations):
            # Pass the training set through our neural network
            output_from_layer_1, output_from_layer_2 = self.forward_propagate(training_set_inputs)

            # Calculate the error for layer 2 (The difference between the desired output
            # and the predicted output).
            layer2_error = training_set_outputs - output_from_layer_2
            layer2_delta = layer2_error * self.sigmoid_activation_derivative(output_from_layer_2)

            # Calculate the error for layer 1 (By looking at the weights in layer 1,
            # we can determine by how much layer 1 contributed to the error in layer 2).
            layer1_error = np.dot(layer2_delta, self.layer2.synaptic_weights.T)
            layer1_delta = layer1_error * self.sigmoid_activation_derivative(output_from_layer_1)

            # Calculate how much to adjust the weights by
            layer1_adjustment = np.dot(training_set_inputs.T, layer1_delta)
            layer2_adjustment = np.dot(output_from_layer_1.T, layer2_delta)

            # Adjust the weights.
            self.layer1.synaptic_weights += layer1_adjustment
            self.layer2.synaptic_weights += layer2_adjustment
            if iteration % 10000 == 0:
                print("Training:" , int(iteration/1000), "%")
            if iteration == 99999:
                print("Training complete - 100%!")

    # The neural network prints its weights
    def print_weights(self):
        print ("    Layer 1 (4 neurons, each with 2 inputs): ")
        print (self.layer1.synaptic_weights)
        print ("    Layer 2 (1 neuron, with 4 inputs):")
        print (self.layer2.synaptic_weights)

if __name__ == "__main__":

    #Seed the random number generator
    np.random.seed(1)

    # Create layer 1 (4 neurons, each with 2 inputs)
    layer1 = Layer(2, 4)

    # Create layer 2 (a single neuron with 4 inputs)
    layer2 = Layer(4, 1)

    # Combine the layers to create a neural network
    neural_network = NeuralNetwork(layer1, layer2)

    print ("Stage 1) Random starting synaptic weights: ")
    neural_network.print_weights()

    input_array = np.array([
    [0, 0],
    [0, 1],
    [0, 2],
    [-1, 0],
    [-2, -2],
    [-2, -1],
    [-1, -1],
    [1, -1],
    [1, 0],
    [1, 1],
    [2, 0],
    [3, 0],
    [0, -3],
    [0, -2],
    [2, -2]])
    output_array = np.array([[0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1]]).T

    # The training set. We have 7 examples, each consisting of 3 input values
    # and 1 output value.
    training_set_inputs = input_array

    training_set_outputs = output_array
    # Train the neural network using the training set.
    # Do it 60,000 times and make small adjustments each time.
    neural_network.train(training_set_inputs, training_set_outputs, 100000)

    print ("Stage 2) New synaptic weights after training: ")
    neural_network.print_weights()

    # Test the neural network with a new situation.
    new_test = np.array([2, -3])
    hidden_state, output = neural_network.forward_propagate(new_test)
    print ("Stage 3) Considering a new situation", new_test)
    print (output)

### Be patient and bear with me! That might look complicated but it's not.

## HERE WE GO!

We start off by creating a simple work-flow. That is, we will make a skeleton of the program without even declaring workable class and methods.

To begin with, lets seed random numbers (which is not mandatory but it will make things easier for later usage as this function will seed the random values and will create consistency). Read more [here](https://stackoverflow.com/questions/21494489/what-does-numpy-random-seed0-do/21494630#21494630).
### Seed it!

```
random.seed(1)
```
### We will now create Neural Network layers (In this case, layer1 (hidden layer) and layer2 (output layer)).
```
# Create layer 1 (4 neurons, each with 2 inputs)

layer1 = Layer(2, 4)

# Create layer 2 (a single neuron with 4 inputs)

layer2 = Layer(4, 1)
```
#### Don't worry about calling the ``` Layer() ``` class. We haven't even built one. We will define it shortly.
### Now, take the two layers and pass it to the non-existent ```NeuralNetwork()``` class and finally print the initial weights using ```print_weights()``` for analysis.

```
# Combine the layers to create a neural network

neural_network = NeuralNetwork(layer1, layer2)

print ("Stage 1) Random starting synaptic weights: ")

neural_network.print_weights()
```
### Now, take the input data and train it using our non-existant ```train()``` function.
```
input_array = np.array([[0, 0],[0, 1],[0, 2],[-1, 0],[-2, -2],[-2, -1],[-1, -1],
                         [1, -1],[1, 0],[1, 1],[2, 0],[3, 0],[0, -3],[0, -2],[2, -2]])
output_array = np.array([[0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1]]).T

training_set_inputs = input_array

training_set_outputs = output_array

neural_network.train(training_set_inputs, training_set_outputs, 100000)
```
#### We are iterating over 100000 times so that the weights gets gradually updated to find the most appropriate final weight.
### Print the new synaptic weights after training. Notice, the change in weights after these many iterations.
```
print ("Stage 2) New synaptic weights after training: ")

neural_network.print_weights()
```
### Test the network with new test situation and Forward Propagate with newly updated weights.
```
# Test the neural network with a new situation.

new_test = np.array([2, -3])

hidden_state, output = neural_network.forward_propagate(new_test)

print ("Stage 3) Considering a new situation", new_test)

print (output)
```

### So far, our Neural Network looks somethig like this

In [None]:
import numpy as np

random.seed(1)

layer1 = Layer(2, 4) # Layer() class is not defined yet.

layer2 = Layer(4, 1)

neural_network = NeuralNetwork(layer1, layer2) # NeuralNetwork() class is not defined yet.

print ("Stage 1) Random starting synaptic weights: ")
neural_network.print_weights() # print_weights() method is not defined yet.

input_array = np.array([[0, 0],[0, 1],[0, 2],[-1, 0],[-2, -2],[-2, -1],[-1, -1],
                     [1, -1],[1, 0],[1, 1],[2, 0],[3, 0],[0, -3],[0, -2],[2, -2]])
output_array = np.array([[0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1]]).T

training_set_inputs = input_array

training_set_outputs = output_array

neural_network.train(training_set_inputs, training_set_outputs, 100000) # train() method is not defined yet.

print ("Stage 2) New synaptic weights after training: ")
neural_network.print_weights()

new_test = np.array([2, -3])
hidden_state, output = neural_network.forward_propagate(new_test) # forward_propagate() method isn't defined yet.
print ("Stage 3) Considering a new situation", new_test)
print (output)

### Time to define all the Classes and methods
#### ```class Layer()```
We first define a ``` class Layer() ``` which has an init method.
```
class Layer():
    def __init__(self, number_of_inputs_per_neuron, number_of_neurons):
        self.synaptic_weights = 2 * random.random((number_of_inputs_per_neuron, number_of_neurons)) - 1
```
Here, we are simply constructing the weight matrices that accepts a set of randomly initialzed numbers with the dimensions (as talked earlier) **previous layer neurons X current layer neurons**. This is equavalent to saying  ``` number_of_inputs_per_neuron ``` and ``` number_of_neurons ```. Multiplication by 2 and subtraction by 1 is to maintain the synaptic weights witin a certain range of 'normalized' mean.



#### ``` class NeuralNetwork() ```
We define a ``` class NeuralNetwork() ``` which has **six** methods.

```
def __init__(self, layer1, layer2):
    self.layer1 = layer1
    self.layer2 = layer2
```
We initialize the NeuralNetwork() class by creating the layer variables.

```

def sigmoid_activation(self, x):
    return 1 / (1 + exp(-x))

def sigmoid_activation_derivative(self, x):
    return x * (1 - x)
```
While you must be familiar with the ``` sigmoid_activation ``` method, the derivative of the sigmoid (``` sigmoid_activation_derivative ```) function gives the slope of any point in the sigmoid curve.
In other words, this is the gradient of the Sigmoid curve and it indicates how confident we are about the existing weight.

```
def forward_propagate(self, inputs):
    output_from_layer1 = self.sigmoid_activation(dot(inputs, self.layer1.synaptic_weights))
    output_from_layer2 = self.sigmoid_activation(dot(output_from_layer1, self.layer2.synaptic_weights))
    return output_from_layer1, output_from_layer2
```
Forward propagation is the process of "moving forward" the network. In other words, we mutiply the weights and neuron layers and finally apply the sigmoid activation to regulate the obtained result. We do this till we obtain the final output.<br>

And, finally, we define the ``` train ``` method to declare forward propagation, compare output results and apply backpropagation to update weights so that the network performs better in the next iterations.<br>
Below, we have:<br>
``` layer2_error ``` - which is just the difference of resulting output from our actual output.<br>
``` layer2_delta ``` - which is the product of error and gradient of the output. This helps in gradual weight change.<br>
And same with layer1.<br>
Ultimately, the weight adjustement can be done by:
```
# Calculate how much to adjust the weights by
layer1_adjustment = training_set_inputs.T.np.dot(layer1_delta)
layer2_adjustment = output_from_layer_1.T.np.dot(layer2_delta)

# Adjust the weights.
self.layer1.synaptic_weights += layer1_adjustment
self.layer2.synaptic_weights += layer2_adjustment
```

# Where are you now? Not Atlantis! :p

### We can now put things together and see the neural network operate.

In [2]:
import numpy as np


class Layer():
    def __init__(self, number_of_inputs_per_neuron, number_of_neurons):
        self.synaptic_weights = 2 * np.random.random((number_of_inputs_per_neuron, number_of_neurons)) - 1


class NeuralNetwork():
    def __init__(self, layer1, layer2):
        self.layer1 = layer1
        self.layer2 = layer2

    # The Sigmoid function, which describes an S shaped curve.
    # We pass the weighted sum of the inputs through this function to
    # normalise them between 0 and 1.
    def sigmoid_activation(self, x):
        return 1 / (1 + np.exp(-x))

    # The derivative of the Sigmoid function.
    # This is the gradient of the Sigmoid curve.
    # It indicates how confident we are about the existing weight.
    def sigmoid_activation_derivative(self, x):
        return x * (1 - x)

    # The neural network forward_propagates.
    def forward_propagate(self, inputs):
        output_from_layer1 = self.sigmoid_activation(np.dot(inputs, self.layer1.synaptic_weights))
        output_from_layer2 = self.sigmoid_activation(np.dot(output_from_layer1, self.layer2.synaptic_weights))
        return output_from_layer1, output_from_layer2

    # We train the neural network through a process of trial and error.
    # Adjusting the synaptic weights each time.
    def train(self, training_set_inputs, training_set_outputs, number_of_training_iterations):
        for iteration in range(number_of_training_iterations):
            # Pass the training set through our neural network
            output_from_layer_1, output_from_layer_2 = self.forward_propagate(training_set_inputs)

            # Calculate the error for layer 2 (The difference between the desired output
            # and the predicted output).
            layer2_error = training_set_outputs - output_from_layer_2
            layer2_delta = layer2_error * self.sigmoid_activation_derivative(output_from_layer_2)

            # Calculate the error for layer 1 (By looking at the weights in layer 1,
            # we can determine by how much layer 1 contributed to the error in layer 2).
            layer1_error = np.dot(layer2_delta, self.layer2.synaptic_weights.T)
            layer1_delta = layer1_error * self.sigmoid_activation_derivative(output_from_layer_1)

            # Calculate how much to adjust the weights by
            layer1_adjustment = np.dot(training_set_inputs.T, layer1_delta)
            layer2_adjustment = np.dot(output_from_layer_1.T, layer2_delta)

            # Adjust the weights.
            self.layer1.synaptic_weights += layer1_adjustment
            self.layer2.synaptic_weights += layer2_adjustment
            if iteration % 10000 == 0:
                print("Training:" , int(iteration/1000), "%")
            if iteration == 99999:
                print("Training complete - 100%!")

    # The neural network prints its weights
    def print_weights(self):
        print ("    Layer 1 (4 neurons, each with 2 inputs): ")
        print (self.layer1.synaptic_weights)
        print ("    Layer 2 (1 neuron, with 4 inputs):")
        print (self.layer2.synaptic_weights)

if __name__ == "__main__":

    #Seed the random number generator
    np.random.seed(1)

    # Create layer 1 (4 neurons, each with 2 inputs)
    layer1 = Layer(2, 4)

    # Create layer 2 (a single neuron with 4 inputs)
    layer2 = Layer(4, 1)

    # Combine the layers to create a neural network
    neural_network = NeuralNetwork(layer1, layer2)

    print ("Stage 1) Random starting synaptic weights: ")
    neural_network.print_weights()

    input_array = np.array([
    [0, 0],
    [0, 1],
    [0, 2],
    [-1, 0],
    [-2, -2],
    [-2, -1],
    [-1, -1],
    [1, -1],
    [1, 0],
    [1, 1],
    [2, 0],
    [3, 0],
    [0, -3],
    [0, -2],
    [2, -2]])
    output_array = np.array([[0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1]]).T

    # The training set. We have 7 examples, each consisting of 3 input values
    # and 1 output value.
    training_set_inputs = input_array

    training_set_outputs = output_array
    # Train the neural network using the training set.
    # Do it 60,000 times and make small adjustments each time.
    neural_network.train(training_set_inputs, training_set_outputs, 100000)

    print ("Stage 2) New synaptic weights after training: ")
    neural_network.print_weights()

    # Test the neural network with a new situation.
    new_test = np.array([2, -3])
    hidden_state, output = neural_network.forward_propagate(new_test)
    print ("Stage 3) Considering a new situation", new_test)
    print (output)


Stage 1) Random starting synaptic weights: 
    Layer 1 (4 neurons, each with 2 inputs): 
[[-0.16595599  0.44064899 -0.99977125 -0.39533485]
 [-0.70648822 -0.81532281 -0.62747958 -0.30887855]]
    Layer 2 (1 neuron, with 4 inputs):
[[-0.20646505]
 [ 0.07763347]
 [-0.16161097]
 [ 0.370439  ]]
Training: 0 %
Training: 10 %
Training: 20 %
Training: 30 %
Training: 40 %
Training: 50 %
Training: 60 %
Training: 70 %
Training: 80 %
Training: 90 %
Training complete - 100%!
Stage 2) New synaptic weights after training: 
    Layer 1 (4 neurons, each with 2 inputs): 
[[-3.97185157  4.66297249 -5.25188655  1.39753303]
 [ 1.35299999 -1.57439556  1.76665557 -0.71676559]]
    Layer 2 (1 neuron, with 4 inputs):
[[ -7.20493687]
 [  6.57701463]
 [-11.30181144]
 [  0.89289527]]
Stage 3) Considering a new situation [ 2 -3]
[0.99942671]


# BOOM!
The new situation (2,-3) is predicted as 0.999 which is about 1.<br>
This mean, the point (2,-3) on the graph should be labeled red (Y label 1 is for red as mentioned above), which is pretty much what we have expected.<br>
Still unconvinced? Try with different points and analyze what works and what doesn't. The more you explore, the more you will be able to understand the inner mechanisms of neural network.

Congratulations! We have just built our very own tiny neural network ALL from scratch.<br>
It might not be the best network out there but it WORKS! And hopefuly now, you have come to know all the nuts and bolts of a simple multi-layered neural network.

# Next Steps?
It's completely up to you. How far do you want to go?

## Lets get in touch!
The name is **Akash Adhikari**.<br> 
**Address**- 221B Baker Street<br>
No, wait!<br>
**Address**: To and forth Kathmandu, Biratnagar or just about everywhere.<br>
Contact me:<br>
    **Facebook**: fb.com/akashbrt<br>
    **Github**  : github.com/akashadhikari<br>
    **website** : akashadhikari.com.np<br>
    **email**   : akashsky1313@gmail.com