# Neural Network From Scratch

Guidance from work by **Trask** from [here](https://iamtrask.github.io/2015/07/12/basic-python-network/)

* Firstly, Implementation of a Simple two layered (Input and Output) **Artificial Neural Network**.

![nn1](nn1.png)

Importing dependencies

In [348]:
import numpy as np

Creating **Sigmoid** Function

One of the desirable properties of a sigmoid function is that its output can be used to create its derivative. If the sigmoid's output is a variable **out**, then the derivative is simply **out * (1-out)**

In [349]:
def sigmoid(x,deriv=False):
    if(deriv==True):
        return x*(1-x)
    return 1/(1+np.exp(-x))

Input Data

In [350]:
X = np.array([[0,0,1],
              [0,1,1],
              [1,0,1],
              [1,1,1]])

Output Data

 After the transpose, this y matrix has 4 rows with one column as one should expect

In [351]:
y = np.array([[0,0,1,1]]).T

We use random seed to have deterministic calculation

In [352]:
np.random.seed(0)

Initializing the Weights randomly.

**W** here is the first layer of weights connecting layer 1 of network to layer 2.  
Here since there are only two layers, we will need only a single weight matrix.

We here keep the weights's mean to **zero** for mathematical reasons of initializing weights.

In [353]:
w = 2*np.random.random((3,1)) - 1

Considering,
* l1 as the first layer (the inputs)
* Dot product as 1 to 1 multiplication to get the same dim. matrix o/p.

We use **Full Batch Training** which means feeding all of our training examples (here four) all at once for several iterartions.

In [354]:
for j in range(10000):
    #forward propagation
    l1 = X
    l2 = sigmoid(np.dot(l1,w)) #input times weight activate.
  
    #Calculating error
    l2_err = y - l2
    
    #Calculating delta
    #we multiply the "slopes" by the error, we are reducing the error of high confidence predictions.
    l2_delta = l2_err * sigmoid(l2, True)
    
    #Updating the weights
    w  = w + np.dot(l1.T, l2_delta)

In [355]:
l2

array([[0.00966779],
       [0.00786453],
       [0.99358992],
       [0.99211751]])

We can observe these are almost equal to y = [ [0, 0, 1, 1] ]

## ------------------------------------------------------------------------------------------------------------------------------

* Second, Implementation of a Three layered (Input, Hidden and Output) **Artificial Neural Network**.

![nn2](nn2.png)

We will use the sigmoid function as already defined above.

In [356]:
X = np.array([[0,0,1],
            [0,1,1],
            [1,0,1],
            [1,1,1]])

In [357]:
y = np.array([[0],
              [1],
              [1],
              [0]])

In [358]:
np.random.seed(1)

In [359]:
w1 = 2*np.random.random((3,4)) - 1 #weights between input and hidden
w2 = 2*np.random.random((4,1)) - 1 #weights between hidden and output

In [361]:
for j in range(60000):
    
    #forward propagating
    l1 = X
    l2 = sigmoid(np.dot(l1, w1))
    l3 = sigmoid(np.dot(l2, w2))
    
    #Error
    l3_err = y - l3
    
    #To output error at different intervals
    if(j % 10000 == 0):
        print("Error: "+str(np.mean(np.abs(l3_err))))
    
    l3_delta = l3_err * sigmoid(l3, True)
    
    # how much did each l2 value contribute to the l3 error 
    l2_err = l3_delta.dot(w2.T)
    
    l2_delta = l2_err * sigmoid(l2, True)

    #updating the weights
    w2 = w2 + np.dot(l2.T, l3_delta)
    w1 = w1 + np.dot(l1.T, l2_delta)

Error: 0.49641003190272537
Error: 0.008584525653247159
Error: 0.0057894598625078085
Error: 0.004629176776769985
Error: 0.003958765280273649
Error: 0.0035101225678616766


**l2_err = l3_delta.dot(w2.T)**

Uses the **confidence weighted error** from l3 to establish an error for l2. To do this, it simply sends the error across the weights from l3 to l2. This gives what you could call a **contribution weighted error** because we learn how much each node value in l2 "contributed" to the error in l3. This step is called **backpropagating** and is the namesake of the algorithm. We then update w1 using the same steps we did in the 2 layer implementation. 

We can easily observe the error decrease.

In [363]:
l3

array([[0.00260572],
       [0.99672209],
       [0.99701711],
       [0.00386759]])

We can observe these are almost equal to y = [ [0, 1, 1, 0] ]

### Done!