# Simple neural network for beginners

#### © Jubeen Shah 2018

This jupyter notebook is to get anyone started with the basics of neural networks, and eventually deep learning. 

### 1) Importing the numpy library for math operations

Click  [Here](http://www.numpy.org/) for more details on NumPy

In [1]:
import numpy as np

### 2) Defining the sigmoid function
`Pseudocode`<code>
    
    def sigmoid(a_variable, derivative = False):
        check if derivative is passed as TRUE as a parameter:
        if so:
            return x * (1 -x)
        otherwise for all other cases:
            return 1/(1 + exp(-x))   
</code>
The formula for sigmoid is given as : 
$$Sigmoid(x)=\frac{1}{1 + e^-x} $$

The derivative of sigmoid $$\frac {d}{dx}\frac{1}{1 + e^-x} = x - x^2 = (x)(1-x) $$

In [2]:
def sigmoid(x, deriv = False):
    if deriv == True:
        return x * ( 1 - x )
    return 1/(1+np.exp(-x))

### 3) Defining the input and the output dataset for the simple neural network

|A|B|C|O/P|
|---------|
|0|0|1|  0|
|0|1|1|  0|
|1|0|1|  1|
|1|1|1|  1|


In [3]:
X = np.array([[0,0,1],
              [0,1,1],
              [1,0,1],
              [1,1,1]])
y = np.array([[0,0,1,1]]).T

### 4) Defining the layers
This is our weight matrix for this neural network. It's called "syn0" to imply "synapse zero". Since we only have 2 layers (input and output), we only need one matrix of weights to connect them. Its dimension is (3,1) because we have 3 inputs and 1 output. Another way of looking at it is that l0 is of size 3 and l1 is of size 1. Thus, we want to connect every node in l0 to every node in l1, which requires a matrix of dimensionality (3,1). :) 

In [7]:
np.random.seed(1)
syn0 = 2*np.random.random((3,4)) - 1
syn1 = 2*np.random.random((4,1)) - 1

### 5) Forward and backpropogation

Uses the "confidence weighted error" from l2 to establish an error for l1. To do this, it simply sends the error across the weights from l2 to l1. This gives what you could call a "contribution weighted error" because we learn how much each node value in l1 "contributed" to the error in l2. This step is called "backpropagating" and is the namesake of the algorithm. We then update syn0 using the same steps we did in the 2 layer implementation

In [8]:
for iteration in range(60000):
    l0 = X
    l1 = sigmoid(np.dot(l0,syn0))
    l2 = sigmoid(np.dot(l1,syn1))
    
    l2_error = y - l2
    
    if iteration %5000 == 0:
        print("Error rate is : " + str(np.mean(np.abs(l2_error))))
    
    l2_delta = l2_error * sigmoid(l2, True)
    
    l1_error = l2_delta.dot(syn1.T)
    
    l1_delta = l1_error * sigmoid(l1, True)
    
    syn1 += l1.T.dot(l2_delta)
    syn0 += l2.T.dot(l1_delta)
    

Error rate is : 0.468534325458
Error rate is : 0.167929000433
Error rate is : 0.165670789246
Error rate is : 0.164686021751
Error rate is : 0.164093299619
Error rate is : 0.16368345298
Error rate is : 0.163376950548
Error rate is : 0.163135815501
Error rate is : 0.162939238195
Error rate is : 0.162774703074
Error rate is : 0.162634156709
Error rate is : 0.162512143433
