# Neural Network Blueprint

The basics of a neural network is to: **Build** -> **Train** -> **Test**

This will be demonstrated below.

## Nerual Network(NN)
Typically a NN has an **Input** layer, **Hidden** layer/s, and an **Output** layer.
At each of these steps there's data pre-processing, some random/non-random weight initialization, an squashing function(to put all numbers between 0-1 or -1-1), and some form of backpropagation with gradient descent. 

In [11]:
#Setup
#use numpy since it's a good library for scientific computing
import numpy as np

#create a function to map any value to a value between 0-1(we use the sigmoid function here)
def sigmoid(x,deriv=False):
    if(deriv==True):
        return x*(1-x)
    return 1/(1+np.exp(-x))

#initialize input data set as a matrix
#(each row is a different training example, each column is a different neuron)
X = np.array([[0,0,1],[0,1,1],[1,0,0],[1,1,1]])

#output data set
y = np.array([[0],[1],[1],[0]])

#here we'll seed our random number so that it's random but generates 
#the same number each time(useful for debugging)
np.random.seed(1)

#create connections between matrixes(neuron in one leyer to neuron in next layer)
#since this NN is 3 layers, we need 2 synapse matrixes with random weights
syn0 = 2*np.random.random((3,4)) - 1
syn1 = 2*np.random.random((4,1)) - 1

#training step
for j in range(50000):
    #create first layer(just the input data)
    l0 = X
    #create prediction between each layer and its synapse(dot product)
    #then create the next layer(sigmoid function over all values in matrix)
    l1 = sigmoid(np.dot(l0, syn0))
    #this next layer contains prediction of output data
    #do same steps as previous layer to get more refined prediction(for output value)
    l2 = sigmoid(np.dot(l1, syn1))
    
    #compare predicted output value(l2 prediction above) to expected output value
    l2_error = y - l2
    
    #print out average error rate at set intervals so we can make sure it goes down each training iteration
    if(j % 10000) == 0:
        print ("Error:" + str(np.mean(np.abs(l2_error))))
        
    #multiply erorr rate by sigmoid to get derivative of output prediction from l2
    #this will give us a delta which we will use to update synapses every
    #iteration of training to reduce error rate
    l2_delta = l2_error*sigmoid(l2, deriv=True)
    
    #check how much l1 contributed to the error in l2(a.k.a Backpropagation)
    #get this by multiplying l2 delta by synapse 1's transpose
    l1_error = l2_delta.dot(syn1.T)
    #get l1 delta by multiplying l1 error by result of sigomoid(used to get derivative of l1) 
    l1_delta = l1_error * sigmoid(l1,deriv=True)
    
    #now update the synapse weights with each delta layer to continually 
    #reduce error rate every iteration(a.k.a Gradient Descent)
    syn1 += l1.T.dot(l2_delta)
    syn0 += l0.T.dot(l1_delta)
    
#print the output
print ("Output after training:")
print (l2)

Error:0.5028737600123812
Error:0.00911646534335037
Error:0.006295398146737811
Error:0.005085035346132255
Error:0.0043746563305548155
Output after training:
[[0.00215897]
 [0.99640868]
 [0.99552582]
 [0.00535417]]


### Results

We can see the error rate decreasing for each iteration and the predicted output is really close to the actual output!