### Neural Network
Neural Network is a machine learning model to classify data.

Example: If we want to classify hand-written number into digital form. We will take the image of the hand-written digit pass it as an input (as pixels) and feed it to the neural network. The processing happend between input and output layer called hidden layers. Finally, the output layer consisting of 10 nodes (one each for each digit from 0 to 9) will classify the input hand-written image pixels into digit based on the node that activated at most (i.e, if node 3 is mostly activated than it means the digit is 3).

In [114]:
import numpy as np

#### Sigmoid Function
Sigmoid function takes in mathematical function that takes in input and outputs either 0 or 1.

In [115]:
def sigmoid_function(x):
    return 1/(1 + np.exp(-x))

#### Error Gradient
This function is used to minimize error while back propogating through the network. We do that by calculating the derivative of 'x' at each activation in each layer.
Error Gradient/Slope is given as x * (1 – x).

In [116]:
def error_gradient(x):
    return x * (1-x)

#### Prepare Data
We will design a 3 layer neural network, with: input layer, 1 hidden layer and  output layer.
- Input Layer
- Output Layer
- The 3 layers are connected with weights. Thus, to connect input layer to hidden layer one set of weight is required and to connect hidden layer to output layer another set of weights is required. The weight are called synapsis and are randomly initialized.

In [117]:
input_layer = np.array([[0.4, 0.3, 0.1],
                        [1.0, 0.4, 0.4],
                        [0.0, 1.0, 0.8]])

output_layer = np.array([[0],
                        [0],
                        [1]])

np.random.seed(1)
weights_1 = 2 * np.random.random((3,4)) - 1
weights_2 = 2 * np.random.random((4,1)) - 1

#### Training
The motto behind building a Neural Network is to find weights that when plugged to a neural network can classify input correctly.

Before training, we randomly initialize weights that connect all the layers. 

With randomly initilized weight, we will calculate dot product of input layer and weights connecting input layer and hidden layer which is passed to sigmoid function to calculate activations for hidden layer. Similary, the output layer is calculated. Next, the error is calculated compared to desired output.

Next, we will do 'back propogation'. To do so, we calculate the derivate of calculated output layer and multiply with the output layer error to get the output layer delta (adjusting weights) used later to adjust weights connecting output layer and hidden layer. Then, hidden layer error is calcuated taking dot product of weight connect output layer and hidden layer with delta, which is than used to similary to calcuate hidden layer delta.

Then, the weights are adjusted. The whole process is repeated for number of times to find weight that is close to weights supplied as output layer above.

In [118]:
#Hyper-Parameters
noIterations = 100000


for j in range(noIterations):    
    #activations
    hidden_layer = sigmoid_function(np.dot(input_layer, weights_1))
    calculated_output_layer = sigmoid_function(np.dot(hidden_layer, weights_2))
    
    #error
    output_layer_error = output_layer -  calculated_output_layer
    
    #back propogation
    output_layer_delta = output_layer_error * error_gradient(calculated_output_layer)
    hidden_layer_error = output_layer_delta.dot(weights_2.T)
    hidden_layer_delta = hidden_layer_error * error_gradient(hidden_layer)
    
    #adjust weight
    weights_2 += hidden_layer.T.dot(output_layer_delta)
    weights_1 += input_layer.T.dot(hidden_layer_delta)
    
    
    #Uncomment following line to see error reducing progress every 25000 steps
    #if(j % 25000) == 0:
        #print("Error:",  str(np.mean(np.abs(output_layer_error))) )
        
    
print("\nOUTPUT LAYER:")
print(calculated_output_layer)

print("\nWEIGHTS:")
print(weights_1)
print(weights_2)

Error: 0.5103155049500878
Error: 0.003311791991183617
Error: 0.0022739887619340404
Error: 0.0018290290995867549

OUTPUT LAYER:
[[2.07709076e-03]
 [5.76043072e-05]
 [9.97429249e-01]]

WEIGHTS:
[[ 4.40196274  0.8957048   2.540444   -3.80914267]
 [-1.6526529  -1.49514764 -1.59725686  1.1607846 ]
 [-1.87813772 -0.63999129 -1.6505048   2.38679322]]
[[-6.39283219]
 [-1.55583673]
 [-4.46493535]
 [ 6.94364049]]


#### Observation
After training the Neural Network with our dataset, we can see that calculate output layer is pretty close to our supplied actual output layer with least error.

The designed Neural Network model is trained to predict correct output. Meaning, we can plug in the final weights into a neural network to correctly classified using input data similar to training data.