## Table of Contents:
1.Simple intuition behind Neural networks.

2.Multi-Layer Perception and its basics.

3.Steps involved in Neural Network methodology.

4.Visualizing steps for Neural Network working methdology.

5.Implementing NN using Numpy(Python).

6.Implementing NN using R.

7.\[Optional\]Mathematical Perspective of Back Propagration Algorithm.

## Simple intuition behind neural networks

Neural Networks work in very similar manner. It takes several input, processes it through multiple neurons from multiple hidden layers and returns the result using an output layer. This result estimation process is technically known as "Forward Propagation"

Next, we compare the result with actual output. The task is to make the output to neural network as close to actual(desired) output. Each of these neurons are contributing some error to final output. How do you reduce the error?

We try to minimize the value/weight of neurons those are contributing more to the error and this happens while traveling back to the neurons of the neural network and finding where the error lies. This process is known as "Backward Progation".

## Multi-Layer Perception and its basics 
About Gradient Descent: 
https://www.analyticsvidhya.com/blog/2017/03/introduction-to-gradient-descent-algorithm-along-its-variants/

## Steps involved in Neural Network methodology.

0). We take input and output X and Y.

1). We initialize weights and biases with random values (This is one time initiation. In the next iteration, we will use updated weights, and biases). 

2). We take matrix dot product of input and weights assigned to edges between the input and hidden layer then add biases of the hidden layer neurons to trspective inputs, this is konown as linear transformation:

hidden_layer_input = matrix_dot_product(X, wh) + bh

3). Perform non-linear transformation using an activation function(Sigmoid). Sigmoid will return the output as 1/(1+exp(-x)).

hiddenlayer_activations = sigmoid(hidden_layer_input)

4). Perform a linear transformation on hidden layer activation(take matrix dot product with weights and add a bias of the output layer neuron) then apply an activation function (again used sigmoid, but you can use any other activation function depending upon your task) to predict the output.

output_layer_input = matrix_dot_product(hiddenlayer_activations * wout)+bout

output = sigmoid(output_layer_input)

### All above steps are known as "Forward Propagation"

5).Compare prediction with actual output and calculate the gradient of error (Actual – Predicted). Error is the mean square loss = ((Y-t)^2)/2

E = y – output

6.) Compute the slope/ gradient of hidden and output layer neurons ( To compute the slope, we calculate the derivatives of non-linear activations x at each layer for each neuron). Gradient of sigmoid can be returned as x * (1 – x).

slope_output_layer = derivatives_sigmoid(output) 

slope_hidden_layer = derivatives_sigmoid(hiddenlayer_activations)

7.) Compute change factor(delta) at output layer, dependent on the gradient of error multiplied by the slope of output layer activation

d_output = E * slope_output_layer

8.) At this step, the error will propagate back into the network which means error at hidden layer. For this, we will take the dot product of output layer delta with weight parameters of edges between the hidden and output layer (wout.T).

Error_at_hidden_layer = matrix_dot_product(d_output, wout.Transpose) 

9.)Compute change factor(delta) at hidden layer, multiply the error at hidden layer with slope of hidden layer activation 

d_hiddenlayer = Error_at_hidden_layer * slope_hidden_layer

10.) Update weights at the output and hidden layer: The weights in the network can be updated from the errors calculated for training example(s).

wout = wout + matrix_dot_product(
hiddenlayer_activations.Transpose, d_output)*learning_rate 

wh = wh + matrix_dot_product(
X.Transpose,d_hiddenlayer)*learning_rate 

learning_rate: The amount that weights are updated is controlled by a configuration parameter called the learning rate) 

11.) Update biases at the output and hidden layer: The biases in the network can be updated from the aggregated errors at that neuron. 

• bias at output_layer =bias at output_layer + sum of delta of output_layer at row-wise * learning_rate 

• bias at hidden_layer =bias at hidden_layer + sum of delta of output_layer at row-wise * learning_rate 

bh = bh + sum(d_hiddenlayer, axis=0) * learning_rate 

bout = bout + sum(d_output, axis=0)*learning_rate

### Steps from 5 to 11 are known as "Backward Propagation"

## Implementing NN using Numpy(Python).

In [1]:
import numpy as np

# Input array
X = np.array([[1,0,1,0],[1,0,1,1],[0,1,0,1]])

# Output
y = np.array([[1],[1],[0]])

# Sigmoid Function
def sigmoid(x):
    return 1/(1+np.exp(-x))

def derivatives_sigmoid(x):
    return x * (1-x)

In [3]:

# Variable initialization
epoch=5000 #Setting training iterations
learning_rate = 0.1
inputlayer_neurons = X.shape[1]
hiddenlayer_neurons = 3
output_neurons = 1

# weight and bias initialization
wh = np.random.uniform(size=(inputlayer_neurons, hiddenlayer_neurons))
bh=np.random.uniform(size=(1,hiddenlayer_neurons)) 
wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons)) 
bout=np.random.uniform(size=(1,output_neurons))

for i in range(epoch):
    # Forward Propogation 
    hidden_layer_input1=np.dot(X,wh) + bh
    hiddenlayer_activations = sigmoid(hidden_layer_input)
    output_layer_input=np.dot(hiddenlayer_activations,wout) + bout 
    output = sigmoid(output_layer_input)
    
    # Backpropagation
    E = y-output
    slope_output_layer = derivatives_sigmoid(output)
    slope_hidden_layer = derivatives_sigmoid(hiddenlayer_activations)
    d_output = E * slope_output_layer
    Error_at_hidden_layer = d_output.dot(wout.T)
    d_hiddenlayer = Error_at_hidden_layer * slope_hidden_layer
    wout += hiddenlayer_activations.T.dot(d_output) *learning_rate
    bout += np.sum(d_output, axis=0,keepdims=True) *learning_rate
    wh += X.T.dot(d_hiddenlayer) *learning_rate
    bh += np.sum(d_hiddenlayer, axis=0,keepdims=True) *learning_rate
    
print(output)
    

[[0.98143605]
 [0.96569667]
 [0.04737666]]
