# Neural-Network Implementation from Scratch

## PLAN

I implemented a neural network for XOR gate. It will be a neural network with a hidden layer in it. The hidden layer will have two units in it. I started by choosing some random weights and biases and find out the predicted y for these random values. 

Then we will use gradient descent technique to reduce the cost function and then we will back propagate the values and update weights and biases multiple times till we reach a good enough output. 

In [1]:
import numpy as np

In [2]:
# sigmoid function

def sigmoid(z):
    return 1/(1 + np.exp(-z))

In [3]:
# derivative of sigmoid function

def derivative_sigmoid(z):
    return (sigmoid(z)*(1 - sigmoid(z))) 

## Preparing the data

In [4]:
# input values
X = np.array([[0,0] , [0,1] , [1,0] , [1,1]])
# output values
Y = np.array([[0] , [1] , [1] , [0]])
X.shape , Y.shape

((4, 2), (4, 1))

 The below function will generate random weigths and biases for us

In [5]:
# function which generates numpy array random number of size n X m

def generate_random(n,m):
    return ((np.random.random((n,m))))

def bring_in_range(x):
    return 2*x - 1

## Generating random weigths and biases for every layer

In [6]:
# Generating random weights and biases

weights_hidden = generate_random(2,2)
weights_hidden = bring_in_range(weights_hidden)
bias_hidden = generate_random(1,2)
bias_hidden = bring_in_range(bias_hidden)
weights_output = generate_random(2,1)
weights_output = bring_in_range(weights_output)
bias_output = generate_random(1,1)
bias_output = bring_in_range(bias_output)

# Defining a learning rate for gradient decent, used for cost function minimization
learning_rate = 0.1

### FORWARD PROPAGATION (Without back_propagation)

In [7]:
expected_output = X

# preparing the input for the hidden layer by multiplying weigths and adding biases
hidden_layer_input = np.dot(X , weights_hidden) + bias_hidden

# applying sigmoid function on the input
hidden_layer_output = sigmoid(hidden_layer_input)

# preparing the input for the output layer by multiplying weigths and adding biases
output_layer_input = np.dot(hidden_layer_output , weights_output) + bias_output

# applying sigmoid function on the input
final_output = sigmoid(output_layer_input)

final_output

array([[0.65698497],
       [0.64178181],
       [0.64981579],
       [0.63743885]])

### We have received an output but not good. The expected output was [0,1,1,0] and this is not even close. This happened as we didn't apply back propagation to minimize the error. This output is just for random weights.

## Forward and Backward propagation

In [8]:
# forward propagation optimized

# We will perform 10,000 iterations to reduce error
for iterate in range(10000):
    expected_output = X
    
    # FORWARD PROPAGATION STARTS HERE
    
    # preparing the input for the hidden layer by multiplying weigths and adding biases
    hidden_layer_input = np.dot(expected_output , weights_hidden) + bias_hidden
    
    # applying sigmoid function on the input
    hidden_layer_output = sigmoid(hidden_layer_input)
    
    # preparing the input for the output layer by multiplying weigths and adding biases
    output_layer_input = np.dot(hidden_layer_output , weights_output) + bias_output
    
    # applying sigmoid function on the input
    final_output = sigmoid(output_layer_input)
    
    # FORWARD PROPAGATION ENDS HERE
    
    # BACKWARD PROPAGATION STARTS HERE

    # output_layer_first_term = Y predicted - Y actual
    output_layer_first_term = final_output - Y

    # output_layer_second_term = derivative of the sigmoid (input for the layer)
    output_layer_second_term = derivative_sigmoid(output_layer_input)

    output_layer_first_two = output_layer_second_term * output_layer_first_term

    hidden_layer_first_term = np.dot(output_layer_first_two , weights_output.T)
    hidden_layer_second_term = derivative_sigmoid(hidden_layer_input)
    hidden_layer_first_two = hidden_layer_second_term * hidden_layer_first_term
    
    # Calculating the changes in weights required

    changes_output_weight = np.dot(hidden_layer_output.T , output_layer_first_two)
    changes_output_bias = np.sum(output_layer_first_two , axis = 0 , keepdims = True )
    # If keepdims will be false then this will return us a single number but we want an array

    changes_hidden_weight = np.dot(expected_output.T , hidden_layer_first_two)
    changes_hidden_bias = np.sum(hidden_layer_first_two , axis = 0 , keepdims = True )
    
    # Updating weights and biases

    weights_output = weights_output - learning_rate * changes_output_weight
    bias_output = bias_output - learning_rate * changes_output_bias
    weights_hidden = weights_hidden - learning_rate * changes_hidden_weight
    bias_hidden = bias_hidden - learning_rate * changes_hidden_bias
    
    # BACKWARD PROPAGATION ENDS HERE

## Now we have the updated weights and biases so we calculate the final output

In [9]:
# Now we have updated the weights, so we will find the final output with updated weights
expected_output = X
hidden_layer_input = np.dot(X , weights_hidden) + bias_hidden
hidden_layer_output = sigmoid(hidden_layer_input)
output_layer_input = np.dot(hidden_layer_output , weights_output) + bias_output
final_output = sigmoid(output_layer_input)

In [10]:
final_output

# We can clearly see that the final_output is very close to [0,1,1,0]

array([[0.05514049],
       [0.93250801],
       [0.93231645],
       [0.06878953]])

## After applying the forward and the back propagation over multiple iterations, we have reached a final output which is extremely close to the required output [0,1,1,0].