# Neural Networks
## What are neural networks?
A neural network is a machine learing training algorithm which is loosely based on the working of the human brain. A neural network has input nodes where we pass feature values which are passed through layers after which we give the output in terms of output nodes.
<img src="neural_nets.png" height=600 width=600>

## Activation Function
In neural networks, an activation function refers to the function that is applied on the data when it is passed forward. It is the computation that occurs due to which information is carried forward in the neural network.
<img src="activation-functions.png" height=600 width=600>
<img src="activation-function.png" height=600 width=600>

## Weights
Weights basically tell us **how important a particular feature is for the model**. A neural network learns on the basis of these weights. The values assigned to each weight is random initially. We usually have it in the form of a matrix or a vector. We will discuss how it changes soon.

## Input Layer
This is the layer where we input the signals or features or in even simpler terms, the input layer is **X**. We pass in values to this layer after which, the network passes them forward.

## Hidden Layers
These are the layers where the magic happens. They basically have nodes where activation functions are applied. On applying these functions, we get outputs which are passed on to further nodes.

## Output Layer
This is the final layer in the neural network model. The output is calculated or shown in this layer. We get the output in the form of a continuous value or a class depending on the task at hand.

## Forward Propagation
Also known as forward pass, this is the process of passing the data forward through the neural network to get an output for that layer. It is also referred to as inference.

## Backpropagation
Possibly the most important topic in this topic, backpropagation is the process of checking the error or the distance between the the output you obtained and the output that is expected of the network. It does this by finding the partial derivative of the output function. This is done for every set of weights. The forward pass then occurs again with the newly updated weights which should produce an output closer to the actual expected output. This processes is repeated for a certain number of iterations. 

# Neural Networks from Scratch
The following section is a simple implementation to understand how neural networks work. The code is based on the VTU machine learning lab. I'd like to thank [Praahas Amin](https://github.com/praahas) for [his work](https://github.com/praahas/machine-learning-vtu) on it and would like to thank him for letting me use his code for the purpose of this demonstration.

In [26]:
# Import modules
import numpy as np

In [37]:
# Initialize values
X = np.random.rand(3, 2)
X

array([[0.16315786, 0.34687718],
       [0.52440762, 0.29383145],
       [0.81306845, 0.06814426]])

In [39]:
y = np.random.rand(3, 1)
y

array([[0.94500291],
       [0.69429435],
       [0.11578679]])

In [29]:
# # Divide each element of X by an array with max of each column
# X = X / np.max(X, axis=0) 
# # Divide each element of y by 100
# y = y / 100

In [30]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))  # The np.exp works like e^x where e is eulers constant 

def sigmoid_derive(x):
    return x * (1 - x)

In [31]:
# Initialize values
epoch=7000
learning_rate = 0.2
input_nodes = 2
hidden_nodes = 3 
output_nodes = 1

In [32]:
# Weight of hidden layer
wh = np.random.uniform(size = (input_nodes, hidden_nodes))
# Weight of output layer
wo=np.random.uniform(size = (hidden_nodes, output_nodes)) 

# Bias of hidden layer
bh = np.random.uniform(size = (1, hidden_nodes))
# Bias of output layer
bo=np.random.uniform(size = (1, output_nodes))

In [33]:
# Training starts
for i in range(epoch):
    #Forward Propogation
    net_h = np.dot(X, wh) + bh  #net_h = net input for hidden layer
    
    sigma_h = sigmoid(net_h)  #sigma_h = output of sigmoid function of hidden layer
    
    net_o = np.dot(sigma_h, wo) + bo  #net_o = net input for output layer
    
    output = sigmoid(net_o)  #output = is the output of output layer i.e sigmoid of net_o
    
    #Backpropagation
    
    deltaK = (y - output) * sigmoid_derive(output)  ##calculate deltak
    
    deltaH = deltaK.dot(wo.T) * sigmoid_derive(sigma_h)  #deltaH
    
    wo = wo + sigma_h.T.dot(deltaK) * learning_rate  #Update output layer weights
    
    wh = wh + X.T.dot(deltaH) * learning_rate  #Update hidden layer weights
    error = sum(deltaK) ** 2 / len(deltaK)
    
    if i % 1000 == 0:
        print(f"Epoch -> {i}, learning_rate -> {learning_rate}, error_rate -> {error}") 

Epoch -> 0, learning_rate -> 0.2, error_rate -> [8.7311097e-08]
Epoch -> 1000, learning_rate -> 0.2, error_rate -> [1.43336664e-06]
Epoch -> 2000, learning_rate -> 0.2, error_rate -> [6.85124874e-09]
Epoch -> 3000, learning_rate -> 0.2, error_rate -> [3.68031818e-08]
Epoch -> 4000, learning_rate -> 0.2, error_rate -> [3.83261085e-08]
Epoch -> 5000, learning_rate -> 0.2, error_rate -> [3.01310567e-08]
Epoch -> 6000, learning_rate -> 0.2, error_rate -> [2.18222444e-08]


In [25]:
print("Input: \n" + str(X))
print("Actual Output: \n" + str(y))
print("Predicted Output: \n", output)

Input: 
[[0.08894989 1.        ]
 [1.         0.82233848]
 [0.30251951 0.84390054]]
Actual Output: 
[[0.00053973]
 [0.00211195]
 [0.00311097]]
Predicted Output: 
 [[0.00865435]
 [0.0064224 ]
 [0.00883296]]
