## Simple Neural Network

Building a two layer Neural Network using the Sigmoid activation function

### Neural Network Training

The output of a simple 2 layer network is:

$\hat{y} = \sigma(W_2 \sigma(W_1x + b_1) + b_2)$

Training the model will correct the weights and biases

### Training Process

**Feedforward**: Calculating the predicted output $\hat{y}$  
**Backpropagation**: Updating the weights and biases

### Loss Function

Using **sum-of-squares error** as loss function

*Sum-of-Squares Error* $= \sum_{i=1}^n(y - \hat{y})^2$

Differences are squared to measure the sum of absolute values. Objective of training is to minimize the loss function

### Backpropagation

**Gradient Descent**: Update the weights and biases using the derivative of the loss function

Need **chain rule** to calculate the derivative of the loss function with respect to the weights and biases

$Loss(y, \hat{y}) = \sum_{i=1}^n(y - \hat{y})^2$

$\frac{\partial Loss(y, \hat{y})}{\partial W} = \frac{\partial Loss(y, \hat{y})}{\partial \hat{y}} * \frac{\partial \hat{y}}{\partial z} * \frac{\partial z}{\partial W}$, where $z = Wx + b$

$= 2(y - \hat{y}) * \frac{\partial \hat{y}}{\partial z} * x = 2(y - \hat{y}) * z(1 - z) * x$

In [7]:
# Imports 
import numpy as np

In [8]:
# Activation Function
def sigmoid(t):
    return 1 / (1 + np.exp(-t))

def sigmoid_derivative(p):
    return p * (1 - p)

In [17]:
# Create Neural Network class
class NeuralNetwork:
    def __init__(self, x, y):
        self.input = x
        self.weights1 = np.random.rand(self.input.shape[1], 4)
        self.weights2 = np.random.rand(4, 1)
        self.y = y
        self.output = np.zeros(y.shape)
        
    def feedforward(self):
        self.layer1 = sigmoid(np.dot(self.input, self.weights1))
        self.output = sigmoid(np.dot(self.layer1, self.weights2))
        return self.output
        
    def backprop(self):
        # Application of chain rule to find derivative of loss function
        d_weights2 = np.dot(self.layer1.T, (2 * (self.y - self.output) * sigmoid_derivative(self.output)))
        d_weights1 = np.dot(self.input.T, (np.dot(2 * (self.y - self.output) 
                                * sigmoid_derivative(self.output), self.weights2.T) * sigmoid_derivative(self.layer1)))
        
        # Update the weights with the derivative (slope) of the loss function
        self.weights1 += d_weights1
        self.weights2 += d_weights2
        
    def train(self, X, y):
        self.output = self.feedforward()
        self.backprop()

In [18]:
# Data
X = np.array(([0, 0, 1], [0, 1, 1], [1, 0, 1], [1, 1, 1]), dtype=float)
y = np.array(([0], [1], [1], [0]), dtype=float)

# Training the model
NN = NeuralNetwork(X, y)
for i in range(1500):
    if i % 100 == 0:
        print("Interation # " + str(i) + "\n")
        print("Input: \n" + str(X))
        print("Actual Output: \n" + str(y))
        print("Predicted Output: \n" + str(NN.feedforward()))
        print("Loss: \n" + str(np.mean(np.square(y - NN.feedforward())))) # mean sum squared loss
        print()
    
    NN.train(X, y)

Interation # 0

Input: 
[[0. 0. 1.]
 [0. 1. 1.]
 [1. 0. 1.]
 [1. 1. 1.]]
Actual Output: 
[[0.]
 [1.]
 [1.]
 [0.]]
Predicted Output: 
[[0.78765408]
 [0.82681409]
 [0.83778702]
 [0.86140889]]
Loss: 
0.35468265680615974

Interation # 100

Input: 
[[0. 0. 1.]
 [0. 1. 1.]
 [1. 0. 1.]
 [1. 1. 1.]]
Actual Output: 
[[0.]
 [1.]
 [1.]
 [0.]]
Predicted Output: 
[[0.30357566]
 [0.60103473]
 [0.57256564]
 [0.55883323]]
Loss: 
0.186581541889406

Interation # 200

Input: 
[[0. 0. 1.]
 [0. 1. 1.]
 [1. 0. 1.]
 [1. 1. 1.]]
Actual Output: 
[[0.]
 [1.]
 [1.]
 [0.]]
Predicted Output: 
[[0.09535826]
 [0.81491198]
 [0.809652  ]
 [0.23016213]]
Loss: 
0.03313943496456518

Interation # 300

Input: 
[[0. 0. 1.]
 [0. 1. 1.]
 [1. 0. 1.]
 [1. 1. 1.]]
Actual Output: 
[[0.]
 [1.]
 [1.]
 [0.]]
Predicted Output: 
[[0.04733724]
 [0.89754125]
 [0.89924595]
 [0.12107718]]
Loss: 
0.009387417859040908

Interation # 400

Input: 
[[0. 0. 1.]
 [0. 1. 1.]
 [1. 0. 1.]
 [1. 1. 1.]]
Actual Output: 
[[0.]
 [1.]
 [1.]
 [0.]]
Predict