# Backpropagation Algorithm

The algorithm consists of two phases: 
    1). Forward pass/Propagation phase, where our inputs are passed through the network and output predictions obtained
    2). Backward pass /Weight update phase, where we compute the gradient of the loss function at the final layer i.e., 
          prediction layer of the network and use this gradient to recursively apply the chain rule to update the weights in our 
          network

In [2]:
import numpy as np

In [3]:
# W implies weight matrix
# layers is a list or intergers to define our network architecture eg (2, 2, 1) Architecture (Define #inputs)
# alpha is our learning rate which controls the size of our step toward an optimal gradient decent with local/global minimum
class NeuralNetwork:
    # Construclor initialization
    def __init__(self, layers, alpha=0.1):
        self.layers = layers
        self.W = []
        self.aplha = alpha
        
        for i in np.arange(0, len(layers) - 2):
            # randomy initialize the weight matrix connecting the number of nodes in each respective layer together
            # Add extra node for bias trick
            # Append the weights into W by scalling them with square root of number of node in the current layer
            w = np.random.randn(layers[i] + 1, layers[i + 1] + 1)
            self.W.append(w / np.sqrt(layers[i]))
            
        # the last two layers
        # layer[-2] + 1 is an input layer which require a bias trick thats why we have added 1 to it
        # layer[-1] is an output layer which does not require a bias trick
        w = np.random.randn(layers[-2] + 1, layers[-1])
        self.W.append(w/ np.sqrt(layers[-2]))
    
    # Python magic function for debugging
    def __repr__(self):
        # construct and return a string that represents the network architecture
        return "Neural Network: {}".format("-".join(str(layer) for layer in self.layers))
    
    # Define sigmoid activation function
    def sigmoid(self, x):
        return 1.0 / (1 + np.exp(-x)) 
    
    # Define derivative of the sigmoid to be used during the backward pass
    def sigmoid_deriv(self, x):
        # compute the derivative of the sigmoid function assuming that x has already been passed through the sigmoid function
        return x * (1 - x)

In [4]:
nn = NeuralNetwork([2,2,1])
print(nn)

Neural Network: 2-2-1
