# Problem Statement:
ANN with Back propagation

# Algorithm:
Step 1: Randomly Initialize the weghts and biases for the connections between neurons in the network, 
these initial values will be adjusted during the training process.

Step 2: Pass input through the network to calculate the predicted output.

Step 3: Apply Activation functions at each neuron to introduce non-linearity

Step 4: Compare the predicted output with the actual target values to compute the loss(error) of the 
network.

Step 5: Calculate the gradient of the loss w.r.t the weights and biases using the chain rule of calculus

Step 6: Update weights and biases in the opposite direction of the gradient to minimize the loss. This step 
is where the term “backpropagation” comes from.

Step 7: Use an optimization algorithm to update the weights and biases efficiently

Step 8: Adjust the learning rate to control the size of weight updates.

Step 9: Repeat steps 2-7 for a specified number of iterations or until the loss converges to a satisfactory 
level

# Code

In [1]:
#importing the required libraries
import numpy as np

In [2]:
# Define the input data 'X' as a NumPy array.
# Each row represents an input example, and each column represents a feature.
# In this case, we have 4 examples with 2 features each (binary inputs).
X = np.array(([0, 0], [0, 1], [1, 0], [1, 1]), dtype=float)

# Define the corresponding target output 'y' as a NumPy array.
# Each row represents the expected output corresponding to the input example.
# In this case, we have 4 expected outputs (binary values).
y = np.array(([0], [1], [1], [0]), dtype=float)

In [3]:
#The X array represents the input features, which are binary values (0 or 1)
X


array([[0., 0.],
       [0., 1.],
       [1., 0.],
       [1., 1.]])

In [4]:
#y array represents the corresponding expected output for each input example
y

array([[0.],
       [1.],
       [1.],
       [0.]])

In [5]:
# Define a class named NeuralNetwork
class NeuralNetwork(object):
    def __init__(self):
        # Initialize the network's architecture parameters
        self.input = 2   # Number of input neurons
        self.output = 1  # Number of output neurons
        self.hidden = 3  # Number of neurons in the hidden layer
        
        # Initialize the weights for the neural network
        self.W1 = np.random.randn(self.input, self.hidden)  # Weight matrix from input to hidden layer (random initialization)
        self.W2 = np.random.randn(self.hidden, self.output)  # Weight matrix from hidden to output layer (random initialization)
        
    def feedForward(self, X):
        # Perform forward propagation through the network
        self.z = np.dot(X, self.W1)  # Calculate the dot product of inputs (X) and weights (W1)
        self.z2 = self.sigmoid(self.z)  # Apply the sigmoid activation function to the result (z)
        self.z3 = np.dot(self.z2, self.W2)  # Calculate the dot product of hidden layer outputs (z2) and weights (W2)
        output = self.sigmoid(self.z3)  # Apply the sigmoid activation function to the result (z3)
        return output
        
    def sigmoid(self, s, deriv=False):
        # Define the sigmoid activation function and its derivative
        if deriv:
            return s * (1 - s)  # Derivative of the sigmoid function
        return 1 / (1 + np.exp(-s))  # Sigmoid activation function
    
    def backward(self, X, y, output):
        # Perform backward propagation through the network to update weights
        self.output_error = y - output  # Calculate the error in the output
        self.output_delta = self.output_error * self.sigmoid(output, deriv=True)  # Calculate the delta for the output layer
        
        self.z2_error = self.output_delta.dot(self.W2.T)  # Calculate the error for the hidden layer (z2)
        self.z2_delta = self.z2_error * self.sigmoid(self.z2, deriv=True)  # Calculate the delta for the hidden layer
        
        self.W1 += X.T.dot(self.z2_delta)  # Update the weights of the first layer (input -> hidden)
        self.W2 += self.z2.T.dot(self.output_delta)  # Update the weights of the second layer (hidden -> output)
        
    def train(self, X, y):
        output = self.feedForward(X)  # Perform forward propagation to get the output
        self.backward(X, y, output)  # Perform backward propagation to update the weights

# Create an instance of the NeuralNetwork class
# This instance can be used to create and train a neural network

In [6]:
# Create an instance of the NeuralNetwork class
NN = NeuralNetwork()

# Loop through the training process for a specified number of epochs (500 times in this case)
for i in range(500):
    # Print the loss every 100 epochs
    if i % 100 == 0:
        # Calculate the loss by finding the mean squared error between the predicted output and the actual output
        loss = np.mean(np.square(y - NN.feedForward(X)))
        print("Loss", i, ":", loss)
        
    # Train the neural network using the training data (X) and target outputs (y)
    NN.train(X, y)

Loss 0 : 0.43695284496380216
Loss 100 : 0.18318881483555077
Loss 200 : 0.07784239118644393
Loss 300 : 0.03414898274093979
Loss 400 : 0.020544989462296208


In [7]:
# Print the input data
print("Input: \n" + str(X))

# Print a newline for separation
print("\n")

# Print the actual target output
print("Actual Output: \n", y)

# Print a newline for separation
print("\n")

# Calculate and print the loss, which is the mean squared error between the actual target output and the predicted output
loss = np.mean(np.square(y - NN.feedForward(X)))
print("Loss: \n" + str(loss))

# Print a newline for separation
print("\n")

# Calculate and print the predicted output from the neural network
predicted_output = NN.feedForward(X)
print("Predicted Output: " + str(predicted_output))

Input: 
[[0. 0.]
 [0. 1.]
 [1. 0.]
 [1. 1.]]


Actual Output: 
 [[0.]
 [1.]
 [1.]
 [0.]]


Loss: 
0.014501925368449122


Predicted Output: [[0.16008955]
 [0.88672642]
 [0.88642069]
 [0.08153448]]


# Remarks
The backpropagation algorithm is a gradient estimation method that uses the Leibniz chain rule to train 
neural network models. The gradient estimate is used by the optimization algorithm to compute the 
network parameter updates. The time complexity of the backpropagation algorithm for training artificial 
neural networks is O(nm), where n is the number of training examples and m is the number of weights in 
the neural network