Program 4: Build an Artificial Neural Network by implementing the Backpropagation algorithm and test the same using appropriate data sets.

A basic neural network mimics the structure and function of the human brain to perform tasks like classification, regression, and pattern recognition. It consists of layers of interconnected nodes (neurons), where each connection has an associated weight. Here's a step-by-step explanation of how a basic neural network works:

Components of a Neural Network

Neurons (Nodes):

The fundamental units of the network.
Each neuron receives input, processes it, and passes the output to the next layer.
Layers:

Input Layer: The first layer that receives input data.

Hidden Layer(s): Intermediate layers that process inputs received from the input layer.

Output Layer: The final layer that produces the network's output.
Weights:

Each connection between neurons has a weight that determines the strength and direction of the signal passed.
Biases:

Each neuron has an associated bias value added to the weighted sum of inputs before applying the activation function.

Activation Function:

A non-linear function applied to the weighted sum of inputs plus the bias.
Common activation functions include sigmoid, ReLU (Rectified Linear Unit), and tanh.

How a Basic Neural Network Works

Initialization:

Weights and biases are initialized, often randomly.

Forward Propagation:

Input Layer: Takes input data and passes it to the first hidden layer.

Hidden Layers: Each neuron in a hidden layer computes a weighted sum of its inputs, adds the bias, and applies an activation function. The output is passed to the next layer.

Output Layer: Produces the final output of the network by processing the inputs from the last hidden layer.

In [1]:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder, StandardScaler

class NeuralNetwork:
    def __init__(self, input_size, hidden_size, output_size):
        np.random.seed(42)
        self.weights_input_hidden = np.random.rand(input_size, hidden_size)
        self.weights_hidden_output = np.random.rand(hidden_size, output_size)
        self.bias_hidden = np.random.rand(hidden_size)
        self.bias_output = np.random.rand(output_size)

    def sigmoid(self, x):
        return 1 / (1 + np.exp(-x))

    def sigmoid_derivative(self, x):
        return x * (1 - x)

    def forward_propagation(self, inputs):
        self.hidden_input = np.dot(inputs, self.weights_input_hidden) + self.bias_hidden
        self.hidden_output = self.sigmoid(self.hidden_input)
        
        self.final_input = np.dot(self.hidden_output, self.weights_hidden_output) + self.bias_output
        self.final_output = self.sigmoid(self.final_input)
        
        return self.hidden_output, self.final_output

    def backpropagation(self, inputs, hidden_output, final_output, expected_output, learning_rate):
        output_error = expected_output - final_output
        output_delta = output_error * self.sigmoid_derivative(final_output)
        
        hidden_error = output_delta.dot(self.weights_hidden_output.T)
        hidden_delta = hidden_error * self.sigmoid_derivative(hidden_output)
        
        self.weights_hidden_output += hidden_output.T.dot(output_delta) * learning_rate
        self.bias_output += np.sum(output_delta, axis=0) * learning_rate
        
        self.weights_input_hidden += inputs.T.dot(hidden_delta) * learning_rate
        self.bias_hidden += np.sum(hidden_delta, axis=0) * learning_rate

    def train(self, inputs, expected_output, learning_rate, epochs):
        for epoch in range(epochs):
            hidden_output, final_output = self.forward_propagation(inputs)
            self.backpropagation(inputs, hidden_output, final_output, expected_output, learning_rate)

    def predict(self, inputs):
        _, final_output = self.forward_propagation(inputs)
        return final_output

    def accuracy(self, predictions, labels):
        pred_labels = np.argmax(predictions, axis=1)
        true_labels = np.argmax(labels, axis=1)
        return np.mean(pred_labels == true_labels)

# Load and preprocess the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target.reshape(-1, 1)

encoder = OneHotEncoder(sparse=False)
y = encoder.fit_transform(y)

scaler = StandardScaler()
X = scaler.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and train the network
nn = NeuralNetwork(input_size=4, hidden_size=5, output_size=3)
nn.train(X_train, y_train, learning_rate=0.1, epochs=10000)

# Test the network
test_results = nn.predict(X_test)
print("Test Accuracy:", nn.accuracy(test_results, y_test))


Test Accuracy: 0.9666666666666667


BACKPROPAGATION ALGORITHM:

backpropagation function updates the weights and biases of the neural network based on the error between the predicted output (final_output) and the expected output (expected_output).

output_error is the difference between the expected output and the actual output.

output_delta is the gradient of the loss function with respect to the output, scaled by the derivative of the activation function (sigmoid in this case). This delta represents how much the output layer's weights need to be adjusted.

hidden_error is the propagated error from the output layer back to the hidden layer. It is calculated by taking the dot product of output_delta and the transpose of the weights connecting the hidden and output layers.

hidden_delta is the gradient of the loss function with respect to the hidden layer's outputs, scaled by the derivative of the activation function (sigmoid in this case). This delta represents how much the hidden layer's weights need to be adjusted.

self.weights_hidden_output is updated by adding the product of the transposed hidden layer output and output_delta, scaled by the learning rate. This adjusts the weights in the direction that reduces the output error.

self.bias_output is updated by adding the sum of output_delta across all samples, scaled by the learning rate. This adjusts the biases in the output layer.


self.weights_input_hidden is updated by adding the product of the transposed input data and hidden_delta, scaled by the learning rate. This adjusts the weights in the direction that reduces the hidden layer error.

self.bias_hidden is updated by adding the sum of hidden_delta across all samples, scaled by the learning rate. This adjusts the biases in the hidden layer.