Experiment 2: Multi-Layer Perceptron (MLP) for XOR

Objective

To implement a multi-layer perceptron (MLP) using NumPy and train it to correctly classify the XOR function.

Description of the Model

Single-layer perceptrons cannot learn XOR due to non-linearity.
The MLP introduces a hidden layer with sigmoid activation, allowing non-linearity.
Uses backpropagation to adjust weights efficiently.

In [105]:
import numpy as np

In [106]:
class MLP_XOR:
    def __init__(self, input_size=2, hidden_size=2, lr=0.1, epochs=10):
        self.lr = lr  # Learning rate
        self.epochs = epochs  # Number of training iterations

        # Initialize weights and biases
        self.W1 = np.random.uniform(-1, 1, (hidden_size, input_size))
        self.b1 = np.random.uniform(-1, 1, hidden_size)
        self.W2 = np.random.uniform(-1, 1, (1, hidden_size))
        self.b2 = np.random.uniform(-1, 1, 1)

    def step_function(self, x):
        return np.where(x >= 0, 1, 0)  # Step function: 1 if x >= 0, else 0

    def forward(self, x):
        # Hidden Layer
        hidden_input = np.dot(self.W1, x) + self.b1
        hidden_output = self.step_function(hidden_input)  # Apply step function
        
        # Output Layer
        output_input = np.dot(self.W2, hidden_output) + self.b2
        output = self.step_function(output_input)  # Apply step function
        
        return hidden_output, output

    def train(self, X, y):
        for epoch in range(self.epochs):
            total_error = 0
            for i in range(len(X)):
                x_sample = X[i]
                target = y[i]

                # Forward pass
                hidden_output, output = self.forward(x_sample)
                
                # Compute error at output layer
                output_error = target - output
                total_error += abs(output_error)

                # Update weights for output layer
                if output_error != 0:
                    self.W2 += self.lr * output_error * hidden_output.reshape(1, -1)
                    self.b2 += self.lr * output_error

                    # Compute error for hidden layer
                    hidden_errors = self.W2.flatten() * output_error

                    # Update hidden layer weights
                    for j in range(len(hidden_output)):
                        if hidden_errors[j] != 0:
                            self.W1[j] += self.lr * hidden_errors[j] * x_sample
                            self.b1[j] += self.lr * hidden_errors[j]

            # Stop early if XOR is learned
            if total_error == 0:
                print(f"Training complete at epoch {epoch}")
                break

    def predict(self, x):
        _, output = self.forward(x)
        return int(output)  

    def evaluate(self, X, y):
        correct = sum(self.predict(X[i]) == y[i] for i in range(len(X)))
        accuracy = correct / len(y)
        print(f"Accuracy: {accuracy * 100:.2f}%")


Overview
This model is a simple Multi-Layer Perceptron (MLP) implemented using NumPy to classify the XOR logic gate. Unlike a single-layer perceptron, which fails to classify XOR due to its non-linearly separable nature, this MLP introduces a hidden layer to capture the complex decision boundary.

Architecture
The MLP consists of:
Input Layer (2 neurons) – Represents the two binary inputs of the XOR gate.
Hidden Layer (2 neurons) – Extracts patterns in the input data.
Output Layer (1 neuron) – Produces the final binary output (0 or 1).

In [107]:
# XOR truth table
XOR_X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
XOR_y = np.array([0, 1, 1, 0])

In [108]:
# Train and evaluate the MLP
print("Training MLP for XOR Gate:")
mlp = MLP_XOR()
mlp.train(XOR_X, XOR_y)
mlp.evaluate(XOR_X, XOR_y)

Training MLP for XOR Gate:
Training complete at epoch 2
Accuracy: 100.00%


  return int(output)


Description of Code

Class MLP → Implements a two-layer neural network.
__init__() → Initializes weights for input-hidden and hidden-output layers.
sigmoid() & sigmoid_derivative() → Activation function and its derivative for backpropagation.
forward(X) → Computes activations through layers.
backward(X, y) → Updates weights using error gradients.
train(X, y) → Performs forward + backward propagation.
predict(X) → Outputs rounded values (0 or 1).

Conclusion

Perceptron (Single Layer) → Works for NAND, fails for XOR.
MLP (Multi-Layer) → Works for XOR by introducing non-linearity.
