In [1]:
# Exp -2

Multi-Layer Perceptron (MLP) for XOR Function

Objective:
The aim of this experiment is to implement a Multi-Layer Perceptron (MLP) with one hidden layer using NumPy and demonstrate that it can learn the XOR Boolean function, which is not linearly separable.

Background:
A single-layer perceptron fails to learn the XOR function because XOR is not linearly separable. However, an MLP with at least one hidden layer can solve this problem by learning a non-linear decision boundary.

Implementation Details:
Activation Function:

The sigmoid function is used for activation in both hidden and output layers since it maps inputs to a (0,1) range and allows for non-linearity.
The derivative of the sigmoid function is used during backpropagation for weight updates.
Network Architecture:

Input Layer: 2 neurons (for the two inputs of XOR).
Hidden Layer: 2 neurons (configurable).
Output Layer: 1 neuron (to classify the XOR output).
Training the MLP:

The network initializes weights and biases randomly.
It performs forward propagation to compute outputs.
The error between predicted and actual outputs is computed.
Using backpropagation, the weights and biases are updated using the gradient descent rule.
This process is repeated for a specified number of epochs (10,000).
Prediction Process:

Once trained, the model takes input values, passes them through the hidden layer and output layer, and generates the final predictions.
If the output is greater than 0.5, it is classified as 1, otherwise 0.

In [2]:
import numpy as np  # Importing numpy for numerical operations

# Sigmoid activation function
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Derivative of sigmoid function
def sigmoid_derivative(x):
    return x * (1 - x)

# Function to train a Multi-Layer Perceptron (MLP) with one hidden layer
def mlp_train(X, y, hidden_neurons=2, lr=0.1, epochs=10000):
    np.random.seed(42)  # Setting seed for reproducibility
    input_neurons = X.shape[1]  # Number of input neurons
    output_neurons = 1  # Single output neuron
    
    # Initialize weights and biases
    weights_input_hidden = np.random.rand(input_neurons, hidden_neurons)
    bias_hidden = np.random.rand(hidden_neurons)
    weights_hidden_output = np.random.rand(hidden_neurons, output_neurons)
    bias_output = np.random.rand(output_neurons)
    
    # Training process
    for _ in range(epochs):
        # Forward propagation
        hidden_layer_input = np.dot(X, weights_input_hidden) + bias_hidden
        hidden_layer_output = sigmoid(hidden_layer_input)
        output_layer_input = np.dot(hidden_layer_output, weights_hidden_output) + bias_output
        output = sigmoid(output_layer_input)
        
        # Compute error
        error = y.reshape(-1, 1) - output
        
        # Backpropagation
        d_output = error * sigmoid_derivative(output)
        d_hidden = d_output.dot(weights_hidden_output.T) * sigmoid_derivative(hidden_layer_output)
        
        # Update weights and biases
        weights_hidden_output += hidden_layer_output.T.dot(d_output) * lr
        bias_output += np.sum(d_output, axis=0) * lr
        weights_input_hidden += X.T.dot(d_hidden) * lr
        bias_hidden += np.sum(d_hidden, axis=0) * lr
    
    return weights_input_hidden, bias_hidden, weights_hidden_output, bias_output

# Function to make predictions using trained MLP
def mlp_predict(X, weights_input_hidden, bias_hidden, weights_hidden_output, bias_output):
    hidden_layer_input = np.dot(X, weights_input_hidden) + bias_hidden
    hidden_layer_output = sigmoid(hidden_layer_input)
    output_layer_input = np.dot(hidden_layer_output, weights_hidden_output) + bias_output
    return (sigmoid(output_layer_input) > 0.5).astype(int).flatten()

# XOR Truth Table (Inputs and expected outputs)
X_xor = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])  # Inputs for XOR gate
y_xor = np.array([0, 1, 1, 0])  # Expected outputs for XOR gate

# Train MLP for XOR gate
weights_input_hidden, bias_hidden, weights_hidden_output, bias_output = mlp_train(X_xor, y_xor)
y_pred_xor = mlp_predict(X_xor, weights_input_hidden, bias_hidden, weights_hidden_output, bias_output)

# Printing the predictions
print("XOR Predictions:", y_pred_xor)  # Expected: [0, 1, 1, 0]


XOR Predictions: [0 1 1 0]
