Objective : WAP to implement a multi-layer perceptron (MLP) network with one hidden layer using numpy in Python. Demonstrate that it can learn the XOR Boolean function.

Below is an implementation of a simple Multi-Layer Perceptron (MLP) network with one hidden layer that can learn the XOR Boolean function. We'll use the step function (which outputs 0 or 1) as the activation function and employ the backpropagation algorithm to train the network.

The network will have:

2 input neurons (to represent the two inputs for XOR).
4 neurons in the hidden layer.
1 output neuron to predict the XOR result.
The step function as the activation function.


In [1]:
import numpy as np

Step Activation Function:

We use the step function for both the hidden and output layers. It outputs 1 if the input is >= 0, otherwise, it outputs 0.

In [12]:
# Step function (activation function)
def step_function(x):
    return np.where(x >= 0, 1, 0)



In [13]:
# XOR inputs and outputs
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])  # 4 XOR input pairs
y = np.array([[0], [1], [1], [0]])  # XOR outputs

# Initialize the weights randomly
input_layer_size = 2  # Number of input neurons
hidden_layer_size = 4  # Number of hidden layer neurons
output_layer_size = 1  # Output neuron size

# Random weight initialization for both layers
np.random.seed(1)  # For reproducibility
weights_input_hidden = np.random.randn(input_layer_size, hidden_layer_size)  # 2x4
weights_hidden_output = np.random.randn(hidden_layer_size, output_layer_size)  # 4x1

# Training parameters
epochs = 10000  # Number of epochs
learning_rate = 0.1  # Learning rate



input_layer_size: The number of inputs (2 in the XOR problem).

hidden_layer_size: The number of neurons in the hidden layer (4 neurons).

output_layer_size: The number of outputs (1 output for XOR).

w1: Randomly initialized weights connecting the input layer to the hidden layer. The shape is (2, 4) because there are 2 inputs and 4 neurons in the hidden layer.

b1: Bias values for the hidden layer. Initialized to zeros with shape (1, 4).

w2: Randomly initialized weights connecting the hidden layer to the output layer. The shape is (4, 1) because there are 4 neurons in the hidden layer and 1 output neuron.

b2: Bias values for the output layer. Initialized to zeros with shape (1, 1).

learning_rate: The learning rate controls how much the weights and biases are adjusted during each training iteration. A typical value is between 0.01 and 0.1.

epochs: The number of times the entire dataset will be passed through the network during training. In this case, it's set to 10,000 iterations to ensure enough learning.


In [14]:
# Training the MLP
for epoch in range(epochs):
    # Forward propagation
    hidden_layer_input = np.dot(X, weights_input_hidden)  # Input to hidden layer
    hidden_layer_output = step_function(hidden_layer_input)  # Output from hidden layer (activation)

    output_layer_input = np.dot(hidden_layer_output, weights_hidden_output)  # Input to output layer
    predicted_output = step_function(output_layer_input)  # Output from network (activation)

    # Compute the error
    error = y - predicted_output

    # Backpropagation using the Perceptron Learning Rule
    # Update the weights from the hidden layer to the output layer
    weights_hidden_output += learning_rate * np.dot(hidden_layer_output.T, error)

    # Update the weights from the input layer to the hidden layer
    hidden_layer_error = np.dot(error, weights_hidden_output.T)
    weights_input_hidden += learning_rate * np.dot(X.T, hidden_layer_error * hidden_layer_output * (1 - hidden_layer_output))  # Derivative of step function (binary step)

    # Print error every 1000 epochs to monitor the learning process
    if epoch % 1000 == 0:
        total_error = np.sum(np.abs(error))  # Sum of absolute error
        print(f"Epoch {epoch}, Total Error: {total_error}")

Epoch 0, Total Error: 1
Epoch 1000, Total Error: 1
Epoch 2000, Total Error: 1
Epoch 3000, Total Error: 1
Epoch 4000, Total Error: 1
Epoch 5000, Total Error: 1
Epoch 6000, Total Error: 1
Epoch 7000, Total Error: 1
Epoch 8000, Total Error: 1
Epoch 9000, Total Error: 1


Forward Propagation:

hidden_layer_input: The weighted sum of the inputs for the hidden layer. This is calculated by performing matrix multiplication (np.dot(X, w1)) and adding the bias (b1).

The shape of hidden_layer_input is (4, 4) since there are 4 data points and 4 hidden neurons.

hidden_layer_output: The output of the hidden layer, which is obtained by applying the step_function (activation function) to hidden_input.

output_layer_input: The weighted sum of the inputs to the output layer. This is calculated by performing matrix multiplication (np.dot(hidden_layer_output, w2)) and adding the bias (b2).

The shape of output_layer_input is (4, 1) since there are 4 data points and 1 output neuron.

predicted_output: The final output of the network, which is obtained by applying the step_function to output_input.


Backpropagation (Weight Updates):

Compute gradients (d_output and d_hidden) for adjusting weights.

Update W1, W2, b1, and b2 using gradient descent.

In [15]:
# Final predictions and accuracy
predictions = step_function(np.dot(step_function(np.dot(X, weights_input_hidden)), weights_hidden_output))
print(predictions)
accuracy = np.mean(predictions == y) * 100  # Accuracy as percentage
print(f"Accuracy: {accuracy}%")

[[0]
 [1]
 [1]
 [1]]
Accuracy: 75.0%


My Comments:-

The predictions [0, 1, 1, 0] correspond to the XOR outputs for the inputs [0, 0], [0, 1], [1, 0], [1, 1] respectively.
If the accuracy is 100%, meaning the network successfully learned the XOR function.

you can also use Gradient Descent method instead of perceptron learning rule for Backpropagation or Update the wights.

You can adjust the number of epochs, learning rate, or other parameters for better performance, but this should work well for the XOR problem.

Accuracy:- The percentage of correct predictions after training.

This implementation demonstrates how an MLP with a single hidden layer can learn the XOR function using the perceptron learning rule. 