# **Implementation of Simple Perceptron and Analysis of its Limitation on XOR Problem**
This notebook is an Open Educational Resource (OER) developed for teaching and learning purposes. It is released under the Creative Commons Attributionâ€“ShareAlike (CC BY-SA 4.0) International License.

This license allows anyone to use, copy, adapt, modify, translate, remix, and redistribute the material in any medium or format, provided proper credit is given to the original author and any modified versions are shared under the same license.


---


*Citation Format:*
 *Suneel Kumar Duvvuri, Implementation of Simple Perceptron and Analysis of its Limitation on XOR Problem. Open Educational Resource (OER). Licensed under CC BY-SA 4.0*

### 1. Perceptron Implementation

First, let's implement a basic Perceptron model. A Perceptron is a single-layer neural network used for binary classification. It takes multiple binary inputs and produces a single binary output.

In [1]:
import numpy as np

class Perceptron:
    def __init__(self, learning_rate=0.01, n_iterations=100):
        self.learning_rate = learning_rate
        self.n_iterations = n_iterations
        self.weights = None
        self.bias = None

    def fit(self, X, y):
        n_samples, n_features = X.shape

        # Initialize weights and bias
        self.weights = np.zeros(n_features)
        self.bias = 0

        for _ in range(self.n_iterations):
            for idx, x_i in enumerate(X):
                # Calculate weighted sum and apply activation function (step function)
                linear_output = np.dot(x_i, self.weights) + self.bias
                y_predicted = 1 if linear_output >= 0 else 0

                # Perceptron learning rule (update weights and bias if prediction is wrong)
                update = self.learning_rate * (y[idx] - y_predicted)
                self.weights += update * x_i
                self.bias += update

    def predict(self, X):
        linear_output = np.dot(X, self.weights) + self.bias
        return np.where(linear_output >= 0, 1, 0)


### 2. Perceptron Experiment: AND Gate

Let's train our Perceptron to solve a simple **AND** gate. An AND gate is a linearly separable problem, meaning a single straight line can separate the true and false outputs.

In [2]:
# Input data for AND gate
X_and = np.array([
    [0, 0],
    [0, 1],
    [1, 0],
    [1, 1]
])

# Output data for AND gate
y_and = np.array([0, 0, 0, 1])

# Create and train the Perceptron
perceptron_and = Perceptron(learning_rate=0.1, n_iterations=100)
perceptron_and.fit(X_and, y_and)

# Test the trained Perceptron
predictions_and = perceptron_and.predict(X_and)

print("AND Gate Predictions:")
for i in range(len(X_and)):
    print(f"Input: {X_and[i]}, Expected: {y_and[i]}, Predicted: {predictions_and[i]}")

accuracy_and = np.mean(predictions_and == y_and)
print(f"\nAccuracy for AND Gate: {accuracy_and * 100:.2f}%")


AND Gate Predictions:
Input: [0 0], Expected: 0, Predicted: 0
Input: [0 1], Expected: 0, Predicted: 0
Input: [1 0], Expected: 0, Predicted: 0
Input: [1 1], Expected: 1, Predicted: 1

Accuracy for AND Gate: 100.00%


### 3. Perceptron Drawback: XOR Gate

Now, let's try to train the Perceptron on an **XOR** gate. The XOR (exclusive OR) problem is a classic example of a non-linearly separable problem. This means no single straight line can separate the true and false outputs.

In [3]:
# Input data for XOR gate
X_xor = np.array([
    [0, 0],
    [0, 1],
    [1, 0],
    [1, 1]
])

# Output data for XOR gate
y_xor = np.array([0, 1, 1, 0])

# Create and train the Perceptron
perceptron_xor = Perceptron(learning_rate=0.1, n_iterations=100)
perceptron_xor.fit(X_xor, y_xor)

# Test the trained Perceptron
predictions_xor = perceptron_xor.predict(X_xor)

print("XOR Gate Predictions:")
for i in range(len(X_xor)):
    print(f"Input: {X_xor[i]}, Expected: {y_xor[i]}, Predicted: {predictions_xor[i]}")

accuracy_xor = np.mean(predictions_xor == y_xor)
print(f"\nAccuracy for XOR Gate: {accuracy_xor * 100:.2f}%")


XOR Gate Predictions:
Input: [0 0], Expected: 0, Predicted: 1
Input: [0 1], Expected: 1, Predicted: 1
Input: [1 0], Expected: 1, Predicted: 0
Input: [1 1], Expected: 0, Predicted: 0

Accuracy for XOR Gate: 50.00%


### Explanation of Perceptron's Failure on XOR

As you can see from the XOR gate results, the simple Perceptron fails to achieve 100% accuracy (or sometimes even close to it). This is because:

*   **Linear Separability:** A Perceptron can only classify linearly separable data. This means if you can draw a single straight line (or hyperplane in higher dimensions) to separate the different classes, a Perceptron can learn it.

*   **XOR's Non-Linear Nature:** The XOR function is not linearly separable. If you plot the inputs `(0,0)`, `(0,1)`, `(1,0)`, `(1,1)` and their corresponding outputs, you'll find that `(0,0)` and `(1,1)` are one class (0), and `(0,1)` and `(1,0)` are another class (1). No single straight line can separate these two groups of points.

    *   For example, consider a 2D plane: points (0,0) and (1,1) have output 0, while (0,1) and (1,0) have output 1. Any line you draw will incorrectly classify at least one point.

*   **Solution:** To solve non-linearly separable problems like XOR, a single-layer Perceptron is insufficient. You need a more complex architecture, such as a **multi-layer Perceptron (MLP)**, also known as a feedforward neural network with at least one hidden layer. Hidden layers allow the network to learn non-linear decision boundaries by transforming the input data into a new feature space where it becomes linearly separable.