## Perceptron Overview and Training Explanation

The Perceptron is one of the simplest types of artificial neural networks and serves as a fundamental building block for more complex models. It is designed to perform binary classification by learning a linear decision boundary that separates two classes.

### What the Code Does:

1. **Initialization**  
   We start by randomly initializing the perceptron’s weights and bias. Each weight corresponds to an input feature, and the bias acts as an offset that helps the decision boundary shift to better separate the data.

2. **Prediction**  
   For each input instance, the perceptron calculates a weighted sum of the input features plus the bias. This sum is then passed through a simple step function that outputs either 0 or 1, corresponding to two classes.

3. **Training (Learning)**  
   During training, the perceptron iterates over the dataset multiple times (epochs). For each example:
   - It predicts the class using current weights and bias.
   - Computes the error as the difference between the true label and the predicted label.
   - Updates the weights and bias to reduce the error, nudging the decision boundary closer to correctly classify that example.
   
   The weight update is proportional to the error, the learning rate (which controls the size of the adjustment), and the input feature value. The bias update is proportional to the error and learning rate (considering the bias as a weight connected to a fixed input of 1).

4. **Convergence Check**  
   After each epoch, the code checks if there were any misclassifications. If the perceptron correctly classifies all examples (total error is zero), training stops early since the model has converged on a solution.

### Important Points

- **Linearity**: The perceptron can only solve problems where the two classes are linearly separable, meaning a straight line (or hyperplane) can separate the classes perfectly.
  
- **Failing on Non-linear Problems**: For problems like XOR, which cannot be separated by a single line, the perceptron will fail to converge.

- **Learning Rate & Epochs**: The learning rate governs how big each weight update step is, while epochs determine the maximum number of passes over the training data.

This simple yet powerful algorithm forms the basis for understanding neural networks and demonstrates fundamental concepts of training via error correction and weight adjustment.

In [1]:
import numpy as np

In [6]:
import numpy as np

class Perceptron:
    """
    A simple implementation of a single-layer perceptron for binary classification.

    The perceptron learns a linear decision boundary by iteratively adjusting weights
    and bias based on the training data. It uses a step activation function to output
    either 0 or 1.

    Attributes:
        weights (np.ndarray): The weight vector corresponding to each input feature.
        bias (float): The bias term added to the weighted sum.
        learning_rate (float): Step size for weight updates.
        epochs (int): Maximum number of iterations over the training dataset.
    """

    def __init__(self, num_inputs, learning_rate=0.1, epochs=100):
        """
        Initialize the perceptron with random weights and bias.

        Args:
            num_inputs (int): Number of input features.
            learning_rate (float): Learning rate for the weight updates.
            epochs (int): Maximum number of training iterations.
        """
        self.weights = np.random.rand(num_inputs)  # Initialize weights randomly
        self.bias = np.random.rand(1)              # Initialize bias randomly
        self.learning_rate = learning_rate
        self.epochs = epochs

    def _step_function(self, x):
        """
        Step activation function that maps input to binary output.

        Args:
            x (float): The input value, typically the weighted sum of inputs plus bias.

        Returns:
            int: 1 if x is greater than or equal to 0, else 0.
        """
        return 1 if x >= 0 else 0

    def predict(self, inputs):
        """
        Predict the binary class label for given input features.

        Calculates the weighted sum of inputs and bias, then applies the step function.

        Args:
            inputs (np.ndarray): Input features as a 1D array.

        Returns:
            int: Predicted class label (0 or 1).
        """
        # Calculate weighted sum of inputs and add bias
        linear_output = np.dot(inputs, self.weights) + self.bias
        # Apply step function to get binary prediction
        return self._step_function(linear_output)

    def train(self, training_inputs, labels):
        """
        Train the perceptron on the provided dataset.

        Iteratively updates weights and bias based on prediction errors to reduce misclassifications.
        Stops early if the perceptron converges (no prediction errors).

        Args:
            training_inputs (np.ndarray): 2D array where each row is an input sample.
            labels (np.ndarray): 1D array of true binary class labels corresponding to inputs.
        """
        print(f"Weight at step 0 : {self.weights}")
        for epoch in range(self.epochs):
            total_error = 0
            for inputs, label in zip(training_inputs, labels):
                prediction = self.predict(inputs)
                error = label - prediction

                # Update weights: proportional to error, learning rate, and input magnitude
                self.weights += self.learning_rate * error * inputs

                print(f"Weight at step {epoch+1} : {self.weights}")

                # Update bias: proportional to error and learning rate (bias input assumed 1)
                self.bias += self.learning_rate * error

                # Accumulate total error magnitude for this epoch
                total_error += abs(error)

            # If no errors during this epoch, training has converged
            if total_error == 0:
                print(f"Converged after {epoch + 1} epochs.")
                break
        else:
            # Reached maximum epochs without perfect convergence
            print(f"Training finished after {self.epochs} epochs.")


In [7]:
# --- AND Gate ---
print("--- AND Gate ---")
and_inputs = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
and_labels = np.array([0, 0, 0, 1])

perceptron_and = Perceptron(num_inputs=2)
perceptron_and.train(and_inputs, and_labels)

for inputs, label in zip(and_inputs, and_labels):
    prediction = perceptron_and.predict(inputs)
    print(f"AND({inputs[0]}, {inputs[1]}) -> Predicted: {prediction}, Actual: {label}")


--- AND Gate ---
Weight at step 0 : [0.52842781 0.80654865]
Weight at step 1 : [0.52842781 0.80654865]
Weight at step 1 : [0.52842781 0.70654865]
Weight at step 1 : [0.42842781 0.70654865]
Weight at step 1 : [0.42842781 0.70654865]
Weight at step 2 : [0.42842781 0.70654865]
Weight at step 2 : [0.42842781 0.60654865]
Weight at step 2 : [0.32842781 0.60654865]
Weight at step 2 : [0.32842781 0.60654865]
Weight at step 3 : [0.32842781 0.60654865]
Weight at step 3 : [0.32842781 0.50654865]
Weight at step 3 : [0.22842781 0.50654865]
Weight at step 3 : [0.22842781 0.50654865]
Weight at step 4 : [0.22842781 0.50654865]
Weight at step 4 : [0.22842781 0.40654865]
Weight at step 4 : [0.22842781 0.40654865]
Weight at step 4 : [0.22842781 0.40654865]
Weight at step 5 : [0.22842781 0.40654865]
Weight at step 5 : [0.22842781 0.30654865]
Weight at step 5 : [0.22842781 0.30654865]
Weight at step 5 : [0.22842781 0.30654865]
Weight at step 6 : [0.22842781 0.30654865]
Weight at step 6 : [0.22842781 0.3065

In [8]:
# --- OR Gate ---
print("\n--- OR Gate ---")
or_inputs = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
or_labels = np.array([0, 1, 1, 1])

perceptron_or = Perceptron(num_inputs=2)
perceptron_or.train(or_inputs, or_labels)

for inputs, label in zip(or_inputs, or_labels):
    prediction = perceptron_or.predict(inputs)
    print(f"OR({inputs[0]}, {inputs[1]}) -> Predicted: {prediction}, Actual: {label}")



--- OR Gate ---
Weight at step 0 : [0.58507941 0.2360992 ]
Weight at step 1 : [0.58507941 0.2360992 ]
Weight at step 1 : [0.58507941 0.2360992 ]
Weight at step 1 : [0.58507941 0.2360992 ]
Weight at step 1 : [0.58507941 0.2360992 ]
Weight at step 2 : [0.58507941 0.2360992 ]
Weight at step 2 : [0.58507941 0.2360992 ]
Weight at step 2 : [0.58507941 0.2360992 ]
Weight at step 2 : [0.58507941 0.2360992 ]
Weight at step 3 : [0.58507941 0.2360992 ]
Weight at step 3 : [0.58507941 0.2360992 ]
Weight at step 3 : [0.58507941 0.2360992 ]
Weight at step 3 : [0.58507941 0.2360992 ]
Converged after 3 epochs.
OR(0, 0) -> Predicted: 0, Actual: 0
OR(0, 1) -> Predicted: 1, Actual: 1
OR(1, 0) -> Predicted: 1, Actual: 1
OR(1, 1) -> Predicted: 1, Actual: 1


In [10]:
# --- XOR Gate (Expected to Fail) ---
print("\n--- XOR Gate (Expected to Fail) ---")
xor_inputs = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
xor_labels = np.array([0, 1, 1, 0])

perceptron_xor = Perceptron(num_inputs=2, epochs=50) # Increase epochs to show it doesn't converge
perceptron_xor.train(xor_inputs, xor_labels)

for inputs, label in zip(xor_inputs, xor_labels):
    prediction = perceptron_xor.predict(inputs)
    print(f"XOR({inputs[0]}, {inputs[1]}) -> Predicted: {prediction}, Actual: {label}")


--- XOR Gate (Expected to Fail) ---
Weight at step 0 : [0.1188553  0.46465484]
Weight at step 1 : [0.1188553  0.46465484]
Weight at step 1 : [0.1188553  0.46465484]
Weight at step 1 : [0.1188553  0.46465484]
Weight at step 1 : [0.0188553  0.36465484]
Weight at step 2 : [0.0188553  0.36465484]
Weight at step 2 : [0.0188553  0.36465484]
Weight at step 2 : [0.0188553  0.36465484]
Weight at step 2 : [-0.0811447   0.26465484]
Weight at step 3 : [-0.0811447   0.26465484]
Weight at step 3 : [-0.0811447   0.26465484]
Weight at step 3 : [0.0188553  0.26465484]
Weight at step 3 : [-0.0811447   0.16465484]
Weight at step 4 : [-0.0811447   0.16465484]
Weight at step 4 : [-0.0811447   0.16465484]
Weight at step 4 : [0.0188553  0.16465484]
Weight at step 4 : [-0.0811447   0.06465484]
Weight at step 5 : [-0.0811447   0.06465484]
Weight at step 5 : [-0.0811447   0.16465484]
Weight at step 5 : [0.0188553  0.16465484]
Weight at step 5 : [-0.0811447   0.06465484]
Weight at step 6 : [-0.0811447   0.06465