# üß† Perceptron: Summary

## What is a Perceptron?

A **Perceptron** is the simplest type of neural network ‚Äî basically a single artificial neuron used for **binary classification** (two-class problems). It tries to find a straight line (or hyperplane) to separate two groups.

Invented in **1957 by Frank Rosenblatt**, it is a **linear classifier**.

---

## How It Works

* **Inputs (x):** Features like x‚ÇÅ, x‚ÇÇ, ..., x‚Çô.
* **Weights (w):** Each input has a weight showing importance.
* **Net Input:**

  * Formula: **z = w ¬∑ x + b**
* **Activation Function (Step function):**

  * If **z > 0 ‚Üí output = 1**
  * If **z ‚â§ 0 ‚Üí output = -1**

The **decision boundary** is where **z = 0**.

---

## Learning: The Perceptron Algorithm

The perceptron learns by correcting its mistakes.

### Steps:

1. Initialize weights and bias (0 or small random values).
2. For each training example:

   * **Predict** output.
   * **Compare** prediction with true label.
   * **Update** only if wrong.

### Update Rule:

* **w_new = w_old + Œ∑ * y * x**
* **b_new = b_old + Œ∑ * y**

Where:

* **Œ∑** = learning rate
* **y** = true label (1 or -1)

---

## Why Updates Work

* If **false negative** (y = 1 but predicted -1):

  * Add Œ∑x ‚Üí makes output more positive.
* If **false positive** (y = -1 but predicted 1):

  * Subtract Œ∑x ‚Üí makes output more negative.

---

## Example Problem: OR Gate

The OR gate is linearly separable, so the perceptron can solve it.

* (0, 0) ‚Üí -1
* (0, 1) ‚Üí 1
* (1, 0) ‚Üí 1
* (1, 1) ‚Üí 1

Python code (not included here) trains the perceptron to correctly classify these points.


In [1]:
import numpy as np

class Perceptron:
    """
    A simple Perceptron classifier.
    
    Parameters
    ----------
    learning_rate : float
        The step size for weight updates (default 0.01).
    n_iters : int
        Number of passes over the training dataset (epochs) (default 100).
        
    Attributes
    ----------
    weights : 1d-array
        Weights after fitting.
    bias : scalar
        Bias unit after fitting.
    """

    def __init__(self, learning_rate=0.01, n_iters=100):
        self.learning_rate = learning_rate
        self.n_iters = n_iters
        self.weights = None
        self.bias = None

    def _activation_function(self, x):
        """The Heaviside step function."""
        # Returns 1 if x >= 0, otherwise -1
        return np.where(x >= 0, 1, -1)

    def fit(self, X, y):
        """
        Fit training data.
        
        Parameters
        ----------
        X : {array-like}, shape = [n_samples, n_features]
            Training vectors.
        y : array-like, shape = [n_samples]
            Target values (must be 1 or -1).
        """
        n_samples, n_features = X.shape
        
        # 1. Initialize weights and bias
        # We initialize to zeros for simplicity
        self.weights = np.zeros(n_features)
        self.bias = 0
        
        # We use 1 and -1 for our class labels
        y_ = np.array([1 if i > 0 else -1 for i in y])

        # 2. Loop for n_iters (epochs)
        for _ in range(self.n_iters):
            # 3. Loop over each training sample
            for idx, x_i in enumerate(X):
                # Calculate the net input (w*x + b)
                linear_output = np.dot(x_i, self.weights) + self.bias
                
                # Make a prediction
                y_predicted = self._activation_function(linear_output)
                
                # 4. Compare and Update (if wrong)
                if y_[idx] != y_predicted:
                    # Apply the Perceptron update rule
                    update = self.learning_rate * y_[idx]
                    self.weights += update * x_i
                    self.bias += update

    def predict(self, X):
        """
        Predict class labels for new data.
        
        Parameters
        ----------
        X : {array-like}, shape = [n_samples, n_features]
            Input vectors.
            
        Returns
        -------
        y_pred : array, shape = [n_samples]
            Predicted class labels (1 or -1).
        """
        linear_output = np.dot(X, self.weights) + self.bias
        return self._activation_function(linear_output)

# --- Main execution ---
if __name__ == "__main__":
    
    # We will test the Perceptron on the OR gate
    # We use 1 and -1 as our labels for True and False
    
    # Input data (OR gate)
    X = np.array([
        [0, 0],
        [0, 1],
        [1, 0],
        [1, 1]
    ])
    
    # Target labels (OR gate results)
    # 0 or 0 = 0 (False -> -1)
    # 0 or 1 = 1 (True -> 1)
    # 1 or 0 = 1 (True -> 1)
    # 1 or 1 = 1 (True -> 1)
    y = np.array([-1, 1, 1, 1])

    # Create and train the perceptron
    p = Perceptron(learning_rate=0.1, n_iters=10)
    p.fit(X, y)
    
    print("Perceptron training complete.")
    print(f"Learned Weights: {p.weights}")
    print(f"Learned Bias: {p.bias}")
    
    # Test the predictions
    print("\n--- Predictions ---")
    test_00 = p.predict([0, 0])
    test_01 = p.predict([0, 1])
    test_10 = p.predict([1, 0])
    test_11 = p.predict([1, 1])
    
    print(f"Input [0, 0] -> Output: {test_00}")
    print(f"Input [0, 1] -> Output: {test_01}")
    print(f"Input [1, 0] -> Output: {test_10}")
    print(f"Input [1, 1] -> Output: {test_11}")

Perceptron training complete.
Learned Weights: [0.1 0.1]
Learned Bias: -0.1

--- Predictions ---
Input [0, 0] -> Output: -1
Input [0, 1] -> Output: 1
Input [1, 0] -> Output: 1
Input [1, 1] -> Output: 1




## ‚ö†Ô∏è Key Concepts and Limitations

**Convergence:** The Perceptron Convergence Theorem guarantees the algorithm will find a separating line *only if* the data is linearly separable. If not, it never converges (hence setting a max number of iterations).

**Linear Separability (XOR Problem):** The perceptron's major limitation is that it can only solve linearly separable tasks.

**XOR Example (not linearly separable):**

* (0, 0) ‚Üí 0
* (0, 1) ‚Üí 1
* (1, 0) ‚Üí 1
* (1, 1) ‚Üí 0

No single straight line can separate the outputs. This led to the discovery that single-layer perceptrons are limited, contributing to the "AI Winter." The solution became **Multi-Layer Perceptrons**, which handle nonlinear problems using multiple stacked layers.
