# Perceptron

### 1. Decision Rule

$
\hat{y} = 
\begin{cases} 
+1 & \text{if } \mathbf{w} \cdot \mathbf{x} + b > 0 \\ 
-1 & \text{otherwise} 
\end{cases}
$

- $\mathbf{x}$: Input feature vector $\mathbf{x} \in \mathbb{R}^n$, where $n$ is the number of features.
- $\mathbf{w}$: Weight vector $\mathbf{w} \in \mathbb{R}^n$, representing the model's learned weights for each feature.
- $b$: Bias term, allowing the decision boundary to shift.
- $\mathbf{w} \cdot \mathbf{x}$: Dot product of the weight vector and input vector, representing the weighted sum of the inputs.
- $\hat{y}$: Predicted label, either +1 or -1.

---

#### 2. Learning Rule
If the Perceptron misclassifies a sample $(\mathbf{x}_i, y_i)$, it updates the weights and bias to correct the mistake.

#### Update Rules:
1. **Weights Update**:
$\mathbf{w} \gets \mathbf{w} + \eta \cdot y_i \cdot \mathbf{x}_i$

2. **Bias Update**:
$b \gets b + \eta \cdot y_i$

- $y_i$: True label of the i-th sample $( y_i \in \{-1, +1\})$.
- $\mathbf{x}_i$: Feature vector of the i-th sample.
- $\eta$: Learning rate, controlling the step size of updates ( $\eta > 0$ \).

---

### 3. Objective
The goal of the Perceptron is to find a hyperplane that correctly separates the two classes. The hyperplane is defined as:
$\mathbf{w} \cdot \mathbf{x} + b = 0$

- The vector $\mathbf{w}$ is **perpendicular** to the decision boundary.
- The magnitude of b determines how far the hyperplane is from the origin.
- The term $\mathbf{w} \cdot \mathbf{x} + b$ gives the signed distance of a point $\mathbf{x}$ from the hyperplane.

---

### 4. Algorithm Steps
1. Initialize $\mathbf{w} = \mathbf{0}$ (or small random values) and b = 0.
2. For each training sample $(\mathbf{x}_i, y_i)$:
   - Compute the prediction:
     $\hat{y}_i = \text{sign}(\mathbf{w} \cdot \mathbf{x}_i + b)$
   - If $\hat{y}_i \neq y_i$ (misclassification), update:
     $\mathbf{w} \gets \mathbf{w} + \eta \cdot y_i \cdot \mathbf{x}_i$
     $b \gets b + \eta \cdot y_i$
3. Repeat for a fixed number of iterations or until convergence.


In [1]:
import numpy as np

In [2]:
class Perceptron:
    def __init__(self, learning_rate=0.01, n_iterations=1000):
        self.learning_rate = learning_rate
        self.n_iterations = n_iterations
        self.weights = None
        self.bias = None

    def fit(self, X, y): # X: Training data of shape (n_samples, n_features); y: Target labels of shape (n_samples,)
        n_samples, n_features = X.shape
        # Initialize weights and bias to zeros
        self.weights = np.zeros(n_features)
        self.bias = 0

        # Convert labels to {-1, 1} if they are {0, 1}
        y = np.where(y <= 0, -1, 1)

        for _ in range(self.n_iterations):
            for idx, x_i in enumerate(X):
                # Compute the linear combination
                linear_output = np.dot(x_i, self.weights) + self.bias
                # Apply the step function (sign of the output)
                y_predicted = np.sign(linear_output)

                # Update weights and bias if prediction is wrong
                if y[idx] * y_predicted <= 0:
                    self.weights += self.learning_rate * y[idx] * x_i
                    self.bias += self.learning_rate * y[idx]

    def predict(self, X): # X: Input data of shape (n_samples, n_features)
        linear_output = np.dot(X, self.weights) + self.bias
        return np.sign(linear_output)


In [3]:
if __name__ == "__main__":
    # Example dataset: Logical AND gate
    X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
    y = np.array([0, 0, 0, 1])  # AND gate output

    perceptron = Perceptron(learning_rate=0.1, n_iterations=10)
    perceptron.fit(X, y)

    predictions = perceptron.predict(X)
    print("Predictions:", predictions)

Predictions: [-1. -1. -1.  1.]
