A **Perceptron** is the most basic unit of a neural network — it's a simplified model of how a single neuron in the brain works, used in artificial intelligence and machine learning. It was introduced by **Frank Rosenblatt in 1958**.

---

## 🧠 What is a Perceptron?

A **Perceptron**:

* Takes **multiple input values** (features),
* Multiplies each by a **weight**,
* Adds a **bias**,
* Passes the result through an **activation function** (usually a step function or sign function),
* And produces a **single output** (usually 0 or 1).

---

## 🧮 Mathematical Formula:

For inputs: `x1, x2, ..., xn`
Weights: `w1, w2, ..., wn`
Bias: `b`
Then the **output** is:

```
output = activation(w1*x1 + w2*x2 + ... + wn*xn + b)
```

Or in vector form:

```
output = activation(w · x + b)
```

---

## 🔧 Activation Function (Classic Perceptron):

Typically, the **step function**:

```python
def step_function(x):
    return 1 if x >= 0 else 0
```

So the perceptron is essentially a **linear binary classifier**.

---

## ✅ Example:

Suppose we want to model a simple logic gate like **AND**:

```python
inputs = [1, 1]
weights = [1, 1]
bias = -1.5

output = step_function(1*1 + 1*1 - 1.5)  # = step(0.5) ➝ 1
```

This correctly predicts `1 AND 1 = 1`.

---

## 🔁 Learning:

A perceptron learns using an update rule:

```
w = w + α * (target - output) * x
```

Where:

* `α` is the learning rate,
* `target` is the expected output,
* `output` is the predicted output,
* `x` is the input vector.

It adjusts weights to minimize classification error.

---

## ⚠️ Limitations:

* Can only classify **linearly separable data**.
* Cannot solve problems like **XOR**, which are **not linearly separable**.

> This led to early skepticism about neural networks until **multi-layer perceptrons (MLPs)** and **backpropagation** were introduced, which enabled learning **non-linear functions**.

---

## 🧱 TL;DR:

* **Perceptron** = a single neuron model.
* **Inputs + weights + bias → activation → output**
* Good for **linear** classification tasks.
* Foundation of deeper models like **neural networks**, **CNNs**, and **transformers**.


In [1]:
import numpy as np

In [2]:
X = np.array([
    [0, 0],
    [0, 1],
    [1, 0],
    [1, 1],
])

y = np.array([0, 1, 1, 1])

In [None]:
from model import Perceptron

In [4]:
model = Perceptron(input_size=2)
model.train(X, y, epochs=10)

for xi in X:
    print(f"{xi} => {model.predict(xi)}")

[0 0] => 0
[0 1] => 1
[1 0] => 1
[1 1] => 1
