# Lab 3 – Module 3: Building a Perceptron

**Time:** ~5 minutes

---

> **KEY IDEA**  
> A perceptron is the single building block of every neural network.
> It is tiny and simple on its own — but millions of them, connected in layers, power systems like ChatGPT and image recognition.  
>
> Before answering, look closely at the two-step diagram below.
> Make sure you can trace what happens to a number as it moves through the perceptron from input to output.

## 1. The Two-Step Diagram

A perceptron does exactly two things, one after the other:

```
         STEP 1                          STEP 2
     Weighted Sum                     Activation
  ┌────────────────────────┐   ┌────────────────────────┐
  │                        │   │                        │
  │  z = w₁·x₁ + w₂·x₂ + b │ ─▶ │  output = activation(z) │ ─▶ answer
  │                        │   │                        │
  └────────────────────────┘   └────────────────────────┘

  x₁, x₂  =  the two inputs (coordinates of a data point)
  w₁, w₂  =  weights   (how much each input matters)
  b       =  bias      (shifts the dividing line left or right)
  z       =  a single number that says "which side of the line is this point on?"
```

**Step 1** is just multiplication and addition — the same kind of line equation you used in Lab 1.  
**Step 2** is the activation function from Modules 1–2. It introduces the *bend* that lets the perceptron handle more than just straight lines.

## 2. Setup

Run this cell to create a dataset and the interactive perceptron tool.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
from ipywidgets import FloatSlider, Dropdown, interact

# Activation functions
def sigmoid(z):
    return 1 / (1 + np.exp(-np.clip(z, -500, 500)))

def relu(z):
    return np.maximum(0, z)

def step_fn(z):
    return (z > 0).astype(float)

# Two datasets: one diagonal, one vertical
np.random.seed(42)
n = 50

# Dataset A – Diagonal separation
Xa = np.vstack([
    np.random.randn(n, 2) * 0.5 + [-1.5, -1.5],
    np.random.randn(n, 2) * 0.5 + [ 1.5,  1.5]
])
ya = np.hstack([np.zeros(n), np.ones(n)])

# Dataset B – Vertical separation (left vs right)
Xb = np.vstack([
    np.random.randn(n, 2) * 0.6 + [-1.8, 0],
    np.random.randn(n, 2) * 0.6 + [ 1.8, 0]
])
yb = np.hstack([np.zeros(n), np.ones(n)])

datasets = {
    'Diagonal': (Xa, ya),
    'Left vs Right': (Xb, yb),
}

print('Setup complete! Two datasets ready.')
print('  Diagonal      \u2014 blue lower-left, red upper-right')
print('  Left vs Right \u2014 blue on the left, red on the right')

## 3. Interactive Perceptron

Adjust the **weights** (w\u2081, w\u2082) and **bias** (b) to move the green decision boundary so it separates blue from red.  
The background color shows which region the perceptron predicts as each class.

**Experiments to try:**
1. Start with the **Diagonal** dataset. Try to reach 100 %.
2. Switch to **Left vs Right**. Which weight matters more now?
3. Change **only the bias** (leave weights fixed) and watch the boundary slide.
4. Change **only a weight** (leave bias fixed) and watch the boundary tilt.

In [None]:
def perceptron_viz(dataset_name, w1, w2, b, activation_name):
    X, y_true = datasets[dataset_name]
    act = {'Sigmoid': sigmoid, 'ReLU': relu, 'Step': step_fn}[activation_name]
    threshold = 0.5 if activation_name == 'Sigmoid' else 0.5

    # Decision surface on a grid
    g = np.linspace(-4, 4, 200)
    G1, G2 = np.meshgrid(g, g)
    Z_grid = act(w1 * G1 + w2 * G2 + b)
    pred_grid = (Z_grid > threshold).astype(int)

    # Classify data points
    z_data = act(w1 * X[:, 0] + w2 * X[:, 1] + b)
    pred_data = (z_data > threshold).astype(int)
    correct = pred_data == y_true
    acc = correct.mean() * 100

    fig, ax = plt.subplots(figsize=(8, 8), dpi=100)
    ax.contourf(G1, G2, pred_grid, levels=[-0.5, 0.5, 1.5],
                colors=['#cce0ff', '#ffcccc'], alpha=0.35)
    ax.contour(G1, G2, pred_grid, levels=[0.5], colors=['green'], linewidths=3)

    # Correct points
    ax.scatter(X[correct & (y_true == 0), 0], X[correct & (y_true == 0), 1],
              c='blue', s=80, alpha=0.7, edgecolors='k', lw=1.2, label='Class 0')
    ax.scatter(X[correct & (y_true == 1), 0], X[correct & (y_true == 1), 1],
              c='red',  s=80, alpha=0.7, edgecolors='k', lw=1.2, label='Class 1')
    # Misclassified
    if not correct.all():
        ax.scatter(X[~correct, 0], X[~correct, 1],
                  c='yellow', s=130, marker='X', edgecolors='red', lw=3, label='Wrong')

    ax.set_title(f'{dataset_name}  |  {activation_name}  |  Accuracy: {acc:.0f}%',
                fontsize=13, fontweight='bold')
    ax.set_xlabel('x\u2081'); ax.set_ylabel('x\u2082')
    ax.legend(fontsize=10, loc='upper left')
    ax.set_xlim(-4, 4); ax.set_ylim(-4, 4)
    ax.set_aspect('equal'); ax.grid(True, alpha=0.2)
    plt.tight_layout(); plt.show()

    print(f'Perceptron:  z = {w1:.1f} x\u2081 + {w2:.1f} x\u2082 + {b:.1f}  \u2192  {activation_name}(z)')
    if acc == 100:
        print('Perfect! Try the other dataset or a different activation.')

interact(
    perceptron_viz,
    dataset_name=Dropdown(options=list(datasets.keys()), value='Diagonal',
                          description='Dataset:'),
    w1=FloatSlider(min=-3, max=3, step=0.1, value=1.0, description='Weight w\u2081:',
                   continuous_update=False),
    w2=FloatSlider(min=-3, max=3, step=0.1, value=1.0, description='Weight w\u2082:',
                   continuous_update=False),
    b=FloatSlider(min=-5, max=5, step=0.1, value=0.0,  description='Bias b:',
                  continuous_update=False),
    activation_name=Dropdown(options=['Sigmoid', 'ReLU', 'Step'], value='Sigmoid',
                             description='Activation:')
);

## 4. What the Parameters Control

| Parameter | What it does |
|-----------|-------------|
| **Weights** (w\u2081, w\u2082) | Tilt / rotate the decision boundary. Making w\u2081 large and w\u2082 small creates a mostly vertical boundary (x\u2081 matters more). Equal weights make a diagonal boundary. |
| **Bias** (b) | Slide the boundary left/right without changing its angle. Positive bias pushes it one way, negative pushes the other. |
| **Activation** | For these simple datasets, all activations give the same accuracy — because the data is already separable by a straight line. The activation matters more when things get complicated (Module 4). |

## Answer‑Sheet Questions (Q10 – Q12)

**Q10.** Look at the two-step diagram in Section 1. Without using any math, describe each step in plain language — what goes in, what happens, and what comes out?

**Q11.** Try adjusting **only the weights** while keeping the bias fixed. What changes about the decision boundary? Now try adjusting **only the bias**. What changes? Describe the difference between what each one controls.

**Q12.** You’ve now seen activation functions bend space (Module 1) and a perceptron combine weights, bias, and an activation function (this module). Where exactly in the perceptron does the “bending” happen — Step 1 or Step 2? Why does that matter for what kinds of patterns the perceptron can separate?

---

**Next:** Continue to **Module 4** to test the perceptron on the *impossible* patterns from Module 0 and discover its limits.