<a href="https://colab.research.google.com/github/SkovanskiyS/ailab/blob/main/Lab_2B_Perceptron_Tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lab 2B: Perceptron Tutorial

## Objective
This notebook is written as a **guided tutorial**.

For each concept:
1. We first **solve a problem together**.
2. Then you are asked to **solve a similar problem yourself**.

By the end, you will understand how perceptrons work and how they implement logical gates.

## Section 1: Perceptron Prediction (Worked Example)

**Goal:** Understand how weights and bias produce an output.

We start with a perceptron **without learning**.

In [None]:
import numpy as np

def step(z):
    return 1 if z >= 0 else 0

def perceptron(x, w, b):
    return step(np.dot(x, w) + b)

# Example inputs
X = np.array([[2, 3], [1, 1], [3, 1]])

# Chosen weights and bias
w = np.array([1, -1])
b = 0

print("Worked example results:")
for x in X:
    print(x, perceptron(x, w, b))

Worked example results:
[2 3] 0
[1 1] 1
[3 1] 1


### Explanation
- We compute a **dot product** between inputs and weights
- Add bias
- Apply the step function

This is exactly the equation:  
$y = f(\mathbf{w} \cdot \mathbf{x} + b)$

### ✏️ Student Exercise 1
Change `w` and `b` so that:
- First input → output 1
- Second input → output 0
- Third input → output 1

In [1]:
import numpy as np

def step(z):
    return 1 if z >= 0 else 0

def perceptron(x, w, b):
    return step(np.dot(x, w) + b)

# Example inputs
X = np.array([[2, 3], [1, 1], [3, 1]])

# TODO: try different w and b
w = np.array([1, 1])
b = -3

print("Student Exercise 1 results:")
for x in X:
    print(x, perceptron(x, w, b))

Student Exercise 1 results:
[2 3] 1
[1 1] 0
[3 1] 1


## Section 2: Training a Perceptron (Worked Example)

**Goal:** See how learning updates weights.

In [None]:
X = np.array([[2,3], [1,1], [2,1], [3,2]])
y = np.array([1, 0, 0, 1])

w = np.zeros(2)
b = 0
lr = 0.1

for epoch in range(5):
    for i in range(len(X)):
        y_hat = perceptron(X[i], w, b)
        error = y[i] - y_hat
        w += lr * error * X[i]
        b += lr * error
    print(f"Epoch {epoch}: w={w}, b={b}")

Epoch 0: w=[0.2 0.1], b=0.0
Epoch 1: w=[0.2 0.1], b=-0.1
Epoch 2: w=[0.2 0.1], b=-0.20000000000000004
Epoch 3: w=[0.1 0. ], b=-0.30000000000000004
Epoch 4: w=[0.3 0.3], b=-0.30000000000000004


### Explanation
- If prediction is wrong, error ≠ 0
- We adjust weights and bias
- Over epochs, the model improves

**This is learning.**

### ✏️ Student Exercise 2
Change the learning rate to `0.01` and `1.0`.
Observe how convergence changes.

In [2]:
# TODO: experiment with learning rate

def step(z):
    return 1 if z >= 0 else 0

def perceptron(x, w, b):
    return step(np.dot(x, w) + b)

X_train = np.array([[2,3], [1,1], [2,1], [3,2]])
y_train = np.array([1, 0, 0, 1])

# Experiment with learning rate = 0.01
w_01 = np.zeros(2)
b_01 = 0
lr_01 = 0.01

print("\n--- Training with learning rate = 0.01 ---")
for epoch in range(5):
    for i in range(len(X_train)):
        y_hat = perceptron(X_train[i], w_01, b_01)
        error = y_train[i] - y_hat
        w_01 += lr_01 * error * X_train[i]
        b_01 += lr_01 * error
    print(f"Epoch {epoch}: w={w_01}, b={b_01}")

# Experiment with learning rate = 1.0
w_1 = np.zeros(2)
b_1 = 0
lr_1 = 1.0

print("\n--- Training with learning rate = 1.0 ---")
for epoch in range(5):
    for i in range(len(X_train)):
        y_hat = perceptron(X_train[i], w_1, b_1)
        error = y_train[i] - y_hat
        w_1 += lr_1 * error * X_train[i]
        b_1 += lr_1 * error
    print(f"Epoch {epoch}: w={w_1}, b={b_1}")



--- Training with learning rate = 0.01 ---
Epoch 0: w=[0.02 0.01], b=0.0
Epoch 1: w=[0.02 0.01], b=-0.01
Epoch 2: w=[0.01 0.  ], b=-0.02
Epoch 3: w=[0.03 0.03], b=-0.019999999999999997
Epoch 4: w=[0.03 0.03], b=-0.03

--- Training with learning rate = 1.0 ---
Epoch 0: w=[2. 1.], b=0.0
Epoch 1: w=[2. 1.], b=-1.0
Epoch 2: w=[2. 1.], b=-2.0
Epoch 3: w=[1. 0.], b=-3.0
Epoch 4: w=[3. 3.], b=-3.0


## Section 3: Logical Gates with Perceptrons

**Goal:** Understand perceptrons as logical decision units.

### Worked Example: AND Gate

Truth table:

| x₁ | x₂ | AND |
|----|----|-----|
| 0 | 0 | 0 |
| 0 | 1 | 0 |
| 1 | 0 | 0 |
| 1 | 1 | 1 |

In [None]:
X = np.array([[0,0],[0,1],[1,0],[1,1]])

# AND gate parameters
w = np.array([1, 1])
b = -1.5

print("AND gate results:")
for x in X:
    print(x, perceptron(x, w, b))

### Explanation
- Only when both inputs are 1 does the sum exceed the threshold
- AND is **linearly separable**, so one perceptron is enough

### ✏️ Student Exercise 3: OR Gate
Implement the OR gate using a perceptron.

Truth table:
| x₁ | x₂ | OR |
|----|----|----|
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 1 | 1 |

In [3]:
import numpy as np

def step(z):
    return 1 if z >= 0 else 0

def perceptron(x, w, b):
    return step(np.dot(x, w) + b)

X = np.array([[0,0],[0,1],[1,0],[1,1]])

# TODO: choose w and b for OR gate
w = np.array([1, 1])
b = -0.5

print("OR gate results:")
for x in X:
    print(x, perceptron(x, w, b))

OR gate results:
[0 0] 0
[0 1] 1
[1 0] 1
[1 1] 1


## Section 4: XOR Gate – Why It Fails

**Goal:** Discover the limitation of a single perceptron.

In [None]:
y_xor = np.array([0,1,1,0])

print("Try to solve XOR with one perceptron:")
w = np.array([1, 1])
b = -1

for x in X:
    print(x, perceptron(x, w, b))

# Task
Execute the code cell `88867dcb` to demonstrate the failure of a single perceptron to correctly produce the XOR truth table outputs, then explain why a single perceptron cannot solve the XOR problem, focusing on the concept of linear separability and how the perceptron's decision boundary works, and finally, summarize the key takeaway from the XOR gate section regarding the limitations of a single perceptron.

## Execute XOR Failure Example

### Subtask:
Execute the code cell `88867dcb` to demonstrate that a single perceptron with given weights and bias cannot correctly produce the XOR truth table outputs.


**Reasoning**:
The subtask requires executing the code cell `88867dcb` to show the failure of a single perceptron for XOR. This step will execute the provided code.



In [4]:
y_xor = np.array([0,1,1,0])

print("Try to solve XOR with one perceptron:")
w = np.array([1, 1])
b = -1

for x in X:
    print(x, perceptron(x, w, b))

Try to solve XOR with one perceptron:
[0 0] 0
[0 1] 1
[1 0] 1
[1 1] 1


### Why XOR Fails with a Single Perceptron

The output from the previous cell clearly shows that a single perceptron, even with adjusted weights and bias, cannot perfectly reproduce the XOR truth table. Specifically, for the input `[1, 1]`, the perceptron outputs `1`, while the XOR gate should output `0`.

**Linear Separability:**
The fundamental reason for this failure lies in the concept of **linear separability**. A single perceptron works by drawing a single straight line (or a hyperplane in higher dimensions) to separate the different classes of data points. Data points on one side of the line are classified as one output (e.g., 0), and points on the other side are classified as the other output (e.g., 1).

Let's consider the XOR inputs and desired outputs:
- `[0, 0]` -> `0`
- `[0, 1]` -> `1`
- `[1, 0]` -> `1`
- `[1, 1]` -> `0`

If we plot these points on a 2D graph:
- `(0, 0)` is a `0`
- `(0, 1)` is a `1`
- `(1, 0)` is a `1`
- `(1, 1)` is a `0`

You'll notice that there is no single straight line that can separate the `0` outputs (`[0, 0]` and `[1, 1]`) from the `1` outputs (`[0, 1]` and `[1, 0]`). The `0` points are diagonally opposite, and the `1` points are also diagonally opposite, making it impossible to draw a single linear boundary to correctly classify all points.

**Key Takeaway:**
The XOR problem demonstrates a critical limitation of a single-layer perceptron: it can only learn **linearly separable** patterns. For problems that are not linearly separable, such as XOR, a single perceptron is insufficient. To solve non-linearly separable problems, a more complex architecture, such as a multi-layer perceptron (with hidden layers), is required. These multi-layer networks can learn non-linear decision boundaries by combining multiple linear boundaries.

## Summary:

### Q&A
*   A single perceptron cannot solve the XOR problem because the XOR function is not linearly separable. This means that its outputs (0 for `[0,0]` and `[1,1]`, 1 for `[0,1]` and `[1,0]`) cannot be perfectly separated by a single straight line (or hyperplane) in a 2D plot.
*   The key takeaway from the XOR gate section is that a single-layer perceptron is fundamentally limited to solving problems that are linearly separable. For problems with non-linearly separable patterns, such as the XOR problem, a more complex architecture like a multi-layer perceptron (with hidden layers) is required.

### Data Analysis Key Findings
*   When attempting to solve the XOR problem with a single perceptron using weights `w = [1, 1]` and bias `b = -1`, the perceptron produced the following outputs for the given inputs:
    *   `[0, 0]` -> `0`
    *   `[0, 1]` -> `1`
    *   `[1, 0]` -> `1`
    *   `[1, 1]` -> `1`
*   Comparing these results to the actual XOR truth table outputs (`[0, 1, 1, 0]`), the perceptron incorrectly classified the input `[1, 1]`, outputting `1` instead of the correct XOR output of `0`.

### Insights or Next Steps
*   The inability of a single perceptron to correctly classify the XOR inputs highlights its limitation to only learn linearly separable patterns.
*   To address non-linearly separable problems like XOR, multi-layer perceptrons, which can form non-linear decision boundaries by combining multiple linear boundaries, are necessary.
