# Lab 2: Machine Learning with PyTorch

## From Classical ML to Deep Learning Foundations

**Duration:** 90-120 minutes | **Difficulty:** Intermediate | **Prerequisites:** Lab 1

---

## Overview

This lab bridges classical machine learning and deep learning by teaching PyTorch fundamentals through hands-on implementation of ML algorithms. You'll learn to build, train, and evaluate models using the same patterns used in production deep learning systems.

### Lab Structure

| Part | Topic | Key Concepts |
|------|-------|--------------|
| **Part 1** | PyTorch Tensors | Creating tensors, tensor operations, automatic differentiation (autograd) |
| **Part 2** | Linear Regression | Training loop from scratch, MSE loss, nn.Module, gradient descent |
| **Part 3** | Logistic Regression | Sigmoid function, BCE loss, binary classification, decision boundaries |
| **Part 4** | Support Vector Machines | Kernel trick (linear, RBF, polynomial), margins, support vectors |
| **Part 5** | Model Evaluation | Confusion matrix, precision, recall, F1-score, classification report |

### Key Pattern You'll Learn

The PyTorch training loop used in all deep learning:

```python
for epoch in range(n_epochs):
    y_pred = model(X)           # Forward pass
    loss = criterion(y_pred, y) # Compute loss
    optimizer.zero_grad()       # Clear gradients
    loss.backward()             # Backward pass
    optimizer.step()            # Update weights
```

### Libraries Used

- **PyTorch** (torch, nn, optim) - Deep learning framework
- **scikit-learn** - SVM, train_test_split, metrics
- **NumPy, Pandas, Matplotlib** - Data manipulation and visualization

---

### Learning Objectives

By the end of this lab, you will be able to:

1. **Create and manipulate PyTorch tensors** - the building blocks of deep learning
2. **Use automatic differentiation** - PyTorch's killer feature for training models
3. **Implement Linear Regression** - from scratch and with nn.Module
4. **Build Logistic Regression** - for binary classification
5. **Train Support Vector Machines** - with different kernels
6. **Evaluate models properly** - confusion matrix, precision, recall, F1

### How This Lab Works

This lab follows the **Learn → Practice** pattern:

1. **Demonstration cells** show working examples with explanations
2. **Exercise cells** ask you to apply what you learned (`# YOUR CODE HERE`)
3. **Expected outputs** help you verify your solutions

---

**Let's begin!** Run the setup cell below first.

In [None]:
# ============================================
# SETUP - Run this cell first!
# ============================================
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.svm import SVC
from sklearn.datasets import make_classification, make_moons

# Configure matplotlib
%matplotlib inline
plt.rcParams['figure.figsize'] = [10, 6]
plt.rcParams['font.size'] = 12

# Set random seeds for reproducibility
np.random.seed(42)
torch.manual_seed(42)

print("="*50)
print("  Setup Complete!")
print("="*50)
print(f"  PyTorch version: {torch.__version__}")
print(f"  NumPy version: {np.__version__}")
print("="*50)
print("\n  You're ready to start the lab!")

---

# Part 1: PyTorch Tensors

## What is a Tensor?

**Tensors** are the fundamental data structure in PyTorch - like NumPy arrays but with:
- **GPU support** - Run computations on graphics cards for massive speedups
- **Automatic differentiation** - Automatically compute gradients for training

| NumPy | PyTorch | Description |
|-------|---------|-------------|
| `np.array()` | `torch.tensor()` | Create from list |
| `np.zeros()` | `torch.zeros()` | Array of zeros |
| `np.ones()` | `torch.ones()` | Array of ones |
| `np.random.rand()` | `torch.rand()` | Random [0, 1) |
| `arr.shape` | `tensor.shape` | Get dimensions |

---

## 1.1 Creating Tensors

### Demonstration: Creating Tensors

In [None]:
# ============================================
# DEMONSTRATION: Creating Tensors
# ============================================

# Method 1: From a Python list
t1 = torch.tensor([1, 2, 3, 4, 5])
print("1. From list:")
print(f"   torch.tensor([1, 2, 3, 4, 5]) = {t1}")
print(f"   Shape: {t1.shape}, Dtype: {t1.dtype}")
print()

# Method 2: From NumPy array
np_arr = np.array([[1, 2, 3], [4, 5, 6]])
t2 = torch.from_numpy(np_arr)
print("2. From NumPy:")
print(f"   torch.from_numpy(np_arr) =\n{t2}")
print()

# Method 3: Zeros and Ones
t_zeros = torch.zeros(2, 3)  # 2 rows, 3 columns
t_ones = torch.ones(2, 3)
print("3. Zeros (2x3):")
print(t_zeros)
print()

# Method 4: Random tensors
t_rand = torch.rand(2, 3)    # Uniform [0, 1)
t_randn = torch.randn(2, 3)  # Normal distribution (mean=0, std=1)
print("4. Random (uniform):")
print(t_rand)
print()

# Method 5: Specifying dtype
t_float = torch.tensor([1, 2, 3], dtype=torch.float32)
print(f"5. With dtype: {t_float} (dtype: {t_float.dtype})")

### Exercise 1.1: Create Tensors

**Your Task:** Create the tensors specified below.

**Expected Output:**
```
a) tensor([10, 20, 30, 40, 50])
b) Shape: torch.Size([3, 4])
c) Random tensor with shape torch.Size([2, 5])
```

In [None]:
# ============================================
# EXERCISE 1.1: Create Tensors
# ============================================

# a) Create a tensor containing [10, 20, 30, 40, 50]
tensor_a = None  # YOUR CODE HERE

# b) Create a 3x4 tensor filled with zeros
tensor_b = None  # YOUR CODE HERE

# c) Create a 2x5 tensor with random values (uniform)
tensor_c = None  # YOUR CODE HERE

# ---- Test your answers ----
print(f"a) {tensor_a}")
print(f"b) Shape: {tensor_b.shape if tensor_b is not None else 'None'}")
print(f"c) Random tensor with shape {tensor_c.shape if tensor_c is not None else 'None'}")

---

## 1.2 Tensor Operations

### Demonstration: Math Operations

In [None]:
# ============================================
# DEMONSTRATION: Tensor Operations
# ============================================

a = torch.tensor([[1., 2.], [3., 4.]])
b = torch.tensor([[5., 6.], [7., 8.]])

print("Tensor a:")
print(a)
print("\nTensor b:")
print(b)
print()

# Element-wise operations
print("Element-wise addition (a + b):")
print(a + b)
print()

print("Element-wise multiplication (a * b):")
print(a * b)
print()

# Matrix multiplication (VERY important for ML!)
print("Matrix multiplication (a @ b):")
print(a @ b)  # Same as torch.matmul(a, b)
print()

# Aggregations
print(f"Sum of all elements: a.sum() = {a.sum()}")
print(f"Mean of all elements: a.mean() = {a.mean()}")
print(f"Sum per column: a.sum(dim=0) = {a.sum(dim=0)}")
print(f"Sum per row: a.sum(dim=1) = {a.sum(dim=1)}")

### Key Operations Summary

| Operation | Code | Description |
|-----------|------|-------------|
| Add | `a + b` | Element-wise addition |
| Multiply | `a * b` | Element-wise multiplication |
| Matrix multiply | `a @ b` | Matrix multiplication |
| Sum | `a.sum()` | Sum all elements |
| Mean | `a.mean()` | Average of all elements |
| Reshape | `a.reshape(r, c)` | Change shape |
| Transpose | `a.T` | Swap rows/columns |

---

### Exercise 1.2: Tensor Operations

**Your Task:** Perform the specified operations.

In [None]:
# ============================================
# EXERCISE 1.2: Tensor Operations
# ============================================

x = torch.tensor([[1., 2., 3.], [4., 5., 6.]])
y = torch.tensor([[7., 8., 9.], [10., 11., 12.]])

print("x =")
print(x)
print("\ny =")
print(y)
print()

# a) Add x and y element-wise
result_add = None  # YOUR CODE HERE

# b) Calculate the mean of x
result_mean = None  # YOUR CODE HERE

# c) Calculate the sum of each row of x (dim=1)
result_row_sum = None  # YOUR CODE HERE

# d) Reshape x to be 3 rows x 2 columns
result_reshape = None  # YOUR CODE HERE

# ---- Test your answers ----
print(f"a) x + y =\n{result_add}")
print(f"\nb) Mean of x = {result_mean}")
print(f"\nc) Sum of each row = {result_row_sum}")
print(f"\nd) x reshaped to 3x2:\n{result_reshape}")

---

## 1.3 Automatic Differentiation (Autograd)

**This is PyTorch's killer feature!** It automatically computes gradients, which we need for training models.

### Why Do We Need Gradients?

Machine learning training works by:
1. Making a prediction
2. Calculating how wrong it is (loss)
3. **Computing gradients** to know which direction to adjust weights
4. Updating weights to reduce the error

PyTorch does step 3 automatically!

### Demonstration: Autograd

In [None]:
# ============================================
# DEMONSTRATION: Automatic Differentiation
# ============================================

# Create a tensor and tell PyTorch to track gradients
x = torch.tensor([3.0], requires_grad=True)
print(f"x = {x.item()}")
print(f"requires_grad = {x.requires_grad}")
print()

# Define a function: y = x^2 + 2x + 1
y = x**2 + 2*x + 1
print(f"y = x² + 2x + 1 = {y.item()}")
print()

# Compute the gradient (derivative): dy/dx = 2x + 2
y.backward()

# At x=3: dy/dx = 2(3) + 2 = 8
print(f"dy/dx = 2x + 2 = {x.grad.item()}")
print()
print("PyTorch computed the derivative automatically!")
print("This is how neural networks learn.")

### Demonstration: Gradient with Multiple Variables

In [None]:
# ============================================
# DEMONSTRATION: Multiple Variables
# ============================================

# Two parameters (like weights in ML)
w = torch.tensor([2.0], requires_grad=True)
b = torch.tensor([1.0], requires_grad=True)

# Input
x = torch.tensor([3.0])

# Forward pass: y = w*x + b (linear equation!)
y = w * x + b
print(f"w = {w.item()}, b = {b.item()}, x = {x.item()}")
print(f"y = w*x + b = {y.item()}")
print()

# Compute gradients
y.backward()

print("Gradients:")
print(f"  dy/dw = x = {w.grad.item()}")
print(f"  dy/db = 1 = {b.grad.item()}")
print()
print("These gradients tell us how to adjust w and b!")

### Exercise 1.3: Practice with Autograd

**Your Task:** Compute gradients for the given function.

**Expected Output:**
```
y = 3x² - 4x + 5 at x=2: y = 9.0
dy/dx = 6x - 4 at x=2: dy/dx = 8.0
```

In [None]:
# ============================================
# EXERCISE 1.3: Practice with Autograd
# ============================================

# Create x = 2.0 with gradient tracking
x = None  # YOUR CODE HERE: torch.tensor([2.0], requires_grad=True)

# Compute y = 3x² - 4x + 5
y = None  # YOUR CODE HERE

# Compute the gradient
# YOUR CODE HERE: call y.backward()

# ---- Test your answers ----
if x is not None and y is not None:
    print(f"y = 3x² - 4x + 5 at x=2: y = {y.item()}")
    print(f"dy/dx = 6x - 4 at x=2: dy/dx = {x.grad.item()}")
else:
    print("Complete the code above!")

---

# Part 2: Linear Regression

## What is Linear Regression?

Linear regression finds the best line to fit data: **y = wx + b**

- **w** (weight): The slope of the line
- **b** (bias): The y-intercept

We'll train a model by:
1. Starting with random w and b
2. Making predictions
3. Measuring error (MSE loss)
4. Using gradients to improve w and b
5. Repeat!

---

## 2.1 Generate Training Data

In [None]:
# ============================================
# SETUP: Generate Linear Regression Data
# ============================================

# True relationship: y = 3x + 2 (plus some noise)
np.random.seed(42)
n_samples = 100

X_np = np.random.rand(n_samples, 1) * 10  # X values from 0 to 10
y_np = 3 * X_np + 2 + np.random.randn(n_samples, 1) * 1.5  # y = 3x + 2 + noise

# Convert to PyTorch tensors
X = torch.tensor(X_np, dtype=torch.float32)
y = torch.tensor(y_np, dtype=torch.float32)

# Visualize the data
plt.figure(figsize=(10, 6))
plt.scatter(X.numpy(), y.numpy(), alpha=0.7, label='Data points')
plt.plot([0, 10], [2, 32], 'r--', linewidth=2, label='True line: y = 3x + 2')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Linear Regression Training Data')
plt.legend()
plt.grid(alpha=0.3)
plt.show()

print(f"Data shape: X={X.shape}, y={y.shape}")
print(f"Our goal: Learn w≈3 and b≈2 from the data!")

---

## 2.2 Linear Regression from Scratch

### Demonstration: The Training Loop

This is the core pattern for ALL deep learning:

In [None]:
# ============================================
# DEMONSTRATION: Training Loop from Scratch
# ============================================

# Step 1: Initialize parameters randomly
w = torch.randn(1, requires_grad=True)  # Random weight
b = torch.zeros(1, requires_grad=True)  # Bias starts at 0

print(f"Initial parameters: w = {w.item():.4f}, b = {b.item():.4f}")
print()

# Hyperparameters
learning_rate = 0.01
n_epochs = 100

# Store loss history for plotting
losses = []

# Training loop
for epoch in range(n_epochs):
    # STEP 2: Forward pass - make predictions
    y_pred = w * X + b
    
    # STEP 3: Compute loss (Mean Squared Error)
    loss = ((y_pred - y) ** 2).mean()
    
    # STEP 4: Backward pass - compute gradients
    loss.backward()
    
    # STEP 5: Update parameters (gradient descent)
    with torch.no_grad():  # Don't track these operations
        w -= learning_rate * w.grad
        b -= learning_rate * b.grad
        
        # Clear gradients for next iteration
        w.grad.zero_()
        b.grad.zero_()
    
    losses.append(loss.item())
    
    # Print progress every 20 epochs
    if (epoch + 1) % 20 == 0:
        print(f'Epoch {epoch+1:3d}/{n_epochs} | Loss: {loss.item():.4f} | w: {w.item():.4f}, b: {b.item():.4f}')

print()
print(f"Final parameters: w = {w.item():.4f}, b = {b.item():.4f}")
print(f"True parameters:  w = 3.0000, b = 2.0000")

In [None]:
# Visualize training results
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot 1: Loss curve
axes[0].plot(losses, 'b-', linewidth=2)
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('MSE Loss')
axes[0].set_title('Training Loss Over Time')
axes[0].grid(alpha=0.3)

# Plot 2: Learned line vs data
axes[1].scatter(X.numpy(), y.numpy(), alpha=0.6, label='Data')
X_line = torch.linspace(0, 10, 100).reshape(-1, 1)
y_line = w.detach() * X_line + b.detach()
axes[1].plot(X_line.numpy(), y_line.numpy(), 'r-', linewidth=2,
             label=f'Learned: y = {w.item():.2f}x + {b.item():.2f}')
axes[1].plot([0, 10], [2, 32], 'g--', linewidth=2, alpha=0.5, label='True: y = 3x + 2')
axes[1].set_xlabel('X')
axes[1].set_ylabel('y')
axes[1].set_title('Linear Regression Fit')
axes[1].legend()
axes[1].grid(alpha=0.3)

plt.tight_layout()
plt.show()

### The Training Loop Pattern

```python
for epoch in range(n_epochs):
    # 1. Forward pass: make prediction
    y_pred = model(X)
    
    # 2. Compute loss
    loss = loss_function(y_pred, y)
    
    # 3. Backward pass: compute gradients
    loss.backward()
    
    # 4. Update parameters
    optimizer.step()
    
    # 5. Clear gradients
    optimizer.zero_grad()
```

---

## 2.3 Linear Regression with nn.Module

PyTorch provides `nn.Module` to make building models easier.

### Demonstration: Using nn.Module

In [None]:
# ============================================
# DEMONSTRATION: nn.Module
# ============================================

# Define a model class
class LinearRegressionModel(nn.Module):
    def __init__(self):
        super().__init__()
        # nn.Linear(input_features, output_features)
        # This automatically creates w and b for us!
        self.linear = nn.Linear(1, 1)
    
    def forward(self, x):
        # Define how data flows through the model
        return self.linear(x)

# Create model instance
model = LinearRegressionModel()

# Loss function
criterion = nn.MSELoss()

# Optimizer (handles the gradient descent for us)
optimizer = optim.SGD(model.parameters(), lr=0.01)

print("Model structure:")
print(model)
print()
print("Initial parameters:")
for name, param in model.named_parameters():
    print(f"  {name}: {param.data}")

In [None]:
# Training with nn.Module (much cleaner!)
n_epochs = 100
losses_nn = []

for epoch in range(n_epochs):
    # Forward pass
    y_pred = model(X)
    loss = criterion(y_pred, y)
    
    # Backward pass and optimization
    optimizer.zero_grad()  # Clear gradients
    loss.backward()        # Compute gradients
    optimizer.step()       # Update parameters
    
    losses_nn.append(loss.item())
    
    if (epoch + 1) % 20 == 0:
        print(f'Epoch {epoch+1:3d}/{n_epochs} | Loss: {loss.item():.4f}')

# Get learned parameters
w_learned = model.linear.weight.item()
b_learned = model.linear.bias.item()
print(f"\nLearned: w = {w_learned:.4f}, b = {b_learned:.4f}")
print(f"True:    w = 3.0000, b = 2.0000")

### Exercise 2.1: Build Your Own Linear Regression Model

**Your Task:** Complete the model class and training loop.

**Expected:** The model should learn w ≈ 3 and b ≈ 2

In [None]:
# ============================================
# EXERCISE 2.1: Build Linear Regression
# ============================================

# Reset random seed
torch.manual_seed(123)

# Define your model
class MyLinearRegression(nn.Module):
    def __init__(self):
        super().__init__()
        # YOUR CODE HERE: Create a linear layer (1 input, 1 output)
        self.linear = None
    
    def forward(self, x):
        # YOUR CODE HERE: Return the output of the linear layer
        return None

# Create model, loss function, and optimizer
my_model = MyLinearRegression()
my_criterion = None  # YOUR CODE HERE: nn.MSELoss()
my_optimizer = None  # YOUR CODE HERE: optim.SGD(my_model.parameters(), lr=0.01)

# Training loop
my_losses = []
for epoch in range(100):
    # YOUR CODE HERE: Complete the training loop
    # 1. Forward pass: y_pred = my_model(X)
    # 2. Compute loss: loss = my_criterion(y_pred, y)
    # 3. Zero gradients: my_optimizer.zero_grad()
    # 4. Backward pass: loss.backward()
    # 5. Update weights: my_optimizer.step()
    pass

# Print results
if my_model.linear is not None:
    print(f"Learned: w = {my_model.linear.weight.item():.4f}, b = {my_model.linear.bias.item():.4f}")
    print(f"True:    w = 3.0000, b = 2.0000")

---

# Part 3: Logistic Regression (Classification)

## What is Logistic Regression?

For **binary classification** (two classes), we need to output a probability.

**Key idea:** Use the **sigmoid function** to squash outputs to [0, 1]

$$\sigma(z) = \frac{1}{1 + e^{-z}}$$

- If σ(z) > 0.5 → Predict class 1
- If σ(z) < 0.5 → Predict class 0

---

## 3.1 Generate Classification Data

In [None]:
# ============================================
# SETUP: Generate Classification Data
# ============================================

# Create a classification dataset
X_class, y_class = make_classification(
    n_samples=300,
    n_features=2,
    n_redundant=0,
    n_informative=2,
    n_clusters_per_class=1,
    random_state=42
)

# Split into train and test
X_train, X_test, y_train, y_test = train_test_split(
    X_class, y_class, test_size=0.2, random_state=42
)

# Standardize features (important for gradient descent!)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Convert to tensors
X_train_t = torch.tensor(X_train_scaled, dtype=torch.float32)
X_test_t = torch.tensor(X_test_scaled, dtype=torch.float32)
y_train_t = torch.tensor(y_train, dtype=torch.float32).reshape(-1, 1)
y_test_t = torch.tensor(y_test, dtype=torch.float32).reshape(-1, 1)

# Visualize
plt.figure(figsize=(10, 6))
plt.scatter(X_train_scaled[y_train == 0, 0], X_train_scaled[y_train == 0, 1],
            c='blue', label='Class 0', alpha=0.7, edgecolors='black')
plt.scatter(X_train_scaled[y_train == 1, 0], X_train_scaled[y_train == 1, 1],
            c='red', label='Class 1', alpha=0.7, edgecolors='black')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Binary Classification Dataset')
plt.legend()
plt.grid(alpha=0.3)
plt.show()

print(f"Training samples: {len(X_train)}")
print(f"Test samples: {len(X_test)}")

### Demonstration: The Sigmoid Function

In [None]:
# ============================================
# DEMONSTRATION: Sigmoid Function
# ============================================

z = np.linspace(-10, 10, 100)
sigmoid = 1 / (1 + np.exp(-z))

plt.figure(figsize=(10, 5))
plt.plot(z, sigmoid, 'b-', linewidth=2)
plt.axhline(y=0.5, color='red', linestyle='--', label='Threshold = 0.5')
plt.axvline(x=0, color='gray', linestyle='--', alpha=0.5)
plt.fill_between(z, sigmoid, 0.5, where=(sigmoid > 0.5), alpha=0.3, color='green', label='Predict Class 1')
plt.fill_between(z, sigmoid, 0.5, where=(sigmoid < 0.5), alpha=0.3, color='red', label='Predict Class 0')
plt.xlabel('z (linear output)')
plt.ylabel('σ(z) = Probability')
plt.title('Sigmoid Function: σ(z) = 1 / (1 + e^(-z))')
plt.legend()
plt.grid(alpha=0.3)
plt.show()

print("The sigmoid function:")
print("- Converts any number to a probability (0 to 1)")
print("- When z > 0, probability > 0.5 → Predict class 1")
print("- When z < 0, probability < 0.5 → Predict class 0")

---

## 3.2 Building a Logistic Regression Model

### Demonstration: Logistic Regression

In [None]:
# ============================================
# DEMONSTRATION: Logistic Regression Model
# ============================================

class LogisticRegressionModel(nn.Module):
    def __init__(self, input_dim):
        super().__init__()
        self.linear = nn.Linear(input_dim, 1)
        self.sigmoid = nn.Sigmoid()
    
    def forward(self, x):
        # Linear transformation, then sigmoid
        z = self.linear(x)
        return self.sigmoid(z)

# Create model
log_model = LogisticRegressionModel(input_dim=2)

# Binary Cross-Entropy Loss (for classification)
criterion_bce = nn.BCELoss()

# Optimizer (Adam often works better than SGD)
optimizer_log = optim.Adam(log_model.parameters(), lr=0.1)

print("Model structure:")
print(log_model)

In [None]:
# Training
n_epochs = 100
losses_log = []
accuracies = []

for epoch in range(n_epochs):
    # Forward pass
    y_pred_prob = log_model(X_train_t)
    loss = criterion_bce(y_pred_prob, y_train_t)
    
    # Calculate accuracy
    y_pred_class = (y_pred_prob >= 0.5).float()
    accuracy = (y_pred_class == y_train_t).float().mean()
    
    # Backward pass
    optimizer_log.zero_grad()
    loss.backward()
    optimizer_log.step()
    
    losses_log.append(loss.item())
    accuracies.append(accuracy.item())
    
    if (epoch + 1) % 20 == 0:
        print(f'Epoch {epoch+1:3d}/{n_epochs} | Loss: {loss.item():.4f} | Accuracy: {accuracy.item():.4f}')

# Test accuracy
log_model.eval()
with torch.no_grad():
    y_test_pred = log_model(X_test_t)
    y_test_class = (y_test_pred >= 0.5).float()
    test_acc = (y_test_class == y_test_t).float().mean()

print(f"\nTest Accuracy: {test_acc.item():.4f}")

In [None]:
# Visualize decision boundary
fig, axes = plt.subplots(1, 3, figsize=(16, 5))

# Loss curve
axes[0].plot(losses_log, 'b-', linewidth=2)
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('BCE Loss')
axes[0].set_title('Training Loss')
axes[0].grid(alpha=0.3)

# Accuracy curve
axes[1].plot(accuracies, 'g-', linewidth=2)
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Accuracy')
axes[1].set_title('Training Accuracy')
axes[1].set_ylim([0, 1])
axes[1].grid(alpha=0.3)

# Decision boundary
xx, yy = np.meshgrid(np.linspace(-3, 3, 100), np.linspace(-3, 3, 100))
grid = torch.tensor(np.c_[xx.ravel(), yy.ravel()], dtype=torch.float32)

log_model.eval()
with torch.no_grad():
    Z = log_model(grid).numpy().reshape(xx.shape)

axes[2].contourf(xx, yy, Z, levels=50, cmap='RdYlBu', alpha=0.8)
axes[2].contour(xx, yy, Z, levels=[0.5], colors='black', linewidths=2)
axes[2].scatter(X_train_scaled[y_train == 0, 0], X_train_scaled[y_train == 0, 1],
                c='blue', label='Class 0', edgecolors='black')
axes[2].scatter(X_train_scaled[y_train == 1, 0], X_train_scaled[y_train == 1, 1],
                c='red', label='Class 1', edgecolors='black')
axes[2].set_xlabel('Feature 1')
axes[2].set_ylabel('Feature 2')
axes[2].set_title(f'Decision Boundary (Test Acc: {test_acc.item():.2f})')
axes[2].legend()

plt.tight_layout()
plt.show()

### Exercise 3.1: Build Your Own Classifier

**Your Task:** Complete the logistic regression model and train it.

In [None]:
# ============================================
# EXERCISE 3.1: Build a Classifier
# ============================================

torch.manual_seed(42)

class MyClassifier(nn.Module):
    def __init__(self, input_dim):
        super().__init__()
        # YOUR CODE HERE: Create linear layer and sigmoid
        self.linear = None
        self.sigmoid = None
    
    def forward(self, x):
        # YOUR CODE HERE: Apply linear then sigmoid
        return None

# Create model, loss, and optimizer
my_classifier = MyClassifier(input_dim=2)
my_bce_loss = None    # YOUR CODE HERE: nn.BCELoss()
my_opt = None         # YOUR CODE HERE: optim.Adam(my_classifier.parameters(), lr=0.1)

# Training loop
for epoch in range(100):
    # YOUR CODE HERE: Complete the training loop
    pass

# Test accuracy
if my_classifier.linear is not None:
    my_classifier.eval()
    with torch.no_grad():
        y_pred = my_classifier(X_test_t)
        y_class = (y_pred >= 0.5).float()
        acc = (y_class == y_test_t).float().mean()
    print(f"Test Accuracy: {acc.item():.4f}")

---

# Part 4: Support Vector Machines

## What is an SVM?

**Support Vector Machines** find the optimal hyperplane that maximizes the margin between classes.

Key concepts:
- **Margin:** Distance between the hyperplane and nearest points
- **Support Vectors:** The points closest to the decision boundary
- **Kernel Trick:** Transform data to make it linearly separable

---

## 4.1 Non-Linearly Separable Data

In [None]:
# ============================================
# SETUP: Non-Linear Data (Moons)
# ============================================

# Generate "moons" data - two interleaving half circles
X_moons, y_moons = make_moons(n_samples=300, noise=0.2, random_state=42)

X_train_m, X_test_m, y_train_m, y_test_m = train_test_split(
    X_moons, y_moons, test_size=0.2, random_state=42
)

# Visualize
plt.figure(figsize=(10, 6))
plt.scatter(X_moons[y_moons == 0, 0], X_moons[y_moons == 0, 1],
            c='blue', label='Class 0', alpha=0.7, edgecolors='black')
plt.scatter(X_moons[y_moons == 1, 0], X_moons[y_moons == 1, 1],
            c='red', label='Class 1', alpha=0.7, edgecolors='black')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Non-Linearly Separable Data (Moons)')
plt.legend()
plt.grid(alpha=0.3)
plt.show()

print("This data CANNOT be separated by a straight line!")
print("We need the kernel trick...")

### Demonstration: Comparing SVM Kernels

In [None]:
# ============================================
# DEMONSTRATION: SVM Kernels
# ============================================

# Train three SVMs with different kernels
svm_linear = SVC(kernel='linear', C=1.0)
svm_rbf = SVC(kernel='rbf', C=1.0, gamma='scale')
svm_poly = SVC(kernel='poly', degree=3, C=1.0)

# Fit all three
svm_linear.fit(X_train_m, y_train_m)
svm_rbf.fit(X_train_m, y_train_m)
svm_poly.fit(X_train_m, y_train_m)

# Compare accuracies
print("Test Accuracies:")
print(f"  Linear kernel: {svm_linear.score(X_test_m, y_test_m):.4f}")
print(f"  RBF kernel:    {svm_rbf.score(X_test_m, y_test_m):.4f}")
print(f"  Polynomial:    {svm_poly.score(X_test_m, y_test_m):.4f}")

In [None]:
# Visualize decision boundaries
def plot_svm(ax, model, X, y, title):
    xx, yy = np.meshgrid(np.linspace(X[:, 0].min()-0.5, X[:, 0].max()+0.5, 100),
                          np.linspace(X[:, 1].min()-0.5, X[:, 1].max()+0.5, 100))
    Z = model.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)
    
    ax.contourf(xx, yy, Z, levels=[-1, 0, 1], colors=['blue', 'red'], alpha=0.3)
    ax.contour(xx, yy, Z, levels=[0.5], colors='black', linewidths=2)
    ax.scatter(X[y == 0, 0], X[y == 0, 1], c='blue', edgecolors='black')
    ax.scatter(X[y == 1, 0], X[y == 1, 1], c='red', edgecolors='black')
    
    # Highlight support vectors
    sv = model.support_vectors_
    ax.scatter(sv[:, 0], sv[:, 1], s=100, facecolors='none', edgecolors='green',
               linewidths=2, label=f'Support Vectors ({len(sv)})')
    
    acc = model.score(X_test_m, y_test_m)
    ax.set_title(f'{title}\nTest Acc: {acc:.2f}')
    ax.set_xlabel('Feature 1')
    ax.set_ylabel('Feature 2')

fig, axes = plt.subplots(1, 3, figsize=(16, 5))
plot_svm(axes[0], svm_linear, X_train_m, y_train_m, 'Linear Kernel')
plot_svm(axes[1], svm_rbf, X_train_m, y_train_m, 'RBF Kernel')
plot_svm(axes[2], svm_poly, X_train_m, y_train_m, 'Polynomial Kernel')
axes[0].legend(loc='upper right')
plt.tight_layout()
plt.show()

print("\nNotice:")
print("- Linear kernel can't handle curved boundaries")
print("- RBF kernel creates a flexible, curved boundary")
print("- Support vectors (green circles) define the boundary")

### Exercise 4.1: Train an SVM

**Your Task:** Train an RBF SVM and experiment with the C parameter.

In [None]:
# ============================================
# EXERCISE 4.1: Train an SVM
# ============================================

# a) Create an RBF SVM with C=10
my_svm = None  # YOUR CODE HERE: SVC(kernel='rbf', C=10)

# b) Fit it on the training data
# YOUR CODE HERE: my_svm.fit(X_train_m, y_train_m)

# c) Calculate test accuracy
if my_svm is not None:
    accuracy = my_svm.score(X_test_m, y_test_m)
    n_support = len(my_svm.support_vectors_)
    print(f"Test Accuracy: {accuracy:.4f}")
    print(f"Number of Support Vectors: {n_support}")

---

# Part 5: Model Evaluation

## Why Accuracy Isn't Enough

Imagine a dataset with 95% class 0 and 5% class 1. A model that always predicts class 0 has 95% accuracy but is useless!

We need better metrics:
- **Precision:** Of all predicted positives, how many are correct?
- **Recall:** Of all actual positives, how many did we find?
- **F1 Score:** Harmonic mean of precision and recall

---

## 5.1 Confusion Matrix

### Demonstration: Understanding the Confusion Matrix

In [None]:
# ============================================
# DEMONSTRATION: Confusion Matrix
# ============================================

# Get predictions from our best SVM
y_pred = svm_rbf.predict(X_test_m)

# Create confusion matrix
cm = confusion_matrix(y_test_m, y_pred)

print("Confusion Matrix:")
print(cm)
print()

# Visualize
fig, ax = plt.subplots(figsize=(8, 6))
im = ax.imshow(cm, cmap='Blues')

# Labels
ax.set_xticks([0, 1])
ax.set_yticks([0, 1])
ax.set_xticklabels(['Predicted 0', 'Predicted 1'])
ax.set_yticklabels(['Actual 0', 'Actual 1'])
ax.set_xlabel('Predicted Label', fontsize=12)
ax.set_ylabel('True Label', fontsize=12)
ax.set_title('Confusion Matrix', fontsize=14)

# Add numbers
labels = [['TN', 'FP'], ['FN', 'TP']]
for i in range(2):
    for j in range(2):
        color = 'white' if cm[i, j] > cm.max()/2 else 'black'
        ax.text(j, i, f'{cm[i, j]}\n({labels[i][j]})', 
                ha='center', va='center', fontsize=16, color=color)

plt.colorbar(im)
plt.tight_layout()
plt.show()

print("Legend:")
print("  TN (True Negative): Correctly predicted negative")
print("  FP (False Positive): Incorrectly predicted positive (Type I error)")
print("  FN (False Negative): Incorrectly predicted negative (Type II error)")
print("  TP (True Positive): Correctly predicted positive")

### Demonstration: Computing Metrics

In [None]:
# ============================================
# DEMONSTRATION: Calculating Metrics
# ============================================

# Extract values from confusion matrix
TN, FP, FN, TP = cm.ravel()

print("From the confusion matrix:")
print(f"  TN = {TN}, FP = {FP}")
print(f"  FN = {FN}, TP = {TP}")
print()

# Calculate metrics manually
accuracy = (TP + TN) / (TP + TN + FP + FN)
precision = TP / (TP + FP) if (TP + FP) > 0 else 0
recall = TP / (TP + FN) if (TP + FN) > 0 else 0
f1 = 2 * precision * recall / (precision + recall) if (precision + recall) > 0 else 0

print("Metrics:")
print(f"  Accuracy:  {accuracy:.4f}  = (TP + TN) / Total")
print(f"  Precision: {precision:.4f}  = TP / (TP + FP)  'Of predicted positive, how many correct?'")
print(f"  Recall:    {recall:.4f}  = TP / (TP + FN)  'Of actual positive, how many found?'")
print(f"  F1 Score:  {f1:.4f}  = 2 * P * R / (P + R)")
print()

# Using sklearn
print("Classification Report (sklearn):")
print(classification_report(y_test_m, y_pred, target_names=['Class 0', 'Class 1']))

### Exercise 5.1: Evaluate Your Model

**Your Task:** Calculate the confusion matrix and metrics for the logistic regression model.

In [None]:
# ============================================
# EXERCISE 5.1: Evaluate Your Model
# ============================================

# Get predictions from the logistic regression model
log_model.eval()
with torch.no_grad():
    y_pred_prob = log_model(X_test_t)
    y_pred_log = (y_pred_prob >= 0.5).numpy().astype(int).flatten()

# a) Create confusion matrix
cm_log = None  # YOUR CODE HERE: confusion_matrix(y_test, y_pred_log)

# b) Print classification report
# YOUR CODE HERE: print(classification_report(y_test, y_pred_log))

if cm_log is not None:
    print("Confusion Matrix:")
    print(cm_log)

---

# Lab Complete!

## Summary

| Topic | Key Concepts |
|-------|-------------|
| **PyTorch Tensors** | `torch.tensor()`, `requires_grad=True`, operations |
| **Autograd** | `.backward()`, automatic gradient computation |
| **Linear Regression** | MSE loss, gradient descent, `nn.Linear` |
| **Logistic Regression** | Sigmoid, BCE loss, binary classification |
| **SVM** | Kernels (linear, RBF, poly), margin, support vectors |
| **Evaluation** | Confusion matrix, precision, recall, F1-score |

## The PyTorch Training Loop

```python
for epoch in range(n_epochs):
    y_pred = model(X)           # Forward pass
    loss = criterion(y_pred, y) # Compute loss
    optimizer.zero_grad()       # Clear gradients
    loss.backward()             # Backward pass
    optimizer.step()            # Update weights
```

## Next Steps

- **Practice:** Try different learning rates and see how training changes
- **Experiment:** Use different kernels and C values for SVM
- **Next Lab:** Neural Networks and Deep Learning!

---

*Great work! Save your notebook (Ctrl+S) before closing.*