# Week 3: Building Real-World Housing Price Predictor - Homework

**ML2: Advanced Machine Learning**

**Estimated Time**: 1 hour

---

This homework combines programming exercises and knowledge-based questions to reinforce this week's concepts.

## Setup

Run this cell to import necessary libraries:

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import torch
import torch.nn as nn

# Set random seed for reproducibility
np.random.seed(42)
torch.manual_seed(42)

print('✓ Libraries imported successfully')

---
## Part 1: Programming Exercises (60%)

Complete the following programming tasks. Read each description carefully and implement the requested functionality.

### Exercise 1: Experiment: Comparing Evaluation Metrics

**Time**: 10 min

Train a simple regression model and observe how different metrics (MSE, MAE, R²) tell different stories about model performance.

In [None]:
import torch
import torch.nn as nn
import numpy as np

# Generate synthetic regression data with some outliers
torch.manual_seed(42)
X = torch.randn(100, 1) * 10
y_true = 2 * X + 5 + torch.randn(100, 1) * 2

# Add 5 outliers
outlier_idx = torch.randint(0, 100, (5,))
y_true[outlier_idx] += torch.randn(5, 1) * 20

# Simple model
model = nn.Linear(1, 1)
optimizer = torch.optim.SGD(model.parameters(), lr=0.001)

# Train
for epoch in range(200):
    optimizer.zero_grad()
    pred = model(X)
    loss = nn.MSELoss()(pred, y_true)
    loss.backward()
    optimizer.step()

# Evaluate with different metrics
with torch.no_grad():
    final_pred = model(X)
    mse = nn.MSELoss()(final_pred, y_true).item()
    mae = nn.L1Loss()(final_pred, y_true).item()
    
    # R-squared
    ss_res = torch.sum((y_true - final_pred) ** 2).item()
    ss_tot = torch.sum((y_true - y_true.mean()) ** 2).item()
    r2 = 1 - (ss_res / ss_tot)
    
    print(f"MSE: {mse:.4f}")
    print(f"MAE: {mae:.4f}")
    print(f"R²: {r2:.4f}")

# TODO: Answer reflection questions about what each metric tells you

### Exercise 2: Experiment: Detecting Overfitting

**Time**: 12 min

Observe what overfitting looks like by comparing train vs validation performance.

In [None]:
import torch
import torch.nn as nn

# Generate data
torch.manual_seed(42)
X_train = torch.randn(50, 1) * 5
y_train = 3 * X_train + 2 + torch.randn(50, 1) * 1

X_val = torch.randn(20, 1) * 5
y_val = 3 * X_val + 2 + torch.randn(20, 1) * 1

# Model with too much capacity (overly complex for the task)
overfit_model = nn.Sequential(
    nn.Linear(1, 100),
    nn.ReLU(),
    nn.Linear(100, 100),
    nn.ReLU(),
    nn.Linear(100, 1)
)

optimizer = torch.optim.Adam(overfit_model.parameters(), lr=0.01)
criterion = nn.MSELoss()

train_losses = []
val_losses = []

for epoch in range(500):
    # Train
    optimizer.zero_grad()
    train_pred = overfit_model(X_train)
    train_loss = criterion(train_pred, y_train)
    train_loss.backward()
    optimizer.step()
    
    # Validate
    with torch.no_grad():
        val_pred = overfit_model(X_val)
        val_loss = criterion(val_pred, y_val)
    
    train_losses.append(train_loss.item())
    val_losses.append(val_loss.item())
    
    if epoch % 100 == 0:
        print(f"Epoch {epoch}: Train Loss = {train_loss:.4f}, Val Loss = {val_loss:.4f}")

# TODO: Observe the pattern. When does overfitting start?

---
## Part 2: Knowledge Questions (40%)

Answer the following questions to test your conceptual understanding.

### Question 1 (Short Answer)

**Question 1 - Understanding MSE vs MAE**

MSE = mean((y_true - y_pred)²)
MAE = mean(|y_true - y_pred|)

Explain:
1. Why does MSE use squaring? What effect does this have?
2. When would you prefer MAE over MSE?
3. If you have outliers in your data, which metric would be more affected? Why?

**Hint**: Think about what happens when you square large errors vs small errors.

**Your Answer**:

[Write your answer here in 2-4 sentences]

### Question 2 (Short Answer)

**Question 2 - What R² Really Means**

R² = 1 - (SS_residual / SS_total)

R² ranges from 0 to 1 (or sometimes negative for very bad models).

Explain in plain language:
1. What does R² = 0.8 mean about your model?
2. What does R² = 0 mean?
3. Why is R² more interpretable than MSE for comparing models across different datasets?

**Hint**: R² represents the proportion of variance explained by the model.

**Your Answer**:

[Write your answer here in 2-4 sentences]

### Question 3 (Multiple Choice)

**Question 3 - Metric Selection**

You're building a house price predictor. Buyers care most about absolute dollar errors (not squared errors), and there are some mansions that could skew your metrics.

Which metric should you prioritize?

A) MSE - emphasizes large errors
B) MAE - treats all errors equally
C) R² - explains variance
D) RMSE - root of MSE

A) MSE - emphasizes large errors
B) MAE - treats all errors equally
C) R² - explains variance
D) RMSE - root of MSE

**Hint**: Which metric is in dollars and doesn't over-penalize mansion outliers?

**Your Answer**: [Write your answer here - e.g., 'B']

**Explanation**: [Explain why this is correct]

### Question 4 (Short Answer)

**Question 4 - Experiment Reflection: Metrics**

After running the 'Comparing Evaluation Metrics' experiment:

The data had 5 outliers added. Compare the MSE vs MAE values you observed.

Explain: Which metric was more affected by the outliers, and why does this happen mathematically?

**Hint**: Remember that MSE squares all errors.

**Your Answer**:

[Write your answer here in 2-4 sentences]

### Question 5 (Short Answer)

**Question 5 - Detecting Overfitting**

Based on the 'Detecting Overfitting' experiment:

You should see training loss decrease continuously while validation loss eventually increases.

Explain:
1. At what point does overfitting start?
2. What is the model doing when train loss drops but val loss rises?
3. How would you prevent this in practice?

**Hint**: The model starts memorizing training data instead of learning patterns.

**Your Answer**:

[Write your answer here in 2-4 sentences]

### Question 6 (Short Answer)

**Question 6 - Train/Val Split Purpose**

Explain the PURPOSE of splitting data into train and validation sets.

What specific problem does this solve? What would happen if you only evaluated on training data?

**Hint**: You need separate data to detect when the model stops generalizing.

**Your Answer**:

[Write your answer here in 2-4 sentences]

### Question 7 (Multiple Choice)

**Question 7 - Model Complexity and Overfitting**

You have 100 training examples. Which model is MORE likely to overfit?

A) Linear model: y = wx + b (2 parameters)
B) Deep network with 10,000 parameters
C) Both equally likely
D) Neither will overfit

A) Linear model: y = wx + b (2 parameters)
B) Deep network with 10,000 parameters
C) Both equally likely
D) Neither will overfit

**Hint**: More parameters = more capacity to memorize training data.

**Your Answer**: [Write your answer here - e.g., 'B']

**Explanation**: [Explain why this is correct]

### Question 8 (Short Answer)

**Question 8 - Systematic Improvement**

You train a model and get: Train R² = 0.60, Val R² = 0.58

Then you try a deeper network and get: Train R² = 0.95, Val R² = 0.50

What does this tell you? Should you use the deeper model? Why or why not?

**Hint**: The gap between train and val performance reveals overfitting.

**Your Answer**:

[Write your answer here in 2-4 sentences]

### Question 9 (Short Answer)

**Question 9 - Learning Rate Impact on Metrics**

You train two models on the same data:
- Model A (LR=0.001): Final MSE = 10.5 after 1000 epochs
- Model B (LR=0.1): Final MSE = 45.2 after 1000 epochs

What does this suggest about Model B's learning rate? How would you diagnose this?

**Hint**: A learning rate that's too high causes instability.

**Your Answer**:

[Write your answer here in 2-4 sentences]

### Question 10 (Short Answer)

**Question 10 - Real-World Trade-offs**

In production, you must choose between:
- Model A: Avg error $50,000 (but only $10,000 on cheap houses, $200,000 on mansions)
- Model B: Avg error $60,000 (but consistent across all price ranges)

Which do you deploy? Explain your reasoning using metric selection concepts.

**Hint**: This is about MSE vs MAE philosophy. Do you care more about average or consistency?

**Your Answer**:

[Write your answer here in 2-4 sentences]

---
## Submission

Before submitting:
1. Run all cells to ensure code executes without errors
2. Check that all questions are answered
3. Review your explanations for clarity

**To Submit**:
- File → Download → Download .ipynb
- Submit the notebook file to your course LMS

**Note**: Make sure your name is in the filename (e.g., homework_01_yourname.ipynb)