# Chapter 2.3: Evaluation Metrics

Goal: Calculate and interpret MSE, MAE, RMSE, and R² to understand model performance.

### Topics:
- Computing metrics manually and with sklearn
- Understanding when to use MSE vs MAE
- Interpreting R² as variance explained
- How outliers affect different metrics

In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

## Quick Recap

- **MSE**: Average of squared errors - penalizes large errors heavily
- **RMSE**: Square root of MSE - same units as target variable
- **MAE**: Average of absolute errors - robust to outliers
- **R²**: Proportion of variance explained (0 to 1, higher is better)

In [None]:
# Load the Diamonds dataset
diamonds = sns.load_dataset('diamonds')
diamonds.head()

In [None]:
# X = carats column, y = price column
X = ...
y = ...

# Train-test split the data (80/20)


# Fit the model


# Get predictions



## Practice

### 1. Fit regression: carat → price, calculate predictions

(Already done above - just verify the predictions look reasonable)

In [None]:
# Look at first 5 actual vs predicted values


### 2. By hand - Calculate MSE: `((y_true - y_pred)**2).mean()`

In [None]:
# Step 1: Calculate the errors (residuals)
errors = y_test - y_pred

# Step 2: Square the errors


# Step 3: Take the mean


# Print it out

### 3. By hand - Calculate MAE: `(abs(y_true - y_pred)).mean()`

In [None]:
# Step 1: Take absolute value of errors


# Step 2: Take the mean


# Print it out

### 4. Use sklearn functions to verify your calculations

In [None]:
# Calculate MSE, MAE, and R² using sklearn


**Question:** Which metric is most interpretable for this problem? Why?

(Write your answer here - hint: RMSE is in the same units as price)

### 5. Add 5 extreme outliers to y_test, recalculate MSE vs MAE - which changed more?

In [None]:
# Create a copy of y_test with outliers
y_test_outliers = y_test.copy()

# Add 5 extreme outliers (errors of $50,000)
outlier_indices = y_test_outliers.index[:5]
y_pred_outliers = y_pred.copy()
y_pred_outliers[:5] = y_pred_outliers[:5] + 50000  # Predictions are way off

# Calculate new metrics with outliers


# Print results (write summary in markdown cell below)


**Your interpretation:** Which metric changed more? Why does this happen?

(Write your answer here)

### 6. Calculate Adjusted R²

Adjusted R² penalizes adding features that don't help:

$$\text{Adjusted } R^2 = 1 - \frac{(1 - R^2)(n - 1)}{n - p - 1}$$

Where:
- n = number of samples
- p = number of features

In [None]:
# Calculate Adjusted R²


# Compare R² to adjusted R²

**Question:** How are they different? What does this tell us?

(Write your answer here)

## Summary

| Metric | Value | Interpretation |
|--------|-------|----------------|
| MSE | | Sensitive to large errors |
| RMSE | | Average error in dollars |
| MAE | | Robust to outliers |
| R² | | Variance explained |