# Notebook 12 — Model Evaluation & Validation (Exercises)

This notebook focuses on evaluating and validating ML models.

Topics:
- Train/Test Split
- Cross-Validation (k-fold, stratified)
- Classification & Regression Metrics
- Bias-Variance Tradeoff
- Learning Curves & Validation Curves

Work through the exercises before checking the solutions notebook.

## Exercise 1 — Train/Test Split
1. Load the **Iris dataset**.
2. Perform a train/test split (70/30).
3. Fit a Logistic Regression model and report the test accuracy.
4. Explain why relying only on train/test split can be misleading.

In [1]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# TODO: Train/test split and evaluate Logistic Regression


## Exercise 2 — Cross-Validation
1. Use **k-fold cross-validation** (k=5) on the Iris dataset with Logistic Regression.
2. Compare the average accuracy from CV to the single train/test split.
3. Repeat using **StratifiedKFold** — why is stratification important for imbalanced data?

In [2]:
from sklearn.model_selection import cross_val_score, KFold, StratifiedKFold

# TODO: Implement k-fold and stratified cross-validation


## Exercise 3 — Classification Metrics
1. Train a RandomForestClassifier on the **Breast Cancer dataset**.
2. Report **Accuracy, Precision, Recall, F1-score, ROC-AUC**.
3. Which metric would you prioritize if false negatives are more costly than false positives?

In [3]:
from sklearn.datasets import load_breast_cancer
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import precision_score, recall_score, f1_score, roc_auc_score

# TODO: Train and evaluate classification metrics


## Exercise 4 — Regression Metrics
1. Use the **California Housing dataset**.
2. Train a RandomForestRegressor.
3. Report **MSE, RMSE, MAE, R²**.
4. Discuss which metric is most interpretable for business stakeholders.

In [4]:
from sklearn.datasets import fetch_california_housing
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
import numpy as np

# TODO: Train and evaluate regression metrics


## Exercise 5 — Bias-Variance Tradeoff
1. Fit Decision Trees of varying depth on the Iris dataset.
2. Record training and test accuracy.
3. Plot accuracy vs tree depth.
4. Identify underfitting and overfitting regions.

In [5]:
import matplotlib.pyplot as plt
from sklearn.tree import DecisionTreeClassifier

# TODO: Fit multiple trees with increasing depth and plot train vs test accuracy


## Exercise 6 — Learning Curves
1. Generate learning curves for Logistic Regression on the Breast Cancer dataset.
2. Plot training score vs validation score across increasing sample sizes.
3. What does the gap between curves indicate about bias/variance?

In [6]:
from sklearn.model_selection import learning_curve
import numpy as np

# TODO: Implement learning curve plots


## Exercise 7 — Validation Curves
1. Use the Breast Cancer dataset.
2. Generate a validation curve for SVM varying the regularization parameter C.
3. Plot training score vs validation score.
4. Identify the optimal C value.

In [7]:
from sklearn.svm import SVC
from sklearn.model_selection import validation_curve

# TODO: Implement validation curve for SVM with different C values
