# Day 8 — Student Notebook

*Auto-generated notebook based on provided lecture slides.*

## Day 8 — Model Evaluation & Exam-style practice
**Goals:** learn common metrics (precision, recall, f1, ROC-AUC), cross-validation, and overfitting/underfitting intuition.

In [None]:
# Setup: installs (uncomment the !pip lines if needed) and imports
# If running in a managed environment (e.g. Google Colab), uncomment the pip installs below.
# !pip install pandas numpy seaborn plotly scikit-learn matplotlib

import pandas as pd, numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler, OneHotEncoder
from sklearn.impute import SimpleImputer
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix, roc_auc_score, roc_curve
sns.set_theme(style='whitegrid')

# Load dataset (seaborn's titanic dataset) - we'll use this across all notebooks
df = sns.load_dataset('titanic')
df_original = df.copy()  # keep a pristine copy
print('Loaded titanic dataset with shape:', df.shape)
df.head()


In [None]:
# Prepare data quickly
df_ml = df.copy()
df_ml['age'] = df_ml['age'].fillna(df_ml['age'].median())
df_ml['fare'] = df_ml['fare'].fillna(df_ml['fare'].median())
df_ml = pd.get_dummies(df_ml, columns=['sex','embarked','class'], drop_first=True)
features = ['age','fare'] + [c for c in df_ml.columns if c.startswith('sex_') or c.startswith('embarked_') or c.startswith('class_')]
X = df_ml[features]
y = df_ml['survived']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Fit a model
clf = LogisticRegression(max_iter=300)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
y_proba = clf.predict_proba(X_test)[:,1]


### 1) Compute evaluation metrics (student)
- Accuracy, precision, recall, f1
- Confusion matrix
- ROC AUC and plot ROC curve

**Task:** compute and interpret these metrics.

In [None]:
# Student: metrics
print('Accuracy:', accuracy_score(y_test, y_pred))
print('Precision:', precision_score(y_test, y_pred))
print('Recall:', recall_score(y_test, y_pred))
print('F1:', f1_score(y_test, y_pred))
print('ROC AUC:', roc_auc_score(y_test, y_proba))

# ROC curve data
fpr, tpr, thr = roc_curve(y_test, y_proba)
import matplotlib.pyplot as plt
plt.plot(fpr, tpr)
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.grid(True)
plt.show()


### 2) Cross-validation (student)
- Use `cross_val_score` to estimate average accuracy with 5-fold CV

**Task:** compute 5-fold cross-validated accuracy.

In [None]:
from sklearn.model_selection import cross_val_score
scores = cross_val_score(clf, X, y, cv=5, scoring='accuracy')
print('5-fold accuracy scores:', scores)
print('Mean accuracy:', np.mean(scores))


### 3) Over/underfitting demo (student)
- Train KNN with k from 1 to 20 and plot train vs test accuracy to see bias-variance tradeoff.

In [None]:
train_scores = []
test_scores = []
ks = list(range(1,21))
for k in ks:
    model = KNeighborsClassifier(n_neighbors=k)
    model.fit(X_train, y_train)
    train_scores.append(model.score(X_train, y_train))
    test_scores.append(model.score(X_test, y_test))

plt.plot(ks, train_scores, label='train')
plt.plot(ks, test_scores, label='test')
plt.xlabel('k (neighbors)')
plt.ylabel('Accuracy')
plt.legend()
plt.title('KNN: train vs test accuracy')
plt.show()


### Reflection
- Where is overfitting visible? How would you choose k? What other steps reduce overfitting?