# **AI TECH INSTITUTE** · *Intermediate AI & Data Science*
### Week 7 - Lab 01: ML Fundamentals & Evaluation Practice
**Instructor:** Amir Charkhi | **Type:** Hands-On Practice

> Practice what you learned in Notebooks 01 & 02

## 🎯 Lab Objectives

In this lab, you'll practice:
- Loading data and performing train/test split
- Training your first models
- Calculating and interpreting evaluation metrics
- Understanding confusion matrices
- Choosing the right metric for the problem

**Time**: 30-40 minutes  
**Difficulty**: ⭐⭐☆☆☆ (Beginner)

---

## 📚 Quick Reference

**Classification Metrics:**
```python
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
accuracy = accuracy_score(y_true, y_pred)
precision = precision_score(y_true, y_pred)
recall = recall_score(y_true, y_pred)
f1 = f1_score(y_true, y_pred)
```

**Regression Metrics:**
```python
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
mae = mean_absolute_error(y_true, y_pred)
mse = mean_squared_error(y_true, y_pred)
rmse = np.sqrt(mse)
r2 = r2_score(y_true, y_pred)
```

---

In [None]:
# Setup - Run this cell first!
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_breast_cancer, load_diabetes
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression, LinearRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score,
    confusion_matrix, mean_absolute_error, mean_squared_error, r2_score
)
import warnings
warnings.filterwarnings('ignore')

print("✅ Setup complete! Let's start practicing!")

---

## 📊 Exercise 1: Train/Test Split Basics

Let's practice the most fundamental concept in ML!

### Task 1.1: Load the Dataset

In [None]:
# Load the breast cancer dataset
cancer = load_breast_cancer()
X = cancer.data
y = cancer.target

print(f"Dataset loaded!")
print(f"Number of samples: {len(X)}")
print(f"Number of features: {X.shape[1]}")
print(f"Classes: {np.unique(y)}")

# TODO 1.1: Print the class distribution (how many 0s and 1s)
# Hint: Use np.bincount(y) or pd.Series(y).value_counts()

# Your code here:


### Task 1.2: Perform Train/Test Split

In [None]:
# TODO 1.2: Split the data into train and test sets
# Requirements:
#   - Use 80% for training, 20% for testing
#   - Set random_state=42 for reproducibility
#   - Use stratify=y to maintain class balance

# Your code here:
X_train, X_test, y_train, y_test = # Complete this line

# Validation (Don't modify)
print(f"✅ Training set: {len(X_train)} samples")
print(f"✅ Test set: {len(X_test)} samples")
print(f"✅ Train/test split: {len(X_train)/len(X)*100:.0f}% / {len(X_test)/len(X)*100:.0f}%")

assert len(X_train) + len(X_test) == len(X), "❌ Split doesn't add up!"
assert len(X_train) > len(X_test), "❌ Training set should be larger!"
print("\n🎉 Task 1.2 Complete!")

### Task 1.3: Verify Stratification

In [None]:
# TODO 1.3: Calculate and compare class proportions
# Calculate the percentage of class 1 (positive class) in:
#   - Original dataset (y)
#   - Training set (y_train)
#   - Test set (y_test)

# Your code here:
original_class1_pct = # Calculate percentage of 1s in y
train_class1_pct = # Calculate percentage of 1s in y_train
test_class1_pct = # Calculate percentage of 1s in y_test

# Validation (Don't modify)
print(f"Class 1 percentage:")
print(f"  Original: {original_class1_pct:.1%}")
print(f"  Training: {train_class1_pct:.1%}")
print(f"  Test:     {test_class1_pct:.1%}")

if abs(train_class1_pct - original_class1_pct) < 0.02:  # Within 2%
    print("\n✅ Stratification worked! Proportions are similar.")
    print("🎉 Task 1.3 Complete!")
else:
    print("\n⚠️ Proportions differ - did you use stratify=y?")

---

## 🤖 Exercise 2: Building Your First Model

Now let's train a model and make predictions!

### Task 2.1: Train a Logistic Regression Model

In [None]:
# TODO 2.1: Create and train a Logistic Regression model
# Steps:
#   1. Create a LogisticRegression model (set max_iter=1000)
#   2. Fit it on the training data
#   3. Make predictions on the test set

# Your code here:
model = # Create the model
# Fit the model
y_pred = # Make predictions on X_test

# Validation (Don't modify)
print(f"✅ Model trained!")
print(f"✅ Predictions made for {len(y_pred)} test samples")
print(f"\nFirst 10 predictions: {y_pred[:10]}")
print(f"First 10 actual:      {y_test[:10]}")
print("\n🎉 Task 2.1 Complete!")

### Task 2.2: Calculate Basic Metrics

In [None]:
# TODO 2.2: Calculate accuracy, precision, recall, and F1-score
# Use the functions from sklearn.metrics

# Your code here:
accuracy = # Calculate accuracy
precision = # Calculate precision
recall = # Calculate recall
f1 = # Calculate F1-score

# Validation (Don't modify)
print("Model Performance:")
print(f"  Accuracy:  {accuracy:.4f} ({accuracy:.1%})")
print(f"  Precision: {precision:.4f} ({precision:.1%})")
print(f"  Recall:    {recall:.4f} ({recall:.1%})")
print(f"  F1-Score:  {f1:.4f} ({f1:.1%})")

if accuracy > 0.9:
    print("\n✅ Great performance! Above 90% accuracy!")
    print("🎉 Task 2.2 Complete!")
else:
    print(f"\n⚠️ Accuracy is {accuracy:.1%} - check your code")

---

## 🎭 Exercise 3: Understanding Confusion Matrix

The confusion matrix is the foundation of all classification metrics!

### Task 3.1: Create Confusion Matrix

In [None]:
# TODO 3.1: Calculate the confusion matrix
# Then extract True Negatives, False Positives, False Negatives, True Positives

# Your code here:
cm = # Calculate confusion matrix using y_test and y_pred
tn, fp, fn, tp = # Extract the four values (use cm.ravel())

# Validation (Don't modify)
print("Confusion Matrix:")
print(cm)
print(f"\nBreakdown:")
print(f"  True Negatives (TN):  {tn}")
print(f"  False Positives (FP): {fp}")
print(f"  False Negatives (FN): {fn}")
print(f"  True Positives (TP):  {tp}")

assert tn + fp + fn + tp == len(y_test), "❌ Values don't add up!"
print("\n🎉 Task 3.1 Complete!")

### Task 3.2: Calculate Metrics from Confusion Matrix

In [None]:
# TODO 3.2: Calculate metrics manually from TP, TN, FP, FN
# This helps you understand what each metric really means!

# Formulas:
# Accuracy = (TP + TN) / (TP + TN + FP + FN)
# Precision = TP / (TP + FP)
# Recall = TP / (TP + FN)

# Your code here:
manual_accuracy = # Calculate manually
manual_precision = # Calculate manually
manual_recall = # Calculate manually

# Validation (Don't modify)
print("Manual Calculations:")
print(f"  Accuracy:  {manual_accuracy:.4f}")
print(f"  Precision: {manual_precision:.4f}")
print(f"  Recall:    {manual_recall:.4f}")

print("\nCompare with sklearn:")
print(f"  Accuracy:  {accuracy:.4f}")
print(f"  Precision: {precision:.4f}")
print(f"  Recall:    {recall:.4f}")

if abs(manual_accuracy - accuracy) < 0.001:
    print("\n✅ Perfect match! You understand the formulas!")
    print("🎉 Task 3.2 Complete!")
else:
    print("\n⚠️ Numbers don't match - check your formulas")

---

## 📈 Exercise 4: Regression Metrics

Now let's practice with regression (predicting numbers)!

### Task 4.1: Train a Regression Model

In [None]:
# TODO 4.1: Load diabetes dataset, split, and train a model
# This is a regression problem (predicting disease progression)

# Load data
diabetes = load_diabetes()
X_reg = diabetes.data
y_reg = diabetes.target

print(f"Regression dataset loaded: {len(X_reg)} samples")

# Your code here:
# 1. Split into train/test (80/20, random_state=42)
X_train_reg, X_test_reg, y_train_reg, y_test_reg = # Split here

# 2. Create and train a LinearRegression model
reg_model = # Create model
# Fit model

# 3. Make predictions on test set
y_pred_reg = # Predict

# Validation (Don't modify)
print(f"\n✅ Model trained and predictions made!")
print(f"Sample predictions: {y_pred_reg[:5]}")
print(f"Sample actual:      {y_test_reg[:5]}")
print("\n🎉 Task 4.1 Complete!")

### Task 4.2: Calculate Regression Metrics

In [None]:
# TODO 4.2: Calculate MAE, MSE, RMSE, and R²

# Your code here:
mae = # Mean Absolute Error
mse = # Mean Squared Error
rmse = # Root Mean Squared Error (use np.sqrt on MSE)
r2 = # R² Score

# Validation (Don't modify)
print("Regression Metrics:")
print(f"  MAE:  {mae:.2f}")
print(f"  MSE:  {mse:.2f}")
print(f"  RMSE: {rmse:.2f}")
print(f"  R²:   {r2:.4f}")

print(f"\n💡 Interpretation:")
print(f"  On average, predictions are off by {mae:.1f} units (MAE)")
print(f"  Model explains {r2*100:.1f}% of variance in disease progression")

if r2 > 0.3:
    print(f"\n✅ R² > 0.3 is reasonable for this dataset!")
    print("🎉 Task 4.2 Complete!")
else:
    print(f"\n⚠️ R² is low - check your code")

---

## 🎯 Exercise 5: Choosing the Right Metric

Understanding WHEN to use which metric is critical!

### Task 5.1: Metric Selection Quiz

In [None]:
# TODO 5.1: For each scenario, choose the best metric
# Options: 'accuracy', 'precision', 'recall', 'f1', 'mae', 'rmse', 'r2'

# Scenario 1: Email spam detection - we DON'T want important emails in spam
scenario_1_metric = ""  # Your answer

# Scenario 2: Cancer detection - we CAN'T miss cancer cases (false negatives are deadly)
scenario_2_metric = ""  # Your answer

# Scenario 3: Predicting house prices - large errors are worse than small errors
scenario_3_metric = ""  # Your answer

# Scenario 4: Customer churn prediction - need balance, classes imbalanced
scenario_4_metric = ""  # Your answer

# Validation (Don't modify)
answers = {
    1: ('precision', "We don't want false positives (important email marked as spam)"),
    2: ('recall', "We must catch all cancer cases (minimize false negatives)"),
    3: ('rmse', "RMSE penalizes large errors more than MAE"),
    4: ('f1', "F1 balances precision and recall for imbalanced classes")
}

score = 0
your_answers = [scenario_1_metric, scenario_2_metric, scenario_3_metric, scenario_4_metric]

for i, (correct, explanation) in answers.items():
    if your_answers[i-1].lower() == correct:
        print(f"✅ Scenario {i}: Correct! {explanation}")
        score += 1
    else:
        print(f"❌ Scenario {i}: Expected '{correct}'. {explanation}")

print(f"\nScore: {score}/4")
if score == 4:
    print("🎉 Perfect! You understand metric selection!")
    print("🎉 Task 5.1 Complete!")
elif score >= 2:
    print("👍 Good! Review the wrong ones.")
else:
    print("📚 Review Notebook 02 on when to use each metric.")

---

## 🏆 Lab Complete!

### What You Practiced:

✅ **Exercise 1**: Train/test split with stratification  
✅ **Exercise 2**: Training models and making predictions  
✅ **Exercise 3**: Understanding confusion matrices  
✅ **Exercise 4**: Regression metrics (MAE, RMSE, R²)  
✅ **Exercise 5**: Choosing the right metric  

### Key Takeaways:

1. **Always stratify** for classification problems
2. **Test set is sacred** - never touch during training!
3. **Different problems need different metrics**
4. **Confusion matrix** is the foundation of all classification metrics
5. **R² shows explanatory power**, RMSE shows prediction error

### Next Steps:

- Move to **Lab 02** for cross-validation practice
- Review concepts you struggled with
- Try modifying the code to deepen understanding

**Great job! 🎉**