# 📅 Day 9: Evaluation Metrics & Confusion Matrix

## 🎯 Objective
Understand how to evaluate classification models using accuracy, precision, recall, F1-score, and the confusion matrix.

## 📘 Dataset: Breast Cancer Dataset
We'll continue using the breast cancer dataset for consistency.

In [None]:
from sklearn.datasets import load_breast_cancer
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression

# Load and prepare data
data = load_breast_cancer()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target

X = df.drop('target', axis=1)
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

model = LogisticRegression()
model.fit(X_train_scaled, y_train)
y_pred = model.predict(X_test_scaled)


## 📊 Confusion Matrix

In [None]:
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

cm = confusion_matrix(y_test, y_pred)
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=data.target_names)
disp.plot(cmap='Blues')


## 📐 Evaluation Metrics

In [None]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, classification_report

print("Accuracy:", accuracy_score(y_test, y_pred))
print("Precision:", precision_score(y_test, y_pred))
print("Recall:", recall_score(y_test, y_pred))
print("F1-Score:", f1_score(y_test, y_pred))


## 📋 Full Classification Report

In [None]:
print(classification_report(y_test, y_pred, target_names=data.target_names))

## 📘 Metric Breakdown


| Metric        | Formula                                     | Description |
|---------------|---------------------------------------------|-------------|
| Accuracy      | (TP + TN) / (TP + FP + TN + FN)             | Overall correctness |
| Precision     | TP / (TP + FP)                              | How many predicted positives are true |
| Recall        | TP / (TP + FN)                              | How many actual positives are correctly predicted |
| F1-Score      | 2 * (Precision * Recall) / (Precision + Recall) | Balance between precision and recall |


## ✅ Summary
- Confusion matrix helps you visualize model performance.
- Use multiple metrics for a complete evaluation, especially with imbalanced datasets.