# Performance Metrics in Machine Learning

In this notebook, we will learn about important evaluation metrics used to measure the performance of classification models:
- Accuracy
- Precision
- Recall
- F1 Score

## 📊 Performance Metrics: Beyond Simple Accuracy

- 🎯 **Accuracy:** Overall correctness percentage
- 🔍 **Precision:** How many positive predictions were correct?
- 🎣 **Recall:** How many actual positives did we catch?
- ⚖️ **F1 Score:** Balanced combination of precision & recall

## 🎯 Accuracy: The Starting Point

**Formula:** Accuracy = (Correct Predictions) / (Total Predictions)

- 📈 Easy to understand and calculate
- ⚠️ **Problem:** Misleading with imbalanced data
- 🏥 **Example:** 95% accuracy in rare disease detection sounds good...
- 🚨 **Reality:** Model might just predict "no disease" for everyone!

## 🔍 Precision: Quality Over Quantity

**Formula:** Precision = True Positives / (True Positives + False Positives)

- 🎯 **Question:** "When I predict positive, how often am I right?"
- 📧 **Email Example:** Of emails marked as spam, how many are actually spam?
- 💡 **High Precision:** Few false alarms, but might miss some
- ⚖️ **Trade-off:** Being careful vs being thorough

## 🎣 Recall: Catching Everything

**Formula:** Recall = True Positives / (True Positives + False Negatives)

- 🔍 **Question:** "Of all actual positives, how many did I find?"
- 🏥 **Medical Example:** Of all patients with disease, how many did we detect?
- 📈 **High Recall:** Catch most cases, but more false alarms
- ⚖️ **Trade-off:** Being thorough vs being precise

## ⚖️ F1 Score: The Balanced Approach

**Formula:** F1 = 2 × (Precision × Recall) / (Precision + Recall)

- 🤝 **Purpose:** Harmonic mean of precision and recall
- 📊 **Range:** 0 to 1 (higher is better)
- ⚖️ **Best for:** Imbalanced datasets
- 🎯 **Goal:** Balance between precision and recall

## 💻 Code Example: Performance Metrics

In [None]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import load_breast_cancer

# Load dataset
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
    data.data, data.target, test_size=0.2, random_state=42
)

# Train and predict
model = KNeighborsClassifier()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

# Calculate metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

print(f"Accuracy: {accuracy:.3f}")
print(f"Precision: {precision:.3f}")
print(f"Recall: {recall:.3f}")
print(f"F1 Score: {f1:.3f}")

## 📊 Key Takeaway

> **"Different metrics tell different stories - choose wisely!"**

### Reflection Question:
- For a fraud detection system, would you prioritize precision or recall? Why?