# 📘 Part 2: Classification Metrics

In this section, we will cover the most important **classification metrics** used in Machine Learning and Deep Learning.

These metrics are essential for evaluating models on **binary classification** and **multi-class classification** problems.

We will cover:
1. Accuracy
2. Precision
3. Recall
4. F1 Score
5. ROC & AUC


## 1. Accuracy

**Definition:**
Accuracy measures the proportion of correctly classified instances (both positive and negative) out of the total instances.

**Why it is used in ML:**
It provides a simple measure of performance when classes are balanced.

**Pros:** Easy to interpret, widely used.
**Cons:** Misleading for imbalanced datasets (e.g., 95% negatives).

$$ Accuracy = \frac{TP + TN}{TP + TN + FP + FN} $$


In [None]:
# Manual Accuracy Calculation
import numpy as np
from sklearn.metrics import accuracy_score

y_true = np.array([1, 0, 1, 1, 0, 1, 0, 0])
y_pred = np.array([1, 0, 1, 0, 0, 1, 0, 1])

# Manual
accuracy_manual = np.sum(y_true == y_pred) / len(y_true)
print('Manual Accuracy:', accuracy_manual)

# Sklearn
print('Sklearn Accuracy:', accuracy_score(y_true, y_pred))

## 2. Precision

**Definition:**
Precision measures the proportion of correctly predicted positive observations out of all predicted positives.

**Why it is used in ML:**
Useful when the cost of **false positives** is high.

**Pros:** Important in information retrieval (search engines, spam filters).
**Cons:** Ignores false negatives.

$$ Precision = \frac{TP}{TP + FP} $$


In [None]:
from sklearn.metrics import precision_score

precision_manual = np.sum((y_true == 1) & (y_pred == 1)) / np.sum(y_pred == 1)
print('Manual Precision:', precision_manual)

print('Sklearn Precision:', precision_score(y_true, y_pred))

## 3. Recall (Sensitivity)

**Definition:**
Recall measures the proportion of correctly predicted positive observations out of all actual positives.

**Why it is used in ML:**
Important when missing a positive case is very costly (e.g., medical diagnosis).

**Pros:** Prioritizes catching all positive cases.
**Cons:** Can be high while precision is low.

$$ Recall = \frac{TP}{TP + FN} $$


In [None]:
from sklearn.metrics import recall_score

recall_manual = np.sum((y_true == 1) & (y_pred == 1)) / np.sum(y_true == 1)
print('Manual Recall:', recall_manual)

print('Sklearn Recall:', recall_score(y_true, y_pred))

## 4. F1 Score

**Definition:**
The F1 score is the harmonic mean of Precision and Recall.

**Why it is used in ML:**
Balances the trade-off between Precision and Recall.

**Pros:** Good when you need a balance between precision and recall.
**Cons:** Not intuitive to interpret compared to accuracy.

$$ F1 = 2 \cdot \frac{Precision \cdot Recall}{Precision + Recall} $$


In [None]:
from sklearn.metrics import f1_score

precision = precision_score(y_true, y_pred)
recall = recall_score(y_true, y_pred)
f1_manual = 2 * (precision * recall) / (precision + recall)
print('Manual F1 Score:', f1_manual)

print('Sklearn F1 Score:', f1_score(y_true, y_pred))

## 5. ROC Curve & AUC

**Definition:**
The ROC curve plots True Positive Rate (Recall) against False Positive Rate at different thresholds. The AUC measures the overall area under this curve.

**Why it is used in ML:**
Useful for comparing classifiers, especially with imbalanced data.

**Pros:** Threshold-independent evaluation.
**Cons:** Can be misleading with highly imbalanced datasets.

$$ TPR = \frac{TP}{TP + FN} $$
$$ FPR = \frac{FP}{FP + TN} $$

$$ AUC = \int_0^1 TPR(FPR) \, dFPR $$


In [None]:
from sklearn.metrics import roc_curve, auc
import matplotlib.pyplot as plt

y_scores = np.array([0.9, 0.2, 0.8, 0.4, 0.3, 0.7, 0.1, 0.6])  # Probabilities
fpr, tpr, thresholds = roc_curve(y_true, y_scores)
roc_auc = auc(fpr, tpr)

plt.plot(fpr, tpr, label=f'ROC Curve (AUC = {roc_auc:.2f})')
plt.plot([0, 1], [0, 1], 'k--')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.legend(loc='lower right')
plt.show()