#  Day 18 ‚Äì Classification Metrics (Part 2)

##  Recap: Classification Metrics
Classification problems deal with predicting **categorical outcomes**.  
To evaluate classification models, we use performance metrics derived from the **confusion matrix**.

### Common Metrics
- Accuracy  
- Precision  
- Recall (Sensitivity)  
- Specificity  
- F1 Score  
- ROC Curve & AUC  

---

##  Positive vs ‚ûñ Negative Classes (Convention)

- **Positive Class**  
  - Rare event  
  - Needs attention  
  - Example: Disease present, Spam mail, Fraud

- **Negative Class**  
  - Frequent / Normal  
  - Example: No disease, Ham mail, Legit transaction

> Positive ‚â† Good  
> Negative ‚â† Bad  

---

##  True vs  False

- **True** ‚Üí Correct prediction  
- **False** ‚Üí Wrong prediction  

---

##  Confusion Matrix

| Actual \\ Predicted | Positive | Negative |
|--------------------|----------|----------|
| **Positive**       | TP       | FN       |
| **Negative**       | FP       | TN       |

### Definitions
- **TP (True Positive):** Positive correctly predicted as positive  
- **TN (True Negative):** Negative correctly predicted as negative  
- **FP (False Positive):** Negative wrongly predicted as positive  
- **FN (False Negative):** Positive wrongly predicted as negative  

---

##  Classification Metrics & Formulae

###  Accuracy
Overall correctness of the model.
- Accuracy = (TP + TN) / (TP + TN + FP + FN)
  


- Can be misleading for imbalanced datasets.

---

### üéØ Precision (Positive Predictive Value)
- Out of all predicted positives, how many are actually positive?
- Precision = TP / (TP + FP)


- Focuses on **False Positives (FP)**.

---

### üîç Recall (Sensitivity / True Positive Rate)
- Out of all actual positives, how many did the model correctly identify?
- Recall = TP / (TP + FN)

- Focuses on **False Negatives (FN)**.

---

### üõ°Ô∏è Specificity (True Negative Rate)
- Out of all actual negatives, how many are correctly predicted?
- Specificity = TN / (TN + FP)


---

###  False Rates
- False Positive Rate (FPR) = FP / (FP + TN)
- False Negative Rate (FNR) = FN / (TP + FN)


---

##  F1 Score ‚Äì Balanced Metric

- F1 Score is the **harmonic mean** of Precision and Recall.
- F1 Score = (2 √ó Precision √ó Recall) / (Precision + Recall)


### Behavior
- High Precision + Low Recall ‚Üí Low F1  
- Low Precision + High Recall ‚Üí Low F1  
- High Precision + High Recall ‚Üí High F1  

---

##  Numerical Example

Given:
- TP = 2  
- TN = 3  
- FP = 3  
- FN = 3  

Calculations:
- Accuracy = (2 + 3) / (2 + 3 + 3 + 3) = 5/11
- Precision = 2 / (2 + 3) = 2/5
- Recall = 2 / (2 + 3) = 2/5
- F1 Score = 2 √ó (2/5 √ó 2/5) / (2/5 + 2/5) = 2/5


---

## üìå Choosing the Right Metric (Very Important)

### üìß Spam‚ÄìHam Classification
- **Spam** ‚Üí Positive (rare, needs attention)
- **Ham** ‚Üí Negative (frequent)

#### Misclassification Costs:
1. Spam ‚Üí Ham (**FN**) ‚Üí Less costly  
2. Ham ‚Üí Spam (**FP**) ‚Üí Very costly  

‚úÖ **Precision is the best metric**  
Goal: Minimize **False Positives**

---

## üè• Healthcare / Covid Example

- Disease present predicted as absent (**FN**) is very dangerous  
- FN is **costly**

‚úÖ Use **Recall**  
Goal: Catch all positive cases

---

## üß† Metric Selection Guide

| Scenario | Metric to Focus |
|--------|----------------|
| Spam detection | Precision |
| Disease screening | Recall |
| Fraud detection | Recall |
| Balanced importance | F1 Score |
| Imbalanced dataset | Precision / Recall / F1 |




In [5]:
import numpy as np
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix

In [2]:
y_act = np.array([0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0])
y_act

array([0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0])

In [3]:
y_pred = np.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1])
y_pred

array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1])

In [6]:
confusion_matrix(y_act, y_pred)

array([[3, 3],
       [3, 2]], dtype=int64)

In [7]:
accuracy_score(y_act, y_pred)

0.45454545454545453

In [8]:
precision_score(y_act, y_pred)

0.4

In [9]:
recall_score(y_act, y_pred)

0.4

In [10]:
f1_score(y_act, y_pred)

0.4

---

## üßæ Final Summary

- Confusion Matrix is the foundation of all metrics  
- Metric selection depends on **cost of misclassification**  
- FP costly ‚Üí Precision  
- FN costly ‚Üí Recall  
- Both costly ‚Üí F1 Score  

‚úîÔ∏è Day 18 completed: **Deep understanding of classification metrics and real-world decision making**