## MACHINE LEARNING DAY - 16 : Accuracy Measures in Classification

### What is a Confusion Matrix?

A **confusion matrix** is a performance measurement tool for classification problems. It is a **table** that shows the actual vs. predicted classifications, and helps us understand how well a classification model is performing.

---

### Structure of a Confusion Matrix (for Binary Classification)

|                      | **Predicted: Positive** | **Predicted: Negative** |
| -------------------- | ----------------------- | ----------------------- |
| **Actual: Positive** | True Positive (TP)      | False Negative (FN)     |
| **Actual: Negative** | False Positive (FP)     | True Negative (TN)      |

---

### Key Accuracy Measures (Metrics)

1. **Accuracy**:

   * The overall correctness of the model.
   * **Formula**:

     $$
     \text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN}
     $$

2. **Precision** (also called Positive Predictive Value):

   * Out of all predicted positives, how many were actually positive?
   * **Formula**:

     $$
     \text{Precision} = \frac{TP}{TP + FP}
     $$

3. **Recall** (also called Sensitivity or True Positive Rate):

   * Out of all actual positives, how many did we correctly predict?
   * **Formula**:

     $$
     \text{Recall} = \frac{TP}{TP + FN}
     $$

4. **F1 Score**:

   * Harmonic mean of Precision and Recall. Useful when you need a balance between Precision and Recall.
   * **Formula**:

     $$
     \text{F1 Score} = 2 \cdot \frac{\text{Precision} \cdot \text{Recall}}{\text{Precision} + \text{Recall}}
     $$

---

### Example:

Imagine a model that predicts whether an email is spam or not:

* **TP (Spam correctly predicted as spam)** = 80
* **FP (Not spam incorrectly predicted as spam)** = 10
* **FN (Spam incorrectly predicted as not spam)** = 5
* **TN (Not spam correctly predicted as not spam)** = 105

Then:

* Accuracy = (80 + 105) / (80 + 105 + 10 + 5) = **92.5%**
* Precision = 80 / (80 + 10) = **88.9%**
* Recall = 80 / (80 + 5) = **94.1%**
* F1 Score ≈ **91.4%**
* Specificity = 105 / (105 + 10) = **91.3%**