# Performance Metrics: Confusion Matrix, Accuracy, Precision, Recall, and F-Beta Score

## Commands

* No specific technical commands were used in this lesson.

## Summary

* **Performance metrics** are essential for evaluating classification models.
* A **Confusion Matrix** maps actual values against predicted values to categorize correct predictions and errors.
* **Accuracy** can be misleading when working with **imbalanced datasets**.
* **Precision** focuses on reducing **False Positives (FP)**.
* **Recall** focuses on reducing **False Negatives (FN)**.
* The **F-Beta Score** combines Precision and Recall using a harmonic mean and allows emphasis adjustment through the **beta parameter**.

---

## The Confusion Matrix

A **Confusion Matrix** is the foundation for evaluating classification performance.

For binary classification, it is a **2×2 matrix**:

|                | Predicted 1 | Predicted 0 |
|---------------|------------|------------|
| **Actual 1**  | TP         | FN         |
| **Actual 0**  | FP         | TN         |

### Definitions

* **True Positive (TP)**: Actual = 1, Predicted = 1  
* **True Negative (TN)**: Actual = 0, Predicted = 0  
* **False Positive (FP)**: Actual = 0, Predicted = 1  
* **False Negative (FN)**: Actual = 1, Predicted = 0  

Correct predictions: **TP + TN**  
Incorrect predictions: **FP + FN**

---

## Accuracy

Accuracy measures overall correctness:

$$
Accuracy = \frac{TP + TN}{TP + TN + FP + FN}
$$

### Problem with Imbalanced Datasets

Suppose:
* 1000 records
* 900 positive (1)
* 100 negative (0)

If a model predicts **1 for every record**, then:

* TP = 900
* TN = 0
* Accuracy = 900 / 1000 = 90%

The model achieves **90% accuracy** but completely fails to detect the negative class.

Therefore, **Accuracy alone is unreliable** for imbalanced datasets.

---

## Precision: Reducing False Positives

Precision measures how many predicted positives are actually correct.

$$
Precision = \frac{TP}{TP + FP}
$$

### When to Use Precision?

Use Precision when **False Positives are costly**.

### Example: Spam Classification

* Predicting spam incorrectly for an important email (FP) is a serious mistake.
* Therefore, the priority is to **reduce False Positives**.
* Precision should be optimized.

---

## Recall: Reducing False Negatives

Recall measures how many actual positives are correctly identified.

$$
Recall = \frac{TP}{TP + FN}
$$

### When to Use Recall?

Use Recall when **False Negatives are costly**.

### Example: Disease Prediction

* If a patient actually has a disease (1) but the model predicts healthy (0), that is a **False Negative**.
* This is extremely dangerous.
* Therefore, the priority is to **reduce False Negatives**.
* Recall should be optimized.

---

## Stock Market Crash Example (Business Logic)

If building a model to predict whether the stock market will crash:

* **False Positive** → Predict crash, but no crash happens → Premature selling
* **False Negative** → Predict no crash, but crash happens → Massive financial loss

Since missing a crash is more catastrophic, the priority should be to **reduce False Negatives**, meaning **Recall** is more important.

---

## The F-Beta Score

Sometimes both Precision and Recall matter.

The **F-Beta Score** combines them using a harmonic mean:

$$
F_{\beta}
=
(1 + \beta^2)
\frac{Precision \times Recall}
{(\beta^2 \times Precision) + Recall}
$$

### Understanding Beta ($\beta$)

The $\beta$ parameter controls emphasis:

* **$\beta = 1$ → F1 Score**  
  Equal importance to Precision and Recall.

* **$\beta < 1$ (e.g., 0.5)**  
  More weight on Precision.

* **$\beta > 1$ (e.g., 2)**  
  More weight on Recall.

---

## Metric Comparison Summary

| Metric     | Focus | Best Used When |
|------------|-------|----------------|
| Accuracy   | Overall correctness | Balanced datasets |
| Precision  | Reduce FP | Spam detection, fraud alerts |
| Recall     | Reduce FN | Medical diagnosis, crash prediction |
| F1 Score   | Balance FP & FN | When both errors matter equally |
| F-Beta     | Custom balance | When one error type matters more |

---

## Key Takeaway

* Use **Accuracy** only for balanced datasets.
* Use **Precision** when False Positives are costly.
* Use **Recall** when False Negatives are costly.
* Use **F-Beta** when you need a customizable balance between Precision and Recall.