```{contents}
```

# Performance Metrics

## **1. Classification Metrics**

When KNN is used to classify data points into categories, we evaluate how well it predicts the **correct class**. Common metrics:

### **A. Accuracy**

$$
\text{Accuracy} = \frac{\text{Number of correct predictions}}{\text{Total predictions}}
$$

* Simple and intuitive.
* Works well when classes are balanced.

### **B. Confusion Matrix**

A table showing predicted vs actual labels:

| Actual \ Predicted | Class 1 | Class 2 | Class 3 |
| ------------------ | ------- | ------- | ------- |
| Class 1            | TP      | FN      | FN      |
| Class 2            | FP      | TP      | FN      |
| Class 3            | FP      | FN      | TP      |

* **TP** = True Positive, **FP** = False Positive, etc.
* Helps compute other metrics like precision and recall.

### **C. Precision**

$$
\text{Precision} = \frac{TP}{TP + FP}
$$

* Of all points predicted as class X, how many are correct?

### **D. Recall (Sensitivity)**

$$
\text{Recall} = \frac{TP}{TP + FN}
$$

* Of all points actually in class X, how many did we predict correctly?

### **E. F1 Score**

$$
F1 = 2 \cdot \frac{\text{Precision} \cdot \text{Recall}}{\text{Precision + Recall}}
$$

* Harmonic mean of precision and recall.
* Useful when classes are imbalanced.

---

## **2. Regression Metrics**

When KNN predicts continuous values:

### **A. Mean Squared Error (MSE)**

$$
\text{MSE} = \frac{1}{n} \sum_{i=1}^n (y_i - \hat{y}_i)^2
$$

* Measures average squared difference between true and predicted values.

### **B. Root Mean Squared Error (RMSE)**

$$
\text{RMSE} = \sqrt{\text{MSE}}
$$

* Same units as the target variable, easier to interpret.

### **C. Mean Absolute Error (MAE)**

$$
\text{MAE} = \frac{1}{n} \sum_{i=1}^n |y_i - \hat{y}_i|
$$

* Average absolute difference. Less sensitive to outliers than MSE.

### **D. R² Score (Coefficient of Determination)**

$$
R^2 = 1 - \frac{\sum_i (y_i - \hat{y}_i)^2}{\sum_i (y_i - \bar{y})^2}
$$

* Measures how much variance in the target is explained by the model.
* Range: 0–1 (higher is better).

---

## **3. KNN-Specific Considerations**

* **Choice of k** strongly affects performance:

  * Small k → may overfit → high variance
  * Large k → may underfit → high bias
* **Distance metric** affects how neighbors are chosen, impacting metrics.
* **Scaling features** is crucial, otherwise one feature may dominate distances → poor performance.

---

### **4. Quick Summary Table**

| Task           | Metric           | What it Measures                   |
| -------------- | ---------------- | ---------------------------------- |
| Classification | Accuracy         | Overall correct predictions        |
|                | Precision        | Correct positive predictions       |
|                | Recall           | Coverage of actual positives       |
|                | F1 Score         | Balance of precision & recall      |
|                | Confusion Matrix | Detailed correct/misclassification |
| Regression     | MSE / RMSE       | Average squared error              |
|                | MAE              | Average absolute error             |
|                | R²               | Variance explained                 |

