# ✅ Model Evaluation in Machine Learning

## 🎯 Why Evaluate a Model?

Model evaluation helps assess how well your trained model generalizes to **unseen data**. It ensures:
- The model is not overfitting or underfitting
- The predictions are reliable
- You can compare multiple models effectively

---

## 🧠 Theoretical Overview

Evaluation depends on the **type of problem**:
- **Classification** (predicting categories)
- **Regression** (predicting continuous values)

Each type uses different metrics to assess performance.

---

## 🧪 Evaluation Metrics for Classification

### 1. Accuracy

- Good for balanced datasets
- Misleading for imbalanced datasets

### 2. Precision

- How many of the predicted positives were actually correct?

### 3. Recall (Sensitivity)

- How many of the actual positives were captured?

### 4. F1 Score

-  Harmonic mean of Precision and Recall



### 5. Confusion Matrix

|               | Predicted Positive | Predicted Negative |
|---------------|--------------------|--------------------|
| Actual Positive | True Positive (TP) | False Negative (FN) |
| Actual Negative | False Positive (FP) | True Negative (TN) |

---

### 🐍 Python Example (Classification)

```python
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix

# Predict on test data
y_pred = model.predict(X_test)

# Evaluate
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Precision:", precision_score(y_test, y_pred))
print("Recall:", recall_score(y_test, y_pred))
print("F1 Score:", f1_score(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))

```
---
## 📈 Evaluation Metrics for Regression

1. Mean Absolute Error (MAE)

MAE = (1/n) * Σ |yᵢ - ŷᵢ|

- Measures average magnitude of errors

2. Mean Squared Error (MSE)

MSE = (1/n) * Σ (yᵢ - ŷᵢ)²

- Penalizes larger errors more than MAE

3. Root Mean Squared Error (RMSE)

RMSE = √MSE

- More interpretable (same units as output variable)

4. R² Score (Coefficient of Determination)

R² = 1 - (SS_res / SS_tot)

- Indicates how much variance is explained by the model (closer to 1 is better)

🐍 Python Example (Regression)

```python

from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import numpy as np

y_pred = model.predict(X_test)

print("MAE:", mean_absolute_error(y_test, y_pred))
print("MSE:", mean_squared_error(y_test, y_pred))
print("RMSE:", np.sqrt(mean_squared_error(y_test, y_pred)))
print("R2 Score:", r2_score(y_test, y_pred))



| Concept      | Train Accuracy | Test Accuracy | Reason                        |
| ------------ | -------------- | ------------- | ----------------------------- |
| Overfitting  | High           | Low           | Model memorized training data |
| Underfitting | Low            | Low           | Model too simple              |
| Good Fit     | High           | High          | Model generalized well        |


# 📊 Classification Metrics in Machine Learning

When evaluating classification models, especially binary classifiers, these are the most commonly used metrics:

---

## ✅ 1. Accuracy

**Definition:**
The ratio of correctly predicted observations to the total observations.

**Formula:**

Accuracy = (TP + TN) / (TP + TN + FP + FN)


- TP = True Positives  
- TN = True Negatives  
- FP = False Positives  
- FN = False Negatives

**Use case:** Good when classes are balanced.

---

## 🎯 2. Precision

**Definition:**
The ratio of correctly predicted positive observations to the total predicted positive observations.

**Formula:**

Precision = TP / (TP + FP)


**Use case:** Important when **false positives** are costly (e.g., spam detection).

---

## 📢 3. Recall (Sensitivity or True Positive Rate)

**Definition:**
The ratio of correctly predicted positive observations to all actual positives.

**Formula:**

Recall = TP / (TP + FN)


**Use case:** Important when **false negatives** are costly (e.g., disease diagnosis).

---

## 🔄 4. F1 Score

**Definition:**
The harmonic mean of Precision and Recall. It balances the two metrics.

**Formula:**

F1 Score = 2 * (Precision * Recall) / (Precision + Recall)


**Use case:** Useful when you need a balance between Precision and Recall, especially with imbalanced classes.

---

## 📘 5. Confusion Matrix

**Definition:**
A table used to evaluate the performance of a classification algorithm.

### Structure:

|                      | Predicted Positive | Predicted Negative |
|----------------------|--------------------|--------------------|
| **Actual Positive**  | True Positive (TP) | False Negative (FN)|
| **Actual Negative**  | False Positive (FP)| True Negative (TN) |

**Explanation:**
- **TP**: Model correctly predicted the positive class
- **TN**: Model correctly predicted the negative class
- **FP**: Model incorrectly predicted positive (Type I Error)
- **FN**: Model incorrectly predicted negative (Type II Error)

---

