
# Machine Learning Evaluation Metrics

In machine learning, model evaluation is crucial to understand how well a model performs on unseen data. Different tasks require different evaluation metrics. Below is a list of commonly used evaluation metrics for **classification** and **regression** problems.

## Table of Contents
1. [Classification Metrics](#classification-metrics)
   - [Accuracy](#accuracy)
   - [Precision](#precision)
   - [Recall](#recall)
   - [F1 Score](#f1-score)
   - [AUC-ROC](#auc-roc)
   - [Confusion Matrix](#confusion-matrix)
2. [Regression Metrics](#regression-metrics)
   - [Mean Absolute Error (MAE)](#mean-absolute-error-mae)
   - [Mean Squared Error (MSE)](#mean-squared-error-mse)
   - [Root Mean Squared Error (RMSE)](#root-mean-squared-error-rmse)
   - [R-squared (R²)](#r-squared-r²)
   - [Adjusted R-squared](#adjusted-r-squared)

---

## Classification Metrics

### Accuracy
**Definition**: Accuracy is the ratio of correctly predicted instances to the total instances.
\[
Accuracy = \frac{TP + TN}{TP + TN + FP + FN}
\]
- **TP (True Positives)**: Correct positive predictions
- **TN (True Negatives)**: Correct negative predictions
- **FP (False Positives)**: Incorrect positive predictions
- **FN (False Negatives)**: Incorrect negative predictions

**Use case**: Suitable for balanced datasets where classes are evenly distributed.

---

### Precision
**Definition**: Precision measures the proportion of true positives among all positive predictions.
\[
Precision = \frac{TP}{TP + FP}
\]
**Use case**: Important in cases where **false positives** are costly (e.g., spam detection).

---

### Recall (Sensitivity or True Positive Rate)
**Definition**: Recall is the proportion of true positives that were correctly identified.
\[
Recall = \frac{TP}{TP + FN}
\]
**Use case**: Used when **false negatives** are critical (e.g., disease detection).
**for more info https://developers.google.com/machine-learning/crash-course/classification/roc-and-auc**

---

### F1 Score
**Definition**: The F1 score is the harmonic mean of precision and recall, offering a balance between the two.
\[
F1 = 2 \times \frac{Precision \times Recall}{Precision + Recall}
\]
**Use case**: Useful when there’s an uneven class distribution and a balance between precision and recall is needed.

---

### AUC-ROC (Area Under the Receiver Operating Characteristic Curve)
**Definition**: The ROC curve plots the true positive rate (recall) against the false positive rate. The AUC (Area Under the Curve) measures the entire two-dimensional area underneath the curve.
- **AUC = 1**: Perfect classifier
- **AUC = 0.5**: No skill classifier (random guessing)

**Use case**: Suitable for imbalanced datasets and understanding a model's ability to distinguish between classes.

---

### Confusion Matrix
**Definition**: A confusion matrix is a table used to evaluate the performance of a classification algorithm. It provides insights into how well the model's predictions match the actual labels.
```
           Predicted Positive | Predicted Negative
-------------------------------------------------
Actual Positive  |       TP          |       FN
Actual Negative  |       FP          |       TN
```
**Use case**: Useful for understanding the number of correct/incorrect classifications across each class.

---

## Regression Metrics

### Mean Absolute Error (MAE)
**Definition**: MAE is the average of the absolute differences between predicted values and actual values.
\[
MAE = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y_i}|
\]
- \( y_i \): Actual value
- \( \hat{y_i} \): Predicted value
- \( n \): Number of observations

**Use case**: MAE is easy to interpret and gives an idea of the average error in the predictions.

---

### Mean Squared Error (MSE)
**Definition**: MSE is the average of the squared differences between predicted values and actual values.
\[
MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2
\]
**Use case**: Commonly used in regression. It penalizes larger errors more than smaller ones, making it sensitive to outliers.

---

### Root Mean Squared Error (RMSE)
**Definition**: RMSE is the square root of the MSE, giving an error estimate in the same units as the target variable.
\[
RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2}
\]
**Use case**: Like MSE but easier to interpret since it’s in the same units as the actual data.

---

### R-squared (R²)
**Definition**: R² measures the proportion of variance in the dependent variable that is predictable from the independent variables.
\[
R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y_i})^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2}
\]
- \( \bar{y} \): Mean of the actual values

**Use case**: Gives an idea of how well the model explains the variance in the data. A value closer to 1 indicates a better fit.

---

### Adjusted R-squared
**Definition**: Adjusted R² adjusts the R² value by accounting for the number of predictors in the model.
\[
\text{Adjusted } R^2 = 1 - \left(1 - R^2\right) \frac{n - 1}{n - p - 1}
\]
- \( n \): Number of data points
- \( p \): Number of predictors

**Use case**: More reliable than R² when comparing models with different numbers of predictors, as it penalizes adding irrelevant features.

