## MACHINE LEARNING DAY - 9 : Accuracy Measures Basics

### Accuracy Measures in Regression Models

In regression, we measure **how close the predicted values are to the actual (true) values**. Unlike classification (where accuracy is a percentage of correct predictions), regression accuracy is measured using **error-based metrics**.

The most commonly used metrics are:

1. **MAE (Mean Absolute Error)**
2. **MSE (Mean Squared Error)**
3. **RMSE (Root Mean Squared Error)**
4. **R² Score (Coefficient of Determination)**

---

### 1. **MAE – Mean Absolute Error**

**Definition**: The average of the absolute differences between predicted and actual values.

**Formula**:

$$
\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y}_i|
$$

* $y_i$ = actual value
* $\hat{y}_i$ = predicted value
* $n$ = number of data points

**Interpretation**:

* MAE gives equal weight to all errors.
* It tells you the **average magnitude of the errors**, in the same units as the target variable.
* Less sensitive to outliers than MSE.

**Example**: If MAE = 5, it means your predictions are off by 5 units on average.

---

### 2. **MSE – Mean Squared Error**

**Definition**: The average of the **squared** differences between predicted and actual values.

**Formula**:

$$
\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
$$

**Interpretation**:

* Squaring penalizes **larger errors more heavily**.
* Good if large errors are especially bad in your application.
* More sensitive to outliers than MAE.

**Units**: The square of the target variable’s units (e.g., if the target is in dollars, MSE is in dollars²).

---

### 3. **RMSE – Root Mean Squared Error** (Optional, but commonly used)

**Definition**: The square root of MSE.

$$
\text{RMSE} = \sqrt{\text{MSE}}
$$

**Interpretation**: It brings the error back to the same units as the target. Useful for interpreting error size in the context of the original data.

---

### 4. **R² Score – Coefficient of Determination**

**Definition**: Measures how well the regression predictions approximate the actual data. It tells you the **proportion of variance** in the dependent variable that is predictable from the independent variables.

**Formula**:

$$
R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2}
$$

* $\bar{y}$ is the mean of the actual values.

**Interpretation**:

* R² = 1: Perfect fit (all points lie on the prediction line).
* R² = 0: Model predicts no better than the mean.
* R² < 0: Model is worse than using the mean.

**Important**:

* High R² doesn’t always mean a good model — you must also check for overfitting and residuals.

---

### When to Use What

| Metric   | Use Case                                                              |
| -------- | --------------------------------------------------------------------- |
| MAE      | When you want a **robust** metric less affected by outliers           |
| MSE      | When you want to **penalize large errors** more                       |
| RMSE     | When you want error in **original units**, penalizing large errors    |
| R² Score | When you want to know **how much variance** is explained by the model |