```{contents}
```

## Evaluation Metrics

SVR is used for predicting **continuous values**, so we rely on **regression metrics** (not classification ones like accuracy or precision). These metrics tell us how close the predicted values $\hat{y}_i$ are to the actual values $y_i$.

---

### Mean Absolute Error (MAE)

$$
MAE = \frac{1}{n} \sum_{i=1}^n |y_i - \hat{y}_i|
$$

* Average of absolute errors.
* Easy to interpret since it’s in the **same units as the target**.
* Treats all errors equally (linear penalty).
* **Good when you care about robustness** and don’t want outliers to dominate.

---

### Mean Squared Error (MSE)

$$
MSE = \frac{1}{n} \sum_{i=1}^n (y_i - \hat{y}_i)^2
$$

* Penalizes **larger errors more heavily** (quadratic penalty).
* Useful when you want to **heavily discourage large mistakes**.
* But less interpretable because it’s in squared units.

---

### Root Mean Squared Error (RMSE)

$$
RMSE = \sqrt{MSE}
$$

* Same as MSE, but **square root brings units back** to the original scale.
* Sensitive to outliers.
* Most common metric in practice.

---

### R-squared ($R^2$)

$$
R^2 = 1 - \frac{\sum (y_i - \hat{y}_i)^2}{\sum (y_i - \bar{y})^2}
$$

* Proportion of variance in $y$ explained by the model.
* $R^2 = 1$: perfect prediction.
* $R^2 = 0$: model no better than predicting the mean.
* $R^2 < 0$: model worse than mean prediction.

⚠️ Limitation: Always increases as you add more predictors, even if they don’t improve the model.

---

### Adjusted R-squared ($R^2_{adj}$)

$$
R^2_{adj} = 1 - \frac{(1 - R^2)(n - 1)}{n - p - 1}
$$

Where:

* $n$ = number of observations

* $p$ = number of predictors (features)

* Penalizes useless features.

* If a new predictor improves the model, $R^2_{adj}$ increases.

* If a predictor doesn’t help, $R^2_{adj}$ decreases.

* **More reliable than $R^2$** when you have multiple predictors.

---

### Mean Absolute Percentage Error (MAPE)

$$
MAPE = \frac{100}{n} \sum_{i=1}^n \left|\frac{y_i - \hat{y}_i}{y_i}\right|
$$

* Error expressed as a **percentage** of actual values.
* Easy to communicate (e.g., “on average, predictions are off by 6%”).
* ⚠️ Breaks down when $y_i = 0$.

---

### Explained Variance Score (EVS)

$$
EVS = 1 - \frac{Var(y - \hat{y})}{Var(y)}
$$

* Similar to $R^2$, but focuses on variance explained.
* Higher is better.

---

### Which Metrics to Use

* **General evaluation:** RMSE, MAE, $R^2$.
* **When outliers matter:** RMSE (heavier penalty).
* **When robustness matters:** MAE.
* **When explaining variance:** $R^2$ and Adjusted $R^2$.
* **When business users need % error:** MAPE.

---

### Example in Python


In [1]:
import numpy as np
from sklearn.svm import SVR
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.model_selection import train_test_split

# Sample dataset
X = np.random.rand(100, 3)  # 3 features
y = 3*X[:,0] - 2*X[:,1] + X[:,2] + np.random.randn(100) * 0.1

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train SVR
svr = SVR(kernel='rbf', C=100, gamma=0.1, epsilon=0.1)
svr.fit(X_train, y_train)
y_pred = svr.predict(X_test)

# Metrics
mae = mean_absolute_error(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
r2 = r2_score(y_test, y_pred)

# Adjusted R^2
n, p = X_test.shape
adj_r2 = 1 - ((1 - r2) * (n - 1)) / (n - p - 1)

print(f"MAE: {mae:.3f}")
print(f"MSE: {mse:.3f}")
print(f"RMSE: {rmse:.3f}")
print(f"R^2: {r2:.3f}")
print(f"Adjusted R^2: {adj_r2:.3f}")


MAE: 0.084
MSE: 0.009
RMSE: 0.095
R^2: 0.994
Adjusted R^2: 0.993




---

**Summary Table**

| Metric         | Formula                         | Strength                   | Weakness                                |                           |                         |
| -------------- | ------------------------------- | -------------------------- | --------------------------------------- | ------------------------- | ----------------------- |
| MAE            | (\frac{1}{n}\sum                | y-\hat{y}                  | )                                       | Easy to interpret, robust | Ignores error direction |
| MSE            | $\frac{1}{n}\sum (y-\hat{y})^2$ | Penalizes large errors     | Units are squared                       |                           |                         |
| RMSE           | $\sqrt{MSE}$                    | Same scale as target       | Sensitive to outliers                   |                           |                         |
| $R^2$          | $1 - \frac{SS_{res}}{SS_{tot}}$ | Variance explained         | Always increases with features          |                           |                         |
| Adjusted $R^2$ | Penalized $R^2$                 | Detects useless predictors | Only meaningful for multiple predictors |                           |                         |
| MAPE           | % error                         | Easy to explain            | Undefined if $y=0$                      |                           |                         |
| EVS            | Variance explained              | Intuitive                  | Similar to $R^2$, less common           |                           |                         |

