## Regression

In [None]:
import numpy as np
from sklearn.model_selection import train_test_split

np.random.seed(42)
X = np.random.rand(100, 1) * 10   # 100 data points between 0 and 10
y = 2 * X + 5 + np.random.randn(100, 1) * 2  # Linear relationship with noise

# Splitting data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [None]:
from sklearn.linear_model import LinearRegression

# Create a linear regression model
model = LinearRegression()

# Fit the model to the training data
model.fit(X_train, y_train)

# Make predictions on the test data
y_pred = model.predict(X_test)

In [None]:


df_reg= pd.DataFrame(list(zip(y_test,y_pred)),columns=['y_test','y_pred'])

df_reg['y_test'] = df_reg['y_test'].apply(lambda x: x[0])
df_reg['y_pred'] = df_reg['y_pred'].apply(lambda x: x[0])

df_reg

Unnamed: 0,y_test,y_pred
0,7.584274,6.505975
1,21.889596,22.464091
2,17.970994,20.110837
3,16.431671,18.004462
4,11.757664,10.253702
5,14.406145,13.735564
6,10.85312,10.894214
7,25.993618,21.855077
8,7.242494,5.680993
9,12.664897,12.475984


**1. Mean Absolute Error (MAE)**

$$
\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y}_i|
$$

---
---

**When to Use???**

**Robust to Outliers:** MAE is less sensitive to outliers compared to MSE. It gives an equal weight to all errors and may be preferred when you want to penalize errors in a linear fashion.

**Interpretability:** it’s often easier to interpret in real-world contexts, as it’s in the same unit as the target variable.


In [None]:
# Evaluation Metrics

def mean_absolute_error(y_true, y_pred):
    return np.mean(np.abs(y_true - y_pred))

In [None]:
mae = mean_absolute_error(df_reg.y_test, df_reg.y_pred)
print(f"MAE: {mae:.4f}")

MAE: 1.1827


**2. Mean Squared Error (MSE)**


$$
\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
$$

---
---
**When to Use:**
**Sensitive to Outliers:** MSE penalizes larger errors more than smaller ones due to the squaring of the residuals, so it’s more sensitive to outliers. This can be useful when you want to emphasize the importance of large errors.


**Mathematical Simplicity**: MSE is often easier to work with in mathematical optimization problems because it’s differentiable, making it a natural choice for gradient-based optimization techniques.

In [None]:
def mean_squared_error(y_true, y_pred):
    return np.mean((y_true - y_pred) ** 2)

In [None]:
mse = mean_squared_error(df_reg.y_test, df_reg.y_pred)
print(f"MSE: {mse:.4f}")

MSE: 2.6148


**3. Root Mean Squared Error (RMSE)**

$$
\text{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2}
$$

---
---
**When to Use:**

1. Scale-sensitive: RMSE is in the same units as the target variable, making it more interpretable than MSE. It's useful when you want to understand the average error magnitude.

2. Sensitive to Large Errors: Like MSE, RMSE gives a higher weight to larger errors due to squaring the residuals.



In [None]:
def root_mean_squared_error(y_true, y_pred):
    return np.sqrt(mean_squared_error(y_true, y_pred))

In [None]:
rmse = root_mean_squared_error(df_reg.y_test, df_reg.y_pred)
print(f"MSE: {rmse:.4f}")

MSE: 1.6170


**Mean Absolute Percentage Error (MAPE)**

$$
\text{MAPE} = \frac{1}{n} \sum_{i=1}^{n} \left|\frac{y_i - \hat{y}_i}{y_i}\right| \times 100
$$


---
---

**When to Use**

**Percentage Error:** MAPE expresses errors as percentages, which can be easier to interpret and compare across different datasets or models.

**Scaling:** It’s useful when you need a scale-independent metric, but it can be misleading if the target values are close to zero, as it can result in very large percentage errors.

In [None]:
def mean_absolute_percentage_error(y_true, y_pred):

    # Avoid division by zero
    epsilon = 1e-10       # Small value to prevent division by zero
    error = np.abs((y_true - y_pred) / (y_true + epsilon))
    mape = np.mean(error) * 100
    return mape


mape_value = mean_absolute_percentage_error(df_reg.y_test, df_reg.y_pred)
print(f"MAPE: {mape_value:.2f}%")

MAPE: 8.19%


**R-squared (R²)**


$$
R^2 = 1 - \frac{\sum (y_i - \hat{y}_i)^2}{\sum (y_i - \bar{y})^2}
$$

---
---

**When to Use:**

**Goodness of Fit:** R² measures the proportion of variance in the target variable explained by the model. A higher R² indicates a better fit of the model to the data.

**Model Comparison:** R² is often used to compare models (though it doesn’t always give a complete picture, as it can increase with more predictors, even if those predictors are irrelevant).



In [None]:
def r_squared(y_true, y_pred):
    ss_total   = np.sum((y_true - np.mean(y_true)) ** 2)
    ss_residual = np.sum((y_true - y_pred) ** 2)
    return 1 - (ss_residual / ss_total)


r2 = r_squared(df_reg.y_test, df_reg.y_pred)
print(f"R2: {r2:.4f}")

R2: 0.9287


**5. Adjusted R-squared**
$$
\text{Adjusted } R^2 = 1 - \left(1 - R^2\right) \frac{n - 1}{n - p - 1}
$$

where \(n\) is the number of data points, and \(p\) is the number of predictors.

**When to Use:**

**Multiple Regression:** Adjusted R² is particularly useful when you have multiple predictors in your regression model. Unlike R², it penalizes the model for including irrelevant predictors, providing a more accurate measure of model performance when comparing models with different numbers of predictors.



In [None]:
def adjusted_r2(y_true, y_pred, n, p):

    ss_total     = np.sum((y_true - np.mean(y_true)) ** 2)  # Total sum of squares
    ss_residual  = np.sum((y_true - y_pred) ** 2)           # Residual sum of squares

    r2 = 1 - (ss_residual / ss_total)                     #  R-squared
    adjusted_r2 = 1 - (1 - r2) * (n - 1) / (n - p - 1)

    return adjusted_r2

adjusted_r2_value = adjusted_r2(df_reg.y_test, df_reg.y_pred, len(df_reg), 2)
print(f"Adjusted R-squared: {adjusted_r2_value:.4f}")

Adjusted R-squared: 0.9203


**Huber Loss**

$$
L_\delta(y, \hat{y}) =
\begin{cases}
\frac{1}{2}(y - \hat{y})^2 & \text{for } |y - \hat{y}| \leq \delta \\
\delta |y - \hat{y}| - \frac{1}{2} \delta^2 & \text{otherwise}
\end{cases}
$$

where \( \delta \) is a threshold parameter.

---
---


When to Use:

**Robust to Outliers:** Huber loss is a combination of MSE and MAE and is less sensitive to outliers compared to MSE. It behaves like MSE when the error is small and like MAE when the error is large.

**Ideal for Regression with Outliers:** If the dataset has significant outliers, Huber loss can be a good choice as it doesn’t penalize large errors as heavily as MSE.

In [None]:
import numpy as np

def huber_loss(y_true, y_pred, delta=1.0):

    error = np.abs(y_true - y_pred)
    loss = np.where(error <= delta, 0.5 * error**2, delta * (error - 0.5 * delta))
    return np.mean(loss)


In [None]:
huber_loss_value = huber_loss(df_reg.y_test, df_reg.y_pred, delta=1.0)
print(f"Huber Loss: {huber_loss_value}")

Huber Loss: 0.7797423186731596


Others include

- Explained Variance Score
- Quantile Loss