# Regression Metrics (MAE, RMSE, R²)

This notebook demonstrates how to calculate and interpret common regression metrics: **Mean Absolute Error (MAE)**, **Root Mean Squared Error (RMSE)**, and **R² (Coefficient of Determination)**. We'll use a simple regression problem with scikit-learn's built-in datasets.

## 1. Import Libraries

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

## 2. Load Example Dataset
We'll use the diabetes dataset from sklearn.

In [None]:
from sklearn.datasets import load_diabetes

data = load_diabetes()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = pd.Series(data.target, name='target')
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print('Train shape:', X_train.shape)
print('Test shape:', X_test.shape)

## 3. Train a Linear Regression Model

In [None]:
reg = LinearRegression()
reg.fit(X_train, y_train)
y_pred = reg.predict(X_test)

## 4. Mean Absolute Error (MAE)
**MAE** measures the average magnitude of errors in a set of predictions, without considering their direction.
\[
MAE = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y}_i|
\]


In [None]:
mae = mean_absolute_error(y_test, y_pred)
print(f'Mean Absolute Error (MAE): {mae:.2f}')

## 5. Root Mean Squared Error (RMSE)
**RMSE** is the square root of the average of squared differences between actual and predicted values.
\[
RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2}
\]


In [None]:
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
print(f'Root Mean Squared Error (RMSE): {rmse:.2f}')

## 6. R² Score (Coefficient of Determination)
The **R² score** indicates how well the model explains the variability of the response data.
\[
R^2 = 1 - \frac{\sum (y_i - \hat{y}_i)^2}{\sum (y_i - \bar{y})^2}
\]
Ranges from 0 (poor fit) to 1 (perfect fit).

In [None]:
r2 = r2_score(y_test, y_pred)
print(f'R² Score: {r2:.2f}')

## 7. Visualizing Predictions vs. True Values

In [None]:
plt.figure(figsize=(6,4))
plt.scatter(y_test, y_pred, alpha=0.7)
plt.xlabel('True Values')
plt.ylabel('Predicted Values')
plt.title('Regression: True vs. Predicted')
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--')
plt.show()

## 8. Summary
- **MAE**: Average magnitude of error, easy to interpret.
- **RMSE**: Penalizes larger errors more, sensitive to outliers.
- **R²**: Shows the proportion of variance explained by the model.