# Task 24: Evaluation Techniques for Regression Models
Submitted by: Awais Anwer

Evaluation techniques for regression models are crucial for assessing their performance. Common metrics include Mean Absolute Error (MAE), which measures the average magnitude of errors, and Mean Squared Error (MSE), which emphasizes larger errors due to squaring the differences. Root Mean Squared Error (RMSE) provides error estimates in the same unit as the target variable. R-squared (R²) indicates the proportion of variance explained by the model, while Adjusted R-squared accounts for the number of predictors. Mean Absolute Percentage Error (MAPE) offers a percentage-based accuracy metric, and Median Absolute Error provides a robust measure of central tendency for errors, less influenced by outliers. These metrics help in comparing and refining regression models.


In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score, median_absolute_error

In [2]:
# Load the dataset
housing = fetch_california_housing()
X = pd.DataFrame(housing.data, columns=housing.feature_names)
Y = pd.Series(housing.target)

In [3]:
X.head()

Unnamed: 0,MedInc,HouseAge,AveRooms,AveBedrms,Population,AveOccup,Latitude,Longitude
0,8.3252,41.0,6.984127,1.02381,322.0,2.555556,37.88,-122.23
1,8.3014,21.0,6.238137,0.97188,2401.0,2.109842,37.86,-122.22
2,7.2574,52.0,8.288136,1.073446,496.0,2.80226,37.85,-122.24
3,5.6431,52.0,5.817352,1.073059,558.0,2.547945,37.85,-122.25
4,3.8462,52.0,6.281853,1.081081,565.0,2.181467,37.85,-122.25


In [4]:
X.shape

(20640, 8)

In [5]:
X.describe()

Unnamed: 0,MedInc,HouseAge,AveRooms,AveBedrms,Population,AveOccup,Latitude,Longitude
count,20640.0,20640.0,20640.0,20640.0,20640.0,20640.0,20640.0,20640.0
mean,3.870671,28.639486,5.429,1.096675,1425.476744,3.070655,35.631861,-119.569704
std,1.899822,12.585558,2.474173,0.473911,1132.462122,10.38605,2.135952,2.003532
min,0.4999,1.0,0.846154,0.333333,3.0,0.692308,32.54,-124.35
25%,2.5634,18.0,4.440716,1.006079,787.0,2.429741,33.93,-121.8
50%,3.5348,29.0,5.229129,1.04878,1166.0,2.818116,34.26,-118.49
75%,4.74325,37.0,6.052381,1.099526,1725.0,3.282261,37.71,-118.01
max,15.0001,52.0,141.909091,34.066667,35682.0,1243.333333,41.95,-114.31


In [6]:
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=42)

In [7]:
# model
rf_regressor = RandomForestRegressor(n_estimators=100, random_state=42)
rf_regressor.fit(X_train, Y_train)

In [8]:
# predictions
Y_pred = rf_regressor.predict(X_test)

In [9]:
# evaluation metrics
mae = mean_absolute_error(Y_test, Y_pred)
mse = mean_squared_error(Y_test, Y_pred)
rmse = np.sqrt(mse)
r2 = r2_score(Y_test, Y_pred)
n = X_test.shape[0]
p = X_test.shape[1]
adjusted_r2 = 1 - (1 - r2) * (n - 1) / (n - p - 1)
mape = np.mean(np.abs((Y_test - Y_pred) / Y_test)) * 100
median_ae = median_absolute_error(Y_test, Y_pred)

print(f'Mean Absolute Error (MAE): {mae}')
print(f'Mean Squared Error (MSE): {mse}')
print(f'Root Mean Squared Error (RMSE): {rmse}')
print(f'R-squared (R²): {r2}')
print(f'Adjusted R-squared: {adjusted_r2}')
print(f'Mean Absolute Percentage Error (MAPE): {mape}%')
print(f'Median Absolute Error: {median_ae}')

Mean Absolute Error (MAE): 0.32754256845930246
Mean Squared Error (MSE): 0.2553684927247781
Root Mean Squared Error (RMSE): 0.5053399773665033
R-squared (R²): 0.8051230593157366
Adjusted R-squared: 0.8047445656217638
Mean Absolute Percentage Error (MAPE): 18.91511073211086%
Median Absolute Error: 0.2010199999999998
