## Evaluation Metrics for Regression Models

In [11]:
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score, median_absolute_error

data = {
    'SquareFeet': [1500, 1800, 2000, 2200, 2500, 2700, 3000, 3200, 3500, 4000],
    'NumBedrooms': [3, 4, 3, 5, 4, 5, 4, 6, 5, 6],
    'Price': [300000, 350000, 400000, 450000, 500000, 550000, 600000, 650000, 700000, 750000]
}
df = pd.DataFrame(data)

X = df[['SquareFeet', 'NumBedrooms']]
Y = df['Price']

model = LinearRegression()
model.fit(X, Y)

Y_pred = model.predict(X)



### Mean Absolute Error (MAE)

**Definition**: The average of the absolute errors between predicted and actual values.

**Interpretation**: Provides a straightforward measure of average error, easy to understand and interpret.

In [4]:
mae = mean_absolute_error(Y, Y_pred)
mae

11612.613421974675

### Mean Squared Error (MSE)

**Definition**: The average of the squared differences between predicted and actual values.

**Interpretation**: Penalizes larger errors more than smaller ones due to the squaring of errors.



In [6]:
mse = mean_squared_error(Y, Y_pred)
mse

214446141.40688127

### Root Mean Squared Error (RMSE)

**Definition**: The square root of the average of the squared differences between predicted and actual values.

**Interpretation**: Provides error estimates in the same units as the target variable, which is often useful for interpretation.


In [7]:
rmse = np.sqrt(mse)
rmse

14643.979698390778

### R-squared (R²)

**Definition**: The proportion of the variance in the dependent variable that is predictable from the independent variables.

**Interpretation**: Indicates how well the model explains the variability of the target variable. Values range from 0 to 1.


In [8]:
r2= r2_score(Y, Y_pred)
r2

0.989602611325727

### Median Absolute Error

**Definition**: The median of the absolute errors between predicted and actual values.

**Interpretation**: Robust to outliers and provides a measure of central tendency of the errors.

In [10]:
median_ae = median_absolute_error(Y, Y_pred)
median_ae


11135.567334471329

### Adjusted R-squared

**Definition**: Adjusts the R-squared value to account for the number of predictors in the model.


**Interpretation**: Useful for comparing models with different numbers of predictors.


In [13]:
mape = mean_absolute_percentage_error(Y, Y_pred) * 100
mape

NameError: name 'mean_absolute_percentage_error' is not defined