# Regression Metrics

In this notebook, we will go through the most common regression metrics. This is a companion workbook for the 365 Data Science course on ML Process. This notebook only focuses on implementation. Check out the course or the documentation for the in-depth explanations of each approach.

We will cover:

- R2 Score
- Adj R2 Score
- Mean Absolute Error
- Root Mean Squared Error



In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

## Load Data

Next, we'll load our customer lifetime value dataset. You'll see in our dataset, we have about 6 columns. The `purchases` column is the column we care about in our customer lifetime value problem. 

https://www.kaggle.com/datasets/sasivirat18/machine-learning-datasets

In [3]:
df = pd.read_csv(r"C:\Users\sasi virat\Downloads\learn-machine-learning-process-a-z\Section 12\10_Classification metrics - Coding portion\Course notes\housing_data.csv")

df.head()

Unnamed: 0,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,condition,sqft_above,sqft_basement,yr_built,yr_renovated,street,city,statezip,country
0,2014-05-02 00:00:00,313000.0,3.0,1.5,1340,7912,1.5,0,0,3,1340,0,1955,2005,18810 Densmore Ave N,Shoreline,WA 98133,USA
1,2014-05-02 00:00:00,2384000.0,5.0,2.5,3650,9050,2.0,0,4,5,3370,280,1921,0,709 W Blaine St,Seattle,WA 98119,USA
2,2014-05-02 00:00:00,342000.0,3.0,2.0,1930,11947,1.0,0,0,4,1930,0,1966,0,26206-26214 143rd Ave SE,Kent,WA 98042,USA
3,2014-05-02 00:00:00,420000.0,3.0,2.25,2000,8030,1.0,0,0,4,1000,1000,1963,0,857 170th Pl NE,Bellevue,WA 98008,USA
4,2014-05-02 00:00:00,550000.0,4.0,2.5,1940,10500,1.0,0,0,4,1140,800,1976,1992,9105 170th Ave NE,Redmond,WA 98052,USA


## Feature Selection

Let's select the features we want to use in the model. To keep things simple, we've manually selected a list of features:

In [4]:
features = ['bedrooms', 'bathrooms', 'sqft_living', 'sqft_lot',
       'floors', 'waterfront', 'view', 'condition', 'sqft_above',
       'sqft_basement', 'yr_built', 'yr_renovated']

target = "price"

## Cross-Validation

Next, we'll want to split our data into training and testing sets:

In [6]:
from sklearn.model_selection import train_test_split

y = df[target]
X = df[features]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

## Model

To keep things simple, we'll use linear regression:

In [7]:
from sklearn.ensemble import RandomForestRegressor
from sklearn.linear_model import LinearRegression

model = LinearRegression()

model.fit(X_train, y_train)
y_preds = model.predict(X_test)


## Evaluation Metrics

The most common evaluation metrics are r2_score, rmse and mae. Here, sklearn has its own implementation:

In [8]:
from sklearn.metrics import (
    r2_score,
    mean_absolute_error,
    mean_squared_error
)

r2 = r2_score(y_test, y_preds)
rmse = np.sqrt(mean_squared_error(y_test, y_preds))
mae = mean_absolute_error(y_test, y_preds)

print("r2_score: {0}".format(r2))
print("rmse: {0}".format(rmse))
print("mae: {0}".format(mae))

r2_score: 0.06623293120936191
rmse: 788277.1876925766
mae: 191902.15686060727


## R2 Score

Intuition behind R-squared is that it tells us what percent of the prediction error in the y variable is eliminated/explained by your model. We use this to determine “goodness of fit.” 

Here's an implementation of R2 so you can see the inner workings of the metric:

In [9]:
def r2_score(y_test, y_preds):
    SS_reg = np.sum((y_test - y_preds)**2)
    SS_total = np.sum((y_test - np.mean(y_test))**2)
    r2 = 1-SS_reg/SS_total
    return r2
    
r2_score(y_test, y_preds)

0.06623293120936191

## Adjusted R-Squared

The problem is that R-Squared can be easily hacked. If we overfit our model, this will always increase our r2 score. So the solution is to use adjusted R-squared. Adjusted R-squared will adjust our R-squared number based on the number of features in our model":

In [10]:
def adj_r2_score(X, y_test, y_preds):
    SS_reg = np.sum((y_test - y_preds)**2)
    SS_total = np.sum((y_test - np.mean(y_test))**2)
    r2 = 1-SS_reg/SS_total
    
    N = len(X)
    p = len(X.columns)
    
    adj_r2 = 1-((1-r2)*(N-1))/(N-p-1)
    return adj_r2
    
adj_r2_score(X, y_test, y_preds)

0.06379011350160357

## Mean Absolute Error

In simple terms, we’re just looking at the absolute average errors for each data point. Then taking an average. This gives us the magnitude of the average error in our dataset:

In [11]:
def mean_absolute_error(y_test, y_preds):
    return np.sum(abs(y_preds - y_test))/len(y_preds)
    
mean_absolute_error(y_test, y_preds)

191902.15686060727

## Root Mean Squared Error

Instead of taking the absolute value of the errors, in this case we square the errors first. This forces all the errors to be positive. We take the average of the squared errors, which becomes mean-squared error. Then we take the square root, to get RMSE. 

In [12]:
def mean_squared_error(y_test, y_preds):
    return np.sum((y_preds - y_test)**2)/len(y_preds)
    
np.sqrt(mean_squared_error(y_test, y_preds))

788277.1876925766

## Conclusion

In review, we went over four different regression metrics in this notebook:

- R2
- Adjusted R2
- Mean Absolute Error
- Root Mean Squared Error 


## Additional Resources
- [Model Evaluation Metrics in Machine Learning by Nagesh Singh Chauhan](https://www.kdnuggets.com/2020/05/model-evaluation-metrics-machine-learning.html)
- [11 Important Model Evaluation Metrics for Machine Learning Everyone should know](https://www.analyticsvidhya.com/blog/2019/08/11-important-model-evaluation-error-metrics/)
- [How To Interpret R-squared in Regression Analysis by Jim Frost](https://statisticsbyjim.com/regression/interpret-r-squared-regression/)
- [Know The Best Evaluation Metrics for Your Regression Model by Raghav Agrawal](https://www.analyticsvidhya.com/blog/2021/05/know-the-best-evaluation-metrics-for-your-regression-model/)
- [Recall, Precision, F1, ROC, AUC, and everything by Ofir Shalev](https://medium.com/swlh/recall-precision-f1-roc-auc-and-everything-542aedf322b9)
- [F1 Score vs ROC AUC vs Accuracy vs PR AUC: Which Evaluation Metric Should You Choose? by Jakub Czakon](https://neptune.ai/blog/f1-score-accuracy-roc-auc-pr-auc)
- [Intuition behind Log Loss Score by Gaurav Dembla](https://towardsdatascience.com/intuition-behind-log-loss-score-4e0c9979680a#:~:text=is%20dependent%20on.-,What%20does%20log%2Dloss%20conceptually%20mean%3F,is%20the%20log%2Dloss%20value.)
- [Why is ROC AUC equivalent to the probability that two randomly-selected samples are correctly ranked?](https://stats.stackexchange.com/questions/190216/why-is-roc-auc-equivalent-to-the-probability-that-two-randomly-selected-samples)
- [Man U Whitney Test](https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test#Area-under-curve_(AUC)_statistic_for_ROC_curves)
- [Essential Things You Need to Know About F1-Score](https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test#Area-under-curve_(AUC)_statistic_for_ROC_curves)
- [ROC, AUC, precision, and recall visually explained by Paul Vanderlaken](https://paulvanderlaken.com/2019/08/16/roc-auc-precision-and-recall-visually-explained/)
- [R-squared Is Not Valid for Nonlinear Regression by Jim Frost](https://statisticsbyjim.com/regression/r-squared-invalid-nonlinear-regression/#:~:text=Nonlinear%20regression%20is%20an%20extremely,just%20don%27t%20go%20together.)
- [3 Best metrics to evaluate Regression Model? by Songhao Wu](https://towardsdatascience.com/what-are-the-best-metrics-to-evaluate-your-regression-model-418ca481755b)