## Regression Metrics

### Mean Absolute Error

Mean Absolute Error is the average of the difference between the Original Values and the Predicted Values.It gives an idea of how wrong the predictions were.

$$\Large MAE = \frac{1}{N} \sum\limits_{i = 1}^{N} \mid {y_i - \hat{y_i}} \mid$$

$$\Large MAE = \Large Average(Actual\ target\ value - Predicted\ target\ value)$$

In [1]:
import pandas as pd
from sklearn import model_selection
from sklearn.linear_model import LinearRegression
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.data"
names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV']
dataframe = pd.read_csv(url, delim_whitespace=True, names=names)
array = dataframe.values
X = array[:,0:13]
Y = array[:,13]
seed = 7
kfold = model_selection.KFold(n_splits=10, random_state=seed)
model = LinearRegression()
scoring = 'neg_mean_absolute_error'
results = model_selection.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print("MAE: %.3f (%.3f)" % (results.mean(), results.std()))

MAE: -4.005 (2.084)


### Mean Squared Error

Mean Squared Error(MSE) is quite similar to Mean Absolute Error, the only difference is that MSE takes the average of the square of the difference between the original values and the predicted values.

$$\Large MSE = \frac{1}{N} \sum\limits_{i = 1}^{N} ( {y_i - \hat{y_i}} )^2$$

$$\Large MSE = \Large Average(Actual\ target\ value - Predicted\ target\ value)^2$$

In [2]:
scoring = 'neg_mean_squared_error'
results = model_selection.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print("MSE: %.3f (%.3f)" % (results.mean(), results.std()))

MSE: -34.705 (45.574)


### R Squared

The $R^2$ (or R Squared) metric provides an indication of how well the model captures the variance in the data. It ranges between 0 and 1 for no-variance explained to complete variance captured respectively.

$$(Residual\ Square\ sum)\ \Large RSS = \Large \sum\limits_{i = 1}^{N} (Actual\ target\ value - Predicted\ target\ value)$$

$$(Total\ Square\ sum)\ \Large TSS = \Large \sum\limits_{i = 1}^{N} (Actual\ target\ value - Mean\ target\ value)$$

$$\Large R^2 = \Large (1 - \frac{RSS}{TSS})$$

In [3]:
scoring = 'r2'
results = model_selection.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
print("R^2: %.3f (%.3f)" % (results.mean(), results.std()))

R^2: 0.203 (0.595)
