# SLU10 - Metrics for Regression: Exercise Notebook

In this notebook, you will implement:
    - Mean Absolute Error (MAE)
    - Mean Squared Error (MSE)
    - Root Mean Squared Error (RMSE)
    - Coefficient of Determination (R²)
    - Adjusted R²
    - Scikitlearn metrics
    - Using metrics for k-fold cross validation


Start by loading the data we will use to fit a linear regression - hopefully you still have SLU07 in your memory - and fitting the LinearRegression estimator from scikitlearn:

In [1]:
# Base imports
import math
import numpy as np
import pandas as pd
from sklearn.datasets import load_diabetes
from sklearn.linear_model import LinearRegression

data = load_diabetes()

x = pd.DataFrame(data['data'], columns=data['feature_names'])
y = pd.Series(data['target'])
x.head()

Unnamed: 0,age,sex,bmi,bp,s1,s2,s3,s4,s5,s6
0,0.038076,0.05068,0.061696,0.021872,-0.044223,-0.034821,-0.043401,-0.002592,0.019908,-0.017646
1,-0.001882,-0.044642,-0.051474,-0.026328,-0.008449,-0.019163,0.074412,-0.039493,-0.06833,-0.092204
2,0.085299,0.05068,0.044451,-0.005671,-0.045599,-0.034194,-0.032356,-0.002592,0.002864,-0.02593
3,-0.089063,-0.044642,-0.011595,-0.036656,0.012191,0.024991,-0.036038,0.034309,0.022692,-0.009362
4,0.005383,-0.044642,-0.036385,0.021872,0.003935,0.015596,0.008142,-0.002592,-0.031991,-0.046641


In [2]:
np.random.seed(42)

x_diabetes = x.values
y_diabetes = y.values

lr = LinearRegression()
lr.fit(x_diabetes, y_diabetes)

y_hat_diabetes = lr.predict(x_diabetes)
betas_diabetes = pd.Series([lr.intercept_] + list(lr.coef_))

## 1 Metrics

We will start by covering the metrics we learned in the unit, in particular a set of related metrics:

- Mean Squared Error

$$MSE = \frac{1}{N} \sum_{n=1}^N (y_n - \hat{y}_n)^2$$


- Root Mean Squared Error

$$RMSE = \sqrt{MSE}$$


- Mean Absolute Error

$$MAE = \frac{1}{N} \sum_{n=1}^N \left| y_n - \hat{y}_n \right|$$

### 1.1 Mean Squared Error

Implement the mean squared error in the next function:

In [3]:
def mean_squared_error(y, y_pred): 
    """
    Args: 
        y_pred : numpy.array with shape (num_samples,) - predictions
        y : numpy.array with shape (num_samples,) - labels 
        
    Returns: 
        mse : float with Mean Squared Error Value

    """
    # 1) Compute the error.
    # error = ...
    # YOUR CODE HERE
    error = (y-y_pred)
    #raise NotImplementedError()
    
    # 2) Compute the squared value of the errors for each sample
    # squared_error = ...
    # YOUR CODE HERE
    squared_error = error**2
    #raise NotImplementedError()
    
    # 3) Compute the mean squared value of the errors
    # mse = ...
    # YOUR CODE HERE
    mse = squared_error.mean()
    #raise NotImplementedError()
    
    return mse

Check the outputs of your function match the results below:

In [4]:
mse = mean_squared_error(y_diabetes, y_hat_diabetes)
print('Mean Squared Error Diabetes dataset: {}'.format(mse))
assert math.isclose(2859.6903987680657, mse)

Mean Squared Error Diabetes dataset: 2859.6903987680657


### 1.2 Root Mean Squared Error
Implement the root mean squared error in the function below:

In [5]:
def root_mean_squared_error(y, y_pred): 
    """
    Args: 
        y_pred : numpy.array with shape (num_samples,) - predictions
        y : numpy.array with shape (num_samples,) - labels 
        
    Returns: 
        mse : float with the Root Mean Squared Error Value
    """
    # 1) Compute the mean squared error. 
    # mse = ...
    # YOUR CODE HERE
    mse = ((y-y_pred)**2).mean()
    #raise NotImplementedError()
    
    # 2) Compute the root square.
    # rmse = ...
    # YOUR CODE HERE
    rmse = math.sqrt(mse)
    #raise NotImplementedError()
    
    return rmse

Check the outputs of your function match the results below:

In [6]:
rmse = root_mean_squared_error(y_diabetes, y_hat_diabetes)
print('Root Mean Squared Error Diabetes dataset: {}'.format(rmse))
assert math.isclose(53.47607314274362, rmse)

Root Mean Squared Error Diabetes dataset: 53.47607314274362


### 1.3 Mean Absolute Error

Finally, implement the Mean Absolute Error in the function below. 

In [7]:
def mean_absolute_error(y, y_pred): 
    """
    Args: 
        y_pred : numpy.array with shape (num_samples,) - predictions
        y : numpy.array with shape (num_samples,) - labels 
        
    Returns: 
        mae : float with Mean Absolute Error
    """
    # 1) Compute the error.
    # error = ...
    # YOUR CODE HERE
    error = (y - y_pred)
    #raise NotImplementedError()
    
    # 2) Compute the absolute value of the errors for each sample
    # abs_error = ...
    # YOUR CODE HERE
    abs_error = abs(error)
    #raise NotImplementedError()
    
    # 3) Compute the mean of the absolute value of the errors
    # mae = ...
    # YOUR CODE HERE
    mae = abs_error.mean()
    #raise NotImplementedError()
    
    return mae

Check the outputs of your function match the results below:

In [8]:
mae = mean_absolute_error(y_diabetes, y_hat_diabetes)
print('Mean Absolute Error Diabetes dataset: {}'.format(mae))
assert math.isclose(43.277395083749866, mae)

Mean Absolute Error Diabetes dataset: 43.27739508374988


Next we will focus on the Coefficient of Determination - $R^2$ - and its adjusted form. See the equations below:

- $R^2$ score 

$$R² = 1 - \frac{MSE(y, \hat{y})}{MSE(y, \bar{y})} 
= 1 - \frac{\frac{1}{N} \sum_{n=1}^N (y_n - \hat{y}_n)^2}{\frac{1}{N} \sum_{n=1}^N (y_n - \bar{y})^2}
= 1 - \frac{\sum_{n=1}^N (y_n - \hat{y}_n)^2}{\sum_{n=1}^N (y_n - \bar{y})^2}$$

where $$\bar{y} = \frac{1}{N} \sum_{n=1}^N y_n$$

- Adjusted $R^2$ score 

$$\bar{R}^2 = 1 - \frac{N - 1}{N - K - 1} (1 - R^2)$$

where $N$ is the number of observations in the dataset used for training the model (i.e. number of rows of the pandas dataframe) and $K$ is the number of features used by your model (i.e. number of columns of the pandas dataframe)


### 1.4 R² score

Start by implementing the $R^2$ score in the function below:

In [9]:
def r_squared(y, y_pred): 
    """
    Args: 
        y_pred : numpy.array with shape (num_samples,) - predictions
        y : numpy.array with shape (num_samples,) - labels 
        
    Returns: 
        r2 : float with R squared value
    """

    # 1) Compute labels mean.
    # y_mean = ...
    # YOUR CODE HERE
    y_mean = y.mean()
    #raise NotImplementedError()

    # 2) Compute the mean squared error between the target and the predictions.
    # mse_pred = ...
    # YOUR CODE HERE
    mse_pred = mean_squared_error(y,y_pred)
    #raise NotImplementedError()
    
    # 3) Compute the mean squared error between the target and its mean.
    # YOUR CODE HERE
    mse_mean = mean_squared_error(y,y_mean)
    #raise NotImplementedError()
    
    # 4) Finally, compute R²
    # r2 = ...
    # YOUR CODE HERE
    r2 = 1-(mse_pred/mse_mean)
    #raise NotImplementedError()
    
    return r2

Check the outputs of your function match the results below:

In [10]:
r2 = r_squared(y_diabetes, y_hat_diabetes)
print('R² Diabetes dataset: {}'.format(r2))
assert math.isclose(0.5177494254132934, r2)

R² Diabetes dataset: 0.5177494254132934


### 1.5 Adjusted R² score

Then implement the adjusted $R^2$ score in the function below:

In [11]:
def adjusted_r_squared(y, y_pred, K):
    """
    Args: 
        y : numpy.array with shape (num_samples,) - labels 
        y_pred : numpy.array with shape (num_samples,) - predictions
        K : integer - Number of features used in the model that computed y_hat.

    Returns: 
        r2_adj : float with adjusted R squared value
    """
    
    # 1) Compute R².
    # r2 = ...
    # YOUR CODE HERE
    r2 = r_squared(y, y_pred)
    #raise NotImplementedError()
    
    # 2) Get number of samples 
    # N = ...
    # YOUR CODE HERE
    N = len(y)
    #raise NotImplementedError()

    # 3) Adjust R²
    # r2_adj = ...
    # YOUR CODE HERE
    r2_adj = 1-(N-1)*(1-r2)/(N-K-1)
    #raise NotImplementedError()
    
    return r2_adj

Check the outputs of your function match the results below:

In [12]:
r2 = adjusted_r_squared(y_diabetes, y_hat_diabetes, x_diabetes.shape[1])
print('Adjusted R² Diabetes dataset: {}'.format(r2))
assert math.isclose(0.506560316954205, r2)

Adjusted R² Diabetes dataset: 0.506560316954205


## 2 Scikit-Learn metrics

As you know, scikitlearn also already provides you with implementations of these metrics: 

- `sklearn.metrics.mean_absolute_error`
- `sklearn.metrics.mean_squared_error`
- `sklearn.metrics.r2_score`
- `sklearn.linear_model.LinearRegression.score` 

In [28]:
# Import sklearn metrics
from sklearn import metrics as sklearn_metrics

#### 2.1 Root Mean Squared Error

Implement the root mean squared error functions below with scikitlearn:

In [34]:
def sklearn_root_mean_squared_error(y, y_pred): 
    """
    Args: 
        y_pred : numpy.array with shape (num_samples,) - predictions
        y : numpy.array with shape (num_samples,) - labels 
        
    Returns: 
        rmse : float with Root Mean Squared Error
    """
    # YOUR CODE HERE
    rmse = np.sqrt(sklearn_metrics.mean_squared_error(y, y_pred))
    return rmse
    #raise NotImplementedError()

Make sure your function passes the tests below:

In [19]:
rmse = sklearn_root_mean_squared_error(y_diabetes, y_hat_diabetes)
print('Sklearn RMSE Diabetes dataset: {}'.format(rmse))
assert math.isclose(53.47607314274362, rmse)

Sklearn RMSE Diabetes dataset: 53.47607314274362


#### 2.2  Adjusted R² score

Implement the adjusted R² score below using scikitlearn:

In [32]:
def sklearn_adjusted_r_squared(y, y_pred, K): 
    """
    Args: 
        y_pred : numpy.array with shape (num_samples,) - predictions
        y : numpy.array with shape (num_samples,) - labels 
        K : integer - Number of features used in the model that computed y_hat.

    Returns: 
        r2_adj : float with adjusted R squared value
    """
    # YOUR CODE HERE
    N = len(y)
    r2_adj = 1-((N-1)/(N-K-1))*(1-sklearn_metrics.r2_score(y, y_pred))
    return r2_adj
    #raise NotImplementedError()

Make sure your function passes the tests below:

In [33]:
r2 = sklearn_adjusted_r_squared(y_diabetes, y_hat_diabetes, x_diabetes.shape[1])
print('Sklearn Adjusted R² Diabetes dataset: {}'.format(r2))
assert math.isclose(0.506560316954205, r2)

Sklearn Adjusted R² Diabetes dataset: 0.506560316954205


Finally, compare the sklearn-based metrics with your own for the diabetes dataset:

In [35]:
MAE = mean_absolute_error(y_diabetes, y_hat_diabetes)
MSE = mean_squared_error(y_diabetes, y_hat_diabetes)
RMSE = root_mean_squared_error(y_diabetes, y_hat_diabetes)
R2 = r_squared(y_diabetes, y_hat_diabetes)
R2_adj = adjusted_r_squared(y_diabetes, y_hat_diabetes, x_diabetes.shape[1])

print('Metric for diabetes dataset with base implementation:')
print('Mean Absolute Error diabetes dataset: {}'.format(MAE))
print('Mean Squared Error diabetes dataset: {}'.format(MSE))
print('Root Mean Squared Error diabetes dataset: {}'.format(RMSE))
print('R² diabetes dataset: {}'.format(R2))
print('Adjusted R² diabetes dataset: {}'.format(R2_adj))
print('\n')

SK_MAE = sklearn_metrics.mean_absolute_error(y_diabetes, y_hat_diabetes)
SK_MSE = sklearn_metrics.mean_squared_error(y_diabetes, y_hat_diabetes)
SK_RMSE = sklearn_root_mean_squared_error(y_diabetes, y_hat_diabetes)
SK_R2 = sklearn_metrics.r2_score(y_diabetes, y_hat_diabetes)
SK_R2_adj = sklearn_adjusted_r_squared(y_diabetes, y_hat_diabetes, x_diabetes.shape[1])

print('Metric for diabetes dataset with scikitlearn:')
print('Mean Absolute Error diabetes dataset: {}'.format(SK_MAE))
print('Mean Squared Error diabetes dataset: {}'.format(SK_MSE))
print('Root Mean Squared Error diabetes dataset: {}'.format(SK_RMSE))
print('R² diabetes dataset: {}'.format(SK_R2))
print('Adjusted R² diabetes dataset: {}'.format(SK_R2_adj))


Metric for diabetes dataset with base implementation:
Mean Absolute Error diabetes dataset: 43.27739508374988
Mean Squared Error diabetes dataset: 2859.6903987680657
Root Mean Squared Error diabetes dataset: 53.47607314274362
R² diabetes dataset: 0.5177494254132934
Adjusted R² diabetes dataset: 0.506560316954205


Metric for diabetes dataset with scikitlearn:
Mean Absolute Error diabetes dataset: 43.27739508374988
Mean Squared Error diabetes dataset: 2859.6903987680657
Root Mean Squared Error diabetes dataset: 53.47607314274362
R² diabetes dataset: 0.5177494254132934
Adjusted R² diabetes dataset: 0.506560316954205


## 3 Using the Metrics

Now you'll use the metrics to fit and check performance of your LinearRegression and SGDRegressor, with the `cross_val_scores` method of scikitlearn. Implement the missing steps below:


In [None]:
from sklearn.model_selection import cross_val_score
from sklearn import metrics
from sklearn import linear_model

def estimator_cross_fold(X, y, K, clf_choice='linear', scoring='neg_mean_squared_error'):
    """
    Args: 
        X : numpy.array with shape (num_samples, num_features) - sample data
        y : numpy.array with shape (num_samples,) - sample labels 
        K : integer - Number of iterations for k-fold
        clf_choice: choice of estimator 
        scoring : scoring function as per sklearn notation

    Returns: 
        clf: estimator trained with full data
        scores : scores for each fold
    """
    
    if clf_choice == 'linear':
        clf = linear_model.LinearRegression()
    elif clf_choice == 'sgd':
        clf = linear_model.SGDRegressor(max_iter=10000)
    else:
        print('Invalid estimator')
        return None
     
    # 1) Fit linear_model
    for key, clf in clfs.items():
    clf.fit(x_train_clf, y_train)

    # YOUR CODE HERE
    raise NotImplementedError()

    # 2) Run k-fold cross validation
    # YOUR CODE HERE
    raise NotImplementedError()
    
    return clf, scores

Let's run the k-fold cross validation for the several cases and get the average error:

In [None]:
np.random.seed(42)

clf_lr, nmse_lr = estimator_cross_fold(x_diabetes, y_diabetes.ravel(), 5, clf_choice='linear', scoring='neg_mean_squared_error')
assert math.isclose(-2993.0729432998864, nmse_lr.mean())

clf_sgd, nmse_sgd = estimator_cross_fold(x_diabetes, y_diabetes.ravel(), 5, clf_choice='sgd', scoring='neg_mean_squared_error')
assert math.isclose(-3024.1665383692743, nmse_sgd.mean())

clf_lr, r2_lr = estimator_cross_fold(x_diabetes, y_diabetes.ravel(), 5, clf_choice='linear', scoring='r2')
assert math.isclose(0.48231812211149394, r2_lr.mean())

clf_sgd, r2_sgd = estimator_cross_fold(x_diabetes, y_diabetes.ravel(), 5, clf_choice='sgd', scoring='r2')
assert math.isclose(0.4754722767620816, r2_sgd.mean())


In [None]:
print('Cross val evaluation for diabetes dataset:')
print('NMSE with Linear Regression: {}'.format(nmse_lr.mean()))
print('NMSE with SGD: {}'.format(nmse_sgd.mean()))
print('R² Score with Linear Regression: {}'.format(r2_lr.mean()))
print('R² Score with SGD: {}'.format(r2_sgd.mean()))

For this particular case it seems that the linear regression generalises better than the SGD regressor. It's important to remind that the SGD regressor is at a slight disadvantage, because we didn't check the data distribution to understand if it has appropriate scaling. Remember that SGD will be sensitive to this, while linear regression won't. Feel free to replicate these exercises but applying min-max scaling beforehand and check the new results.