## SLU08 - Metrics for Regression: Exercise Notebook

In this notebook, you will implement:

    - Mean Absolute Error (MAE)
    - Mean Squared Error (MSE)
    - Root Mean Squared Error (RMSE)
    - Coefficient of Determination (R²)
    - Adjusted R²
    - Scikitlearn metrics
    - Use the metrics to compare two linear regression models

Start by importing the necessary packages and loading the data.

In [1]:
import numpy as np
import pandas as pd
from sklearn.datasets import load_diabetes
from sklearn.linear_model import LinearRegression, SGDRegressor
from sklearn.model_selection import train_test_split

In [2]:
data = load_diabetes()

x = pd.DataFrame(data['data'], columns=data['feature_names'])
y = pd.Series(data['target'])
x.head()

Unnamed: 0,age,sex,bmi,bp,s1,s2,s3,s4,s5,s6
0,0.038076,0.05068,0.061696,0.021872,-0.044223,-0.034821,-0.043401,-0.002592,0.019907,-0.017646
1,-0.001882,-0.044642,-0.051474,-0.026328,-0.008449,-0.019163,0.074412,-0.039493,-0.068332,-0.092204
2,0.085299,0.05068,0.044451,-0.00567,-0.045599,-0.034194,-0.032356,-0.002592,0.002861,-0.02593
3,-0.089063,-0.044642,-0.011595,-0.036656,0.012191,0.024991,-0.036038,0.034309,0.022688,-0.009362
4,0.005383,-0.044642,-0.036385,0.021872,0.003935,0.015596,0.008142,-0.002592,-0.031988,-0.046641


Here we fit the data to sklearn LinearRegression model and output the prediction and the model paramaters.

In [3]:
np.random.seed(42)

x_diabetes = x.values
y_diabetes = y.values

lr = LinearRegression()
lr.fit(x_diabetes, y_diabetes)

y_hat_diabetes = lr.predict(x_diabetes)
betas_diabetes = pd.Series([lr.intercept_] + list(lr.coef_))

## 1 Metrics

In this exercise, you will implement functions that calculate the metrics by hand, without using scikit-klearn, so that remember them well. We will start by a set of three related metrics:

- Mean Absolute Error

$$MAE = \frac{1}{N} \sum_{n=1}^N \left| y_n - \hat{y}_n \right|$$

- Mean Squared Error

$$MSE = \frac{1}{N} \sum_{n=1}^N (y_n - \hat{y}_n)^2$$

- Root Mean Squared Error

$$RMSE = \sqrt{MSE}$$

### 1.1 Mean Absolute Error

First, implement the Mean Absolute Error in the function below. 

In [4]:
def mean_absolute_error(y, y_pred): 
    """
    Args: 
        y_pred : numpy.array with shape (num_samples,) - predictions
        y : numpy.array with shape (num_samples,) - true values
        
    Returns: 
        mae : float with Mean Absolute Error
    """
    # 1) Compute the errors between predictions and true values
    # error = ...
    ### BEGIN SOLUTION
    error = y - y_pred
    ### END SOLUTION
    
    # 2) Compute the absolute value of the errors
    # abs_error = ...
    ### BEGIN SOLUTION
    abs_error = np.abs(error)
    ### END SOLUTION
    
    # 3) Compute the mean of the absolute value of the errors
    # mae = ...
    ### BEGIN SOLUTION
    mae = abs_error.mean()
    ### END SOLUTION
    
    return mae

Check if the output of your function matches the result below.

In [5]:
mae = mean_absolute_error(y_diabetes, y_hat_diabetes)
print('Mean Absolute Error Diabetes dataset: {}'.format(mae))
np.testing.assert_almost_equal(mae, 43.2773, 3)

Mean Absolute Error Diabetes dataset: 43.27745202531506


### 1.2 Mean Squared Error

Implement the mean squared error in the next function.

In [6]:
def mean_squared_error(y, y_pred): 
    """
    Args: 
        y_pred : numpy.array with shape (num_samples,) - predictions
        y : numpy.array with shape (num_samples,) - true values
        
    Returns: 
        mse : float with Mean Squared Error Value

    """
    # 1) Compute the errors between the predictions and the true values
    # error = ...
    ### BEGIN SOLUTION
    error = y - y_pred
    ### END SOLUTION
    
    # 2) Compute the squared value of the errors
    # squared_error = ...
    ### BEGIN SOLUTION
    squared_error = error ** 2
    ### END SOLUTION
    
    # 3) Compute the mean squared value of the errors
    # mse = ...
    ### BEGIN SOLUTION
    mse = squared_error.mean()
    ### END SOLUTION
    
    return mse

Check if the output of your function matches the result below.

In [7]:
mse = mean_squared_error(y_diabetes, y_hat_diabetes)
print('Mean Squared Error Diabetes dataset: {}'.format(mse))
np.testing.assert_almost_equal(mse, 2859.6963, 3)

Mean Squared Error Diabetes dataset: 2859.6963475867506


### 1.3 Root Mean Squared Error
Implement the root mean squared error in the function below.

In [8]:
def root_mean_squared_error(y, y_pred): 
    """
    Args: 
        y_pred : numpy.array with shape (num_samples,) - predictions
        y : numpy.array with shape (num_samples,) - true values
        
    Returns: 
        mse : float with the Root Mean Squared Error Value
    """
    # 1) Compute the mean squared error. 
    # mse = ...
    ### BEGIN SOLUTION
    mse = mean_squared_error(y, y_pred)
    ### END SOLUTION
    
    # 2) Compute the root square of the mean squared error
    # rmse = ...
    ### BEGIN SOLUTION
    rmse = np.sqrt(mse)
    ### END SOLUTION
    
    return rmse

Check if the output of your function matches the result below.

In [9]:
rmse = root_mean_squared_error(y_diabetes, y_hat_diabetes)
print('Root Mean Squared Error Diabetes dataset: {}'.format(rmse))
np.testing.assert_almost_equal(rmse, 53.4761, 3)

Root Mean Squared Error Diabetes dataset: 53.476128764026576


Next, we will focus on the Coefficient of Determination - $R^2$ - and its adjusted form. See the equations below:

- $R^2$ score 

$$R² = 1 - \frac{MSE(y, \hat{y})}{MSE(y, \bar{y})} 
= 1 - \frac{\frac{1}{N} \sum_{n=1}^N (y_n - \hat{y}_n)^2}{\frac{1}{N} \sum_{n=1}^N (y_n - \bar{y})^2}
= 1 - \frac{\sum_{n=1}^N (y_n - \hat{y}_n)^2}{\sum_{n=1}^N (y_n - \bar{y})^2}$$

where $$\bar{y} = \frac{1}{N} \sum_{n=1}^N y_n$$

- Adjusted $R^2$ score 

$$\bar{R}^2 = 1 - \frac{N - 1}{N - K - 1} (1 - R^2)$$

where $N$ is the number of observations in the dataset used for training the model (i.e. number of rows of the pandas dataframe) and $K$ is the number of features used by your model (i.e. number of columns of the pandas dataframe).


### 1.4 R² score

Now implement the $R^2$ score in the function below.

In [10]:
def r_squared(y, y_pred): 
    """
    Args: 
        y_pred : numpy.array with shape (num_samples,) - predictions
        y : numpy.array with shape (num_samples,) - true values
        
    Returns: 
        r2 : float with R squared value
    """

    # 1) Compute the mean of the true values
    # y_mean = ...
    ### BEGIN SOLUTION
    y_mean = y.mean()
    ### END SOLUTION

    # 2) Compute the mean squared error between the true values and their mean.
    # mse_mean = ...
    ### BEGIN SOLUTION
    mse_mean = mean_squared_error(y, y_mean)
    ### END SOLUTION
    
    # 3) Compute the mean squared error between the true values and the predictions.
    # mse_pred = ...
    ### BEGIN SOLUTION
    mse_pred = mean_squared_error(y, y_pred)
    ### END SOLUTION
    
    # 4) Finally, compute R²
    # r2 = ...
    ### BEGIN SOLUTION
    r2 = 1 - (mse_pred/mse_mean)
    ### END SOLUTION
    
    return r2

Check if the output of your function matches the result below.

In [11]:
r2 = r_squared(y_diabetes, y_hat_diabetes)
print('R² Diabetes dataset: {}'.format(r2))
np.testing.assert_almost_equal(r2, 0.5177, 3)

R² Diabetes dataset: 0.5177484222203498


### 1.5 Adjusted R² score

Now implement the adjusted $R^2$ score in the function below.

In [12]:
def adjusted_r_squared(y, y_pred, K):
    """
    Args: 
        y : numpy.array with shape (num_samples,) - true values
        y_pred : numpy.array with shape (num_samples,) - predictions
        K : integer - Number of features used in the model that computed y_hat.

    Returns: 
        r2_adj : float with adjusted R squared value
    """
    
    # 1) Compute R².
    # r2 = ...
    ### BEGIN SOLUTION
    r2 = r_squared(y, y_pred)
    ### END SOLUTION
    
    # 2) Get number of samples 
    # N = ...
    ### BEGIN SOLUTION
    N = y.shape[0]
    ### END SOLUTION

    # 3) Computer adjusted R²
    # r2_adj = ...
    ### BEGIN SOLUTION
    r2_adj = 1 - ((N - 1) / (N - K - 1)) * (1 - r2)
    ### END SOLUTION
    
    return r2_adj

Check if the output of your function matches the results below.

In [13]:
r2 = adjusted_r_squared(y_diabetes, y_hat_diabetes, x_diabetes.shape[1])
print('Adjusted R² Diabetes dataset: {}'.format(r2))
np.testing.assert_almost_equal(r2, 0.5065, 3)

Adjusted R² Diabetes dataset: 0.5065592904853231


## 2 Scikit-Learn metrics

Now that you know how to calculate the metrics manually, let's use the scikitlearn implementations:

- `sklearn.metrics.mean_absolute_error`
- `sklearn.metrics.mean_squared_error`
- `sklearn.metrics.r2_score`
- `sklearn.linear_model.LinearRegression.score` 

In [14]:
# Import sklearn metrics
from sklearn import metrics as sklearn_metrics

#### 2.1 Root Mean Squared Error

Implement the root mean squared error function using scikitlearn.

In [15]:
def sklearn_root_mean_squared_error(y, y_pred): 
    """
    Args: 
        y_pred : numpy.array with shape (num_samples,) - predictions
        y : numpy.array with shape (num_samples,) - true values
        
    Returns: 
        rmse : float with Root Mean Squared Error
    """
    ### BEGIN SOLUTION
    mse = sklearn_metrics.mean_squared_error(y, y_pred)
    rmse = np.sqrt(mse)
    return rmse
    ### END SOLUTION

Make sure your function passes the tests below.

In [16]:
rmse = sklearn_root_mean_squared_error(y_diabetes, y_hat_diabetes)
print('Sklearn RMSE Diabetes dataset: {}'.format(rmse))
np.testing.assert_almost_equal(rmse, 53.4760, 3)

Sklearn RMSE Diabetes dataset: 53.476128764026576


#### 2.2  Adjusted R² score

Implement the adjusted R² score below using scikitlearn.

In [17]:
def sklearn_adjusted_r_squared(y, y_pred, K): 
    """
    Args: 
        y_pred : numpy.array with shape (num_samples,) - predictions
        y : numpy.array with shape (num_samples,) - true values
        K : integer - Number of features used in the model that computed y_hat.

    Returns: 
        r2_adj : float with adjusted R squared value
    """
    ### BEGIN SOLUTION
    r2 = sklearn_metrics.r2_score(y, y_pred)
    N = y.shape[0]
    r2_adj = 1 - ((N - 1) / (N - K - 1)) * (1 - r2)
    return r2_adj
    ### END SOLUTION

Make sure your function passes the tests below.

In [18]:
r2 = sklearn_adjusted_r_squared(y_diabetes, y_hat_diabetes, x_diabetes.shape[1])
print('Sklearn Adjusted R² Diabetes dataset: {}'.format(r2))
np.testing.assert_almost_equal(r2, 0.5065, 3)

Sklearn Adjusted R² Diabetes dataset: 0.5065592904853231


Finally, compare the sklearn-based metrics with your own implementations for the diabetes dataset.

In [19]:
MAE = mean_absolute_error(y_diabetes, y_hat_diabetes)
MSE = mean_squared_error(y_diabetes, y_hat_diabetes)
RMSE = root_mean_squared_error(y_diabetes, y_hat_diabetes)
R2 = r_squared(y_diabetes, y_hat_diabetes)
R2_adj = adjusted_r_squared(y_diabetes, y_hat_diabetes, x_diabetes.shape[1])

print('Metric for diabetes dataset with base implementation:')
print('Mean Absolute Error diabetes dataset: {}'.format(MAE))
print('Mean Squared Error diabetes dataset: {}'.format(MSE))
print('Root Mean Squared Error diabetes dataset: {}'.format(RMSE))
print('R² diabetes dataset: {}'.format(R2))
print('Adjusted R² diabetes dataset: {}'.format(R2_adj))
print('\n')

SK_MAE = sklearn_metrics.mean_absolute_error(y_diabetes, y_hat_diabetes)
SK_MSE = sklearn_metrics.mean_squared_error(y_diabetes, y_hat_diabetes)
SK_RMSE = sklearn_root_mean_squared_error(y_diabetes, y_hat_diabetes)
SK_R2 = sklearn_metrics.r2_score(y_diabetes, y_hat_diabetes)
SK_R2_adj = sklearn_adjusted_r_squared(y_diabetes, y_hat_diabetes, x_diabetes.shape[1])

print('Metric for diabetes dataset with scikitlearn:')
print('Mean Absolute Error diabetes dataset: {}'.format(SK_MAE))
print('Mean Squared Error diabetes dataset: {}'.format(SK_MSE))
print('Root Mean Squared Error diabetes dataset: {}'.format(SK_RMSE))
print('R² diabetes dataset: {}'.format(SK_R2))
print('Adjusted R² diabetes dataset: {}'.format(SK_R2_adj))

Metric for diabetes dataset with base implementation:
Mean Absolute Error diabetes dataset: 43.27745202531506
Mean Squared Error diabetes dataset: 2859.6963475867506
Root Mean Squared Error diabetes dataset: 53.476128764026576
R² diabetes dataset: 0.5177484222203498
Adjusted R² diabetes dataset: 0.5065592904853231


Metric for diabetes dataset with scikitlearn:
Mean Absolute Error diabetes dataset: 43.27745202531506
Mean Squared Error diabetes dataset: 2859.6963475867506
Root Mean Squared Error diabetes dataset: 53.476128764026576
R² diabetes dataset: 0.5177484222203498
Adjusted R² diabetes dataset: 0.5065592904853231


## 3 Using the Metrics

Now you'll fit the LinearRegression and SGDRegressor models and check their performance using the metrics. Your function should be able to calculate any of the 3 sklearn defined metrics presented previously in this notebook, depending on your *scoring* parameter.

Implement the missing steps below.

In [20]:
def estimate_metrics_holdout(X, y, lr_choice='linear', scoring='mean_absolute_error'):
    """
    Fits the provided linear model to the data.
    Uses the hold-out method, splitting the data into train and test dataset.
    Calculates the provided metric on the test data prediction.
    
    Args: 
        X : numpy.array with shape (num_samples, num_features) - sample features
        y : numpy.array with shape (num_samples,) - true sample values
        lr_choice: linear regression model
        scoring : metric as per sklearn notation

    Returns: 
        lr: model trained with the train dataset
        score : metric calculated for the test data
    """
    
    np.random.seed(42)
    
    if lr_choice == 'linear':
        lr = LinearRegression()
    elif lr_choice == 'sgd':
        lr = SGDRegressor(max_iter=10000, random_state=42)
    else:
        print('Invalid estimator')
        return None
    
    # Split the data into train and test (use a test size of 0.4 and a random_state = 42)
    ### BEGIN SOLUTION
    x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.4, random_state=42)
    ### END SOLUTION
    
    # Train your model using your train data
    ### BEGIN SOLUTION
    lr.fit(x_train, y_train)
    ### END SOLUTION
    
    # Use the trained model to create prediction for the test dataset (y_hat = ...)
    ### BEGIN SOLUTION
    y_hat = lr.predict(x_test)
    ### END SOLUTION
    
    # Calculate the metric for the test dataset prediction
    # score = ...
    ### BEGIN SOLUTION
    if scoring == 'mean_absolute_error':
        score = sklearn_metrics.mean_absolute_error(y_test, y_hat)
    elif scoring == 'mean_squared_error':
        score = sklearn_metrics.mean_squared_error(y_test, y_hat)
    elif scoring == 'r2_score':
        score = sklearn_metrics.r2_score(y_test, y_hat)
    else:
        print('Invalid scoring function')
        return None
    ### END SOLUTION
    
    return lr, score

Let's calculate metrics using your function.

In [21]:
lr_lr, mae_lr = estimate_metrics_holdout(x_diabetes, y_diabetes, lr_choice='linear', scoring='mean_absolute_error')
np.testing.assert_almost_equal(mae_lr, 42.4965, 2)

lr_sgd, mae_sgd = estimate_metrics_holdout(x_diabetes, y_diabetes, lr_choice='sgd', scoring='mean_absolute_error')
np.testing.assert_almost_equal(mae_sgd, 42.9081, 2)

lr_lr, mse_lr = estimate_metrics_holdout(x_diabetes, y_diabetes, lr_choice='linear', scoring='mean_squared_error')
np.testing.assert_almost_equal(mse_lr, 2832.9913, 2)

lr_sgd, mse_sgd = estimate_metrics_holdout(x_diabetes, y_diabetes, lr_choice='sgd', scoring='mean_squared_error')
np.testing.assert_almost_equal(mse_sgd, 2871.8621, 2)

lr_lr, r2_lr = estimate_metrics_holdout(x_diabetes, y_diabetes, lr_choice='linear', scoring='r2_score')
np.testing.assert_almost_equal(r2_lr, 0.5157, 2)

lr_sgd, r2_sgd = estimate_metrics_holdout(x_diabetes, y_diabetes, lr_choice='sgd', scoring='r2_score')
np.testing.assert_almost_equal(r2_sgd, 0.5091, 2)

In [22]:
print('Hold-out method evaluation for Diabetes dataset:')

print('MAE with Linear Regression: {}'.format(mae_lr))
print('MAE with SGD: {}'.format(mae_sgd))
print('MSE with Linear Regression: {}'.format(mse_lr))
print('MSE with SGD: {}'.format(mse_sgd))
print('R² Score with Linear Regression: {}'.format(r2_lr))
print('R² Score with SGD: {}'.format(r2_sgd))

Hold-out method evaluation for Diabetes dataset:
MAE with Linear Regression: 42.49666434788084
MAE with SGD: 42.908235451680795
MSE with Linear Regression: 2832.9962397611503
MSE with SGD: 2871.8668811690122
R² Score with Linear Regression: 0.5157436313902429
R² Score with SGD: 0.5090993035971039
