## Metrics for Regression Problems

- **MAE**: Mean Absolute Error
- **MSE**: Mean Squared Error   
- **RMSE**: Root Mean Squared Error
- **MSLE**: Mean Squared Logarithmic Error
- **RMSLE**: Root Mean Squared Logarithmic Error
- **MAPE**: Mean Absolute Percentage Error
- **R²**: R squared
- **MCC**: Matthew’s Correlation Coefficient
- **QWK**: Quadratic Weighted kappa

### MAE: Mean Absolute Error

$$ MAE = abs(True Value - Predicted Value) $$

In [None]:
import numpy as np

def mean_absolute_error(y_true, y_pred):
    """
    This function calculates mae
    :param y_true: list of real numbers, true values
    :param y_pred: list of real numbers, predicted values
    :return: mean absolute error
    """
    
    # initialize error at 0
    error = 0
    # loop overa ll samples in the true and prdicted list
    for y_t, yp in zip(y_treu, y_pred):
        # calculate absolute error
        # and add to error
        error += np.abs(yt - yp)
        
    # return mean error
    return error/ len(y_true)


### MSE: Mean Squared Error

$$ MSE = (True Value - Predicted Value)^2 $$

In [None]:
def mean_squared_error(y_true, y_pred):
    """
    This function calculates mse
    :param y_true: list of real numbers, true values
    :param y_pred: list of real numbers, predicted values
    :return: mean squared error
    """
    # initialize error at 0
    error = 0
    
    # loop over all samples in the true and predicted list
    
    for yt, yp in zip(y_true, y_pred):
        # calculate squared error
        # and add to error
        error += (yt - yp) ** 2
    
    # return mean error
    return error / len(y_true)

### RMSE: Root Mean Squared Error

$$RMSE= \sqrt{MSE}$$

### MSLE: Mean Squared Logarithmic Error

In [None]:
def mean_squared_log_error(y_true, y_pred):
    """
    This function calculates msle
    :param y_true: list of real numbers, true values
    :param y_pred: list of real numbers, predicted values
    :return: mean squared logarithmic error

    """
    # initialize error at 0
    error = 0
    
    # loop over all samples in true and predicted list
    for yt, yp in zip(y_true, y_pred):
        # calculate squared log error
        # and add to error
        error += (np.log(1 + yt) - np.log(1 + yp)) ** 2
    
    # return mean error
    return error / len(y_true)

### RMSLE: Root Mean Squared Logarithmic Error

$$Percentage Error = ( ( True Value – Predicted Value ) / True Value ) * 100$$

In [None]:
def mean_percentage_error(y_true, y_pred):
    """
    This function calculates mpe
    :param y_true: list of real numbers, true values
    :param y_pred: list of real numbers, predicted values
    :return: mean percentage error
    """
    # initialize error at 0
    error = 0
    # loop over all samples in true and predicted list
    for yt, yp in zip(y_true, y_pred):
    # calculate percentage error
    # and add to error
    error += (yt - yp) / yt
    # return mean percentage error
    return error / len(y_true)

### MAPE: Mean Absolute Percentage Error

In [None]:
def mean_abs_percentage_error(y_true, y_pred):
    """
    This function calculates MAPE
    :param y_true: list of real numbers, true values
    :param y_pred: list of real numbers, predicted values
    :return: mean absolute percentage error
    """
    # initialize error at 0
    error = 0
    
    # loop over all samples in true and predicted list
    for yt, yp in zip(y_true, y_pred):
        # calculate percentage error
        # and add to error
        error += np.abs(yt - yp) / yt
    
    # return mean percentage error
    return error / len(y_true)

### $R^2$: R-squared

In simple words, R-squared says how good your model fits the data. R-squared
closer to 1.0 says that the model fits the data quite well, whereas closer 0 means
that model isn’t that good.

$$ R^2 = 1 -\frac{\sum^{N}_{i=1} (y_{t_{i}} - y_{p_{i}})^2}{\sum^{N}_{i=1} (y_{t_{i}} - y_{t_{mean}})^2}                $$

In [None]:
def r2(y_true, y_pred):
    """
    This function calculates r-squared score
    :param y_true: list of real numbers, true values
    :param y_pred: list of real numbers, predicted values
    :return: r2 score
    """
    # calculate the mean value of true values
    mean_true_value = np.mean(y_true)
    
    # initialize numerator with 0
    numerator = 0
    
    # initialize denominator with 0
    denominator = 0
    
    # loop over all true and predicted values
    for yt, yp in zip(y_true, y_pred):
        # update numerator
        numerator += (yt - yp) ** 2
        # update denominator
        denominator += (yt - mean_true_value) ** 2
        
    # calculate the ratio
    ratio = numerator / denominator
    
    # return 1 - ratio
    return 1 – ratio

All this metrics are implemented in sklearn, for example:

```def mae_np(y_true, y_pred):
        return np.mean(np.abs(y_true - y_pred))
    ```

### QWK: Quadratic Weighted kappa (Cohen’s kappa)

Measures the agreement between two ratings.

The ratings can be any real numbers in 0 to N. And
predictions are also in the same range. An agreement can be defined as how close
these ratings are to each other. So, it’s **suitable for a classification problem with N
different categories/classes**. 

If the agreement is high, the score is closer towards 1.0.
In the case of low agreement, the score is close to 0.

In [3]:
from sklearn import metrics

y_true = [1, 2, 3, 1, 2, 3, 1, 2, 3]
y_pred = [2, 1, 3, 1, 2, 3, 3, 1, 2]

qwk = metrics.cohen_kappa_score(y_true, y_pred, weights="quadratic")

acc = metrics.accuracy_score(y_true, y_pred)

print('Cohens Kappa', qwk)
print('Accuracy Score', acc)

Cohens Kappa 0.33333333333333337
Accuracy Score 0.4444444444444444


You can see that even though accuracy is high, QWK is less. A QWK greater than
0.85 is considered to be very good!

### MCC: Matthew’s Correlation Coefficient

MCC ranges from -1 to 1. 1 is perfect prediction, -1 is imperfect prediction, and 0 is random
prediction.

        
$$ MCC = \frac{TP \times TN - FP \times FN}{[ (TP + FP) \times (FN + TN) \times (FP + TN) \times (TP + FN) ] ^ {0.5}} $$


In [None]:
def mcc(y_true, y_pred):
    """
    This function calculates Matthew's Correlation Coefficient
    for binary classification.
    :param y_true: list of true values
    :param y_pred: list of predicted values
    :return: mcc score
    """
    tp = true_positive(y_true, y_pred)
    tn = true_negative(y_true, y_pred)
    fp = false_positive(y_true, y_pred)
    fn = false_negative(y_true, y_pred)

    numerator = (tp * tn) - (fp * fn)

    denominator = ( (tp + fp) * (fn + tn) * (fp + tn) * (tp + fn) )

    denominator = denominator ** 0.5

return numerator/denominator

One thing to keep in mind is that to evaluate unsupervised methods, for example,
some kind of clustering, it’s better to create or manually label the test set and keep
it separate from everything that is going on in your modelling part. 

When you are
done with clustering, you can evaluate the performance on the test set simply by
using any of the supervised learning metrics.

Once we understand what metric to use for a given problem, we can start looking
more deeply into our models for improvements.