# Model Evaluation Techniques

This notebook will only deal with commonly used evaluation metrics for regression and classification. This list is not exhaustive, you are encouraged to look at the other metrics that can be used.

**References**:  
(1) Scikit-Learn : https://scikit-learn.org/stable/modules/model_evaluation.html  
(2) https://github.com/maykulkarni/Machine-Learning-Notebooks

**Useful Resources :**  
https://scikit-learn.org/stable/modules/model_evaluation.html  
https://scikit-learn.org/stable/modules/model_evaluation.html#mean-absolute-error

# 1. Regression Metrics

## 1.1 R^2 score (Coefficient of Determination)

More information on the math behind the use of the R^2 score can be found here : https://nbviewer.jupyter.org/github/maykulkarni/Machine-Learning-Notebooks/blob/master/05.%20Model%20Evaluation/R%20Squared.ipynb

In [7]:
# Most of the metrics implemented in the scikit-learn library will be found here. (sklearn.metrics)
from sklearn.metrics import r2_score

In [30]:
"""
    Arbitrarily define y_true and y_pred.
    Assume that the following are the ground truth (y_true) and the model predictions (y_pred).
"""
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]

In [11]:
# The signature of the function is : r2_score(y_true, y_pred)
# An r2_score of 1 means that your model is the best. In practice, you don't achieve such high scores, rather r2_score
# will tend to 1 if you model the data properly.
r2_score(y_true, y_pred)

0.9486081370449679

## 1.2 Mean Absolute Error (MAE)

In [13]:
# Import the function.
from sklearn.metrics import mean_absolute_error

In [15]:
# Signature of the function is : mean_absolute_error(y_true, y_pred)
mean_absolute_error(y_true, y_pred)

0.5

For more information on MAE, refer the following link : https://scikit-learn.org/stable/modules/model_evaluation.html#mean-absolute-error

## 1.3 Mean Squared Error (MSE)

In [26]:
# Import the function.
from sklearn.metrics import mean_squared_error

In [27]:
# Signature of the function is : mean_squared_error(y_true, y_pred)
mean_squared_error(y_true, y_pred)

0.375

For theory on the use of mean_squared_error, refer : https://scikit-learn.org/stable/modules/model_evaluation.html#mean-absolute-error

## 1.4 Root Mean Squared Error (RMSE)

In [28]:
# Import the function.
from sklearn.metrics import mean_squared_error

# Import numpy (will be using some of its functionality)
import numpy as np

In [29]:
# Signature of the function is : mean_squared_error(y_true, y_pred)
np.sqrt(mean_squared_error(y_true, y_pred))

0.6123724356957945

# 2. Classification Metrics

## 2.1 Accuracy Score

In [35]:
# Import the function.
from sklearn.metrics import accuracy_score

In [37]:
"""
    Assume y_true and y_pred to be the following.
"""
y_pred = [0, 2, 1, 3]
y_true = [0, 1, 2, 3]

In [38]:
# Compute the accuracy score. Essentially it means that 50% of the test samples have been classified correctly.
accuracy_score(y_true, y_pred)

0.5

In [39]:
# If 'normalize' == 'False', then the number of correctly classified samples is returned. 
accuracy_score(y_true, y_pred, normalize=False)

2

## 2.2 Confusion Matrix

In [40]:
# Import the function.
from sklearn.metrics import confusion_matrix

In [41]:
"""
    Assumption of y_true and y_pred.
"""
y_true = [2, 0, 2, 2, 0, 1]
y_pred = [0, 0, 2, 2, 0, 2]

In [42]:
"""
    To understand a confusion matrix, you'll need to understand the terms : true-positive, true-negative, 
    false-negative and false-positive. For more information on them, refer the following link :  
    
    Link : https://en.wikipedia.org/wiki/Confusion_matrix
"""
confusion_matrix(y_true, y_pred)

array([[2, 0, 0],
       [0, 0, 1],
       [1, 0, 2]])

People generally visually plot the confusion matrix, as it is much easier to visualize. However, we will not be doing that here. You're free to explore about that.

## 2.3 Classification Report

The classification_report function builds a text report showing the main classification metrics.

In [43]:
# Import the function.
from sklearn.metrics import classification_report

In [44]:
# Dummy Dataset (Assumptions)
y_true = [0, 1, 2, 2, 0]
y_pred = [0, 0, 2, 1, 0]
target_names = ['class 0', 'class 1', 'class 2']

In [47]:
# Think about why we used print() here? Why did we not use it anywhere above?
print (classification_report(y_true, y_pred, target_names=target_names))

              precision    recall  f1-score   support

     class 0       0.67      1.00      0.80         2
     class 1       0.00      0.00      0.00         1
     class 2       1.00      0.50      0.67         2

    accuracy                           0.60         5
   macro avg       0.56      0.50      0.49         5
weighted avg       0.67      0.60      0.59         5



## 2.4 Precision, Recall and F1 Score

These three metrics are generally used together, because the computation of F1-Score requires the value of precision
and recall.

### 2.4.1 Precision

Precision is the ability of the classifier not to label as positive a sample that is negative. The best value is 1 and the worst value is 0.

In [50]:
# Import the function.
from sklearn.metrics import precision_score

In [52]:
# Computing the precision score.
precision_score(y_true, y_pred, average="weighted")

0.6666666666666666

### 2.4.2 Recall

Recall is the ability of the classifier to find all the positive samples. The best value is 1 and the worst value is 0.

In [53]:
# Import the function.
from sklearn.metrics import recall_score

In [57]:
# Computing the recall score.
recall_score(y_true, y_pred, average="weighted")

0.6

### 2.4.3 F1 Score

The F1 score can be interpreted as a weighted average of the precision and recall, where an F1 score reaches its best value at 1 and worst score at 0. The relative contribution of precision and recall to the F1 score are equal. The formula for the F1 score is:

In [56]:
# Import the function.
from sklearn.metrics import f1_score

In [58]:
# Computing the f1 score.
f1_score(y_true, y_pred, average="weighted")

0.5866666666666667

## Optional

Another commonly used classification metric is 'ROC-AUC'. You can read more about this here : https://scikit-learn.org/stable/modules/model_evaluation.html