# Model Evaluation

## 5.1 Cross-validation

**k-fold cross-validation**: Instead of having only one set of train/test, we have several cuts in the data split differently to compute the average test score. (usually between 5 - 10)

Looking at the range of scores in the folds gives us an idea on how well the model will be able to generalise to knew data. To summarise the performance of the model, we use the mean.

**Stratified k-fold cross-validation** is used by default in classification in scikit-learn. This means it considers how your data is split and follows this split in the generated train and test sets. For example, if your data has 90% label A and 10% label B, in each fold you have 90% label A samples and 10% label B samples. 

**IMPORTANT** Cross-validation is not a way to create a model that will be able to predict on new data. It only gives you information on how well a given algorithm will generalise on a specific dataset. The cross validation does not return a model.

In [10]:
from sklearn.model_selection import cross_val_score
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression

iris = load_iris()

logreg = LogisticRegression(max_iter=100000)

scores = cross_val_score(logreg, iris.data, iris.target, cv=5)
#by default it produces 5 folds, but we can change the CV parameter

print(f"Cross-validation scores: {scores}")
print(f"Mean cross-validation: {scores.mean():.3f}")

Cross-validation scores: [0.96666667 1.         0.93333333 0.96666667 1.        ]
Mean cross-validation: 0.973


The CV parameter can be fine-adjusted. Let's say that instead of using the default stratified K-fold, you wanted to reproduce some results, and go with the k-fold instead. You can do that by adjusting the CV parameter as below:

In [13]:
from sklearn.model_selection import KFold

kfold = KFold(n_splits=3)
print(f"Cross-validation scores: {cross_val_score(logreg, iris.data, iris.target, cv=kfold)}")
# we will get a score of 0 0 0 because the data is ordered in way that the first points all belong to the same class


# another strategy instead of using stratified k-fold is to produce a suffling of the data
# but then remember to use random_state in order to get reproducible results
kfold_shuffle = KFold(n_splits=3, shuffle=True, random_state=0)
print(f"Cross-validation scores with shuffling: {cross_val_score(logreg, iris.data, iris.target, cv=kfold_shuffle)}")

Cross-validation scores: [0. 0. 0.]
Cross-validation scores with shuffling: [0.98 0.96 0.96]


**Leave-one-out cross-validation** is like a k-fold, but each fold is a single sample. It is not used very much because it is time-consuming, but it can be useful in small datasets.

In [14]:
from sklearn.model_selection import LeaveOneOut

loo = LeaveOneOut()

scores = cross_val_score(logreg, iris.data, iris.target, cv=loo)

print(f"Number of cv iterations: {len(scores)}")
print(f"Mean accuracy: {scores.mean():.2f}")

Number of cv iterations: 150
Mean accuracy: 0.97
