Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Use this package for Nested Cross-Validation #10

WenjieZ opened this issue Mar 8, 2020 · 0 comments

[Docs] Use this package for Nested Cross-Validation #10

WenjieZ opened this issue Mar 8, 2020 · 0 comments


Copy link

WenjieZ commented Mar 8, 2020

This issue documents the way to use this package for Nested Cross-Validation. If you have any question, welcome to comment below.

Flat cross-validation vs. nested cross-validation

To clarify the meaning of these two terms in this specific issue, let me first describe them.

Flat cross-validation

Let us use 5-Fold as an example. In a 5-Fold flat cross-validation, you split the dataset into 5 subsets. Each time, you train a model from 4 of them and test it on the remaining one. Afterwards, you average the 5 scores yielded from the 5 test subsets.

ooooo: training subset
*****: test subset

ooooo ooooo ooooo ooooo *****
ooooo ooooo ooooo ***** ooooo
ooooo ooooo ***** ooooo ooooo
ooooo ***** ooooo ooooo ooooo
***** ooooo ooooo ooooo ooooo

Reasonably, the model you trained depends on both the algorithm you use and the hyperparameter you input. Therefore, the averaged score provides a criterion to evaluate both the algorithms and hyperparameters. I will later explain whether these evaluations are accurate enough, but for now it suffices to understand the basic procedure.

Nested cross-validation

In contrast to flat cross-validation, which evaluates both the algorithms and the hyperparameters in one fell swoop, nested cross-validation evaluates them in a hierarchical fashion. In the upper-level, it evaluates the algorithms; in the lower-level, it evaluates the hyperparameter within each algorihtm.

Let us still use the 5-Fold setup. First we, likewise, split the dataset into 5 subsets. Let us call it the macro split, which allows us to run each same experiment 5 times. In each run, we further split the training set into 5 sub-subsets. Let us call it the micro split. If the whole dataset has 25 samples, then the macro split sets 20 samples for training and 5 samples for testing in each run, and the micro split further splits the 20 training samples and sets 16 for training and 4 for testing.

Macro split:

12345 12345 12345 12345 *****   =>  further split to micro split -- No. 1
12345 12345 12345 ***** 12345   =>  further split to micro split -- No. 2
12345 12345 ***** 12345 12345   =>  further split to micro split -- No. 3
12345 ***** 12345 12345 12345   =>  further split to micro split -- No. 4
***** 12345 12345 12345 12345   =>  further split to micro split -- No. 5

(Indicative) micro split -- No. 1 (5 in total):

1111 2222 3333 4444 xxxx
1111 2222 3333 xxxx 5555
1111 2222 xxxx 4444 5555
1111 xxxx 3333 4444 5555
xxxx 2222 3333 4444 5555

In the upper-level macro split, we choose a target algorithm and dive into the lower-level micro split. With the target algorithm fixed, we vary the hyperparameters to get the evaluation for each hyperparameter and choose the optimal one. Then, we return to the upper level by fixing the hyperparameter as the optimal one and evaluate the target algorithm. Then, we choose another target algorithm and repeat the same procedure.

Let us call it 5x5 nested cross-validation. Of course, you can use, in general, a mxn nested cross-validation. The essence is to separate the evaluation of the algorithm from the evaluation of the hyperparameter.

Use nested cross-validation for time series.

In time series cross-validation, you need to introduce gaps, which makes the problem tricky. Luckily, we have an easy walk around. That is, the 2xn nested cross-validation is free:

2x4 nested cross-validation

Macro split:

ooooo ooooo ooooo ooooo gap *****
***** gap ooooo ooooo ooooo ooooo

Micro split -- No. 1 (2 in total):

oooo oooo oooo gap ****
ooo ooo gap **** gap ooo
ooo gap **** gap ooo ooo
**** gap oooo oooo oooo

You can use my package tscv for this kind of 2xn nested cross-validation.

Why nested cross-validation?

The reason is that the algorithms with more hyperparameters have an edge in flat cross-validation. The dimension of the hyperparameters can be seen as the capacity of "bribery" of the algorithm. The more hyperparameters the algorithm owns, the more severely the algorithm compromise the test dataset. Flat cross-validation, by nature, favours those algorithms with rich hyperparameters. In contrast, nested cross-validation puts every algorithm on the same starting line. That is why nested cross-validation is preferred when comparing algorithms with significantly different dimensions of hyperparameters.

Then, does the nested cross-validation provides an accurate way to evaluate the final chosen model? No, though it help you to pick the best algorithm and its hyperparameter, the resulted model's performance is not under measurement. To explain it, we need some advanced statistics knowledge. To avoid bloating this issue, I will only mention here that model(x*) is different from model(x)|x=x*. The good news, however, is that if your algorithm does not have too many hyperparameters, the cross-validation error will not be too far away from the resulted model's error. Therefore, an algorithm with better performance in nested cross-validation likely leads to a model with better performance in terms of generation error.

@WenjieZ WenjieZ self-assigned this Mar 8, 2020
@WenjieZ WenjieZ closed this as completed Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet

No branches or pull requests

1 participant