Chapter 18
# Estimation with Cross-Validation

Cross-validation is a resampling procedure used to evaluate the skill of machine learning models on a limited data sample.

The procedure has a single parameter called k that refers to the number of groups that a given data sample is to be split into.  As such, the procedure is often called k-fold cross-validation.

It is a popular method because it is simple to understand, and generally results in a less biased or less optimistic estimate of the model skill than other methods (such as a simple train/test split)

1. shuffle the dataset randomly
2. split the dataset into k groups of approximately equal size
3. for each unique group
- take the group as a hold out or test data set
- take the remaining groups as a training data set
- fit a model on the training set
- evaluate the model on the test set
- retain the evaluation score and discard the model
4. summarise the skill of the model using the sample of model evauation scores

Importantly, each observation in the data sample is assigned to an individual group, and stays in that group for the duration of the procedure.  Thus each sample has the opportunity to be used in the test set once, and in the training set (k-1) times

It is important that any preparation of the data prior to fitting the model occur on the cross-validation-assigned training dataset within the loop, rather than on the broader dataset.  This prevents data leakage and an optimistic estimate of the model skill.

The results of a k-fold cross-validation run are often summarised with the mean of the model skill scores.  It is also good practice to include a measure of the variance, such as the standard deivation or standard error.

# Configuration of k
The k-value must be chosen carefully for your data sample.  A poorly chosen value may result in a mis-representative idea of the skill of the model (e.g. high variance), or a high bias (e.g. overestimate of skill).  Common tactics for choosing k:
- Representative - k is chosen such that each train/test group of data samples is large enough to be statistically representative of the broader dataset
- k=10 - this value has been found through experimentation to generally result in a model skill estiamte with low bias and a modest variance
- k=n - this gives each test sample an opportunity to be used in the test set.  This approach is called leave-one-out cross-validation

The choice of k is usually 5 or 10, but there is no formal rule.  As k gets larger, the difference in size between the training set and resampling subsets gets smaller, and the bias of the technique reduces.

If you are struggling to choose a value, k=10 is recommended.  However, it is preferable to split the data sample into k groups with the same number of samples if possible, such that the sample of model skill scores are all equivalent.

# Worked Example
Imagine we have a data set with 6 observations:
- [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]

Pick a value for k - we will choose k=3.  Shuffle the data and then split into 3 groups (each will have an equal number of 2 observations):
- Fold1: [0.5, 0.2]
- Fold2: [0.1, 0.3]
- Fold3: [0.4, 0.6]

Use the sample to evaluate the skill of a machine learning algorithm.  Three models are trainiend and evaluated, with each fold given a chance to be the test set:
- Model1: trained on Fold1 + Fold2, Tested on Fold3
- Model2: trained on Fold2 + Fold3, Tested on Fold1
- Model3: trained on Fold1 + Fold3, Tested on Fold2

The models are discarded after they are evaluated.  The skill scores are collected for each model and summarised for use.

We do not need to implement k-fold cross-validation manually.  The scikit-learn library provides the KFold() class to split a given data sample.  It takes as arguments:
- the number of splits
- whether or not to shuffle the sample
- the seed for the pseudorandom number generator used prior to the shuffle

The split() function can then be called repeatedly on the class, where the data sample is provided as an argument.  Arrays are returned containing the indices into the original data sample of observations to use for train and test sets on each iteration.

Usually the k-fold cross-validation implementation in scikit-learn is provided as a component operation within broader methods, such as grid-searching model hyperparameters and scoring a model on a dataset.

In [2]:
# scikit-learn k-fold cross-validation
from numpy import array
from sklearn.model_selection import KFold

# data sample
data = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6])

# prepare cross validation to split the dataset into 3 folds, shuffling prior to the split, with a seed of 1
kfold = KFold(3, True, 1)

# enumerate splits
for train, test in kfold.split(data):
    print('train: %s, test: %s' % (data[train], data[test]))

train: [0.1 0.4 0.5 0.6], test: [0.2 0.3]
train: [0.2 0.3 0.4 0.6], test: [0.1 0.5]
train: [0.1 0.2 0.3 0.5], test: [0.4 0.6]


# Variations on Cross-Validation
Commonly used variations on the k-fold cross-validation procedure are:
- Train/Test Split - taken to one extreme, k may be set to 1
- LOOCV - taken to another extreme, k may be set to the total number of observations in the dataset, so that each observation is given a chance to be the test dataset.  This is called leave-one-out cross-validation (LOOCV)
- Stratified - the splitting of data into folds may be governed by criteria such as ensuring that each fold has the same proportion of observations with a given categorical value, such as the class outcome value
- Repeated - the k-fold cross-validation procedure is repeated n times where, importantly, the data sample is shuffled prior to each repetition, thus resulting in a different split of the sample

# Extensions

In [3]:
# write your own function to split a data sample using k-fold cross validation
import numpy as np
from sklearn.model_selection import KFold

# unseeded data sample
data = np.random.randint(0, 100, 100)
print('Data:', data)

# prepare cross validation to split the dataset into 10 folds, shuffling prior to the split, with a seed of 1
kfold = KFold(10, True, 1)

# enumerate splits
for train, test in kfold.split(data):
    print('train: %s, test: %s' % (data[train], data[test]))

Data: [35 27 78 64 18 24 82 87 60 37 95 65 68 61 18 42 92 85 24 73 33 45 10 38
 76 78 88 47 23 56 69 42 12 85 64 13  5 43  7  7 84 13 38 54 77 54 69 61
 50  6 29 78 96 90  1 57 38 74 36 59 37  0 91  2 20 84 63 87 41 14 88  4
 96 16 56 61 92 55 29  0 13 70  4 56 44  1 81 62 34 65 91 91 22 39 99 81
 20 13 59 59]
train: [35 27 78 64 18 24 82 87 60 37 95 65 68 61 18 42 92 24 73 33 45 10 38 76
 78 88 47 23 56 69 42 12 64 13 43  7  7 84 13 38 54 77 54 69 61 50  6 29
 78 96 90  1 57 38 74 36 59 37  0 91  2 20 63 87 41 88  4 96 16 56 61 92
 55 29  0 56  1 81 62 34 65 91 91 22 99 81 20 13 59 59], test: [85 85  5 84 14 13 70  4 44 39]
train: [35 27 78 64 18 24 82 87 60 37 65 68 61 18 42 92 85 24 73 33 45 10 38 76
 78 88 47 23 56 69 85 64 13  5 43  7 84 13 38 54 54 69 61 50  6 29 90  1
 57 74 36 59 37  0 91  2 20 84 63 87 41 14 88  4 96 16 56 61 92 55  0 13
 70  4 56 44  1 81 62 34 65 91 91 39 99 81 20 13 59 59], test: [95 42 12  7 77 78 96 38 29 22]
train: [35 27 64 18 24 82 87 60 37 95 65 68 61

develop examples to demonstrate each of the main types of cross-validation supported by scikit-learn
- GroupKFold
- GroupShuffleSplit
- LeaveOneGroupOut
- LeavePGroupsOut
- LeaveOneOut
- LeavePOut
- PredefinedSplit
- RepeatedKFold
- RepeatedStratifiedKFold
- ShuffleSplit
- StratifiedKFold
- StratifiedShuffleSplit
- TimeSeriesSplit

In [4]:
# GroupKFold example
# k-fold iterator variant with non-overlapping groups - the same group will not appear in two different folds.  The number of distinct groups must be at least equal to the number of folds, and the number of distinct groups is approximately the same in each fold
import numpy as np
from sklearn.model_selection import GroupKFold

X = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
y = np.array([10, 20, 30, 40])
groups = np.array([0, 0, 2, 2])
group_kfold = GroupKFold(n_splits=2)
group_kfold.get_n_splits(X, y, groups)

GroupKFold(n_splits=2)
for train_index, test_index in group_kfold.split(X, y, groups):
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]
    print("TRAIN INDEX:", train_index)
    print('X_train', X_train)
    print('y train', y_train)
    print("TEST INDEX:", test_index)
    print('X test', X_test)
    print('y test', y_test, '\n')

TRAIN INDEX: [0 1]
X_train [[1 2]
 [3 4]]
y train [10 20]
TEST INDEX: [2 3]
X test [[5 6]
 [7 8]]
y test [30 40] 

TRAIN INDEX: [2 3]
X_train [[5 6]
 [7 8]]
y train [30 40]
TEST INDEX: [0 1]
X test [[1 2]
 [3 4]]
y test [10 20] 



In [5]:
# GroupShuffleSplit example
# provides randomised train/test indices to split data according to a third-party provided group.  This group information can be used to encode arbitrary domain specific stratifications of the samples as integers
import numpy as np
from sklearn.model_selection import GroupShuffleSplit

X = np.ones(shape=(8, 2))
y = np.ones(shape=(8, 1))
groups = np.array([1, 1, 2, 2, 2, 3, 3, 3])
print(groups.shape)

gss = GroupShuffleSplit(n_splits=2, train_size=.7, random_state=42)
gss.get_n_splits()

for train_idx, test_idx in gss.split(X, y, groups):
    print("TRAIN INDEX:", train_idx)
    print("TEST INDEX:", test_idx)

(8,)
TRAIN INDEX: [2 3 4 5 6 7]
TEST INDEX: [0 1]
TRAIN INDEX: [0 1 5 6 7]
TEST INDEX: [2 3 4]


In [6]:
# LeaveOneGroupOut example
# provides test/train indices to split data according to a third-party provided group.  This group information can be used to encode arbitrary domain specific stratifications of samples as integers
import numpy as np
from sklearn.model_selection import LeaveOneGroupOut

X = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
y = np.array([10, 20, 11, 22])
groups = np.array([1, 1, 2, 2])
logo = LeaveOneGroupOut() 
# logo = LeavePGroupsOut(n_groups = 2) # number of groups to leave out in the test split
logo.get_n_splits(X, y, groups)

for train_index, test_index in logo.split(X, y, groups):
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]
    print("TRAIN INDEX:", train_index)
    print('X train', X_train)
    print('y train', y_train)
    print("TEST INDEX:", test_index)
    print('X test', X_test)
    print('y test', y_test, '\n')

# the difference between LeaveOneGroupOut and LeavePGroupsOut is that:
# - the former uses samples all assigned to the same groups
# - the latter builds the test sets with all the samples assigned to p different values of the groups

TRAIN INDEX: [2 3]
X train [[5 6]
 [7 8]]
y train [11 22]
TEST INDEX: [0 1]
X test [[1 2]
 [3 4]]
y test [10 20] 

TRAIN INDEX: [0 1]
X train [[1 2]
 [3 4]]
y train [10 20]
TEST INDEX: [2 3]
X test [[5 6]
 [7 8]]
y test [11 22] 



In [7]:
# LeaveOneOut example
# provides train/test indices to split data in train/test sets.  Each sample is used once as a test set (singleton) while the remaining samples form the training set
import numpy as np
from sklearn.model_selection import LeaveOneOut

X = np.array([[1, 2], [3, 4], [5, 6]])
y = np.array([10, 20, 30])
loo = LeaveOneOut()
loo.get_n_splits(X)

for train_index, test_index in loo.split(X):
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]
    print("TRAIN INDEX:", train_index)
    print('X train', X_train)
    print('y train', y_train)
    print("TEST INDEX:", test_index)
    print('X test', X_test)
    print('y test', y_test, '\n')

TRAIN INDEX: [1 2]
X train [[3 4]
 [5 6]]
y train [20 30]
TEST INDEX: [0]
X test [[1 2]]
y test [10] 

TRAIN INDEX: [0 2]
X train [[1 2]
 [5 6]]
y train [10 30]
TEST INDEX: [1]
X test [[3 4]]
y test [20] 

TRAIN INDEX: [0 1]
X train [[1 2]
 [3 4]]
y train [10 20]
TEST INDEX: [2]
X test [[5 6]]
y test [30] 



In [8]:
# LeavePOut example
# provides train/test indices to split data in train/test sets.  This results in testing on all distinct samples of size p, whilst the remaining (n-p) samples form the training set in each iteration
import numpy as np
from sklearn.model_selection import LeavePOut

X = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
y = np.array([1, 2, 3, 4])
lpo = LeavePOut(2)
lpo.get_n_splits(X)

for train_index, test_index in lpo.split(X):
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]
    print("TRAIN INDEX:", train_index)
    print('X train', X_train)
    print('y train', y_train)
    print("TEST INDEX:", test_index)
    print('X test', X_test)
    print('y test', y_test, '\n')

# Note that LeavePOut(p) is NOT equivalent to KFold(n_splits = n_samples // p), which creates non-overlapping test sets

TRAIN INDEX: [2 3]
X train [[5 6]
 [7 8]]
y train [3 4]
TEST INDEX: [0 1]
X test [[1 2]
 [3 4]]
y test [1 2] 

TRAIN INDEX: [1 3]
X train [[3 4]
 [7 8]]
y train [2 4]
TEST INDEX: [0 2]
X test [[1 2]
 [5 6]]
y test [1 3] 

TRAIN INDEX: [1 2]
X train [[3 4]
 [5 6]]
y train [2 3]
TEST INDEX: [0 3]
X test [[1 2]
 [7 8]]
y test [1 4] 

TRAIN INDEX: [0 3]
X train [[1 2]
 [7 8]]
y train [1 4]
TEST INDEX: [1 2]
X test [[3 4]
 [5 6]]
y test [2 3] 

TRAIN INDEX: [0 2]
X train [[1 2]
 [5 6]]
y train [1 3]
TEST INDEX: [1 3]
X test [[3 4]
 [7 8]]
y test [2 4] 

TRAIN INDEX: [0 1]
X train [[1 2]
 [3 4]]
y train [1 2]
TEST INDEX: [2 3]
X test [[5 6]
 [7 8]]
y test [3 4] 



In [9]:
# PredefinedSplit example
# provides train/test indices to split data into train/test sets using a prefdefined scheme specified by the user with the test_fold parameter
import numpy as np
from sklearn.model_selection import PredefinedSplit

X = np.array([[1, 2], [3, 4], [5, 6], [7, 8]])
y = np.array([10, 20, 30, 40])
# the entry test_fold[i] represents the index of the test set that sample i belongs to
# where test_fold[i] = -1, sample i is included in every training set
test_fold = [0, 1, -1, 1]
ps = PredefinedSplit(test_fold)
ps.get_n_splits()

for train_index, test_index in ps.split(X):
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]
    print("TRAIN INDEX:", train_index)
    print('X train', X_train)
    print('y train', y_train)
    print("TEST INDEX:", test_index)
    print('X test', X_test)
    print('y test', y_test, '\n')


TRAIN INDEX: [1 2 3]
X train [[3 4]
 [5 6]
 [7 8]]
y train [20 30 40]
TEST INDEX: [0]
X test [[1 2]]
y test [10] 

TRAIN INDEX: [0 2]
X train [[1 2]
 [5 6]]
y train [10 30]
TEST INDEX: [1 3]
X test [[3 4]
 [7 8]]
y test [20 40] 



In [10]:
# RepeatedKFold example
# repeats k-fold n times, with different randomisation in each repetition
import numpy as np
from sklearn.model_selection import RepeatedKFold

X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
y = np.array([0, 0, 1, 1])
rkf = RepeatedKFold(n_splits=2, n_repeats=2, random_state=2652124)

for train_index, test_index in rkf.split(X):
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]
    print("TRAIN INDEX:", train_index)
    print('X train', X_train)
    print('y train', y_train)
    print("TEST INDEX:", test_index)
    print('X test', X_test)
    print('y test', y_test, '\n')

TRAIN INDEX: [0 1]
X train [[1 2]
 [3 4]]
y train [0 0]
TEST INDEX: [2 3]
X test [[1 2]
 [3 4]]
y test [1 1] 

TRAIN INDEX: [2 3]
X train [[1 2]
 [3 4]]
y train [1 1]
TEST INDEX: [0 1]
X test [[1 2]
 [3 4]]
y test [0 0] 

TRAIN INDEX: [1 2]
X train [[3 4]
 [1 2]]
y train [0 1]
TEST INDEX: [0 3]
X test [[1 2]
 [3 4]]
y test [0 1] 

TRAIN INDEX: [0 3]
X train [[1 2]
 [3 4]]
y train [0 1]
TEST INDEX: [1 2]
X test [[3 4]
 [1 2]]
y test [0 1] 



In [11]:
# ShuffleSplit example
# Yields indices to split data into training and test sets
# Randomised CV splitters may return different results for each call of split.  You can make the results identical by setting random_state to an integer
import numpy as np
from sklearn.model_selection import ShuffleSplit

X = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [3, 4], [5, 6]])
y = np.array([1, 2, 1, 2, 1, 2])

rs = ShuffleSplit(n_splits=5, test_size=.25, random_state=0)
rs.get_n_splits(X)
for train_index, test_index in rs.split(X):
    print("TRAIN INDEX:", train_index, "TEST INDEX:", test_index)
print('')

rs = ShuffleSplit(n_splits=5, train_size=0.5, test_size=.25, random_state=0)
for train_index, test_index in rs.split(X):
    print("TRAIN INDEX:", train_index, "TEST INDEX:", test_index)

# Note that contrary to other cross-validation strategies, random splits do not guarantee that all folds will be different, although this is still very likely for sizeable datasets

TRAIN INDEX: [1 3 0 4] TEST INDEX: [5 2]
TRAIN INDEX: [4 0 2 5] TEST INDEX: [1 3]
TRAIN INDEX: [1 2 4 0] TEST INDEX: [3 5]
TRAIN INDEX: [3 4 1 0] TEST INDEX: [5 2]
TRAIN INDEX: [3 5 1 0] TEST INDEX: [2 4]

TRAIN INDEX: [1 3 0] TEST INDEX: [5 2]
TRAIN INDEX: [4 0 2] TEST INDEX: [1 3]
TRAIN INDEX: [1 2 4] TEST INDEX: [3 5]
TRAIN INDEX: [3 4 1] TEST INDEX: [5 2]
TRAIN INDEX: [3 5 1] TEST INDEX: [2 4]


In [12]:
# StratifiedKFold example
# provides train/test indices to split data in train/test data sets.  This cross-validation is a variation of KFold that returns stratified folds.  The folds are made by preserving the percentage of samples for each class.
import numpy as np
from sklearn.model_selection import StratifiedKFold

X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
y = np.array([0, 0, 1, 1])
skf = StratifiedKFold(n_splits=2)
skf.get_n_splits(X, y)

for train_index, test_index in skf.split(X, y):
    print("TRAIN INDEX:", train_index, "TEST INDEX:", test_index)
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]

TRAIN INDEX: [1 3] TEST INDEX: [0 2]
TRAIN INDEX: [0 2] TEST INDEX: [1 3]


In [13]:
# StratifiedShuffleSplit example
# provides train/test indices to split data in train/test sets.  This cross-validation object is a merge of StratifiedKFold and ShuffleSplit, which returns stratified randomised folds.  The folds are made by preserving the percentage of samples for each class.
import numpy as np
from sklearn.model_selection import StratifiedShuffleSplit

X = np.array([[1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4]])
y = np.array([0, 0, 0, 1, 1, 1])
sss = StratifiedShuffleSplit(n_splits=5, test_size=0.5, random_state=0)
sss.get_n_splits(X, y)

for train_index, test_index in sss.split(X, y):
    print("TRAIN INDEX:", train_index, "TEST INDEX:", test_index)
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]

# Note that like the ShuffleSplit strategy, stratified random splits do not guarantee that all folds will be different, although this is still very likely for sizeable datasets.


TRAIN INDEX: [5 2 3] TEST INDEX: [4 1 0]
TRAIN INDEX: [5 1 4] TEST INDEX: [0 2 3]
TRAIN INDEX: [5 0 2] TEST INDEX: [4 3 1]
TRAIN INDEX: [4 1 0] TEST INDEX: [2 3 5]
TRAIN INDEX: [0 5 1] TEST INDEX: [3 4 2]


In [14]:
# RepeatedStratifiedKFold example
#░Repeats Stratified K-Fold n times, with different randomisation in each repetition
import numpy as np
from sklearn.model_selection import RepeatedStratifiedKFold

X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
y = np.array([0, 0, 1, 1])
rskf = RepeatedStratifiedKFold(n_splits=2, n_repeats=2, random_state=36851234)

for train_index, test_index in rskf.split(X, y):
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]
    print("TRAIN INDEX:", train_index)
    print('X train', X_train)
    print('y train', y_train)
    print("TEST INDEX:", test_index)
    print('X test', X_test)
    print('y test', y_test, '\n')

TRAIN INDEX: [1 2]
X train [[3 4]
 [1 2]]
y train [0 1]
TEST INDEX: [0 3]
X test [[1 2]
 [3 4]]
y test [0 1] 

TRAIN INDEX: [0 3]
X train [[1 2]
 [3 4]]
y train [0 1]
TEST INDEX: [1 2]
X test [[3 4]
 [1 2]]
y test [0 1] 

TRAIN INDEX: [1 3]
X train [[3 4]
 [3 4]]
y train [0 1]
TEST INDEX: [0 2]
X test [[1 2]
 [1 2]]
y test [0 1] 

TRAIN INDEX: [0 2]
X train [[1 2]
 [1 2]]
y train [0 1]
TEST INDEX: [1 3]
X test [[3 4]
 [3 4]]
y test [0 1] 



In [15]:
# TimeSeriesSplit example
# provides train/test indices to split time series data samples that are observed at fixed time inervals, in train/test sets.  In each split, test indices must be higher than before, and thus shuffling in cross validator is inappropriate.
# This cross-validation object is a variation of KFold.  In the kth split, it returns first k folds as train set, and the (k+1)th fold as test set
# Note that unlike standard cross-validation methods, successive training sets are supersets of those that come before them
import numpy as np
from sklearn.model_selection import TimeSeriesSplit

X = np.array([[1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4]])
y = np.array([1, 2, 3, 4, 5, 6])
tscv = TimeSeriesSplit()

for train_index, test_index in tscv.split(X):
    print("TRAIN INDEX:", train_index, "TEST INDEX:", test_index)
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]
print('')

# Fix test_size to 2 with 12 samples
X = np.random.randn(12, 2)
y = np.random.randint(0, 2, 12)
tscv = TimeSeriesSplit(n_splits=3, test_size=2)
for train_index, test_index in tscv.split(X):
    print("TRAIN INDEX:", train_index, "TEST INDEX:", test_index)
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]
print('')   

# Add in a 2 period gap
tscv = TimeSeriesSplit(n_splits=3, test_size=2, gap=2)
for train_index, test_index in tscv.split(X):
   print("TRAIN INDEX:", train_index, "TEST INDEX:", test_index)
   X_train, X_test = X[train_index], X[test_index]
   y_train, y_test = y[train_index], y[test_index]

TRAIN INDEX: [0] TEST INDEX: [1]
TRAIN INDEX: [0 1] TEST INDEX: [2]
TRAIN INDEX: [0 1 2] TEST INDEX: [3]
TRAIN INDEX: [0 1 2 3] TEST INDEX: [4]
TRAIN INDEX: [0 1 2 3 4] TEST INDEX: [5]

TRAIN INDEX: [0 1 2 3 4 5] TEST INDEX: [6 7]
TRAIN INDEX: [0 1 2 3 4 5 6 7] TEST INDEX: [8 9]
TRAIN INDEX: [0 1 2 3 4 5 6 7 8 9] TEST INDEX: [10 11]

TRAIN INDEX: [0 1 2 3] TEST INDEX: [6 7]
TRAIN INDEX: [0 1 2 3 4 5] TEST INDEX: [8 9]
TRAIN INDEX: [0 1 2 3 4 5 6 7] TEST INDEX: [10 11]
