In [3]:
from numpy import array
from sklearn.model_selection import KFold

# data sample
data = array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6])

# prepare cross validation (n_splits=3, shuffle=True, random_state=1)
kfold = KFold(n_splits=3, shuffle=True, random_state=1)

# enumerate splits
for train, test in kfold.split(data):
    print('train: %s, test: %s' % (data[train], data[test]))

train: [0.1 0.4 0.5 0.6], test: [0.2 0.3]
train: [0.2 0.3 0.4 0.6], test: [0.1 0.5]
train: [0.1 0.2 0.3 0.5], test: [0.4 0.6]


### Variations on Cross-Validation

There are a number of variations on the k-fold cross-validation procedure. Three commonly used variations are as follows:

- **Train/Test Split**:  
  Taken to one extreme, *k* may be set to 1 such that a single train/test split is created to evaluate the model.

- **LOOCV (Leave-One-Out Cross-Validation)**:  
  Taken to another extreme, *k* may be set to the total number of observations in the dataset such that each observation is given a chance to be the held-out of the dataset. This is called **leave-one-out cross-validation**, or **LOOCV** for short.

- **Stratified**:  
  The splitting of data into folds may be governed by criteria such as ensuring that each fold has the same proportion of observations with a given categorical value, such as the class outcome value. This is called **stratified cross-validation**.

- **Repeated**:  
  This is where the k-fold cross-validation procedure is repeated *n* times, where importantly, the data sample is shuffled prior to each repetition, which results in a different split of the sample.
