In k-fold cross validation the training set is divided in to k-folds (k is an integer) in k iterations. One of the k-folds will be used for performance evaluation. k-1 folds will be used for training. We then calculate the average performance of the models based on the different, independent test folds to obtain a performance estimate. Example: if k-fold cross-validation with k = 10. The training dataset is divided into 10 folds, and during the 10 iterations, 9 folds are used for training, and 1 fold will be used as the test dataset for model evaluation.

Let's work with breast cancer data provided by Sci-kit learn

# Import breast cancer dataset from Sci-kit learn

In [22]:
from sklearn import datasets
breast = datasets.load_breast_cancer()

# Divide the dataset into train and test sets 

In [23]:
from sklearn.model_selection import train_test_split
X = breast.data
y = breast.target

X_train, X_test, y_train, y_test = train_test_split(X, y,
                 test_size=0.2, random_state=123, stratify=y)

# Make a pipeline

Pipeline made up of transformers(PCA and StandardScaler) and an estimator(LogisticRegression)

In [24]:
from sklearn.linear_model import LogisticRegression
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline

pipe_lr = make_pipeline(StandardScaler(),
                        PCA(n_components=3),
                        LogisticRegression())

# StratifiedKFold from Scikit learn

In [25]:
import numpy as np 
from sklearn.model_selection import StratifiedKFold  #StratifiedKFold is used instead of the standard KFold 
                                                     #to ensure that each fold is representative of the class proportions
                                                     # in the training dataset 

kfold = StratifiedKFold(n_splits=10).split(X_train, y_train) #specified the number of folds via the n_splits

scores = []   #list to calculate the average accuracy and the standard deviation of the estimate.

for k, (train_folds, test_fold) in enumerate(kfold):
    pipe_lr.fit(X_train[train_folds], y_train[train_folds])
    score = pipe_lr.score(X_train[test_fold], y_train[test_fold])
    scores.append(score)
    
    print(f'fold: {k+1},'
          f'accuracy: {score:.2f}')


fold: 1,accuracy: 0.91
fold: 2,accuracy: 1.00
fold: 3,accuracy: 0.98
fold: 4,accuracy: 0.98
fold: 5,accuracy: 0.96
fold: 6,accuracy: 0.96
fold: 7,accuracy: 0.93
fold: 8,accuracy: 0.91
fold: 9,accuracy: 0.96
fold: 10,accuracy: 0.89


#  calculate the accuracy score of the model

In [26]:
mean_acc = np.mean(scores)
std_acc = np.std(scores)

print(f" Accuracy: {mean_acc:.2f}, +/- {std_acc:.2f}")

 Accuracy: 0.95, +/- 0.03
