## K-Fold Cross Validation 
- K-Fold cross validation is a technique for evaluating predictive models. The dataset is divided into k subsets or folds.
- The model is trained and evaluated k times, using different fold as the validation set each time.
- Performance metrics from each fold are averaged to estimates the model's generalization performance.
- This method aids in model assessment. selection, and hyperparameter tuning, providing a more reliable measure of a model's effectiveness

In [1]:
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC 
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt


In [2]:
digits = load_digits()
dir(digits)

['DESCR', 'data', 'feature_names', 'frame', 'images', 'target', 'target_names']

In [3]:
X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target, test_size=0.2, random_state=42)

In [4]:
from sklearn.model_selection import KFold
kf = KFold(n_splits=3)
kf

KFold(n_splits=3, random_state=None, shuffle=False)

In [5]:
for train_index, test_index in kf.split([1,2,3,4,5,6,7,8,9]):
    print(train_index, test_index)

[3 4 5 6 7 8] [0 1 2]
[0 1 2 6 7 8] [3 4 5]
[0 1 2 3 4 5] [6 7 8]


In [6]:
from sklearn.model_selection import cross_val_score

In [8]:
cross_val_score(LogisticRegression(solver='liblinear', multi_class='ovr'), digits.data, digits.target, cv=3)

array([0.89482471, 0.95325543, 0.90984975])

In [10]:
cross_val_score(SVC(gamma='auto'), digits.data, digits.target, cv=3)

array([0.38063439, 0.41068447, 0.51252087])

In [11]:
cross_val_score(RandomForestClassifier(n_estimators=20), digits.data, digits.target, cv=3)

array([0.91986644, 0.94657763, 0.90984975])