<a href="https://colab.research.google.com/github/dauvannam321/K_Fold/blob/main/K_Fold.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

https://miai.vn/2021/01/18/k-fold-cross-validation-tuyet-chieu-train-khi-it-du-lieu/

In [1]:
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D
from sklearn.model_selection import KFold
import numpy as np

In [4]:
# Model configuration
batch_size = 50
no_classes = 100
no_epochs = 25
num_folds = 10

# Define per-fold score containers
accuracy_list = []
loss_list = []

In [2]:
def load_data():

  # Load dữ liệu CIFAR đã được tích hợp sẵn trong Keras
  (X_train, y_train), (X_test, y_test) = cifar10.load_data()

  # Chuẩn hoá dữ liệu
  X_train = X_train.astype('float32')
  X_test = X_test.astype('float32')
  X_test = X_test / 255
  X_train = X_train / 255

  # Do CIFAR đã chia sẵn train và test nên ta nối lại để chia K-Fold
  X = np.concatenate((X_train, X_test), axis=0)
  y = np.concatenate((y_train, y_test), axis=0)

  return X, y

In [3]:
def get_model():

  model = Sequential()
  model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', input_shape=(32, 32, 3)))
  model.add(MaxPooling2D(pool_size=(2, 2)))
  model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
  model.add(MaxPooling2D(pool_size=(2, 2)))
  model.add(Flatten())
  model.add(Dense(256, activation='relu'))
  model.add(Dense(128, activation='relu'))
  model.add(Dense(no_classes, activation='softmax'))

  # Compile  model
  model.compile(loss="sparse_categorical_crossentropy",
                optimizer="Adam",
                metrics=['accuracy'])

  return model

In [6]:
X, y = load_data()

# Định nghĩa K-Fold CV
kfold = KFold(n_splits=num_folds, shuffle=True)

# K-fold Cross Validation model evaluation
fold_idx = 1

for train_ids, val_ids in kfold.split(X, y):

  model = get_model()

  print("Bắt đầu train Fold ", fold_idx)

  # Train model
  model.fit(X[train_ids], y[train_ids],
              batch_size=batch_size,
              epochs=no_epochs,
              verbose=1)

  # Test và in kết quả
  scores = model.evaluate(X[val_ids], y[val_ids], verbose=0)
  print("Đã train xong Fold ", fold_idx)

  # Thêm thông tin accuracy và loss vào list
  accuracy_list.append(scores[1] * 100)
  loss_list.append(scores[0])

  # Sang Fold tiếp theo
  fold_idx = fold_idx + 1

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
Bắt đầu train Fold  1
Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Đã train xong Fold  1
Bắt đầu train Fold  2
Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Đã train xong Fold  2
Bắt đầu train Fold  3
Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25

In [7]:
# In kết quả tổng thể
print('* Chi tiết các fold')
for i in range(0, len(accuracy_list)):
  print(f'> Fold {i+1} - Loss: {loss_list[i]} - Accuracy: {accuracy_list[i]}%')

print('* Đánh giá tổng thể các folds:')
print(f'> Accuracy: {np.mean(accuracy_list)} (Độ lệch +- {np.std(accuracy_list)})')
print(f'> Loss: {np.mean(loss_list)}')

* Chi tiết các fold
> Fold 1 - Loss: 2.105957269668579 - Accuracy: 71.48333191871643%
> Fold 2 - Loss: 2.1373438835144043 - Accuracy: 71.11666798591614%
> Fold 3 - Loss: 2.131883382797241 - Accuracy: 70.48333287239075%
> Fold 4 - Loss: 2.3989880084991455 - Accuracy: 69.23333406448364%
> Fold 5 - Loss: 2.2207534313201904 - Accuracy: 69.9500024318695%
> Fold 6 - Loss: 2.037099838256836 - Accuracy: 71.83333039283752%
> Fold 7 - Loss: 2.094999313354492 - Accuracy: 71.74999713897705%
> Fold 8 - Loss: 2.2892210483551025 - Accuracy: 69.84999775886536%
> Fold 9 - Loss: 1.9811723232269287 - Accuracy: 70.74999809265137%
> Fold 10 - Loss: 2.315394639968872 - Accuracy: 69.0666675567627%
* Đánh giá tổng thể các folds:
> Accuracy: 70.55166602134705 (Độ lệch +- 0.9543296933687476)
> Loss: 2.1712813138961793
