# Putting it all together - Simple Kfold method

Last time we worked with KFOLD in order to validate our result, to insure that our splits were not just lucky, and to check if the model was actually generalizing.

Today we will look at a much simpler setup for doing KFOLD validation.
First we will use the MNIST digits dataset since it is a simple image based dataset that our CPU's can process.
Second we will use the Sklearn library, which contains lots of terrific functions! 

In [1]:
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers

First we import the dataset.

Notice that we have to finde the number of classes, and the input_shape since this information is not directly supplied to us from the mnist.load_data() function.

Further more, we have to normalize the data by scaling each pixel to be within 0-1.
Afterwards we use numpy magic to get the correct shape for the tensor.
Don't worry if numpy doesn't make sense the first many times that you use it, sometimes keras needs the data to be in a specific order or shape in which we use numpy magic to make that happend.
Many times this means to simply remove a dimension as in our example here.

Last we process the labels (y axis) to be categorical.

In [2]:
num_classes = 10
input_shape = (28, 28, 1)

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Scale images to the [0, 1] range
x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255

# Make sure images have shape (28, 28, 1)
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)
print("x_train shape:", x_train.shape)
print(x_train.shape[0], "train samples")
print(x_test.shape[0], "test samples")


# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples


Now we simply define a network.
This small network should do okay and will run on a CPU.

In [3]:
model = keras.Sequential(
    [
        keras.Input(shape=input_shape),
        layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation="softmax"),
    ]
)

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
flatten (Flatten)            (None, 1600)              0         
_________________________________________________________________
dropout (Dropout)            (None, 1600)              0         
_________________________________________________________________
dense (Dense)                (None, 10)                1

Now we train!
Notice that the fit function is used with the validation_split parameter.

In short, this is actually a Sklearn function which you can use manually before providing data to the fit function.
Then we would simply provide the datasets with a validation tuple as in earlier noteboks:
validation_data = (x_val, y_val)

But lets allow keras to do this for us for this example.

In [4]:
batch_size = 128
epochs = 15

model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1)

Train on 54000 samples, validate on 6000 samples
Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


<tensorflow.python.keras.callbacks.History at 0x1c646d2b408>

In [5]:
score = model.evaluate(x_test, y_test, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])

Test loss: 0.02563383817092981
Test accuracy: 0.9917


As you can see we have trained and gotten a good result!

Now heres the problem, we don't know if our model is actually generalizing or if it was just a lucky split!

KFOLD to the rescue again!

We can use a KFOLD function directly from Sklearn.
The amount of splits we will use is 5 (being a 5-Kfold), and we want to randomize the data order before the split such that the data is not in any kind of specific order.

Lastly we define the random_state to, in our case, 3.
This number is like a seed and will insure that we can make repreduceable results.
The integer used can be anything 1-65k or something like that.

In [6]:
from sklearn.model_selection import KFold

# used to hold results from folds during training.
acc_fold = []
loss_fold = []

# setup folds
kfold = KFold(n_splits=5, random_state=3, shuffle=True)

# initialize fold tracking
fold_no = 1

Now we can simply iterate over the folds with an index to the train and test set of the specific fold.

Remember here that we have compressed alot of code into the for loop since we need to train a new model for each fold!

Towards the end of the loop we simply have evaluations and scores that we report, and insert into our previously defined acc and loss arrays to summarize at the end of the notebook.

In [7]:
for train, test in kfold.split(x_train,y_train):
    
    batch_size = 128
    epochs = 5
    
    FoldModel = keras.Sequential(
        [
            keras.Input(shape=input_shape),
            layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
            layers.MaxPooling2D(pool_size=(2, 2)),
            layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
            layers.MaxPooling2D(pool_size=(2, 2)),
            layers.Flatten(),
            layers.Dropout(0.5),
            layers.Dense(num_classes, activation="softmax"),
        ]
    )
    
    #FoldModel.summary()
    FoldModel.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
    History = FoldModel.fit(x_train[train], y_train[train], batch_size=batch_size, epochs=epochs, validation_split=0.1, verbose=3)
    
    scores = FoldModel.evaluate(x_train[test], y_train[test])
    print(f'Score for fold {fold_no}: {FoldModel.metrics_names[0]} of {scores[0]}; {FoldModel.metrics_names[1]} of {scores[1]}')
    acc_fold.append(scores[1]*100)
    loss_fold.append(scores[0])
    
    fold_no = fold_no + 1
    print("\n")
    
print("------------------------------------------------------------------------")
print("Done training")

Train on 43200 samples, validate on 4800 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Score for fold 1: loss of 0.05121044223786642; accuracy of 0.9848333597183228


Train on 43200 samples, validate on 4800 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Score for fold 2: loss of 0.06511581431026571; accuracy of 0.981083333492279


Train on 43200 samples, validate on 4800 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Score for fold 3: loss of 0.04494275243704517; accuracy of 0.9861666560173035


Train on 43200 samples, validate on 4800 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Score for fold 4: loss of 0.04993980676587671; accuracy of 0.9859166741371155


Train on 43200 samples, validate on 4800 samples
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Score for fold 5: loss of 0.05040370714183276; accuracy of 0.984749972820282


------------------------------------------------------------------------
Done training


In [8]:
# == Provide average scores ==
print('------------------------------------------------------------------------')
print('Score per fold')
for i in range(0, len(acc_fold)):
  print('------------------------------------------------------------------------')
  print(f'> Fold {i+1} - Loss: {loss_fold[i]} - Accuracy: {acc_fold[i]}%')
print('------------------------------------------------------------------------')
print('Average scores for all folds:')
print(f'> Accuracy: {np.mean(acc_fold)} (+- {np.std(acc_fold)})')
print(f'> Loss: {np.mean(loss_fold)}')
print('------------------------------------------------------------------------')

------------------------------------------------------------------------
Score per fold
------------------------------------------------------------------------
> Fold 1 - Loss: 0.05121044223786642 - Accuracy: 98.48333597183228%
------------------------------------------------------------------------
> Fold 2 - Loss: 0.06511581431026571 - Accuracy: 98.1083333492279%
------------------------------------------------------------------------
> Fold 3 - Loss: 0.04494275243704517 - Accuracy: 98.61666560173035%
------------------------------------------------------------------------
> Fold 4 - Loss: 0.04993980676587671 - Accuracy: 98.59166741371155%
------------------------------------------------------------------------
> Fold 5 - Loss: 0.05040370714183276 - Accuracy: 98.4749972820282%
------------------------------------------------------------------------
Average scores for all folds:
> Accuracy: 98.45499992370605 (+- 0.18231529507923663)
> Loss: 0.052322504578577365
----------------------

Finally we have arrived at the end.

The summary shows both the results for each fold, but also the overall performance when calculated over the whole dataset.

In our case here, we can see that the model is most likely generalizing!

Lastly, it should be mentioned that KFOLD is normally only used for validating results, not to develop the model architecture itself.