# Week 5: Using K-Fold Cross-validation on the MNIST dataset
Hands-on Machine Learning<br>
20 December, 2018<br><br>


## Step 1.  Load the MNIST dataset from Keras and prepare it for modelling

Keras comes packaged with the MNIST dataset.  This has 60,000 images of handwritten digits, each of which is properly labeled.


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from keras import models, layers
from keras.datasets import mnist
from keras.utils import to_categorical

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
print(train_images.shape, train_labels.shape, test_images.shape, test_labels.shape)

In [None]:
# Reshape the image data into a (n, 784) array.
# Then re-scale the data to a (0, 1) range.
train_images = train_images.reshape((60000, 28*28))
train_images = train_images.astype('float32')/255

test_images = test_images.reshape((10000, 28*28))
test_images = test_images.astype('float32')/255

# Conver the label data to categorical (on-hot encoding)
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

## Step 2.  Create a function to build network models 
<br>
With K-fold cross-validation, we need to train and test each model configuration under multiple datasets.  We need a 'constructor' to compile identically-configured models.  Let's create a python function to do this.<br><br>
When we want to adjust the hyper-parameters, we will need to make the adjustments to this function.

In [None]:
def build_model():
    model = models.Sequential()
    model.add(layers.Dense(512,
                        activation='relu',
                        input_shape=(28 * 28,)))
    
    model.add(layers.Dense(10,
                        activation='softmax'))
    
    model.compile(optimizer='rmsprop',
               loss='categorical_crossentropy',
               metrics=['accuracy'])
    
    return model

## Step 5. Train and validate, using K-fold cross-validation


In [None]:
# New code
k = 4
foldSize = len(train_labels) // k

validationScores=[] # create an empty list to hold the validation results at each fold.

for fold in range(k):
    # fold images/labels will be held out for VALIDATION
    val_images = train_images[foldSize * fold: foldSize * (fold+1)]
    val_labels = train_labels[foldSize * fold: foldSize * (fold+1)]
    
    # remaining images/labels will be used for TRAINING
    partial_train_images = np.concatenate((train_images[:foldSize * fold],
                                           train_images[foldSize * (fold+1):]))
    
    partial_train_labels = np.concatenate((train_labels[:foldSize * fold],
                                           train_labels[foldSize * (fold+1):]))

    # build, train and validate the model
    model = build_model()
    model.fit(partial_train_images, partial_train_labels,
              epochs=2, 
              batch_size=128, 
              verbose=0)
    
    validationScore = model.evaluate(val_images, val_labels)
    print(validationScore)
    validationScores.append(validationScore)

validationScores = np.mean(validationScores, axis=0)
print()
print('Results: Loss:',validationScores[0], 'Accuracy:', validationScores[1])

## Step 4. Test the network, using previously unseen data
<br>
Once we are satisfied with our model design, it is time to test using unseen data.<br>
Before we can do this, however, we need to build and train a new model.<br>
The current model contains the weights from the final fold training.  We want to train this network design on the ENTIRE dataset.  This allows us to get maximum value from the entire training set.
<br>

### 4a.  Build and train a new copy of the network using the ENTIRE training set.

In [None]:
testModel = build_model()
testModel.fit(train_images, train_labels,
              epochs=2, # IMPORTANT: Use the value from your final model above!
              batch_size=128,  # IMPORTANT: Use the value from your final model above!
              verbose=0)



### 4b.  Evaluate the model using the test set.

In [None]:

test_loss, test_acc = testModel.evaluate(test_images, test_labels)
print('test_loss', test_loss)
print('test_acc', test_acc)