## Validating your model

### Simple holdout validation
This is the simplest evaluation protocol, and it suffers from one flaw: if little data is available, then your validation and test sets may contain too few samples to be statistically representative of the data at hand. 

In [3]:
from tensorflow import keras
from tensorflow.keras.datasets import imdb
from tensorflow.keras import layers
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(
    num_words=10000)

model = keras.Sequential([
    layers.Dense(16, activation="relu"),
    layers.Dense(16, activation="relu"),
    layers.Dense(1, activation="sigmoid")
])

model.compile(optimizer="rmsprop",
              loss="binary_crossentropy",
              metrics=["accuracy"])

import numpy as np 
def vectorize_sequences(sequences, dimension=10000): 
    results = np.zeros((len(sequences), dimension))
    for i, sequence in enumerate(sequences):
        for j in sequence:
            results[i, j] = 1.
    return results
x_train = vectorize_sequences(train_data)
x_test = vectorize_sequences(test_data)
y_train = np.asarray(train_labels).astype("float32")
y_test = np.asarray(test_labels).astype("float32")

In [4]:
x_val = x_train[:10000]
partial_x_train = x_train[10000:]
y_val = y_train[:10000]
partial_y_train = y_train[10000:]

history = model.fit(partial_x_train,
                    partial_y_train,
                    epochs=20,
                    batch_size=512,
                    validation_data=(x_val, y_val))

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


### K-fold validation
With this approach, you split your data into K partitions of equal size. For each partition i, train a model on the remaining K - 1 partitions, and evaluate it on partition i

In [6]:
def get_model():
    model = keras.Sequential([
        layers.Dense(16, activation="relu"),
        layers.Dense(16, activation="relu"),
        layers.Dense(1, activation="sigmoid")
    ])

    model.compile(optimizer="rmsprop",
                  loss="binary_crossentropy",
                  metrics=["accuracy"])
    return model

In [20]:
import tensorflow as tf
import numpy as np
y_train_two = np.reshape(y_train, (y_train.shape[0], 1))
print(f"{x_train.shape}, {y_train_two.shape}")
data = np.concatenate((x_train, y_train_two), axis=1)
print(data)

(25000, 10000), (25000, 1)
[[0. 1. 1. ... 0. 0. 1.]
 [0. 1. 1. ... 0. 0. 0.]
 [0. 1. 1. ... 0. 0. 0.]
 ...
 [0. 1. 1. ... 0. 0. 0.]
 [0. 1. 1. ... 0. 0. 1.]
 [0. 1. 1. ... 0. 0. 0.]]


In [28]:

k = 3 
num_validation_samples = len(data) // k
np.random.shuffle(data)
all_acc_histories = [] 
validation_scores = [] 
for fold in range(k):
    validation_data = data[num_validation_samples * fold:
                           num_validation_samples * (fold + 1)]
    training_data = np.concatenate((
        data[:num_validation_samples * fold],
        data[num_validation_samples * (fold + 1):]))
    model = get_model()
    history = model.fit(training_data[:, :-1],
                        training_data[:, -1],
                    epochs=20,
                    batch_size=512,
                    validation_data=(validation_data[:, :-1],validation_data[:, -1]))
    acc_history = history.history["val_accuracy"]
    all_acc_histories.append(acc_history)
    print(acc_history)

print(all_acc_histories)

0
8333
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
[0.8624745011329651, 0.8797551989555359, 0.8907956480979919, 0.8936757445335388, 0.8947557806968689, 0.8895955681800842, 0.8909156322479248, 0.8793951869010925, 0.874114990234375, 0.8834753632545471, 0.878075122833252, 0.8820352554321289, 0.8714748620986938, 0.8784351348876953, 0.8751950263977051, 0.8750749826431274, 0.8679947257041931, 0.8727949261665344, 0.8735149502754211, 0.8706348538398743]
8333
16666
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
[0.8461538553237915, 0.8756750226020813, 0.8888755440711975, 0.868594765663147, 0.8892355561256409, 0.886355459690094, 0.

Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
[0.8726748824119568, 0.8833553194999695, 0.8765150308609009, 0.8889955878257751, 0.8757950067520142, 0.8841953873634338, 0.8759150505065918, 0.8786751627922058, 0.8786751627922058, 0.8802351951599121, 0.8803552389144897, 0.8773550987243652, 0.8720749020576477, 0.8723148703575134, 0.8689547777175903, 0.874114990234375, 0.8706348538398743, 0.8715948462486267, 0.8631945252418518, 0.8700348138809204]
[[0.8624745011329651, 0.8797551989555359, 0.8907956480979919, 0.8936757445335388, 0.8947557806968689, 0.8895955681800842, 0.8909156322479248, 0.8793951869010925, 0.874114990234375, 0.8834753632545471, 0.878075122833252, 0.8820352554321289, 0.8714748620986938, 0.8784351348876953, 0.8751950263977051, 0.8750749826431274, 0.8679947257041931, 0.8727949261665344, 0.8735149502754211, 0.8706348538398743], [0.8461538553237915, 0.8756750226020813, 0.8888755440711975, 0.868594765663147, 0.8892355561256409, 0.8863554596900

In [29]:
print(np.mean(all_acc_histories))

0.8762170523405075


### Iterated K-fold validation with shuffling
This one is for situations in which you have relatively little data available and you need to evaluate your model as precisely as possible. I’ve found it to be extremely helpful in Kaggle competitions. It consists of applying K-fold validation multiple times, shuffling the data every time before splitting it K ways. The final score is the average of the scores obtained at each run of K-fold validation. Note that you end up training and evaluating P * K models (where P is the number of iterations you use), which can be very expensive.