<h4 align="center">Hold-out validation</h4> 
![holdoutvalidation](img/simpleholdoutvalidation.jpeg)

Simplest.

But if little data is available, then your validation and test sets may contain too few samples to be statistically representative of the data at hand.

In [None]:
num_validation_samples = 10000
# usually shuffle
np.random.shuffle(data)

# validation set
validation_data = data[:num_validation_samples]

# training set
data = data[num_validation_samples:]
training_data = data[:]

# train model on training data; evaluate on validation data
model = get_model()
model.train(training_data)
validation_score = model.evaluate(validation_data)

# retrain, evaluate, tune model again...
model = get_model()

# once you've tuned hyperparameters, train final model from scratch on all non-test data avail
model.train(np.concatenate([training_data, validation_data]))

# test
test_score = model.evaluate(test_data)

<h4 align="center">K-fold validation</h4> 
![kfoldvalidation](img/3foldvalidation.jpeg)

Split your data into K partitions of equal size. For each partition i, train a model on the remaining K – 1 partitions, and evaluate it on partition i. Your final score is then the averages of the K scores obtained. This method is helpful when the performance of your model shows significant variance based on your train-test split. Like hold-out validation, this method doesn’t exempt you from using a distinct validation set for model calibration.

In [None]:
k = 4
num_validation_samples = len(data) // k
np.random.shuffle(data)
validation_scores = []

for fold in range(k):
    # validation-data partition
    validation_data = data[num_validation_samples*fold : num_validation_samples*(fold+1)]
    # rest goes into training data
    training_data = data[:num_validation_samples*fold] + data[num_validation_samples*(fold+1):]
    
    # create new, untrained instance of model
    model = get_model()
    model.train(training_data)
    
    validation_score = model.evaluate(validation_data)
    validation_scores.append(validation_score)
    
# average of all validation scores of the k folds
validation_score = np.average(validation_scores)

# train final model on all non-test data available
model = get_model()
model.train(data)
test_score = model.evaluate(test_data)