## 1. Hold-out validation

This is the simplest version. However if little data is availale, then your validation and test sets may contain too few samples to be statistically representative of the data at hand. 

In [None]:
num_validation_samples = 10000

np.random.shuffle(data)

validation_data = data[:num_validation_samples]
data = data[num_validation_samples:]

training_data = data[:]

model = get_model()
model.train(training_data)
validation_score = model.evaluate(validation_data)

# At this point, you can tune your model, retrain it, evaluate it and tune it again

model = get_model()
model.train(np.concatenate([training_data, validation_data]))
test_score = model.evaluate(test_data)



## 2. K-fold validation



In [None]:
k = 4
num_validation_samples = len(data)//k

np.random.shuffle(data)

validation_scores = []
for fold in range(k):
    validation_data = data[num_validation_samples*fold:num_validation_samples*(fold+1)]
    training_data = data[:num_validation_samples*fold]+data[num_validation_samples*(fold+1):]
    
    model = get_model()
    model.train(training_data)
    validation_score = model.evaluate(validation_data)
    validation_scores.append(validation_score)
    
    validation_score = np.average(validation_scores)
    
    model = get_model()
    model.train(data)
    test_score = model.evaluate(test_data)

## Iterated K-fold validation with shuffling

This is the situation in which relatively little data is available and you need to evaluate your model as precisely as possible. This is extremely helpful in Kaggle competitions. It consists of applying K-fold validation multiple times, shuffling the data every time before splitting it in K ways. The final score is the average of the scores obtained at each run of K-fold validation. 
Note that this ends up in training and evaluateing P x K models (where P is the number of iterations), which can be very expensive.

## Things to pay attention to
- Data representativeness: both training set and test set should bre representative of the data. Should randomly shuffle data before splitting it into training and test sets.
- The arrow of time: for prediction of the future given the past, data should not be randomly shuffled to avoid temporal leak. Should always make sure all data in your test set is posterior to the data in training set. 
- Redundancy in your data: Make sure your training set and validation set are disjoint.
