# Cross Validation in Python
Splitting your data into training, validation, and test sets may seem straightforward, but there are a few advanced ways to do it that can come in handy when little data is available.

Here, we will review **three** classic evaluation recipes:

- Hold-out Validation
- K-fold Validtation
- Iterated K-Fold Validation with Shuffling

## Hold-Out Validation
Defined by holding out a specified number of samples of the data set as the validation set, and never changing the validation set to any different sample of the data. This is the simplest evaluation protocol, but has the largest flaw: **if lilttle data is available, then your validation and test sets contain too few samples to be statistically representative of the data at hand.**

#### Hold-out validation in Python
```python

num_validation_samples = 10000

np.random.shuffle(data) # Shuffling the data is usually appropriate

validation_data = data[:num_validation_samples] # Defines the validation set

data = data[num_validation_samples:]
training_data = data[:] # Defines the training set

# Trains a model on the training data, and evaluates it on the validation data
model = get_model()
model.train(training_data)
validation_score = model.evaluate(validation_data)

###########################################
# At this point you can tune your model,
# retrain it, evaluate it, tune it again...
###########################################

# Once you’ve tuned your hyperparameters,
# it’s common to train your final model
# from scratch on **all** non-test data available.
model = get_model()
model.train(np.concatenate([training_data, validation_data]))
test_score = model.evaluate(test_data)
```

## K-Fold Validation
Defined by splitting the data into $K$ partitions of equal size. For each partition $i$, train a model on the remaining $K - 1$ partitions, and evaluate it on partition $i$. Your final score is then the averages of the $K$ scores obtained.

**This method is helpful when the performance of your model shows significant variance based on your train-test split.**

#### K-Fold Validation in Python

```python

k = 4
num_validation_samples = len(data) // k

np.random.shuffle(data)

validation_scores = []
for fold in range(k):
    # Selects the validation-data partition
    validation_data = data[num_validation_samples * fold:num_validation_samples * (fold + 1)]
    # Uses the remainder of the data as training data. Note the concatenation (+)
    training_data = data[:num_validation_samples * fold] + 
        data[num_validation_samples * (fold + 1):]
        
    # Creates a brand-new instance of the model (untrained)
    model = get_model()
    model.train(training_data) # Trains the model
    validation_score = model.evaluate(validation_data) # Evaluates using validation data for this iteration
    validation_scores.append(validation_score) # Store results

# Validation score: average of the validation scores of the k folds
validation_score = np.average(validation_scores)
    
# Trains the final model on all non-test data available
model = get_model()
model.train(data)
test_score = model.evaluate(test_data)
```

## Iterated K-Fold Validation with Shuffling
**This one is for situations in which you have relatively little data available and you need to evaluate your model as precisely as possible.**  It consists of applying $K$-fold validation multiple times, shuffling the data every time before splitting it $K$ ways. The final score is the average of the scores obtained at each run of $K$-fold validation. Note that you end up training and evaluating $P \times K$ models (where $P$ is the number of iterations you use), which can be very expensive.

### Things to keep in mind
You will want to shuffle or not depending on the attributes of your data. This is usually obvious, such as a time series data set, or if the data is ordered in a way so that a sample will not be representative of the data as a whole.

**Additionally, make sure your training set and validation set are disjoint.** That is, if some data points in your data appear twice (fairly common with real-world data), then shuffling the data and splitting it into a
training set and a validation set will result in redundancy between the training and validation sets. In effect, you’ll be testing on part of your training data.