### Simple Holdout validation

- some fraction of data as test set.
- training on the remaining data.
- evaluation on test set
- to avoid information leaks, also reserve a validation set.

![image.png](./../public/image.png)

```python


## for hyperparameters tuning 

num_validation_samples = 10000
np.random.shuffle(data)
validation_data = data[:num_validation_samples]
training_data = data[num_validation_samples:]
model = get_model()
model.fit(training_data, ...)
validation_score = model.evaluate(validation_data, ...)

...

## for final model with best parameters we have found

model = get_model()
model.fit(np.concatenate([training_data, 
validation_data]), ...)
test_score = model.evaluate(test_data, ...)

```

simplest evaluation protocol.

### problems with it

- little data means validation and test sets will contain too few samples to be statistically representative of the data at hand
- if different random shuffling rounds of the data before splitting end up yielding very different measures of model performance.

# K-fold validation

- split data into K partitions of equal size.
- for each partition i , train a model on the remaining K-1 partitions, and evaluate it on partition i.
- final score is then the average of the K scores obtained.
- this method helpful when the performance of our model shows significant variance based on train test split.

![image.png](./../public/image2.png)


### K-fold cross validation

```python

k=3
num_validation_samples = len(data)//k
np.random.shuffle(data)
validation_scores = []
for fold in range(k):
    validation_data = data[num_validation_smples * fold:
    num_validation_samples * (fold+1)]
    training_data = np.concatenate(data[:num_validation_samples * fold], data[num_validation_samples * (fold+1): ])
    model = get_model()
    model.fit(training_data, ...)
    validation_score = model.evaluate(validation_data, ...)
    validation_scores.append(validation_score)

validation_score = np.average(validation_scores)
model = get_model()
model.fit(data, ...)
test_score = model.evaluate(test_data, ...)

```


# Iterated K-Fold Validation with shuffling

- for situations in which we have relatively little data available and we need to evaluate model as precisely as possible.

- consists of applying K-fold validation multiple times, shuffling the data every time before splitting it K ways.

- final score = average of the scores obtained at each run K-fold validation.

- we end up training and evaluating P*K models (where P is the number of iterations we use), can be very expensive.