# Validation Techniques

### What is Validation?
Validation is a form of testing and experimenting with your model, before **evaluating** it on test data. This works by cutting out part of your data, and using it to **tune** your model. We split the data into three sets, Training, Validation and Test sets.

### Why is Validation Important?

It has been said to understand things, we have to observe them. In maching learning, we use validation techniques to observe and tune our models (setting your **hyperparameters**) before finally evaluating your model on the training data. 

This tuning and experimention, allows your model to acheive better convergence, on data it has never seen before.

*Note*: *Hyperparameters are the configurations of your model. For instance, the number of layers your model posseses.*

*Note*: *Information leaks: Every time you tune a hyperparameter of your model based on the model’s performance on the validation set, some information about the validation data leaks into the model. That's why we don't use the validation set to evaluate our model's performance, due to the risk of the model learning and getting optimised for your validation set. So we use a new set of data the model hasn't seen before, the test data, to evalute the model's performance*

### Examples of Validation Techniques

1. Hold-Out Validation

2. K-fold Validation

3. Iterated K-fold Validation with shuffling

#### Hold-Out Validation
Set apart some fraction of your data as your validation set. Train on the remaining data, and evaluate on the test set.

#### Seting up the model

In [None]:
import numpy as np
# Loading the Model
from keras.datasets import mnist # type: ignore
# Load pre-shuffled MNIST data into train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Building the Model
from keras import models
from keras import layers

def build_model():
 network = models.Sequential()
 network.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,)))
 network.add(layers.Dense(10, activation='softmax'))
 return network

def train():
 # Compilation Step
 model = build_model()
 model.compile(optimizer='rmsprop',
                loss='categorical_crossentropy',
                metrics=['accuracy'])
 
 # 
 X_train = X_train.reshape((60000, 28 * 28))
 X_train = X_train.astype('float32') / 255
 X_test = X_test.reshape((10000, 28 * 28))
 X_test = X_test.astype('float32') / 255


In [7]:
# Numpy
num_validation_set = 10000

np.random.shuffle(X_train)

val_set = X_train[:num_validation_set]

X_train = X_train[num_validation_set:]

training_data = X_train[:]

model = build_model()
model.train

10000
