# Hyperparameters and validation sets

## Prerequisites
- [Bias, variance and generalisation]()
- [Tensorboard]()

Run `tensorboard --logdir=runs` in your terminal to start tensorboard. Then run the cell below to get some data, create training and test data loaders, train a neural network on the test set and plot its loss on to tensorboard (see this by visiting tensorboard at [localhost:6006](http://localhost:6006)).

In [8]:
import torch
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
from torch.utils.tensorboard import SummaryWriter

batch_size = 32

train_data = datasets.MNIST(
    root='MNIST-data',
    transform=transforms.ToTensor(),
    train=True,
    download=True
)

train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)

print(len(train_loader))

class NN(torch.nn.Module): # create a neural network class
    def __init__(self): # initialiser
        super().__init__() # initialise the parent class
        self.layer1 = torch.nn.Linear(784, 1024) # create our first linear layer
        self.layer2 = torch.nn.Linear(1024, 256) # create our second linear layer
        self.layer3 = torch.nn.Linear(256, 10) # create our third linear layer
        
    def forward(self, x): # define the forward pass
        x = x.view(-1, 784) # flatten out our image features into vectors
        x = self.layer1(x) # pass through the first linear layer
        x = F.relu(x) # apply activation function
        x = self.layer2(x) # pass through the second linear layer
        x = F.relu(x) # apply activation function
        x = self.layer3(x) # pass through the third linear layer
        x = F.softmax(x) # apply activation function
        return x # return output
    
my_nn = NN() # initialise our model

# CREATE OUR OPTIMISER
optimiser = torch.optim.Adam(              # what optimiser should we use?
    my_nn.parameters(),          # what should it optimise?
)
        
# CREATE OUR CRITERION
criterion = torch.nn.CrossEntropyLoss() # returns a callable object that compares our predictions to our labels and returns our loss

# SET UP TRAINING VISUALISATION
writer = SummaryWriter() # we will use this to show our models performance on a graph
    
# TRAINING LOOP
def train(model, epochs):
    for epoch in range(epochs):
        for idx, minibatch in enumerate(train_loader): # for each mini-batch sampled from the training dataloader
            inputs, labels = minibatch # unpack the inputs and labels from the minibatch
            prediction = model(inputs) # pass the data forward through the model
            loss = criterion(prediction, labels) # compute the loss
            print('Epoch:', epoch, '\tBatch:', idx, '\tLoss:', loss)
            optimiser.zero_grad() # reset the gradients attribute of each of the model's params to zero
            loss.backward() # backward pass to compute and set all of the model param's gradients
            optimiser.step() # update the model's parameters
            writer.add_scalar('Loss/Train', loss, epoch*len(train_loader) + idx) # write loss to a graph
            
            
train(my_nn, 8) # train for 10 epochs

ERROR:root:Internal Python error in the inspect module.
Below is the traceback from this internal error.



1875
Traceback (most recent call last):
  File "/home/ice/.local/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3319, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-8-a35f3c10afa6>", line 63, in <module>
    train(my_nn, 8) # train for 10 epochs
  File "<ipython-input-8-a35f3c10afa6>", line 54, in train
    prediction = model(inputs) # pass the data forward through the model
  File "/home/ice/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "<ipython-input-8-a35f3c10afa6>", line 29, in forward
    x = F.relu(x) # apply activation function
NameError: name 'F' is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ice/.local/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2034, in showtraceback
    stb = value._render_traceback_()
AttributeEr

NameError: name 'F' is not defined

## Why should we not tune our hyperparameters based on our model's score on the test set?

If we adjust our model's hyperparameters by evaluating it's performance on the test set, then 

This is like training for a test and evaluating your performance based on how well you can answer the exact questions that come up.
In real life you are unlikely to encounter exactly the same challenges, and so by training on them you will overfit, and not be able to generalise to *different* unseen answers.

You may find that a certain set of hyperparameters perform well on the test set, but then fail to perform as well in the wild when the model is being evaluated on . 

## What else can we test them on? 

Training our model weights by evaluating them on the test set will cause our model to overfit. The same is true for training our model hyperparameters. But we can't train both our
We can't adjust our hyperparameters by evaluating 

We can take some of the data that we plan to train the neural network's weights on and separate it from that main training set. 
We can then use this split-off data to validate that the current hyperparameters will make our model to perform well on unseen data (both the validation set and the test set are unseen).

PyTorch has a utility method `torch.utils.data.random_split()` that makes it easy to randomly split a dataset. Check out the [docs](https://pytorch.org/docs/stable/data.html#torch.utils.data.random_split) here.