# Training

In this chapter we will use all the stuff we learned to finally train a neural network 🤖 using neuroimaging data 🧠

So, get ready for your new favorite hobby 🤓

<p>
<img src="https://programmerhumor.io/wp-content/uploads/2023/05/programmerhumor-io-python-memes-programming-memes-d917ae7c7cb4095-758x495.jpg" width=500/>
<figcaption>Taken from <a href="https://programmerhumor.io/python-memes/its-been-a-decade/">https://programmerhumor.io/python-memes/its-been-a-decade/</a></figcaption>
</p>

## 1. What is training?

During model training (the sentence continues here 👇)

```python
...
neural_net = NeuralNet()          # the neural net              
for nifti, age in dl:             # is provided with training input
  nifti_age = neural_net(nifti)   # and the neural nets output
  error = (age - nifti_age) ** 2  # is compared to the desired output
  error.backward()                # and the net is adjusted (using Autograd 🦮)
```
Wow, this code is already pretty close to our goal (3 more lines to make it work). We are only missing one final ingredient: The optimizer!

## 2. `torch.nn.optim`

The module `torch.nn.optim` provides a collection of optimizers which handle the adjustment of (neural net-) parameters.

First the optimizer is given the parameters it should update

```python
...   
neural_net = NeuralNet()
optimizer = torch.optim.Adam(neural_net.parameters())  # creating the optimizer
```
then it can be applied in the above code by adding one line at the beginning and one at the end of the training loop

```python          
for nifti, age in dl:
  optimizer.zero_grad()           # resetting the parameters gradients to zero
  nifti_age = neural_net(nifti)
  error = (age - nifti_age) ** 2
  error.backward()
  optimizer.step()                # adjusting the parameters (using Autograd 🦮)
```
Those are the three missing lines which make the above code actually work.

The two most common ways to tinker with the optimizer are by
1. changing the learning rate to e.g. `1e-2` via `torch.optim.Adam(neural_net.parameters(), lr=1e-2)`
2. changing the optimizer e.g. using `torch.optim.SGD(...)` instead of `torch.optim.Adam(...)`

`torch.optim.SGD(...)` - Stochastic Gradient Descent - is simply changing the parameters by $lr * -\nabla$ (re-read 5_PyTorch, 2. Autograd 🦮)

`torch.optim.Adam(...)` is typically **your optimizer of choice** (doing [a bit more](https://pytorch.org/docs/stable/generated/torch.optim.Adam.html))  working good with `lr` between `1e-5` and `1e-1`


# Exercise

## 🚨 Warning 🚨

This Notebook builds on 1_Introduction and the exercise of 2_Data_Exploration and 4_Preprocessing.

You have to run these Notebooks (if you didn't already) and mount your Google Drive to this Notebook via
```python
from google.colab import drive
drive.mount('/content/drive')
```
then you are ready to go!

1. Use the exercise solution of 6_Convolutional_Neural_Net and add the training loop shown in chapter 2 `torch.nn.optim`!

2. Print `error`, `nifti_age` and `age` inside the training loop!

3. Use the exercise solution of 5_PyTorch to **create separate training and validation DataLoaders**.

4. Run the training loop (with print statements) **5 times** - using an additional `for` loop - with the **training DataLoader**!

5. Use the **trained `neural_net`** to run the loop (with print statements) **one time** with the **validation DataLoader**. **Delete/Uncomment all lines which could adjust the neural nets parameters** as we do not want to train during validation.