Now the fun part begins. **Let's train the I3D network on the HMDB-51 dataset!**  

First, we want to download the pre-trained weights from the [MODEL_ZOO.md](https://github.com/kiyoon/PyVideoAI/blob/master/MODEL_ZOO.md)  
We'll use the I3D pretrained on the Kinetics-400 dataset, with 8-frame input.  

Note that the path to the pretrained weights is defined in `model_configs/i3d_resnet50.py` as below.  
```python
kinetics400_pretrained_path_8x8 = os.path.join(DATA_DIR, 'pretrained', 'kinetics400/i3d_resnet50/I3D_8x8_R50.pkl')
```

In [1]:
## IMPORTANT: You must change path values in `00-storage_location.py` before executing below.
# Environments for future use

from pyvideoai.config import PYVIDEOAI_DIR, DATA_DIR
%env PYVIDEOAI_DIR=$PYVIDEOAI_DIR
%env DATA_DIR=$DATA_DIR

import os
exec(open(os.path.join(PYVIDEOAI_DIR, 'examples', '00-storage_location.py')).read())
%env HDD_PATH=$HDD_PATH

env: PYVIDEOAI_DIR=/home/kiyoon/project/PyVideoAI
env: DATA_DIR=/home/kiyoon/project/PyVideoAI/data
env: HDD_PATH=/mnt/hdd/kiyoon


In [2]:
# Link the pretrained weight directory to HDD
!mkdir -p "$HDD_PATH/pretrained/kinetics400/i3d_resnet50"
!ln -s "$HDD_PATH/pretrained" "$DATA_DIR/"

# Download
!wget https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/I3D_8x8_R50.pkl -P "$DATA_DIR/pretrained/kinetics400/i3d_resnet50"

ln: failed to create symbolic link '/home/kiyoon/project/PyVideoAI/data/pretrained': File exists
--2022-08-19 17:22:40--  https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/I3D_8x8_R50.pkl
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 2606:4700:10::ac43:904, 2606:4700:10::6816:4a8e, 2606:4700:10::6816:4b8e, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|2606:4700:10::ac43:904|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 224598662 (214M) [application/octet-stream]
Saving to: ‘/home/kiyoon/project/PyVideoAI/data/pretrained/kinetics400/i3d_resnet50/I3D_8x8_R50.pkl’


2022-08-19 17:22:52 (19.1 MB/s) - ‘/home/kiyoon/project/PyVideoAI/data/pretrained/kinetics400/i3d_resnet50/I3D_8x8_R50.pkl’ saved [224598662/224598662]



Now we have our datasets processed, and the pretrained weights ready.  
We want to use three config files.  
- `hmdb.py` in `dataset_configs`,  
- `i3d_resnet50.py` in `model_configs`,  
- `hmdb/i3d_resnet50-crop224_8x8_largejit_plateau_1scrop5tcrop_split1.py` in `exp_configs`.

NOTE: The only difference between the inference example is the `load_pretrained(model)` function definition in the exp_config file!

In [3]:
# Optional: Setup Telegram bot to report the experiment stats
import os
if not os.path.isfile(f"{PYVIDEOAI_DIR}/tools/key.ini"):
    !cp "$PYVIDEOAI_DIR/tools/key.ini"{.template,}
# EDIT the `tools/key.ini` file on your own.

### Optional: Set up Weights & Biases

You need to set up W & B
```bash
pip install wandb
wandb login
```
Just add `--wandb_project test` argument below.

In [5]:
%env CUDA_VISIBLE_DEVICES=0
!"$PYVIDEOAI_DIR/tools/run_singlenode.sh" train 1 -D hmdb -M i3d_resnet50 -E crop224_8x8_largejit_plateau_1scrop5tcrop_split1

env: CUDA_VISIBLE_DEVICES=0
[34mpyvideoai.train_multiprocess:[0m  122 - [1;30mINFO[0m - PyTorch==1.10.1
[34mpyvideoai.train_multiprocess:[0m  123 - [1;30mINFO[0m - PyVideoAI==v0.3+236.g4034acb.dirty
[34mpyvideoai.train_multiprocess:[0m  124 - [1;30mINFO[0m - Experiment folder: /mnt/hdd/kiyoon/PyVideoAI_experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_000 on host rossum
[34mpyvideoai.train_multiprocess:[0m  181 - [1;30mINFO[0m - args: {
    "num_epochs": "DEPRECATED",
    "experiment_root": "/mnt/hdd/kiyoon/PyVideoAI_experiments",
    "dataset": "hmdb",
    "model": "i3d_resnet50",
    "experiment_name": "crop224_8x8_largejit_plateau_1scrop5tcrop_split1",
    "subfolder_name": null,
    "dataset_channel": null,
    "model_channel": null,
    "experiment_channel": null,
    "save_mode": "last_and_peaks",
    "training_speed": "standard",
    "load_epoch": null,
    "seed": 12,
    "multi_crop_val_period": -1,
    "telegram_post_period

### Saving model weights
It will only keep the models with peak validation accuracy, plus the last epoch. That's why you see that it's removing the previous model when it's not better than current.  
This is due to the command line argument `--save_mode last_and_peaks` which is set by default.  
Use `--save_mode all` in order to keep checkpoints from every epoch.  

### Experiment output structure
The output directory will be organised as follows:  

```
${experiment_root}
└── ${dataset}
    └── ${model_name}
        └── ${experiment_name}
                └── ${subfolder_name}
                    └── version_000
                        ├── configs
                        │   └── args.json
                        ├── logs
                        │   └── summary.csv
                        ├── plots
                        │   ├── accuracy.pdf
                        │   ├── loss.pdf
                        │   ├── video_accuracy_top1.pdf
                        │   └── video_accuracy_top5.pdf
                        ├── (predictions)
                        ├── tensorboard_runs
                        └── weights
                            ├── epoch_0000.pth
                            └── epoch_0001.pth
```

You can define `${subfolder_name}` by adding `-S` argument. By default, there will be no subfolder.


### Visualisation
#### Telegram
If you've set up the Telegram bot, it will report the training stats (and even get notifications of errors running the script) like this:

<img src="https://user-images.githubusercontent.com/12980409/122335586-7cb10a80-cf76-11eb-950f-af08c20055d4.png" alt="Telegram bot stat report example" width="400">

Not satisfied with the looks of the plots and message? Don't worry, you can customise them easily.  
If the exp_config file doesn't define `telegram_reporter`, the default is as follows:
```python
from pyvideoai.visualisations.telegram_reporter import DefaultTelegramReporter
telegram_reporter = DefaultTelegramReporter()
```

Copy `pyvideoai/visualisations/telegram_reporter.py` and change it as you want. Then, use your custom telegram reporter by adding few lines in the exp_config like below.  
```python
from my_telegram_reporter import MyTelegramReporter
telegram_reporter = MyTelegramReporter()
```

#### TensorBoard
You can use TensorBoard in `data/experiments/hmdb/i3d_resnet/crop224_8x8_largejit_steplr_1scrop5tcrop_split1/tensorboard_runs`.  
[Example tensorboard link](https://tensorboard.dev/experiment/mGSBcdZfQmWJNd658zHLbQ)

### Early Stopping
The training will stop when the validation loss and accuracy saturate for 20 epochs.  
See `early_stopping_condition()` in the exp_config.

## Evaluate using the saved weight

In [None]:
# -p will save the predictions into a pickle file.
# -l -2 will pick the best model (highest validation accuracy).
# -l -1 will pick the last model
%env CUDA_VISIBLE_DEVICES=0
!"$PYVIDEOAI_DIR/tools/run_singlenode.sh" eval 1 -D hmdb -M i3d_resnet50 -E crop224_8x8_largejit_plateau_1scrop5tcrop_split1 -l -2 -p

## Resume training

### From the last checkpoint
By simply adding `-l -1`, it will resume from the last checkpoint. `-l -2` means to load from the best checkpoint.   
Or, `-l 50` by resuming from 50th epoch's checkpoint (and start from 51st epoch).

In [None]:
%env CUDA_VISIBLE_DEVICES=0
%env VAI_NUM_EPOCHS=1000
!"$PYVIDEOAI_DIR/tools/run_singlenode.sh" train 1 -D hmdb -M i3d_resnet50 -E crop224_8x8_largejit_plateau_1scrop5tcrop_split1 -l -1

***
# Running TSN / TRN / TSM

Let's train using different model with the same dataset.  


### Dense sampling vs Sparse sampling
The difference between the I3D and the TSN/TRN/TSM is that the former is sampling videos densely, and the latter is sparsely.  
Note the different dataloader in `i3d_resnet50-crop224_8x8_largejit_plateau_1scrop5tcrop_split1.py` and `tsm_resnet50-crop224_8frame_largejit_plateau_5scrop_split1.py` in `exp_configs/hmdb`.  


### LR policy
The new exp_config has an optimiser policy. Refer to `get_optim_policies(model)` in the config file.  
This lets you set different learning rate for different layers in the model.

### Training

In [None]:
%env CUDA_VISIBLE_DEVICES=0
!"$PYVIDEOAI_DIR/tools/run_singlenode.sh" train 1 -D hmdb -M tsm_resnet50 -E crop224_8frame_largejit_plateau_5scrop_split1

## Changing dataset or model
To change dataset, simply change the folder.  
This changes the dataset_config to `dataset_configs/something_v1.py`, but keep other settings unchanged.



In [3]:
# You can replace `cp` to `ln -s` if you want to link the two files.
!cp "$PYVIDEOAI_DIR/exp_configs/hmdb/tsm_resnet50-crop224_8frame_largejit_plateau_5scrop_split1.py" "$PYVIDEOAI_DIR/exp_configs/something_v1/tsm_resnet50-crop224_8frame_largejit_plateau_5scrop_split1.py"

To run another 2D network (TSN), you can simply copy the exp_config.  
This changes the model_config to `model_configs/tsn_resnet50.py`, but keep other settings such as dataloader and optimiser unchanged.

In [4]:
# You can replace `cp` to `ln -s` if you want to link the two files.
!cp "$PYVIDEOAI_DIR/exp_configs/hmdb/tsm_resnet50-crop224_8frame_largejit_plateau_5scrop_split1.py" "$PYVIDEOAI_DIR/exp_configs/hmdb/tsn_resnet50-crop224_8frame_largejit_plateau_5scrop_split1.py"

### Importing from another config file

Okay, cool. But is copying the only way to change experiment settings? What if you have 1000 experiments and later they're too difficult to explore what the changed settings are?

Indeed. Copying too many codes is the easiest way to make mistakes.  
The other way is to "import" the other config file and change only the part of the configuration.  
However, normal Python importing wouldn't work.

```python
# Example exp_config that loads config from another one.
# Good idea, but would NOT work.
from tsm_resnet50-crop224_8frame_largejit_plateau_5scrop_split1 import *
input_frame_length = 32
```

Instead, use `_exec_relative_(relpath)`.  
Open a new config file like `exp_configs/hmdb/tsm_resnet50-32frame.py` in an editor and try below.  
```python
# Example exp_config that loads config from another one. Keep all settings except the number of frames.
# Right way
_exec_relative_('tsm_resnet50-crop224_8frame_largejit_plateau_5scrop_split1.py')

input_frame_length = 32
```

### Change config without copying. Use one config for all!

What if your task is to find the best hyperparameter and you want to try 1000 different learning rates?  
Do you create 1000 config files and write learning rate manually?  
Here's a better way to do so: use ENVIRONMENT VARIABLES!

Notice that the learning rate in the exp_config is defined as
```python
base_learning_rate = float(os.getenv('VAI_BASE_LR', 1e-5))
```
in `optimiser()`. Thus, you can overwrite the learning rate by setting the `BASE_LR` environment variable, or the default will be 1e-5.

Your bash script may look like:
```bash
for LR in {1..10}    # 1e-1, 1e-2, ..., 1e-10
do
    BASE_LR=1e-$LR CUDA_VISIBLE_DEVICES=0 ~/PyVideoAI/tools/run_singlenode.sh train 1 \
    -D hmdb -M tsm_resnet50 -E crop224_8frame_largejit_plateau_5scrop_split1 \
    -S ~/experiments/base_lr-1e-$LR
done
```

`-S` argument creates a subfolder named `base_lr-1e-10` for example.
