Now the fun part begins. **Let's train the I3D network on the HMDB-51 dataset!**  

First, we want to download the pre-trained weights from the [MODEL_ZOO.md](https://github.com/kiyoon/PyVideoAI/blob/master/MODEL_ZOO.md)  
We'll use the I3D pretrained on the Kinetics-400 dataset, with 8-frame input.  

Note that the path to the pretrained weights is defined in `model_configs/i3d_resnet50.py` as below.  
```python
kinetics400_pretrained_path_8x8 = os.path.join(DATA_DIR, 'pretrained', 'kinetics400/i3d_resnet50/I3D_8x8_R50.pkl')
```

In [1]:
## IMPORTANT: You must change path values in `00-storage_location.py` before executing below.
# Environments for future use

from pyvideoai.config import PYVIDEOAI_DIR, DATA_DIR
%env PYVIDEOAI_DIR=$PYVIDEOAI_DIR
%env DATA_DIR=$DATA_DIR

import os
exec(open(os.path.join(PYVIDEOAI_DIR, 'examples', '00-storage_location.py')).read())
%env HDD_PATH=$HDD_PATH

env: PYVIDEOAI_DIR=/home/kiyoon/project/PyVideoAI
env: DATA_DIR=/home/kiyoon/project/PyVideoAI/data
env: HDD_PATH=/fast/kiyoon


In [3]:
# Link the pretrained weight directory to HDD
!mkdir -p "$HDD_PATH/pretrained/kinetics400/i3d_resnet50"
!ln -s "$HDD_PATH/pretrained" "$DATA_DIR/"

# Download
!wget https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/I3D_8x8_R50.pkl -P "$DATA_DIR/pretrained/kinetics400/i3d_resnet50"

--2021-06-14 03:17:54--  https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/I3D_8x8_R50.pkl
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 104.22.74.142, 104.22.75.142, 172.67.9.4, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|104.22.74.142|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 224598662 (214M) [application/octet-stream]
Saving to: ‘/home/kiyoon/project/PyVideoAI/data/pretrained/I3D_8x8_R50.pkl’


2021-06-14 03:18:15 (10.6 MB/s) - ‘/home/kiyoon/project/PyVideoAI/data/pretrained/I3D_8x8_R50.pkl’ saved [224598662/224598662]



Now we have our datasets processed, and the pretrained weights ready.  
We want to use three config files.  
- `hmdb.py` in `dataset_configs`,  
- `i3d_resnet50.py` in `model_configs`,  
- `hmdb/i3d_resnet50-crop224_8x8_largejit_plateau_1scrop5tcrop_split1.py` in `exp_configs`.

NOTE: The only difference between the inference example is the `load_pretrained(model)` function definition in the exp_config file!

In [3]:
# Optional: Setup Telegram bot to report the experiment stats
import os
if not os.path.isfile(f"{PYVIDEOAI_DIR}/tools/key.ini"):
    !cp "$PYVIDEOAI_DIR/tools/key.ini"{.template,}
# EDIT the `tools/key.ini` file on your own.

In [2]:
%env CUDA_VISIBLE_DEVICES=0
%run "$PYVIDEOAI_DIR/tools/run_train.py" --local_world_size 1 -D hmdb -M i3d_resnet50 -E crop224_8x8_largejit_plateau_1scrop5tcrop_split1

env: CUDA_VISIBLE_DEVICES=0


pyvideoai.tasks.task:   35 - INFO - cfg.last_activation not defined. Using softmax
experiment_utils.experiment_builder:  170 - INFO - Telegram bot initialised with keys in /home/kiyoon/project/PyVideoAI/tools/key.ini and using the bot number 0
pyvideoai.train_multiprocess:  136 - INFO - PyTorch==1.8.1
pyvideoai.train_multiprocess:  137 - INFO - PyVideoAI==v0.1+178.gddcd26d.dirty
pyvideoai.train_multiprocess:  138 - INFO - Experiment folder: /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003 on host aislab-2
pyvideoai.train_multiprocess:  142 - INFO - args: {
    "local_world_size": 1,
    "shard_id": 0,
    "num_shards": 1,
    "init_method": "tcp://localhost:19999",
    "backend": "nccl",
    "num_epochs": 100,
    "experiment_root": "/fast/kiyoon/experiments",
    "dataset": "hmdb",
    "model": "i3d_resnet50",
    "experiment_name": "crop224_8x8_largejit_plateau_1scrop5tcrop_split1",
    "dataset_channel": null,
    "model_channel


 Train Iter:   55/  55 - Sample:   3520/  3570 - ETA:    0s - lr: 0.00064000 - batch_loss: 3.7140 - loss: 3.8231 - acc: 0.0957

pyvideoai.train_and_eval:  201 - INFO -  Train Iter:   55/  55 - Sample:   3520/  3570 - 83s - lr: 0.00064000 - loss: 3.8231 - acc: 0.0957                             


 One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.9180 - val_acc: 0.3190

pyvideoai.train_and_eval:  421 - INFO -  One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - 26s - val_loss: 3.9180 - val_acc: 0.3190        
pyvideoai.train_multiprocess:  600 - INFO - Saving model to /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0000.pth
pyvideoai.train_multiprocess:  467 - INFO - Epoch 1/99



 Train Iter:   55/  55 - Sample:   3520/  3570 - ETA:    0s - lr: 0.00064000 - batch_loss: 3.2681 - loss: 3.3617 - acc: 0.3341

pyvideoai.train_and_eval:  201 - INFO -  Train Iter:   55/  55 - Sample:   3520/  3570 - 78s - lr: 0.00064000 - loss: 3.3617 - acc: 0.3341                             


 One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.8438 - val_acc: 0.4046

pyvideoai.train_and_eval:  421 - INFO -  One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - 25s - val_loss: 3.8438 - val_acc: 0.4046        
pyvideoai.train_multiprocess:  600 - INFO - Saving model to /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0001.pth
pyvideoai.train_multiprocess:  620 - INFO - Removing previous model: /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0000.pth
pyvideoai.train_multiprocess:  467 - INFO - Epoch 2/99



 Train Iter:   55/  55 - Sample:   3520/  3570 - ETA:    0s - lr: 0.00064000 - batch_loss: 2.5326 - loss: 2.8910 - acc: 0.4182

pyvideoai.train_and_eval:  201 - INFO -  Train Iter:   55/  55 - Sample:   3520/  3570 - 77s - lr: 0.00064000 - loss: 2.8910 - acc: 0.4182                             


 One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.7692 - val_acc: 0.4902

pyvideoai.train_and_eval:  421 - INFO -  One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - 25s - val_loss: 3.7692 - val_acc: 0.4902        
pyvideoai.train_multiprocess:  600 - INFO - Saving model to /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0002.pth
pyvideoai.train_multiprocess:  620 - INFO - Removing previous model: /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0001.pth
pyvideoai.train_multiprocess:  467 - INFO - Epoch 3/99



 Train Iter:   55/  55 - Sample:   3520/  3570 - ETA:    0s - lr: 0.00064000 - batch_loss: 2.3915 - loss: 2.5583 - acc: 0.4804

pyvideoai.train_and_eval:  201 - INFO -  Train Iter:   55/  55 - Sample:   3520/  3570 - 77s - lr: 0.00064000 - loss: 2.5583 - acc: 0.4804                             


 One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.7197 - val_acc: 0.5314

pyvideoai.train_and_eval:  421 - INFO -  One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - 25s - val_loss: 3.7197 - val_acc: 0.5314        
pyvideoai.train_multiprocess:  600 - INFO - Saving model to /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0003.pth
pyvideoai.train_multiprocess:  620 - INFO - Removing previous model: /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0002.pth
pyvideoai.train_multiprocess:  467 - INFO - Epoch 4/99



 Train Iter:   55/  55 - Sample:   3520/  3570 - ETA:    0s - lr: 0.00064000 - batch_loss: 2.2755 - loss: 2.2977 - acc: 0.5312

pyvideoai.train_and_eval:  201 - INFO -  Train Iter:   55/  55 - Sample:   3520/  3570 - 78s - lr: 0.00064000 - loss: 2.2977 - acc: 0.5312                             


 One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.6743 - val_acc: 0.5699

pyvideoai.train_and_eval:  421 - INFO -  One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - 25s - val_loss: 3.6743 - val_acc: 0.5699        
pyvideoai.train_multiprocess:  600 - INFO - Saving model to /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0004.pth
pyvideoai.train_multiprocess:  620 - INFO - Removing previous model: /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0003.pth
pyvideoai.train_multiprocess:  467 - INFO - Epoch 5/99



 Train Iter:   55/  55 - Sample:   3520/  3570 - ETA:    0s - lr: 0.00064000 - batch_loss: 2.0701 - loss: 2.0678 - acc: 0.5673

pyvideoai.train_and_eval:  201 - INFO -  Train Iter:   55/  55 - Sample:   3520/  3570 - 77s - lr: 0.00064000 - loss: 2.0678 - acc: 0.5673                             


 One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.6376 - val_acc: 0.6065

pyvideoai.train_and_eval:  421 - INFO -  One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - 25s - val_loss: 3.6376 - val_acc: 0.6065        
pyvideoai.train_multiprocess:  600 - INFO - Saving model to /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0005.pth
pyvideoai.train_multiprocess:  620 - INFO - Removing previous model: /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0004.pth
pyvideoai.train_multiprocess:  467 - INFO - Epoch 6/99



 Train Iter:   55/  55 - Sample:   3520/  3570 - ETA:    0s - lr: 0.00064000 - batch_loss: 1.7807 - loss: 1.8762 - acc: 0.5974

pyvideoai.train_and_eval:  201 - INFO -  Train Iter:   55/  55 - Sample:   3520/  3570 - 77s - lr: 0.00064000 - loss: 1.8762 - acc: 0.5974                             


 One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.6092 - val_acc: 0.6216

pyvideoai.train_and_eval:  421 - INFO -  One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - 25s - val_loss: 3.6092 - val_acc: 0.6216        
pyvideoai.train_multiprocess:  600 - INFO - Saving model to /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0006.pth
pyvideoai.train_multiprocess:  620 - INFO - Removing previous model: /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0005.pth
pyvideoai.train_multiprocess:  467 - INFO - Epoch 7/99



 Train Iter:   55/  55 - Sample:   3520/  3570 - ETA:    0s - lr: 0.00064000 - batch_loss: 1.7178 - loss: 1.7382 - acc: 0.6330

pyvideoai.train_and_eval:  201 - INFO -  Train Iter:   55/  55 - Sample:   3520/  3570 - 77s - lr: 0.00064000 - loss: 1.7382 - acc: 0.6330                             


 One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.5820 - val_acc: 0.6484

pyvideoai.train_and_eval:  421 - INFO -  One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - 25s - val_loss: 3.5820 - val_acc: 0.6484        
pyvideoai.train_multiprocess:  600 - INFO - Saving model to /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0007.pth
pyvideoai.train_multiprocess:  620 - INFO - Removing previous model: /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0006.pth
pyvideoai.train_multiprocess:  467 - INFO - Epoch 8/99



 Train Iter:   55/  55 - Sample:   3520/  3570 - ETA:    0s - lr: 0.00064000 - batch_loss: 1.5136 - loss: 1.6054 - acc: 0.6520

pyvideoai.train_and_eval:  201 - INFO -  Train Iter:   55/  55 - Sample:   3520/  3570 - 77s - lr: 0.00064000 - loss: 1.6054 - acc: 0.6520                             


 One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.5567 - val_acc: 0.6601

pyvideoai.train_and_eval:  421 - INFO -  One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - 25s - val_loss: 3.5567 - val_acc: 0.6601        
pyvideoai.train_multiprocess:  600 - INFO - Saving model to /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0008.pth
pyvideoai.train_multiprocess:  620 - INFO - Removing previous model: /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0007.pth
pyvideoai.train_multiprocess:  467 - INFO - Epoch 9/99



 Train Iter:   55/  55 - Sample:   3520/  3570 - ETA:    0s - lr: 0.00064000 - batch_loss: 1.3608 - loss: 1.5025 - acc: 0.6551

pyvideoai.train_and_eval:  201 - INFO -  Train Iter:   55/  55 - Sample:   3520/  3570 - 77s - lr: 0.00064000 - loss: 1.5025 - acc: 0.6551                             


 One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.5405 - val_acc: 0.6667

pyvideoai.train_and_eval:  421 - INFO -  One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - 25s - val_loss: 3.5405 - val_acc: 0.6667        
pyvideoai.train_multiprocess:  565 - INFO - Sending plots to Telegram.
pyvideoai.train_multiprocess:  600 - INFO - Saving model to /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0009.pth
pyvideoai.train_multiprocess:  620 - INFO - Removing previous model: /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0008.pth
pyvideoai.train_multiprocess:  467 - INFO - Epoch 10/99



 Train Iter:   55/  55 - Sample:   3520/  3570 - ETA:    0s - lr: 0.00064000 - batch_loss: 1.6279 - loss: 1.4124 - acc: 0.6832

pyvideoai.train_and_eval:  201 - INFO -  Train Iter:   55/  55 - Sample:   3520/  3570 - 77s - lr: 0.00064000 - loss: 1.4124 - acc: 0.6832                             


 One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.5235 - val_acc: 0.6758

pyvideoai.train_and_eval:  421 - INFO -  One-clip Eval Iter:   24/  24 - Sample:   1530/  1530 - 25s - val_loss: 3.5235 - val_acc: 0.6758        
pyvideoai.train_multiprocess:  600 - INFO - Saving model to /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0010.pth
pyvideoai.train_multiprocess:  620 - INFO - Removing previous model: /fast/kiyoon/experiments/hmdb/i3d_resnet50/crop224_8x8_largejit_plateau_1scrop5tcrop_split1/version_003/weights/epoch_0009.pth
pyvideoai.train_multiprocess:  467 - INFO - Epoch 11/99
-----------------------------------------------------------
----------OUTPUT SIMPLIFIED FOR NOTEBOOK VIEWERS-----------
-----------------------------------------------------------


### Saving model weights
It will only keep the models with peak validation accuracy, plus the last epoch. That's why you see that it's removing the previous model when it's not better than current.  
This is due to the command line argument `--save_mode last_and_peaks` which is set by default.  
Use `--save_mode all` in order to keep checkpoints from every epoch.  

### Experiment output structure
The output directory will be organised as follows:  

```
${experiment_root}
└── ${dataset}
    └── ${model_name}
        └── ${experiment_name}
            └── version_000
                ├── configs
                │   └── args.json
                ├── logs
                │   └── summary.csv
                ├── plots
                │   ├── accuracy.pdf
                │   ├── loss.pdf
                │   ├── video_accuracy_top1.pdf
                │   └── video_accuracy_top5.pdf
                ├── (predictions)
                ├── tensorboard_runs
                └── weights
                    ├── epoch_0000.pth
                    └── epoch_0001.pth
```


### Visualisation
#### Telegram
If you've set up the Telegram bot, it will report the training stats (and even get notifications of errors running the script) like this:

<img src="https://user-images.githubusercontent.com/12980409/122335586-7cb10a80-cf76-11eb-950f-af08c20055d4.png" alt="Telegram bot stat report example" width="400">

Not satisfied with the looks of the plots and message? Don't worry, you can customise them easily.  
If the exp_config file doesn't define `telegram_reporter`, the default is as follows:
```python
from pyvideoai.visualisations.telegram_reporter import DefaultTelegramReporter
telegram_reporter = DefaultTelegramReporter()
```

Copy `pyvideoai/visualisations/telegram_reporter.py` and change it as you want. Then, use your custom telegram reporter by adding few lines in the exp_config like below.  
```python
from my_telegram_reporter import MyTelegramReporter
telegram_reporter = MyTelegramReporter()
```

#### TensorBoard
You can use TensorBoard in `data/experiments/hmdb/i3d_resnet/crop224_8x8_largejit_steplr_1scrop5tcrop_split1/tensorboard_runs`.  
[Example tensorboard link](https://tensorboard.dev/experiment/mGSBcdZfQmWJNd658zHLbQ)

### Early Stopping
The training will stop when the validation loss and accuracy saturate for 20 epochs.  
See `early_stopping_condition()` in the exp_config.

## Evaluate using the saved weight

In [None]:
# -p will save the predictions into a pickle file.
# -l -2 will pick the best model (highest validation accuracy).
# -l -1 will pick the last model
%env CUDA_VISIBLE_DEVICES=0
%run "$PYVIDEOAI_DIR/tools/run_eval.py" --local_world_size 1 -D hmdb -M i3d_resnet50 -E crop224_8x8_largejit_plateau_1scrop5tcrop_split1 -l -2 -p

## Resume training

### From the last checkpoint
By simply adding `-l -1`, it will resume from the last checkpoint.  
Or, `-l 50` by resuming from 50th epoch's checkpoint (and start from 51st epoch).

In [5]:
%env CUDA_VISIBLE_DEVICES=0
%run "$PYVIDEOAI_DIR/tools/run_train.py" --local_world_size 1 -D hmdb -M i3d_resnet50 -E crop224_8x8_largejit_plateau_1scrop5tcrop_split1 -e 200 -l -1

experiment_utils.experiment_builder:  103 - INFO - Telegram bot initialised with keys in /home/kiyoon/project/PyVideoAI/tools/key.ini and using the bot number 0
pyvideoai.train_multiprocess:  121 - INFO - git hash: 48a5d20dc2e17fcd7b4343cc562f043b9839d6e2
pyvideoai.train_multiprocess:  125 - INFO - args: {
    "local_world_size": 1,
    "shard_id": 0,
    "num_shards": 1,
    "init_method": "tcp://localhost:19999",
    "backend": "nccl",
    "num_epochs": 200,
    "experiment_root": "/home/kiyoon/project/PyVideoAI/data/experiments",
    "dataset": "hmdb",
    "model": "i3d_resnet50",
    "experiment_name": "crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1",
    "dataset_channel": null,
    "model_channel": null,
    "experiment_channel": null,
    "save_mode": "last_and_peaks",
    "load_epoch": -1,
    "seed": 12,
    "multi_crop_val_freq": 5,
    "telegram_post_freq": 5,
    "telegram_bot_idx": 0,
    "dataloader_num_workers": 4
}
pyvideoai.dataloaders.frames_densesampl

env: CUDA_VISIBLE_DEVICES=0


pyvideoai.utils.misc:   52 - INFO - Model:
ResNetModel(
  (s1): VideoModelStem(
    (pathway0_stem): ResNetBasicStem(
      (conv): Conv3d(3, 64, kernel_size=[5, 7, 7], stride=[1, 2, 2], padding=[2, 3, 3], bias=False)
      (bn): BatchNorm3d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (pool_layer): MaxPool3d(kernel_size=[1, 3, 3], stride=[1, 2, 2], padding=[0, 1, 1], dilation=1, ceil_mode=False)
    )
  )
-----------------------------------------------------------
----------OUTPUT SIMPLIFIED FOR NOTEBOOK VIEWERS-----------
-----------------------------------------------------------
  (head): ResNetBasicHead(
    (pathway0_avgpool): AvgPool3d(kernel_size=[4, 7, 7], stride=1, padding=0)
    (dropout): Dropout(p=0.5, inplace=False)
    (projection): Linear(in_features=2048, out_features=51, bias=True)
    (act): Softmax(dim=4)
  )
)
pyvideoai.utils.misc:   53 - INFO - Params: 27,328,371
pyvideoai.utils.misc:   54 - INFO - Mem:


 Train Iter:  446/ 446 - Sample:   3568/  3570 - ETA:    0s - lr: 0.00000001 - batch_loss: 0.9854 - loss: 0.5750 - batch_acc: 0.6250 - acc: 0.8487        

pyvideoai.train_and_val:  146 - INFO -  Train Iter:  446/ 446 - Sample:   3568/  3570 - 95s - lr: 0.00000001 - loss: 0.5750 - acc: 0.8487                                                            


 One-clip Eval Iter:  192/ 192 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.3217 - val_acc: 0.7085        

pyvideoai.train_and_val:  324 - INFO -  One-clip Eval Iter:  192/ 192 - Sample:   1530/  1530 - 15s - val_loss: 3.3217 - val_acc: 0.7085
pyvideoai.train_multiprocess:  424 - INFO - Saving model to /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0100.pth
pyvideoai.train_multiprocess:  445 - INFO - Removing previous model: /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0099.pth
pyvideoai.train_multiprocess:  333 - INFO - Epoch 101/199


-----------------------------------------------------------
----------OUTPUT SIMPLIFIED FOR NOTEBOOK VIEWERS-----------
-----------------------------------------------------------


pyvideoai.train_and_val:  324 - INFO -  Multi-clip Eval Iter:  957/ 957 - Sample:   7650/  7650 - 75s - val_loss: 3.3233 - val_acc: 0.7082 - val_vid_acc_top1: 0.7294 - val_vid_acc_top5: 0.9405
pyvideoai.train_multiprocess:  424 - INFO - Saving model to /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0199.pth
pyvideoai.train_multiprocess:  445 - INFO - Removing previous model: /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0198.pth
pyvideoai.train_multiprocess:  455 - SUCCESS - Finished training


***
# Running TSN / TRN / TSM

Let's train using different model with the same dataset.  


### Dense sampling vs Sparse sampling
The difference between the I3D and the TRN is that the former is sampling videos densely, and the latter is sparsely.  
Note the different dataloader in `i3d_resnet50-crop224_8x8_largejit_plateau_1scrop5tcrop_split1.py` and `trn_resnet50-crop224_8frame_largejit_plateau_5scrop_split1.py` in `exp_configs/hmdb`.  


### LR policy
The new exp_config has an optimiser policy. Refer to `get_optim_policies(model)` in the config file.  
This lets you set different learning rate for different layers in the model.

In [None]:
%env CUDA_VISIBLE_DEVICES=0
%run "$PYVIDEOAI_DIR/tools/run_train.py" --local_world_size 1 -D hmdb -M trn_resnet50 -E crop224_8frame_largejit_plateau_5scrop_split1 -e 100

## Changing dataset or model
To change dataset, simply change the folder.  
This changes the dataset_config to `dataset_configs/something_v1.py`, but keep other settings unchanged.



In [3]:
# You can replace `cp` to `ln -s` if you want to link the two files.
!cp "$PYVIDEOAI_DIR/exp_configs/hmdb/trn_resnet50-crop224_8frame_largejit_plateau_5scrop_split1.py" "$PYVIDEOAI_DIR/exp_configs/something_v1/trn_resnet50-crop224_8frame_largejit_plateau_5scrop_split1.py"

To run TSM, you can simply copy the exp_config.  
This changes the model_config to `model_configs/tsm_resnet50.py`, but keep other settings such as dataloader and optimiser unchanged.

In [4]:
# You can replace `cp` to `ln -s` if you want to link the two files.
!cp "$PYVIDEOAI_DIR/exp_configs/hmdb/trn_resnet50-crop224_8frame_largejit_plateau_5scrop_split1.py" "$PYVIDEOAI_DIR/exp_configs/hmdb/tsm_resnet50-crop224_8frame_largejit_plateau_5scrop_split1.py"

### Importing from another config file

Okay, cool. But is copying the only way to change experiment settings? What if you have 1000 experiments and later they're too difficult to explore what the changed settings are?

Indeed. Copying too many codes is the easiest way to make mistakes.  
The other way is to "import" the other config file and change only the part of the configuration.  
However, normal Python importing wouldn't work.

```python
# Example exp_config that loads config from another one.
# Good idea, but wouldn't work.
from trn_resnet50-crop224_8frame_largejit_plateau_5scrop_split1 import *
input_frame_length = 32
```

The workaround is to use `exec()`.  
Open a new config file like `exp_configs/hmdb/trn_resnet50-32frame.py` in an editor and try below.  
```python
# Example exp_config that loads config from another one. Keep all settings except the number of frames.
# Right way
import os
_SCRIPT_DIR = os.path.dirname(os.path.abspath( __file__ ))
exec(open(f'{_SCRIPT_DIR}/trn_resnet50-crop224_8frame_largejit_plateau_5scrop_split1.py').read())

input_frame_length = 32
```

### Change config without copying. Use one config for all!

What if your task is to find the best hyperparameter and you want to try 1000 different learning rates?  
Do you create 1000 config files and write learning rate manually?  
Here's a better way to do so: use ENVIRONMENT VARIABLES!

Notice that the learning rate in the exp_config is defined as
```python
base_learning_rate = float(os.getenv('BASE_LR', 1e-5))
```
in `optimiser()`. Thus, you can overwrite the learning rate by setting the `BASE_LR` environment variable, or the default will be 1e-5.

Your bash script may look like:
```bash
for LR in {1..10}    # 1e-1, 1e-2, ..., 1e-10
do
    BASE_LR=1e-$LR CUDA_VISIBLE_DEVICES=0 ~/PyVideoAI/tools/run_train.py --local_world_size 1 \
    -D hmdb -M trn_resnet50 -E crop224_8frame_largejit_plateau_5scrop_split1 \
    -R ~/experiments/base_lr-1e-$LR
done
```

`-R` argument changes the experiment output folder location.
