Now the fun part begins. **Let's train the I3D network on the HMDB-51 dataset!**  

First, we want to download the pre-trained weights from the [MODEL_ZOO.md](https://github.com/kiyoon/PyVideoAI/blob/master/MODEL_ZOO.md)  
We'll use the I3D pretrained on the Kinetics-400 dataset, with 8-frame input.  

Note that the path to the pretrained weights is defined in `model_configs/i3d_resnet50.py` as below.  
```python
kinetics400_pretrained_path_8x8 = os.path.join(DATA_DIR, 'pretrained', 'kinetics400/i3d_resnet50/I3D_8x8_R50.pkl')
```

In [1]:
# Environments for future use

from pyvideoai.config import PYVIDEOAI_DIR, DATA_DIR
%env PYVIDEOAI_DIR=$PYVIDEOAI_DIR
%env DATA_DIR=$DATA_DIR

# !! CHANGE BELOW
%env HDD_PATH=/storage/kiyoon

env: PYVIDEOAI_DIR=/home/kiyoon/project/PyVideoAI
env: DATA_DIR=/home/kiyoon/project/PyVideoAI/data
env: HDD_PATH=/storage/kiyoon


In [3]:
# Link the pretrained weight directory to HDD
!mkdir -p "$HDD_PATH/pretrained/kinetics400/i3d_resnet50"
!ln -s "$HDD_PATH/pretrained" "$DATA_DIR/"

# Download
!wget https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/I3D_8x8_R50.pkl -P "$DATA_DIR/pretrained/kinetics400/i3d_resnet50"

--2021-06-14 03:17:54--  https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/I3D_8x8_R50.pkl
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 104.22.74.142, 104.22.75.142, 172.67.9.4, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|104.22.74.142|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 224598662 (214M) [application/octet-stream]
Saving to: ‘/home/kiyoon/project/PyVideoAI/data/pretrained/I3D_8x8_R50.pkl’


2021-06-14 03:18:15 (10.6 MB/s) - ‘/home/kiyoon/project/PyVideoAI/data/pretrained/I3D_8x8_R50.pkl’ saved [224598662/224598662]



Now we have our datasets processed, and the pretrained weights ready.  
We want to use three config files.  
- `hmdb.py` in `dataset_configs`,  
- `i3d_resnet50.py` in `model_configs`,  
- `hmdb/i3d_resnet50-crop224_lr00_8x8_largejit_plateau_1scrop5tcrop_split1.py` in `exp_configs`.

NOTE: The only difference between the inference example is the `load_pretrained(model)` function definition in the exp_config file!

In [3]:
# Optional: Setup Telegram bot to report the experiment stats
import os
if not os.path.isfile(f"{PYVIDEOAI_DIR}/tools/key.ini"):
    !cp "$PYVIDEOAI_DIR/tools/key.ini"{.template,}
# EDIT the `tools/key.ini` file on your own.

In [2]:
# If you have more than 1 GPUs, go ahead and change CUDA_VISIBLE_DEVICES to the comma-separated GPU indices, and --local_world_size to the number of GPUs.
# This will increase the batch size (1 GPU=size 8, 2 GPU=size 16, ...) and speed up the learning.
# However, Jupyter Notebook doesn't seem to support stdout printing with multiple processes. Try using normal shell.

# -e sets the number of epochs.
%env CUDA_VISIBLE_DEVICES=0
%run "$PYVIDEOAI_DIR/tools/run_train.py" --local_world_size 1 -D hmdb -M i3d_resnet50 -E crop224_lr0001_8x8_largejit_plateau_1scrop5tcrop_split1 -e 100

env: CUDA_VISIBLE_DEVICES=0


experiment_utils.experiment_builder:  103 - INFO - Telegram bot initialised with keys in /home/kiyoon/project/PyVideoAI/tools/key.ini and using the bot number 0
pyvideoai.train_multiprocess:  121 - INFO - git hash: 48a5d20dc2e17fcd7b4343cc562f043b9839d6e2
pyvideoai.train_multiprocess:  125 - INFO - args: {
    "local_world_size": 1,
    "shard_id": 0,
    "num_shards": 1,
    "init_method": "tcp://localhost:19999",
    "backend": "nccl",
    "num_epochs": 100,
    "experiment_root": "/home/kiyoon/project/PyVideoAI/data/experiments",
    "dataset": "hmdb",
    "model": "i3d_resnet50",
    "experiment_name": "crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1",
    "dataset_channel": null,
    "model_channel": null,
    "experiment_channel": null,
    "save_mode": "last_and_peaks",
    "load_epoch": null,
    "seed": 12,
    "multi_crop_val_freq": 5,
    "telegram_post_freq": 5,
    "telegram_bot_idx": 0,
    "dataloader_num_workers": 4
}
pyvideoai.dataloaders.frames_densesam


 Train Iter:  446/ 446 - Sample:   3568/  3570 - ETA:    0s - lr: 0.00010000 - batch_loss: 3.8874 - loss: 3.8073 - batch_acc: 0.1250 - acc: 0.1085        

pyvideoai.train_and_val:  146 - INFO -  Train Iter:  446/ 446 - Sample:   3568/  3570 - 96s - lr: 0.00010000 - loss: 3.8073 - acc: 0.1085                                                            


 One-clip Eval Iter:  192/ 192 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.9147 - val_acc: 0.3732        

pyvideoai.train_and_val:  324 - INFO -  One-clip Eval Iter:  192/ 192 - Sample:   1530/  1530 - 15s - val_loss: 3.9147 - val_acc: 0.3732
pyvideoai.train_multiprocess:  424 - INFO - Saving model to /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0000.pth
pyvideoai.train_multiprocess:  333 - INFO - Epoch 1/99



 Train Iter:  446/ 446 - Sample:   3568/  3570 - ETA:    0s - lr: 0.00010000 - batch_loss: 3.5042 - loss: 3.4460 - batch_acc: 0.2500 - acc: 0.3310        

pyvideoai.train_and_val:  146 - INFO -  Train Iter:  446/ 446 - Sample:   3568/  3570 - 96s - lr: 0.00010000 - loss: 3.4460 - acc: 0.3310                                                            


 One-clip Eval Iter:  192/ 192 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.8353 - val_acc: 0.4196        

pyvideoai.train_and_val:  324 - INFO -  One-clip Eval Iter:  192/ 192 - Sample:   1530/  1530 - 15s - val_loss: 3.8353 - val_acc: 0.4196
pyvideoai.train_multiprocess:  424 - INFO - Saving model to /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0001.pth
pyvideoai.train_multiprocess:  445 - INFO - Removing previous model: /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0000.pth
pyvideoai.train_multiprocess:  333 - INFO - Epoch 2/99



 Train Iter:  446/ 446 - Sample:   3568/  3570 - ETA:    0s - lr: 0.00010000 - batch_loss: 2.3087 - loss: 3.0303 - batch_acc: 0.8750 - acc: 0.4268        

pyvideoai.train_and_val:  146 - INFO -  Train Iter:  446/ 446 - Sample:   3568/  3570 - 96s - lr: 0.00010000 - loss: 3.0303 - acc: 0.4268                                                            
-----------------------------------------------------------
----------OUTPUT SIMPLIFIED FOR NOTEBOOK VIEWERS-----------
-----------------------------------------------------------
pyvideoai.train_and_val:  324 - INFO -  One-clip Eval Iter:  192/ 192 - Sample:   1530/  1530 - 15s - val_loss: 3.3184 - val_acc: 0.7033
pyvideoai.train_multiprocess:  424 - INFO - Saving model to /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0097.pth
pyvideoai.train_multiprocess:  445 - INFO - Removing previous model: /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0096.pth
pyvideoai.train_multiprocess:  333 - INFO - Epoch 98


 Train Iter:  446/ 446 - Sample:   3568/  3570 - ETA:    0s - lr: 0.00000001 - batch_loss: 1.1730 - loss: 0.5969 - batch_acc: 0.6250 - acc: 0.8461        

pyvideoai.train_and_val:  146 - INFO -  Train Iter:  446/ 446 - Sample:   3568/  3570 - 96s - lr: 0.00000001 - loss: 0.5969 - acc: 0.8461                                                            


 One-clip Eval Iter:  192/ 192 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.3158 - val_acc: 0.7046        

pyvideoai.train_and_val:  324 - INFO -  One-clip Eval Iter:  192/ 192 - Sample:   1530/  1530 - 15s - val_loss: 3.3158 - val_acc: 0.7046
pyvideoai.train_multiprocess:  424 - INFO - Saving model to /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0098.pth
pyvideoai.train_multiprocess:  445 - INFO - Removing previous model: /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0097.pth
pyvideoai.train_multiprocess:  333 - INFO - Epoch 99/99



 Train Iter:  446/ 446 - Sample:   3568/  3570 - ETA:    0s - lr: 0.00000001 - batch_loss: 0.8652 - loss: 0.5864 - batch_acc: 0.7500 - acc: 0.8422        

pyvideoai.train_and_val:  146 - INFO -  Train Iter:  446/ 446 - Sample:   3568/  3570 - 96s - lr: 0.00000001 - loss: 0.5864 - acc: 0.8422                                                            


 One-clip Eval Iter:  192/ 192 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.3208 - val_acc: 0.7033        

pyvideoai.train_and_val:  324 - INFO -  One-clip Eval Iter:  192/ 192 - Sample:   1530/  1530 - 15s - val_loss: 3.3208 - val_acc: 0.7033


 Multi-clip Eval Iter:  957/ 957 - Sample:   7650/  7650 - ETA:    0s - val_loss: 3.3233 - val_acc: 0.7082        

pyvideoai.train_and_val:  324 - INFO -  Multi-clip Eval Iter:  957/ 957 - Sample:   7650/  7650 - 75s - val_loss: 3.3233 - val_acc: 0.7082 - val_vid_acc_top1: 0.7301 - val_vid_acc_top5: 0.9405
pyvideoai.train_multiprocess:  424 - INFO - Saving model to /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0099.pth
pyvideoai.train_multiprocess:  445 - INFO - Removing previous model: /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0098.pth
pyvideoai.train_multiprocess:  455 - SUCCESS - Finished training


### Multicrop evaluation
Since one clip is too short to see the entire video, it will perform multicrop (1 centre crop, 5 temporal crops) evaluation every 5 epochs (`-t 5`).  
FYI, most Facebook AI Research papers (e.g. SlowFast) use 3 spatial crops and 10 temporal crops (totalling 30 clips per video).  
To change the frequency, add `-t EPOCHS`.  
To skip the multicrop evaluation, add `-t -1`.  
To change the number of spatial and temporal crops, edit `test_num_spatial_crops` and `test_num_ensemble_views` in `exp_configs/hmdb/i3d_resnet50-crop224_lr0001_8x8_largejit_plateau_1scrop5tcrop_split1.py`.  
Or, you can change the dataloader by changing `def get_torch_dataset(split)` function in the config file.

### Saving model weights
It will only keep the models with peak validation accuracy, plus the last epoch. That's why you see that it's removing the previous model when it's not better than current.  
This is due to the command line argument `--save_mode last_and_peaks` which is set by default.  
Use `--save_mode all` in order to keep checkpoints from every epoch.  

### Experiment output structure
The output directory will be organised as follows:  

```
${experiment_root}
└── ${dataset}
    └── ${model_name}
        └── ${experiment_name}
            ├── configs
            │   └── args.json
            ├── logs
            │   └── summary.csv
            │   └── train.log
            ├── plots
            │   ├── accuracy.pdf
            │   ├── loss.pdf
            │   ├── video_accuracy_top1.pdf
            │   └── video_accuracy_top5.pdf
            ├── (predictions)
            ├── tensorboard_runs
            └── weights
                ├── epoch_0000.pth
                └── epoch_0001.pth
```


### Visualisation
#### Telegram
If you've set up the Telegram bot, it will report the training stats (and even get notifications of errors running the script) like this:

<img src="https://user-images.githubusercontent.com/12980409/122335586-7cb10a80-cf76-11eb-950f-af08c20055d4.png" alt="Telegram bot stat report example" width="400">

#### TensorBoard
You can use TensorBoard in `data/experiments/hmdb/i3d_resnet/crop224_lr0001_8x8_largejit_steplr_1scrop5tcrop_split1/tensorboard_runs`.  
[Example tensorboard link](https://tensorboard.dev/experiment/mGSBcdZfQmWJNd658zHLbQ)

## Evaluate using the saved weight

In [None]:
# -p will save the predictions into a pickle file.
# -l -2 will pick the best model (highest validation accuracy).
# -l -1 will pick the last model
%env CUDA_VISIBLE_DEVICES=0
%run "$PYVIDEOAI_DIR/tools/run_val.py" --local_world_size 1 -D hmdb -M i3d_resnet50 -N crop224_lr0001_8x8_largejit_plateau_1scrop5tcrop_split1 -l -2 -p

## Resume training

### From the last checkpoint
By simply adding `-l -1`, it will resume from the last checkpoint.

In [5]:
%env CUDA_VISIBLE_DEVICES=0
%run "$PYVIDEOAI_DIR/tools/run_train.py" --local_world_size 1 -D hmdb -M i3d_resnet50 -N crop224_lr0001_8x8_largejit_plateau_1scrop5tcrop_split1 -e 200 -l -1

experiment_utils.experiment_builder:  103 - INFO - Telegram bot initialised with keys in /home/kiyoon/project/PyVideoAI/tools/key.ini and using the bot number 0
pyvideoai.train_multiprocess:  121 - INFO - git hash: 48a5d20dc2e17fcd7b4343cc562f043b9839d6e2
pyvideoai.train_multiprocess:  125 - INFO - args: {
    "local_world_size": 1,
    "shard_id": 0,
    "num_shards": 1,
    "init_method": "tcp://localhost:19999",
    "backend": "nccl",
    "num_epochs": 200,
    "experiment_root": "/home/kiyoon/project/PyVideoAI/data/experiments",
    "dataset": "hmdb",
    "model": "i3d_resnet50",
    "experiment_name": "crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1",
    "dataset_channel": null,
    "model_channel": null,
    "experiment_channel": null,
    "save_mode": "last_and_peaks",
    "load_epoch": -1,
    "seed": 12,
    "multi_crop_val_freq": 5,
    "telegram_post_freq": 5,
    "telegram_bot_idx": 0,
    "dataloader_num_workers": 4
}
pyvideoai.dataloaders.frames_densesampl

env: CUDA_VISIBLE_DEVICES=0


pyvideoai.utils.misc:   52 - INFO - Model:
ResNetModel(
  (s1): VideoModelStem(
    (pathway0_stem): ResNetBasicStem(
      (conv): Conv3d(3, 64, kernel_size=[5, 7, 7], stride=[1, 2, 2], padding=[2, 3, 3], bias=False)
      (bn): BatchNorm3d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (pool_layer): MaxPool3d(kernel_size=[1, 3, 3], stride=[1, 2, 2], padding=[0, 1, 1], dilation=1, ceil_mode=False)
    )
  )
-----------------------------------------------------------
----------OUTPUT SIMPLIFIED FOR NOTEBOOK VIEWERS-----------
-----------------------------------------------------------
  (head): ResNetBasicHead(
    (pathway0_avgpool): AvgPool3d(kernel_size=[4, 7, 7], stride=1, padding=0)
    (dropout): Dropout(p=0.5, inplace=False)
    (projection): Linear(in_features=2048, out_features=51, bias=True)
    (act): Softmax(dim=4)
  )
)
pyvideoai.utils.misc:   53 - INFO - Params: 27,328,371
pyvideoai.utils.misc:   54 - INFO - Mem:


 Train Iter:  446/ 446 - Sample:   3568/  3570 - ETA:    0s - lr: 0.00000001 - batch_loss: 0.9854 - loss: 0.5750 - batch_acc: 0.6250 - acc: 0.8487        

pyvideoai.train_and_val:  146 - INFO -  Train Iter:  446/ 446 - Sample:   3568/  3570 - 95s - lr: 0.00000001 - loss: 0.5750 - acc: 0.8487                                                            


 One-clip Eval Iter:  192/ 192 - Sample:   1530/  1530 - ETA:    0s - val_loss: 3.3217 - val_acc: 0.7085        

pyvideoai.train_and_val:  324 - INFO -  One-clip Eval Iter:  192/ 192 - Sample:   1530/  1530 - 15s - val_loss: 3.3217 - val_acc: 0.7085
pyvideoai.train_multiprocess:  424 - INFO - Saving model to /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0100.pth
pyvideoai.train_multiprocess:  445 - INFO - Removing previous model: /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0099.pth
pyvideoai.train_multiprocess:  333 - INFO - Epoch 101/199


-----------------------------------------------------------
----------OUTPUT SIMPLIFIED FOR NOTEBOOK VIEWERS-----------
-----------------------------------------------------------


pyvideoai.train_and_val:  324 - INFO -  Multi-clip Eval Iter:  957/ 957 - Sample:   7650/  7650 - 75s - val_loss: 3.3233 - val_acc: 0.7082 - val_vid_acc_top1: 0.7294 - val_vid_acc_top5: 0.9405
pyvideoai.train_multiprocess:  424 - INFO - Saving model to /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0199.pth
pyvideoai.train_multiprocess:  445 - INFO - Removing previous model: /home/kiyoon/project/PyVideoAI/data/experiments/hmdb/i3d_resnet50/crop224_lr0001_batch8_8x8_largejit_plateau_1scrop5tcrop_split1/weights/epoch_0198.pth
pyvideoai.train_multiprocess:  455 - SUCCESS - Finished training


### From the intermediate checkpoint
Go to the experiment folder, and find the summary file in `data/experiments/hmdb/i3d_resnet/crop224_lr0001_8x8_largejit_steplr_1scrop5tcrop_split1/logs/summary.csv`.  
You'll see the training stats. Just remove the lines after your resuming point.  
For example, you want to resume from the 50th epoch and start from 51st epoch, then remove from epoch 51 till the end.  
Add `-l 50` to load the 50th epoch's checkpoint and resume from the 51st epoch.

In [None]:
# IMPORTANT NOTE: Before running the code, make sure you change the summary.csv file!!
%env CUDA_VISIBLE_DEVICES=0
%run "$PYVIDEOAI_DIR/tools/run_train.py" --local_world_size 1 -D hmdb -M i3d_resnet50 -N crop224_lr0001_8x8_largejit_steplr_1scrop5tcrop_split1 -e 200 -l 50

# Running TSN / TRN

Let's train using different model with the same dataset.  


### Dense sampling vs Sparse sampling
The difference between the I3D and the TRN is that the former is sampling videos densely, and the latter is sparsely.  
Note the different dataloader in `i3d_resnet50-crop224_lr0001_8x8_largejit_plateau_1scrop5tcrop_split1.py` and `trn_resnet50-crop224_lr0001_8frame_largejit_plateau_5scrop_split1.py` in `exp_configs/hmdb`.  
In sparse sampling mechanism, multicrop operates only spatially, and to me it didn't make much difference compared to temporal multicrop. Thus, we disable it.  


### Early stopping & LR policy
The new exp_config has early stopping and LR policy. Refer to the config file.

In [None]:
# -t -1 disables multicrop evaluation
%env CUDA_VISIBLE_DEVICES=0
%run "$PYVIDEOAI_DIR/tools/run_train.py" --local_world_size 1 -D hmdb -M trn_resnet50 -N crop224_lr0001_8frame_largejit_plateau_5scrop_split1 -e 100 -t -1