# MMAction2 Tutorial

Welcome to MMAction2! This is the official colab tutorial for using MMAction2. In this tutorial, you will learn
- Perform inference with a MMAction2 recognizer.
- Train a new recognizer with a new dataset.

Let's start!

## Install MMAction2

In [13]:
# Load the Drive helper and mount
from google.colab import drive

# This will prompt for authorization.
drive.mount('/content/drive')

# Change director listing to your google drive.
% cd /content/drive/My Drive

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
/content/drive/My Drive


In [None]:
# Execute this only once so that the repository is cloned into your "Workspace" folder.
! git clone https://github.com/zacktohsh/Workspace_Action

# Change director listing to your google drive.
% cd /content/drive/My Drive/Workspace_Action

#https://drive.google.com/file/d/1R3Be-PIy3ZeXNpKsGaQa7gR82wOLhHlZ/view?usp=sharing
!gdown --id 1R3Be-PIy3ZeXNpKsGaQa7gR82wOLhHlZ

!unzip -a Workspace_Action.zip
!rm Workspace_Action.zip
# Verify correct path and content downloaded
! pwd
! ls -l

Mounted at /content/drive
/content/drive/My Drive
Cloning into 'Workspace_Action'...
remote: Enumerating objects: 32, done.[K
remote: Counting objects: 100% (32/32), done.[K
remote: Compressing objects: 100% (29/29), done.[K
remote: Total 32 (delta 11), reused 17 (delta 2), pack-reused 0[K
Unpacking objects: 100% (32/32), done.
/content/drive/My Drive/Workspace_Action
Downloading...
From: https://drive.google.com/uc?id=1R3Be-PIy3ZeXNpKsGaQa7gR82wOLhHlZ
To: /content/drive/My Drive/Workspace_Action/Workspace_Action.zip
886MB [00:10, 83.4MB/s]
Archive:  Workspace_Action.zip
   creating: mmaction2/
   creating: mmaction2/.git/
   creating: mmaction2/.github/
  inflating: mmaction2/.github/CODE_OF_CONDUCT.md  [binary]
  inflating: mmaction2/.github/CONTRIBUTING.md  [binary]
   creating: mmaction2/.github/ISSUE_TEMPLATE/
  inflating: mmaction2/.github/ISSUE_TEMPLATE/config.yml  [binary]
  inflating: mmaction2/.github/ISSUE_TEMPLATE/error-report.md  [binary]
  inflating: mmaction2/.github

In [2]:
# Check nvcc version
!nvcc -V
# Check GCC version
!gcc --version

# Check python version
!python --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Python 3.6.9


In [17]:
#Installation of required libraries and files.
!pip install torch-1.5.1+cu101-cp36-cp36m-linux_x86_64.whl
!pip install torchvision-0.6.1+cu101-cp36-cp36m-linux_x86_64.whl
!pip install mmcv_full-latest+torch1.5.0+cu101-cp36-cp36m-manylinux1_x86_64.whl

%cd mmaction2
!pip install -e .

# Install some optional requirements
!pip install -r requirements/optional.txt

Processing ./torch-1.5.1+cu101-cp36-cp36m-linux_x86_64.whl
[31mERROR: torchvision 0.8.1+cu101 has requirement torch==1.7.0, but you'll have torch 1.5.1+cu101 which is incompatible.[0m
Installing collected packages: torch
  Found existing installation: torch 1.7.0+cu101
    Uninstalling torch-1.7.0+cu101:
      Successfully uninstalled torch-1.7.0+cu101
Successfully installed torch-1.5.1+cu101
Processing ./torchvision-0.6.1+cu101-cp36-cp36m-linux_x86_64.whl
Installing collected packages: torchvision
  Found existing installation: torchvision 0.8.1+cu101
    Uninstalling torchvision-0.8.1+cu101:
      Successfully uninstalled torchvision-0.8.1+cu101
Successfully installed torchvision-0.6.1+cu101
Processing ./mmcv_full-latest+torch1.5.0+cu101-cp36-cp36m-manylinux1_x86_64.whl
Collecting addict
  Downloading https://files.pythonhosted.org/packages/6a/00/b08f23b7d7e1e14ce01419a467b583edbb93c6cdb8654e54a9cc579cd61f/addict-2.4.0-py3-none-any.whl
Collecting yapf
[?25l  Downloading https://fi

In [4]:
# Check Pytorch installation
import torch, torchvision
print(torch.__version__, torch.cuda.is_available())

# Check MMAction2 installation
import mmaction
print(mmaction.__version__)

# Check MMCV installation
from mmcv.ops import get_compiling_cuda_version, get_compiler_version
print(get_compiling_cuda_version())
print(get_compiler_version())

1.5.1+cu101 True
0.9.0
10.1
GCC 7.3


## Perform inference with a MMAction2 recognizer
MMAction2 already provides high level APIs to do inference and training.

In [5]:
#!mkdir checkpoints
#!wget -c https://download.openmmlab.com/mmaction/recognition/tsn/tsn_r50_1x1x3_100e_kinetics400_rgb/tsn_r50_1x1x3_100e_kinetics400_rgb_20200614-e508be42.pth \
#      -O checkpoints/tsn_r50_1x1x3_100e_kinetics400_rgb_20200614-e508be42.pth

In [6]:
from mmaction.apis import inference_recognizer, init_recognizer

# Choose to use a config and initialize the recognizer
config = 'configs/recognition/tsn/tsn_r50_video_inference_1x1x3_100e_kinetics400_rgb.py'
# Setup a checkpoint file to load
checkpoint = 'checkpoints/tsn_r50_1x1x3_100e_kinetics400_rgb_20200614-e508be42.pth'
# Initialize the recognizer
model = init_recognizer(config, checkpoint, device='cuda:0')

In [7]:
# Use the recognizer to do inference
video = 'demo/demo.mp4'
label = 'demo/label_map.txt'
results = inference_recognizer(model, video, label)

In [8]:
# Let's show the results
for result in results:
    print(f'{result[0]}: ', result[1])

arm wrestling:  1.0
rock scissors paper:  6.4344654e-09
shaking hands:  2.7599913e-09
clapping:  1.3454664e-09
massaging feet:  5.555122e-10


## Train a recognizer on customized dataset

To train a new recognizer, there are usually three things to do:
1. Support a new dataset
2. Modify the config
3. Train a new recognizer

### Support a new dataset

In this tutorial, we gives an example to convert the data into the format of existing datasets. Other methods and more advanced usages can be found in the [doc](/docs/tutorials/new_dataset.md)

Firstly, let's download a tiny dataset obtained from [Kinetics-400](https://deepmind.com/research/open-source/open-source-datasets/kinetics/). We select 30 videos with their labels as train dataset and 10 videos with their labels as test dataset.

In [9]:
# download, decompress the data
#!rm kinetics400_tiny.zip*
#!rm -rf kinetics400_tiny
#!wget https://download.openmmlab.com/mmaction/kinetics400_tiny.zip
#!unzip kinetics400_tiny.zip > /dev/null

In [9]:
# Check the directory structure of the tiny data

# Install tree first
!apt-get -q install tree
!tree kinetics400_tiny

Reading package lists...
Building dependency tree...
Reading state information...
tree is already the newest version (1.7.0-5).
0 upgraded, 0 newly installed, 0 to remove and 14 not upgraded.
kinetics400_tiny
├── kinetics_tiny_train_video.txt
├── kinetics_tiny_val_video.txt
├── train
│   ├── 27_CSXByd3s.mp4
│   ├── 34XczvTaRiI.mp4
│   ├── A-wiliK50Zw.mp4
│   ├── D32_1gwq35E.mp4
│   ├── D92m0HsHjcQ.mp4
│   ├── DbX8mPslRXg.mp4
│   ├── FMlSTTpN3VY.mp4
│   ├── h10B9SVE-nk.mp4
│   ├── h2YqqUhnR34.mp4
│   ├── iRuyZSKhHRg.mp4
│   ├── IyfILH9lBRo.mp4
│   ├── kFC3KY2bOP8.mp4
│   ├── LvcFDgCAXQs.mp4
│   ├── O46YA8tI530.mp4
│   ├── oMrZaozOvdQ.mp4
│   ├── oXy-e_P_cAI.mp4
│   ├── P5M-hAts7MQ.mp4
│   ├── phDqGd0NKoo.mp4
│   ├── PnOe3GZRVX8.mp4
│   ├── R8HXQkdgKWA.mp4
│   ├── RqnKtCEoEcA.mp4
│   ├── soEcZZsBmDs.mp4
│   ├── TkkZPZHbAKA.mp4
│   ├── T_TMNGzVrDk.mp4
│   ├── WaS0qwP46Us.mp4
│   ├── Wh_YPQdH1Zg.mp4
│   ├── WWP5HZJsg-o.mp4
│   ├── xGY2dP0YUjA.mp4
│   ├── yLC9CtWU5ws.mp4
│   └── ZQV4U2KQ370

In [10]:
# After downloading the data, we need to check the annotation format
!cat kinetics400_tiny/kinetics_tiny_train_video.txt

D32_1gwq35E.mp4 0
iRuyZSKhHRg.mp4 1
oXy-e_P_cAI.mp4 0
34XczvTaRiI.mp4 1
h2YqqUhnR34.mp4 0
O46YA8tI530.mp4 0
kFC3KY2bOP8.mp4 1
WWP5HZJsg-o.mp4 1
phDqGd0NKoo.mp4 1
yLC9CtWU5ws.mp4 0
27_CSXByd3s.mp4 1
IyfILH9lBRo.mp4 1
T_TMNGzVrDk.mp4 1
TkkZPZHbAKA.mp4 0
PnOe3GZRVX8.mp4 1
soEcZZsBmDs.mp4 1
FMlSTTpN3VY.mp4 1
WaS0qwP46Us.mp4 0
A-wiliK50Zw.mp4 1
oMrZaozOvdQ.mp4 1
ZQV4U2KQ370.mp4 0
DbX8mPslRXg.mp4 1
h10B9SVE-nk.mp4 1
P5M-hAts7MQ.mp4 0
R8HXQkdgKWA.mp4 0
D92m0HsHjcQ.mp4 0
RqnKtCEoEcA.mp4 0
LvcFDgCAXQs.mp4 0
xGY2dP0YUjA.mp4 0
Wh_YPQdH1Zg.mp4 0


According to the format defined in [`VideoDataset`](./datasets/video_dataset.py), each line indicates a sample video with the filepath and label, which are split with a whitespace.

### Modify the config

In the next step, we need to modify the config for the training.
To accelerate the process, we finetune a recognizer using a pre-trained recognizer.

In [19]:
from mmcv import Config
cfg = Config.fromfile('./configs/recognition/tsn/tsn_r50_video_1x1x8_100e_kinetics400_rgb.py')

Given a config that trains a TSN model on kinetics400-full dataset, we need to modify some values to use it for training TSN on Kinetics400-tiny dataset.


In [20]:
from mmcv.runner import set_random_seed

# Modify dataset type and path
cfg.dataset_type = 'VideoDataset'
cfg.data_root = 'kinetics400_tiny/train/'
cfg.data_root_val = 'kinetics400_tiny/val/'
cfg.ann_file_train = 'kinetics400_tiny/kinetics_tiny_train_video.txt'
cfg.ann_file_val = 'kinetics400_tiny/kinetics_tiny_val_video.txt'
cfg.ann_file_test = 'kinetics400_tiny/kinetics_tiny_val_video.txt'

cfg.data.test.type = 'VideoDataset'
cfg.data.test.ann_file = 'kinetics400_tiny/kinetics_tiny_val_video.txt'
cfg.data.test.data_prefix = 'kinetics400_tiny/val/'

cfg.data.train.type = 'VideoDataset'
cfg.data.train.ann_file = 'kinetics400_tiny/kinetics_tiny_train_video.txt'
cfg.data.train.data_prefix = 'kinetics400_tiny/train/'

cfg.data.val.type = 'VideoDataset'
cfg.data.val.ann_file = 'kinetics400_tiny/kinetics_tiny_val_video.txt'
cfg.data.val.data_prefix = 'kinetics400_tiny/val/'

# The flag is used to determine whether it is omnisource training
cfg.setdefault('omnisource', False)
# Modify num classes of the model in cls_head
cfg.model.cls_head.num_classes = 2
# We can use the pre-trained TSN model
cfg.load_from = './checkpoints/tsn_r50_1x1x3_100e_kinetics400_rgb_20200614-e508be42.pth'

# Set up working dir to save files and logs.
cfg.work_dir = './tutorial_exps'

# The original learning rate (LR) is set for 8-GPU training.
# We divide it by 8 since we only use one GPU.
cfg.data.videos_per_gpu = cfg.data.videos_per_gpu // 16
cfg.optimizer.lr = cfg.optimizer.lr / 8 / 16
cfg.total_epochs = 30

# We can set the checkpoint saving interval to reduce the storage cost
cfg.checkpoint_config.interval = 10
# We can set the log print interval to reduce the the times of printing log
cfg.log_config.interval = 5

# Set seed thus the results are more reproducible
cfg.seed = 0
set_random_seed(0, deterministic=False)
cfg.gpu_ids = range(1)


# We can initialize the logger for training and have a look
# at the final config used for training
print(f'Config:\n{cfg.pretty_text}')


Config:
model = dict(
    type='Recognizer2D',
    backbone=dict(
        type='ResNet',
        pretrained='torchvision://resnet50',
        depth=50,
        norm_eval=False),
    cls_head=dict(
        type='TSNHead',
        num_classes=2,
        in_channels=2048,
        spatial_type='avg',
        consensus=dict(type='AvgConsensus', dim=1),
        dropout_ratio=0.4,
        init_std=0.01))
train_cfg = None
test_cfg = dict(average_clips=None)
dataset_type = 'VideoDataset'
data_root = 'kinetics400_tiny/train/'
data_root_val = 'kinetics400_tiny/val/'
ann_file_train = 'kinetics400_tiny/kinetics_tiny_train_video.txt'
ann_file_val = 'kinetics400_tiny/kinetics_tiny_val_video.txt'
ann_file_test = 'kinetics400_tiny/kinetics_tiny_val_video.txt'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_bgr=False)
train_pipeline = [
    dict(type='DecordInit'),
    dict(type='SampleFrames', clip_len=1, frame_interval=1, num_clips=8),
    dict(type='DecordDeco

### Train a new recognizer

Finally, lets initialize the dataset and recognizer, then train a new recognizer!

In [14]:
import os.path as osp

from mmaction.datasets import build_dataset
from mmaction.models import build_model
from mmaction.apis import train_model

import mmcv

# Build the dataset
datasets = [build_dataset(cfg.data.train)]

# Build the recognizer
model = build_model(cfg.model, train_cfg=cfg.train_cfg, test_cfg=cfg.test_cfg)

# Create work_dir
mmcv.mkdir_or_exist(osp.abspath(cfg.work_dir))
train_model(model, datasets, cfg, distributed=False, validate=True)

Downloading: "https://download.pytorch.org/models/resnet50-19c8e357.pth" to /root/.cache/torch/checkpoints/resnet50-19c8e357.pth


HBox(children=(FloatProgress(value=0.0, max=102502400.0), HTML(value='')))

2020-12-22 04:57:29,633 - mmaction - INFO - These parameters in pretrained checkpoint are not loaded: {'fc.bias', 'fc.weight'}





2020-12-22 04:57:29,699 - mmaction - INFO - load checkpoint from ./checkpoints/tsn_r50_1x1x3_100e_kinetics400_rgb_20200614-e508be42.pth

size mismatch for cls_head.fc_cls.weight: copying a param with shape torch.Size([400, 2048]) from checkpoint, the shape in current model is torch.Size([2, 2048]).
size mismatch for cls_head.fc_cls.bias: copying a param with shape torch.Size([400]) from checkpoint, the shape in current model is torch.Size([2]).
2020-12-22 04:57:29,883 - mmaction - INFO - Start running, host: root@daa45c12bfe6, work_dir: /content/drive/My Drive/Workspace_Action/mmaction2/tutorial_exps
2020-12-22 04:57:29,884 - mmaction - INFO - workflow: [('train', 1)], max: 30 epochs
2020-12-22 04:57:34,514 - mmaction - INFO - Epoch [1][5/15]	lr: 7.813e-05, eta: 0:06:51, time: 0.924, data_time: 0.685, memory: 2918, top1_acc: 0.7000, top5_acc: 1.0000, loss_cls: 0.6865, loss: 0.6865, grad_norm: 12.7663
2020-12-22 04:57:35,512 - mmaction - INFO - Epoch [1][10/15]	lr: 7.813e-05, eta: 0:04:

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 10/10, 5.0 task/s, elapsed: 2s, ETA:     0s

2020-12-22 04:58:05,481 - mmaction - INFO - Evaluating top_k_accuracy ...
2020-12-22 04:58:05,486 - mmaction - INFO - 
top1_acc	0.7000
top5_acc	1.0000
2020-12-22 04:58:05,486 - mmaction - INFO - Evaluating mean_class_accuracy ...
2020-12-22 04:58:05,488 - mmaction - INFO - 
mean_acc	0.7000
2020-12-22 04:58:05,489 - mmaction - INFO - Now best checkpoint is epoch_5.pth
2020-12-22 04:58:05,498 - mmaction - INFO - Epoch(val) [5][15]	top1_acc: 0.7000, top5_acc: 1.0000, mean_class_accuracy: 0.7000
2020-12-22 04:58:10,292 - mmaction - INFO - Epoch [6][5/15]	lr: 7.813e-05, eta: 0:02:54, time: 0.954, data_time: 0.729, memory: 2918, top1_acc: 0.5000, top5_acc: 1.0000, loss_cls: 0.6696, loss: 0.6696, grad_norm: 11.0199
2020-12-22 04:58:11,236 - mmaction - INFO - Epoch [6][10/15]	lr: 7.813e-05, eta: 0:02:46, time: 0.189, data_time: 0.002, memory: 2918, top1_acc: 0.6000, top5_acc: 1.0000, loss_cls: 0.6824, loss: 0.6824, grad_norm: 11.8895
2020-12-22 04:58:12,098 - mmaction - INFO - Epoch [6][15/15]

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 10/10, 4.8 task/s, elapsed: 2s, ETA:     0s

2020-12-22 04:58:41,753 - mmaction - INFO - Evaluating top_k_accuracy ...
2020-12-22 04:58:41,754 - mmaction - INFO - 
top1_acc	0.9000
top5_acc	1.0000
2020-12-22 04:58:41,757 - mmaction - INFO - Evaluating mean_class_accuracy ...
2020-12-22 04:58:41,760 - mmaction - INFO - 
mean_acc	0.9000
2020-12-22 04:58:41,761 - mmaction - INFO - Now best checkpoint is epoch_10.pth
2020-12-22 04:58:41,768 - mmaction - INFO - Epoch(val) [10][15]	top1_acc: 0.9000, top5_acc: 1.0000, mean_class_accuracy: 0.9000
2020-12-22 04:58:46,146 - mmaction - INFO - Epoch [11][5/15]	lr: 7.813e-05, eta: 0:02:13, time: 0.874, data_time: 0.636, memory: 2918, top1_acc: 0.8000, top5_acc: 1.0000, loss_cls: 0.5357, loss: 0.5357, grad_norm: 8.8030
2020-12-22 04:58:47,738 - mmaction - INFO - Epoch [11][10/15]	lr: 7.813e-05, eta: 0:02:09, time: 0.319, data_time: 0.111, memory: 2918, top1_acc: 0.8000, top5_acc: 1.0000, loss_cls: 0.5830, loss: 0.5830, grad_norm: 10.5931
2020-12-22 04:58:48,602 - mmaction - INFO - Epoch [11][15

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 10/10, 4.9 task/s, elapsed: 2s, ETA:     0s

2020-12-22 04:59:17,064 - mmaction - INFO - Evaluating top_k_accuracy ...
2020-12-22 04:59:17,066 - mmaction - INFO - 
top1_acc	0.8000
top5_acc	1.0000
2020-12-22 04:59:17,068 - mmaction - INFO - Evaluating mean_class_accuracy ...
2020-12-22 04:59:17,071 - mmaction - INFO - 
mean_acc	0.8000
2020-12-22 04:59:17,071 - mmaction - INFO - Epoch(val) [15][15]	top1_acc: 0.8000, top5_acc: 1.0000, mean_class_accuracy: 0.8000
2020-12-22 04:59:21,321 - mmaction - INFO - Epoch [16][5/15]	lr: 7.813e-05, eta: 0:01:38, time: 0.847, data_time: 0.613, memory: 2918, top1_acc: 0.8000, top5_acc: 1.0000, loss_cls: 0.5428, loss: 0.5428, grad_norm: 10.0089
2020-12-22 04:59:22,505 - mmaction - INFO - Epoch [16][10/15]	lr: 7.813e-05, eta: 0:01:34, time: 0.238, data_time: 0.037, memory: 2918, top1_acc: 0.7000, top5_acc: 1.0000, loss_cls: 0.6095, loss: 0.6095, grad_norm: 11.1777
2020-12-22 04:59:23,439 - mmaction - INFO - Epoch [16][15/15]	lr: 7.813e-05, eta: 0:01:31, time: 0.187, data_time: 0.006, memory: 2918, 

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 10/10, 4.9 task/s, elapsed: 2s, ETA:     0s

2020-12-22 04:59:53,346 - mmaction - INFO - Evaluating top_k_accuracy ...
2020-12-22 04:59:53,350 - mmaction - INFO - 
top1_acc	0.9000
top5_acc	1.0000
2020-12-22 04:59:53,351 - mmaction - INFO - Evaluating mean_class_accuracy ...
2020-12-22 04:59:53,353 - mmaction - INFO - 
mean_acc	0.9000
2020-12-22 04:59:53,355 - mmaction - INFO - Epoch(val) [20][15]	top1_acc: 0.9000, top5_acc: 1.0000, mean_class_accuracy: 0.9000
2020-12-22 04:59:57,503 - mmaction - INFO - Epoch [21][5/15]	lr: 7.813e-05, eta: 0:01:04, time: 0.828, data_time: 0.603, memory: 2918, top1_acc: 0.8000, top5_acc: 1.0000, loss_cls: 0.4515, loss: 0.4515, grad_norm: 8.3872
2020-12-22 04:59:59,020 - mmaction - INFO - Epoch [21][10/15]	lr: 7.813e-05, eta: 0:01:01, time: 0.304, data_time: 0.078, memory: 2918, top1_acc: 0.8000, top5_acc: 1.0000, loss_cls: 0.4542, loss: 0.4542, grad_norm: 9.0976
2020-12-22 04:59:59,925 - mmaction - INFO - Epoch [21][15/15]	lr: 7.813e-05, eta: 0:00:59, time: 0.181, data_time: 0.002, memory: 2918, to

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 10/10, 5.0 task/s, elapsed: 2s, ETA:     0s

2020-12-22 05:00:28,877 - mmaction - INFO - Evaluating top_k_accuracy ...
2020-12-22 05:00:28,879 - mmaction - INFO - 
top1_acc	1.0000
top5_acc	1.0000
2020-12-22 05:00:28,880 - mmaction - INFO - Evaluating mean_class_accuracy ...
2020-12-22 05:00:28,885 - mmaction - INFO - 
mean_acc	1.0000
2020-12-22 05:00:28,887 - mmaction - INFO - Now best checkpoint is epoch_25.pth
2020-12-22 05:00:28,893 - mmaction - INFO - Epoch(val) [25][15]	top1_acc: 1.0000, top5_acc: 1.0000, mean_class_accuracy: 1.0000
2020-12-22 05:00:33,616 - mmaction - INFO - Epoch [26][5/15]	lr: 7.813e-05, eta: 0:00:31, time: 0.943, data_time: 0.731, memory: 2918, top1_acc: 0.9000, top5_acc: 1.0000, loss_cls: 0.3755, loss: 0.3755, grad_norm: 7.9569
2020-12-22 05:00:34,727 - mmaction - INFO - Epoch [26][10/15]	lr: 7.813e-05, eta: 0:00:28, time: 0.222, data_time: 0.031, memory: 2918, top1_acc: 0.5000, top5_acc: 1.0000, loss_cls: 0.5395, loss: 0.5395, grad_norm: 10.7605
2020-12-22 05:00:35,610 - mmaction - INFO - Epoch [26][15

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 10/10, 4.9 task/s, elapsed: 2s, ETA:     0s

2020-12-22 05:01:05,254 - mmaction - INFO - Evaluating top_k_accuracy ...
2020-12-22 05:01:05,256 - mmaction - INFO - 
top1_acc	1.0000
top5_acc	1.0000
2020-12-22 05:01:05,257 - mmaction - INFO - Evaluating mean_class_accuracy ...
2020-12-22 05:01:05,261 - mmaction - INFO - 
mean_acc	1.0000
2020-12-22 05:01:05,261 - mmaction - INFO - Epoch(val) [30][15]	top1_acc: 1.0000, top5_acc: 1.0000, mean_class_accuracy: 1.0000


### Understand the log
From the log, we can have a basic understanding the training process and know how well the recognizer is trained.

Firstly, the ResNet-50 backbone pre-trained on ImageNet is loaded, this is a common practice since training from scratch is more cost. The log shows that all the weights of the ResNet-50 backbone are loaded except the `fc.bias` and `fc.weight`.

Second, since the dataset we are using is small, we loaded a TSN model and finetune it for action recognition.
The original TSN is trained on original Kinetics-400 dataset which contains 400 classes but Kinetics-400 Tiny dataset only have 2 classes. Therefore, the last FC layer of the pre-trained TSN for classification has different weight shape and is not used.

Third, after training, the recognizer is evaluated by the default evaluation. The results show that the recognizer achieves 100% top1 accuracy and 100% top5 accuracy on the val dataset,
 
Not bad!

## Test the trained recognizer

After finetuning the recognizer, let's check the prediction results!

In [15]:
from mmaction.apis import single_gpu_test
from mmaction.datasets import build_dataloader
from mmcv.parallel import MMDataParallel

# Build a test dataloader
dataset = build_dataset(cfg.data.test, dict(test_mode=True))
data_loader = build_dataloader(
        dataset,
        videos_per_gpu=1,
        workers_per_gpu=cfg.data.workers_per_gpu,
        dist=False,
        shuffle=False)
model = MMDataParallel(model, device_ids=[0])
outputs = single_gpu_test(model, data_loader)

eval_config = cfg.evaluation
eval_config.pop('interval')
eval_res = dataset.evaluate(outputs, **eval_config)
for name, val in eval_res.items():
    print(f'{name}: {val:.04f}')

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 10/10, 2.1 task/s, elapsed: 5s, ETA:     0s
Evaluating top_k_accuracy ...

top1_acc	1.0000
top5_acc	1.0000

Evaluating mean_class_accuracy ...

mean_acc	1.0000
top1_acc: 1.0000
top5_acc: 1.0000
mean_class_accuracy: 1.0000


In [21]:
from mmaction.apis import inference_recognizer, init_recognizer

# Choose to use a config and initialize the recognizer
config = 'configs/recognition/tsn/tsn_r50_video_inference_1x1x3_100e_kinetics400_rgb.py'
# Setup a checkpoint file to load
checkpoint = 'tutorial_exps/epoch_30.pth'
# Initialize the recognizer
model = init_recognizer(cfg, checkpoint, device='cuda:0')

In [26]:
# Use the recognizer to do inference
video = 'tutorial_exps/0pVGiAU6XEA.mp4'
label = 'tutorial_exps/label_map.txt'
results = inference_recognizer(model, video, label)

In [27]:
# Let's show the results
for result in results:
    print(f'{result[0]}: ', result[1])

rope climbing:  0.80804515
blow glass:  -1.0268286
