#  Using MMAction2 to solve video action recognition problem 

Train a  recognizer with a new dataset(small Dataset)



## Install MMAction2

In [5]:
# Check nvcc version
!nvcc -V
# Check GCC version
!gcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.



In [6]:
# install dependencies: (use cu111 because colab has CUDA 11.1)
!pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html

# install mmcv-full thus we could use CUDA operators
!pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html

# Install mmaction2
!rm -rf mmaction2
!git clone https://github.com/open-mmlab/mmaction2.git
%cd mmaction2

!pip install -e .

# Install some optional requirements
!pip install -r requirements/optional.txt

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in links: https://download.pytorch.org/whl/torch_stable.html
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in links: https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
Cloning into 'mmaction2'...
remote: Enumerating objects: 19835, done.[K
remote: Counting objects: 100% (370/370), done.[K
remote: Compressing objects: 100% (265/265), done.[K
remote: Total 19835 (delta 133), reused 261 (delta 98), pack-reused 19465[K
Receiving objects: 100% (19835/19835), 71.93 MiB | 20.89 MiB/s, done.
Resolving deltas: 100% (13924/13924), done.
/content/mmaction2
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Obtaining file:///content/mmaction2
Installing collected packages: mmaction2
  Running setup.py develop for mmaction2
Successfully installed mmaction2-0.24.1
L

In [7]:
# Check Pytorch installation
import torch, torchvision
print(torch.__version__, torch.cuda.is_available())

# Check MMAction2 installation
import mmaction
print(mmaction.__version__)

# Check MMCV installation
from mmcv.ops import get_compiling_cuda_version, get_compiler_version
print(get_compiling_cuda_version())
print(get_compiler_version())

1.9.0+cu111 True
0.24.1
11.1
GCC 7.3


## Train a recognizer on customized dataset

To train a new recognizer, there are usually three things to do:
1. Support a new dataset
2. Modify the config
3. Train a new recognizer

### Support a new dataset

Firstly, let's download a small dataset obtained from following github link . We select 80 videos with their labels as train dataset and 20 videos with their labels as test dataset.

In [8]:
##cloning the data from Github
!git clone https://github.com/IvoryCandy/Skyrim-Human-Actions.git

Cloning into 'Skyrim-Human-Actions'...
remote: Enumerating objects: 135, done.[K
remote: Total 135 (delta 0), reused 0 (delta 0), pack-reused 135[K
Receiving objects: 100% (135/135), 12.26 MiB | 17.63 MiB/s, done.
Resolving deltas: 100% (11/11), done.


In [9]:
# Check the directory structure of the data
# Install tree first
!apt-get -q install tree
!tree Skyrim-Human-Actions

Reading package lists...
Building dependency tree...
Reading state information...
tree is already the newest version (1.7.0-5).
The following package was automatically installed and is no longer required:
  libnvidia-common-460
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 20 not upgraded.
Skyrim-Human-Actions
├── archery
│   ├── archery_10.mp4
│   ├── archery_1.mp4
│   ├── archery_2.mp4
│   ├── archery_3.mp4
│   ├── archery_4.mp4
│   ├── archery_5.mp4
│   ├── archery_6.mp4
│   ├── archery_7.mp4
│   ├── archery_8.mp4
│   └── archery_9.mp4
├── breaststroke
│   ├── breaststroke_10.mp4
│   ├── breaststroke_1.mp4
│   ├── breaststroke_2.mp4
│   ├── breaststroke_3.mp4
│   ├── breaststroke_4.mp4
│   ├── breaststroke_5.mp4
│   ├── breaststroke_6.mp4
│   ├── breaststroke_7.mp4
│   ├── breaststroke_8.mp4
│   └── breaststroke_9.mp4
├── crossbow
│   ├── crossbow_10.mp4
│   ├── crossbow_1.mp4
│   ├── crossbow_2.mp4
│   ├── crossbow_3.mp4
│   ├── crossbow_4.mp4
│ 

In [10]:
# directories to store training and validation data 
! mkdir /content/mmaction2/Skyrim-Human-Actions_final
! mkdir /content/mmaction2/Skyrim-Human-Actions_final/train
! mkdir /content/mmaction2/Skyrim-Human-Actions_final/val

In [11]:
import os
import shutil

In [13]:
# filing  the anotation flies and the train and validation folders 
dir="/content/mmaction2/Skyrim-Human-Actions/"
l=0
for folder in os.listdir(dir):
  t=0
  if folder==".git" or folder=='.gitignore' or folder=='extract_frame.py'or folder=='README.md'or folder=='LICENSE':
    continue 
  for i in os.listdir(dir +'/'+folder):
    if t<8:
      # Open the file in append & read mode ('a+')
      with open("/content/mmaction2/Skyrim-Human-Actions_final/train.txt", "a+") as file_object:
        file_object.seek(0)
        data = file_object.read(100)
        if len(data) > 0 :
          file_object.write("\n")
      # Append text at the end of file
        append=str(i)+" "+str(l)
        file_object.write(str(i)+" "+str(l))
        file_object.close()
      src = os.path.join(dir + '/'+folder,i)
      #dst = os.path.join("/content/Skyrim-Human-Actions_final/train", i)
      dst="/content/mmaction2/Skyrim-Human-Actions_final/train/"
      shutil.copy(src, dst)
      t+=1
    else:
      with open("/content/mmaction2/Skyrim-Human-Actions_final/val.txt", "a+") as file_object:
        file_object.seek(0)
        data = file_object.read(100)
        if len(data) > 0 :
          file_object.write("\n")
      # Append text at the end of file
        append=str(i)+" "+str(l)
        file_object.write(append)
        file_object.close()
      src = os.path.join(dir + '/'+folder,i)
      #dst = os.path.join("/content/Skyrim-Human-Actions_final/test", i)
      dst="/content/mmaction2/Skyrim-Human-Actions_final/val/"
      shutil.copy(src, dst)
  l+=1
     

### Modify the config

In the next step, we need to modify the config for the training.
To accelerate the process, we finetune a recognizer using a pre-trained recognizer.

In [14]:
!mkdir checkpoints
!wget -c https://download.openmmlab.com/mmaction/recognition/tsn/tsn_r50_1x1x3_100e_kinetics400_rgb/tsn_r50_1x1x3_100e_kinetics400_rgb_20200614-e508be42.pth \
      -O checkpoints/tsn_r50_1x1x3_100e_kinetics400_rgb_20200614-e508be42.pth

--2022-12-10 19:18:02--  https://download.openmmlab.com/mmaction/recognition/tsn/tsn_r50_1x1x3_100e_kinetics400_rgb/tsn_r50_1x1x3_100e_kinetics400_rgb_20200614-e508be42.pth
Resolving download.openmmlab.com (download.openmmlab.com)... 47.75.20.5
Connecting to download.openmmlab.com (download.openmmlab.com)|47.75.20.5|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 97579339 (93M) [application/octet-stream]
Saving to: ‘checkpoints/tsn_r50_1x1x3_100e_kinetics400_rgb_20200614-e508be42.pth’


2022-12-10 19:18:04 (68.0 MB/s) - ‘checkpoints/tsn_r50_1x1x3_100e_kinetics400_rgb_20200614-e508be42.pth’ saved [97579339/97579339]



Given a config that trains a TSN model on kinetics400-full dataset, we need to modify some values to use it for training TSN on Kinetics400-tiny dataset.


In [19]:
from mmcv.runner import set_random_seed
# Modify dataset type and path
from mmcv import Config
cfg = Config.fromfile('./configs/recognition/tsn/tsn_r50_video_1x1x8_100e_kinetics400_rgb.py')
cfg.dataset_type = 'VideoDataset'
cfg.data_root = 'Skyrim-Human-Actions_final/train/'
cfg.data_root_val = 'Skyrim-Human-Actions_final/val/'
cfg.ann_file_train = 'Skyrim-Human-Actions_final/train.txt'
cfg.ann_file_val = 'Skyrim-Human-Actions_final/val.txt'
cfg.ann_file_test = 'Skyrim-Human-Actions_final/val.txt'

cfg.data.test.type = 'VideoDataset'
cfg.data.test.ann_file = 'Skyrim-Human-Actions_final/val.txt'
cfg.data.test.data_prefix = 'Skyrim-Human-Actions_final/val/'

cfg.data.train.type = 'VideoDataset'
cfg.data.train.ann_file = 'Skyrim-Human-Actions_final/train.txt'
cfg.data.train.data_prefix = 'Skyrim-Human-Actions_final/train/'

cfg.data.val.type = 'VideoDataset'
cfg.data.val.ann_file = 'Skyrim-Human-Actions_final/val.txt'
cfg.data.val.data_prefix = 'Skyrim-Human-Actions_final/val/'
#cfg.data.video_per_gpu=32

# The flag is used to determine whether it is omnisource training
cfg.setdefault('omnisource', False)
# Modify num classes of the model in cls_head
cfg.model.cls_head.num_classes = 10
# We can use the pre-trained TSN model
cfg.load_from = './checkpoints/tsn_r50_1x1x3_100e_kinetics400_rgb_20200614-e508be42.pth'

# Set up working dir to save files and logs.
cfg.work_dir = './tutorial_exps'

# The original learning rate (LR) is set for 8-GPU training.
# We divide it by 8 since we only use one GPU.
cfg.data.videos_per_gpu = cfg.data.videos_per_gpu // 16
cfg.optimizer.lr = cfg.optimizer.lr / 8 / 16
cfg.total_epochs = 40
# We can set the checkpoint saving interval to reduce the storage cost
cfg.checkpoint_config.interval = 5
# We can set the log print interval to reduce the the times of printing log
cfg.log_config.interval = 5

# Set seed thus the results are more reproducible
cfg.seed = 0
set_random_seed(0, deterministic=False)
cfg.gpu_ids = range(1)

# Save the best
cfg.evaluation.save_best='auto'


# We can initialize the logger for training and have a look
# at the final config used for training
print(f'Config:\n{cfg.pretty_text}')

Config:
model = dict(
    type='Recognizer2D',
    backbone=dict(
        type='ResNet',
        pretrained='torchvision://resnet50',
        depth=50,
        norm_eval=False),
    cls_head=dict(
        type='TSNHead',
        num_classes=10,
        in_channels=2048,
        spatial_type='avg',
        consensus=dict(type='AvgConsensus', dim=1),
        dropout_ratio=0.4,
        init_std=0.01),
    train_cfg=None,
    test_cfg=dict(average_clips=None))
optimizer = dict(type='SGD', lr=7.8125e-05, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=40, norm_type=2))
lr_config = dict(policy='step', step=[40, 80])
total_epochs = 40
checkpoint_config = dict(interval=5)
log_config = dict(interval=5, hooks=[dict(type='TextLoggerHook')])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = './checkpoints/tsn_r50_1x1x3_100e_kinetics400_rgb_20200614-e508be42.pth'
resume_from = None
workflow = [('train', 1)]
opencv_num_threads = 0
mp_start_method = 

### Train a new recognizer

Finally, lets initialize the dataset and recognizer, then train a new recognizer!

In [20]:
import os.path as osp

from mmaction.datasets import build_dataset
from mmaction.models import build_model
from mmaction.apis import train_model

import mmcv

# Build the dataset
datasets = [build_dataset(cfg.data.train)]

# Build the recognizer
model = build_model(cfg.model, train_cfg=cfg.get('train_cfg'), test_cfg=cfg.get('test_cfg'))

# Create work_dir
mmcv.mkdir_or_exist(osp.abspath(cfg.work_dir))
train_model(model, datasets, cfg, distributed=False, validate=True)

2022-12-10 19:47:34,245 - mmaction - INFO - These parameters in pretrained checkpoint are not loaded: {'fc.bias', 'fc.weight'}
2022-12-10 19:47:34,296 - mmaction - INFO - load checkpoint from local path: ./checkpoints/tsn_r50_1x1x3_100e_kinetics400_rgb_20200614-e508be42.pth


load checkpoint from torchvision path: torchvision://resnet50



size mismatch for cls_head.fc_cls.weight: copying a param with shape torch.Size([400, 2048]) from checkpoint, the shape in current model is torch.Size([10, 2048]).
size mismatch for cls_head.fc_cls.bias: copying a param with shape torch.Size([400]) from checkpoint, the shape in current model is torch.Size([10]).
2022-12-10 19:47:34,392 - mmaction - INFO - Start running, host: root@dfe2d096301f, work_dir: /content/mmaction2/tutorial_exps
2022-12-10 19:47:34,393 - mmaction - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH   ) StepLrUpdaterHook                  
(NORMAL      ) CheckpointHook                     
(LOW         ) EvalHook                           
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
before_train_epoch:
(VERY_HIGH   ) StepLrUpdaterHook                  
(LOW         ) IterTimerHook                      
(LOW         ) EvalHook                           
(VERY_LOW    ) TextLoggerHook                     
 ---

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 20/20, 26.1 task/s, elapsed: 1s, ETA:     0s

2022-12-10 19:48:22,589 - mmaction - INFO - Evaluating top_k_accuracy ...
2022-12-10 19:48:22,591 - mmaction - INFO - 
top1_acc	0.1000
top5_acc	0.8500
2022-12-10 19:48:22,592 - mmaction - INFO - Evaluating mean_class_accuracy ...
2022-12-10 19:48:22,596 - mmaction - INFO - 
mean_acc	0.1000
2022-12-10 19:48:23,339 - mmaction - INFO - Now best checkpoint is saved as best_top1_acc_epoch_5.pth.
2022-12-10 19:48:23,340 - mmaction - INFO - Best top1_acc is 0.1000 at 5 epoch.
2022-12-10 19:48:23,341 - mmaction - INFO - Epoch(val) [5][10]	top1_acc: 0.1000, top5_acc: 0.8500, mean_class_accuracy: 0.1000
2022-12-10 19:48:26,505 - mmaction - INFO - Epoch [6][5/40]	lr: 7.813e-05, eta: 0:05:35, time: 0.631, data_time: 0.445, memory: 4075, top1_acc: 0.0000, top5_acc: 0.8000, loss_cls: 2.2495, loss: 2.2495, grad_norm: 15.8859
2022-12-10 19:48:27,368 - mmaction - INFO - Epoch [6][10/40]	lr: 7.813e-05, eta: 0:05:31, time: 0.173, data_time: 0.001, memory: 4075, top1_acc: 0.3000, top5_acc: 0.5000, loss_cl

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 20/20, 25.7 task/s, elapsed: 1s, ETA:     0s

2022-12-10 19:49:11,850 - mmaction - INFO - Evaluating top_k_accuracy ...
2022-12-10 19:49:11,852 - mmaction - INFO - 
top1_acc	0.3500
top5_acc	0.9000
2022-12-10 19:49:11,853 - mmaction - INFO - Evaluating mean_class_accuracy ...
2022-12-10 19:49:11,857 - mmaction - INFO - 
mean_acc	0.3500
2022-12-10 19:49:11,882 - mmaction - INFO - The previous best checkpoint /content/mmaction2/tutorial_exps/best_top1_acc_epoch_5.pth was removed
2022-12-10 19:49:12,629 - mmaction - INFO - Now best checkpoint is saved as best_top1_acc_epoch_10.pth.
2022-12-10 19:49:12,630 - mmaction - INFO - Best top1_acc is 0.3500 at 10 epoch.
2022-12-10 19:49:12,631 - mmaction - INFO - Epoch(val) [10][10]	top1_acc: 0.3500, top5_acc: 0.9000, mean_class_accuracy: 0.3500
2022-12-10 19:49:15,816 - mmaction - INFO - Epoch [11][5/40]	lr: 7.813e-05, eta: 0:04:42, time: 0.636, data_time: 0.445, memory: 4075, top1_acc: 0.3000, top5_acc: 0.8000, loss_cls: 2.1635, loss: 2.1635, grad_norm: 16.1371
2022-12-10 19:49:16,688 - mmac

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 20/20, 25.5 task/s, elapsed: 1s, ETA:     0s

2022-12-10 19:50:01,727 - mmaction - INFO - Evaluating top_k_accuracy ...
2022-12-10 19:50:01,729 - mmaction - INFO - 
top1_acc	0.6500
top5_acc	0.9000
2022-12-10 19:50:01,732 - mmaction - INFO - Evaluating mean_class_accuracy ...
2022-12-10 19:50:01,734 - mmaction - INFO - 
mean_acc	0.6500
2022-12-10 19:50:01,758 - mmaction - INFO - The previous best checkpoint /content/mmaction2/tutorial_exps/best_top1_acc_epoch_10.pth was removed
2022-12-10 19:50:02,461 - mmaction - INFO - Now best checkpoint is saved as best_top1_acc_epoch_15.pth.
2022-12-10 19:50:02,463 - mmaction - INFO - Best top1_acc is 0.6500 at 15 epoch.
2022-12-10 19:50:02,465 - mmaction - INFO - Epoch(val) [15][10]	top1_acc: 0.6500, top5_acc: 0.9000, mean_class_accuracy: 0.6500
2022-12-10 19:50:05,664 - mmaction - INFO - Epoch [16][5/40]	lr: 7.813e-05, eta: 0:03:54, time: 0.638, data_time: 0.445, memory: 4075, top1_acc: 0.5000, top5_acc: 0.7000, loss_cls: 2.1864, loss: 2.1864, grad_norm: 15.7069
2022-12-10 19:50:06,564 - mma

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 20/20, 19.5 task/s, elapsed: 1s, ETA:     0s

2022-12-10 19:50:52,994 - mmaction - INFO - Evaluating top_k_accuracy ...
2022-12-10 19:50:52,997 - mmaction - INFO - 
top1_acc	0.7500
top5_acc	0.9500
2022-12-10 19:50:52,999 - mmaction - INFO - Evaluating mean_class_accuracy ...
2022-12-10 19:50:53,001 - mmaction - INFO - 
mean_acc	0.7500
2022-12-10 19:50:53,038 - mmaction - INFO - The previous best checkpoint /content/mmaction2/tutorial_exps/best_top1_acc_epoch_15.pth was removed
2022-12-10 19:50:53,897 - mmaction - INFO - Now best checkpoint is saved as best_top1_acc_epoch_20.pth.
2022-12-10 19:50:53,899 - mmaction - INFO - Best top1_acc is 0.7500 at 20 epoch.
2022-12-10 19:50:53,901 - mmaction - INFO - Epoch(val) [20][10]	top1_acc: 0.7500, top5_acc: 0.9500, mean_class_accuracy: 0.7500
2022-12-10 19:50:57,111 - mmaction - INFO - Epoch [21][5/40]	lr: 7.813e-05, eta: 0:03:08, time: 0.640, data_time: 0.449, memory: 4075, top1_acc: 0.6000, top5_acc: 1.0000, loss_cls: 1.7890, loss: 1.7890, grad_norm: 16.0251
2022-12-10 19:50:58,020 - mma

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 20/20, 25.6 task/s, elapsed: 1s, ETA:     0s

2022-12-10 19:51:43,799 - mmaction - INFO - Evaluating top_k_accuracy ...
2022-12-10 19:51:43,802 - mmaction - INFO - 
top1_acc	0.8000
top5_acc	1.0000
2022-12-10 19:51:43,804 - mmaction - INFO - Evaluating mean_class_accuracy ...
2022-12-10 19:51:43,806 - mmaction - INFO - 
mean_acc	0.8000
2022-12-10 19:51:43,832 - mmaction - INFO - The previous best checkpoint /content/mmaction2/tutorial_exps/best_top1_acc_epoch_20.pth was removed
2022-12-10 19:51:44,629 - mmaction - INFO - Now best checkpoint is saved as best_top1_acc_epoch_25.pth.
2022-12-10 19:51:44,631 - mmaction - INFO - Best top1_acc is 0.8000 at 25 epoch.
2022-12-10 19:51:44,635 - mmaction - INFO - Epoch(val) [25][10]	top1_acc: 0.8000, top5_acc: 1.0000, mean_class_accuracy: 0.8000
2022-12-10 19:51:47,873 - mmaction - INFO - Epoch [26][5/40]	lr: 7.813e-05, eta: 0:02:21, time: 0.646, data_time: 0.450, memory: 4075, top1_acc: 0.6000, top5_acc: 1.0000, loss_cls: 1.6804, loss: 1.6804, grad_norm: 16.1786
2022-12-10 19:51:48,786 - mma

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 20/20, 26.0 task/s, elapsed: 1s, ETA:     0s

2022-12-10 19:52:34,710 - mmaction - INFO - Evaluating top_k_accuracy ...
2022-12-10 19:52:34,713 - mmaction - INFO - 
top1_acc	0.6500
top5_acc	1.0000
2022-12-10 19:52:34,714 - mmaction - INFO - Evaluating mean_class_accuracy ...
2022-12-10 19:52:34,717 - mmaction - INFO - 
mean_acc	0.6500
2022-12-10 19:52:34,718 - mmaction - INFO - Epoch(val) [30][10]	top1_acc: 0.6500, top5_acc: 1.0000, mean_class_accuracy: 0.6500
2022-12-10 19:52:37,912 - mmaction - INFO - Epoch [31][5/40]	lr: 7.813e-05, eta: 0:01:33, time: 0.637, data_time: 0.447, memory: 4075, top1_acc: 0.5000, top5_acc: 0.9000, loss_cls: 1.6601, loss: 1.6601, grad_norm: 16.3921
2022-12-10 19:52:38,816 - mmaction - INFO - Epoch [31][10/40]	lr: 7.813e-05, eta: 0:01:32, time: 0.181, data_time: 0.001, memory: 4075, top1_acc: 0.3000, top5_acc: 1.0000, loss_cls: 1.9064, loss: 1.9064, grad_norm: 16.3769
2022-12-10 19:52:39,727 - mmaction - INFO - Epoch [31][15/40]	lr: 7.813e-05, eta: 0:01:31, time: 0.182, data_time: 0.001, memory: 4075, 

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 20/20, 25.4 task/s, elapsed: 1s, ETA:     0s

2022-12-10 19:53:24,633 - mmaction - INFO - Evaluating top_k_accuracy ...
2022-12-10 19:53:24,635 - mmaction - INFO - 
top1_acc	0.7000
top5_acc	1.0000
2022-12-10 19:53:24,636 - mmaction - INFO - Evaluating mean_class_accuracy ...
2022-12-10 19:53:24,640 - mmaction - INFO - 
mean_acc	0.7000
2022-12-10 19:53:24,642 - mmaction - INFO - Epoch(val) [35][10]	top1_acc: 0.7000, top5_acc: 1.0000, mean_class_accuracy: 0.7000
2022-12-10 19:53:27,863 - mmaction - INFO - Epoch [36][5/40]	lr: 7.813e-05, eta: 0:00:46, time: 0.643, data_time: 0.449, memory: 4075, top1_acc: 0.5000, top5_acc: 1.0000, loss_cls: 1.6977, loss: 1.6977, grad_norm: 16.0621
2022-12-10 19:53:28,772 - mmaction - INFO - Epoch [36][10/40]	lr: 7.813e-05, eta: 0:00:45, time: 0.182, data_time: 0.001, memory: 4075, top1_acc: 0.8000, top5_acc: 1.0000, loss_cls: 1.3501, loss: 1.3501, grad_norm: 14.9392
2022-12-10 19:53:29,685 - mmaction - INFO - Epoch [36][15/40]	lr: 7.813e-05, eta: 0:00:43, time: 0.183, data_time: 0.001, memory: 4075, 

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 20/20, 26.1 task/s, elapsed: 1s, ETA:     0s

2022-12-10 19:54:14,546 - mmaction - INFO - Evaluating top_k_accuracy ...
2022-12-10 19:54:14,548 - mmaction - INFO - 
top1_acc	0.8000
top5_acc	1.0000
2022-12-10 19:54:14,551 - mmaction - INFO - Evaluating mean_class_accuracy ...
2022-12-10 19:54:14,553 - mmaction - INFO - 
mean_acc	0.8000
2022-12-10 19:54:14,555 - mmaction - INFO - Epoch(val) [40][10]	top1_acc: 0.8000, top5_acc: 1.0000, mean_class_accuracy: 0.8000


### Understand the log
From the log, we can have a basic understanding the training process and know how well the recognizer is trained.

Firstly, the ResNet-50 backbone pre-trained on ImageNet is loaded, this is a common practice since training from scratch is more cost. The log shows that all the weights of the ResNet-50 backbone are loaded except the `fc.bias` and `fc.weight`.

Second, since the dataset we are using is small, we loaded a TSN model and finetune it for action recognition.
The original TSN is trained on original Kinetics-400 dataset which contains 400 classes but our dataset  have 10 classes. Therefore, the last FC layer of the pre-trained TSN for classification has different weight shape and is not used.

Third, after training, the recognizer is evaluated by the default evaluation. The results show that the recognizer achieves % top1 accuracy and % top5 accuracy on the val dataset,
 
Not bad!

## Test the trained recognizer

After finetuning the recognizer, let's check the prediction results!

In [21]:
from mmaction.apis import single_gpu_test
from mmaction.datasets import build_dataloader
from mmcv.parallel import MMDataParallel

# Build a test dataloader
dataset = build_dataset(cfg.data.test, dict(test_mode=True))
data_loader = build_dataloader(
        dataset,
        videos_per_gpu=1,
        workers_per_gpu=cfg.data.workers_per_gpu,
        dist=False,
        shuffle=False)
model = MMDataParallel(model, device_ids=[0])
outputs = single_gpu_test(model, data_loader)

eval_config = cfg.evaluation
eval_config.pop('interval')
eval_res = dataset.evaluate(outputs, **eval_config)
for name, val in eval_res.items():
    print(f'{name}: {val:.04f}')

[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 20/20, 0.7 task/s, elapsed: 30s, ETA:     0s
Evaluating top_k_accuracy ...

top1_acc	0.8000
top5_acc	1.0000

Evaluating mean_class_accuracy ...

mean_acc	0.8000
top1_acc: 0.8000
top5_acc: 1.0000
mean_class_accuracy: 0.8000
