# Action recognition using TAO ActionRecognitionNet

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png" width="1080">

## Sample prediction of ActionRecognitionNet
<img align="center" src="https://github.com/vpraveen-nv/model_card_images/blob/main/cv/notebook/action_recognition/ARNet_inference.png?raw=true" width="960">

## Learning Objectives

In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Train 3D RGB only for action recognition on the subset of [HMDB51](https://serre-lab.clps.brown.edu/resource/hmdb-a-large-human-motion-database/) dataset.
* Evaluate the trained model.
* Run Inference on the trained model.
* Export the trained model to a .etlt file for deployment to DeepStream.

At the end of this notebook, you will have a trained and optimized `action_recognition` model that you
may deploy via [DeepStream](https://developer.nvidia.com/deepstream-sdk).

## Table of Contents

This notebook shows an example usecase of ActionRecognitionNet using Train Adapt Optimize (TAO) Toolkit.

0. [Set up env variables and map drives](#head-0)
1. [Installing the TAO launcher](#head-1)
2. [Prepare dataset and pre-trained model](#head-2)
3. [Provide training specification](#head-3)
4. [Run TAO training](#head-4)
5. [Evaluate trained models](#head-5)
6. [Inferences](#head-6)
7. [Deploy](#head-7)


## 0. Set up env variables and map drives <a class="anchor" id="head-0"></a>

When using the purpose-built pretrained models from NGC, please make sure to set the `$KEY` environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

The TAO launcher uses docker containers under the hood, and **for our data and results directory to be visible to the docker, they need to be mapped**. The launcher can be configured using the config file `~/.tao_mounts.json`. Apart from the mounts, you can also configure additional options like the Environment Variables and amount of Shared Memory available to the TAO launcher. <br>

`IMPORTANT NOTE:` The code below creates a sample `~/.tao_mounts.json`  file. Here, we can map directories in which we save the data, specs, results and cache. You should configure it for your specific case so these directories are correctly visible to the docker container.


In [1]:
import os

# Please define this local project directory that needs to be mapped to the TAO docker session.
%env LOCAL_PROJECT_DIR=/Users/jaswanthngade/Documents/Github/TAO_setup

os.environ["HOST_DATA_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "data", "actionrecognitionnet")
os.environ["HOST_RESULTS_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "actionrecognitionnet")

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=/path/to/local/tao-experiments/action_recognition_net
# The sample spec files are present in the same path as the downloaded samples.
os.environ["HOST_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)

# Set your encryption key, and use the same key for all commands
%env KEY = nvidia_tao

env: LOCAL_PROJECT_DIR=/Users/jaswanthngade/Documents/Github/TAO_setup
env: KEY=nvidia_tao


In [2]:
! mkdir -p $HOST_DATA_DIR
! mkdir -p $HOST_SPECS_DIR
! mkdir -p $HOST_RESULTS_DIR

In [3]:
# Mapping up the local directories to the TAO docker.
import json
import os
mounts_file = os.path.expanduser("~/.tao_mounts.json")
tlt_configs = {
   "Mounts":[
       # Mapping the data directory
       {
           "source": os.environ["LOCAL_PROJECT_DIR"],
           "destination": "/workspace/tao-experiments"
       },
       {
           "source": os.environ["HOST_DATA_DIR"],
           "destination": "/data"
       },
       {
           "source": os.environ["HOST_SPECS_DIR"],
           "destination": "/specs"
       },
       {
           "source": os.environ["HOST_RESULTS_DIR"],
           "destination": "/results"
       },
   ],
   "DockerOptions": {
        "shm_size": "16G",
        "ulimits": {
            "memlock": -1,
            "stack": 67108864
         }
   }
}
# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(tlt_configs, mfile, indent=4)

In [4]:
!cat ~/.tao_mounts.json

{
    "Mounts": [
        {
            "source": "/Users/jaswanthngade/Documents/Github/TAO_setup",
            "destination": "/workspace/tao-experiments"
        },
        {
            "source": "/Users/jaswanthngade/Documents/Github/TAO_setup/data/actionrecognitionnet",
            "destination": "/data"
        },
        {
            "source": "/Users/jaswanthngade/Documents/Github/TAO_setup/getting_started_v5.0.0/notebooks/tao_launcher_starter_kit/action_recognition_net/specs",
            "destination": "/specs"
        },
        {
            "source": "/Users/jaswanthngade/Documents/Github/TAO_setup/actionrecognitionnet",
            "destination": "/results"
        }
    ],
    "DockerOptions": {
        "shm_size": "16G",
        "ulimits": {
            "memlock": -1,
            "stack": 67108864
        }
    }
}

## 1. Installing the TAO launcher <a class="anchor" id="head-1"></a>
The TAO launcher is a python package distributed as a python wheel listed in PyPI. You may install the launcher by executing the following cell.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:
* python >=3.6.9 < 3.8.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 455+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be triggered to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.


In [5]:
# SKIP this step IF you have already installed the TAO launcher.
!pip3 install nvidia-tao

Collecting nvidia-tao
  Downloading nvidia_tao-5.1.0-py3-none-any.whl.metadata (8.0 kB)
Collecting chardet==3.0.4 (from nvidia-tao)
  Downloading chardet-3.0.4-py2.py3-none-any.whl (133 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m133.4/133.4 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0m00:01[0m
[?25hCollecting docker==4.3.1 (from nvidia-tao)
  Downloading docker-4.3.1-py2.py3-none-any.whl (145 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m145.2/145.2 kB[0m [31m12.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting docker-pycreds==0.4.0 (from nvidia-tao)
  Downloading docker_pycreds-0.4.0-py2.py3-none-any.whl (9.0 kB)
Collecting idna==2.10 (from nvidia-tao)
  Downloading idna-2.10-py2.py3-none-any.whl (58 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.8/58.8 kB[0m [31m7.3 MB/s[0m eta [36m0:00:00[0m
Collecting six==1.15.0 (from nvidia-tao)
  Downloading six-1.15.0-py2.py3-none-any.whl (10 kB)
Collecting tabulate==0.

In [6]:
# View the versions of the TAO launcher
!tao info

Configuration of the TAO Toolkit Instance
task_group: ['model', 'dataset', 'deploy']
format_version: 3.0
toolkit_version: 5.1.0
published_date: 10/10/2023


## 2. Prepare dataset and pre-trained model <a class="anchor" id="head-2"></a>

### 2.1 Prepare dataset

 We will be using the [HMDB51](https://serre-lab.clps.brown.edu/resource/hmdb-a-large-human-motion-database/) dataset for the tutorial. Download the HMDB51 dataset and unrar them firstly (We choose fall_floor/ride_bike for this tutorial): 

In [17]:
# install unrar
# NOTE: The following commands require `sudo`. You can run the command outside the notebook.
# !apt update
!brew install rar

[34m==>[0m [1mDownloading https://formulae.brew.sh/api/formula.jws.json[0m
######################################################################### 100.0%
[34m==>[0m [1mDownloading https://formulae.brew.sh/api/cask.jws.json[0m
######################################################################### 100.0%


In [22]:
# download the dataset and unrar the files
# !wget -P $HOST_DATA_DIR http://serre-lab.clps.brown.edu/wp-content/uploads/2013/10/hmdb51_org.rar
!mkdir -p $HOST_DATA_DIR/videos && unrar x -o+ $HOST_DATA_DIR/hmdb51_org.rar $HOST_DATA_DIR/videos
!mkdir -p $HOST_DATA_DIR/raw_data
!unrar x -o+ $HOST_DATA_DIR/videos/fall_floor.rar $HOST_DATA_DIR/raw_data
!unrar x -o+ $HOST_DATA_DIR/videos/ride_bike.rar $HOST_DATA_DIR/raw_data


UNRAR 6.24 freeware      Copyright (c) 1993-2023 Alexander Roshal


Extracting from /Users/jaswanthngade/Documents/Github/TAO_setup/data/actionrecognitionnet/hmdb51_org.rar

Extracting  /Users/jaswanthngade/Documents/Github/TAO_setup/data/actionrecognitionnet/videos/shoot_gun.rar       1  OK 
Extracting  /Users/jaswanthngade/Documents/Github/TAO_setup/data/actionrecognitionnet/videos/sit.rar         3  OK 
Extracting  /Users/jaswanthngade/Documents/Github/TAO_setup/data/actionrecognitionnet/videos/situp.rar       4  OK 
Extracting  /Users/jaswanthngade/Documents/Github/TAO_setup/data/actionrecognitionnet/videos/smile.rar       5  OK 
Extracting  /Users/jaswanthngade/Documents/Github/TAO_setup/data/actionrecognitionnet/videos/smoke.rar         7  OK 
Extracting  /Users/jaswanthngade/Documents/Github/TAO_setup/data/actionrecognitionnet/videos/somersault.rar         9  OK 
Extracting  /Users/jaswanthngade/Documents/Github/TAO_setup/data/actionrecognitionnet/videos/stand.rar      1 11  OK

Clone the dataset process script

In [24]:
!if [ -d tao_toolkit_recipes ]; then rm -rf tao_toolkit_recipes; fi
!git clone https://github.com/NVIDIA-AI-IOT/tao_toolkit_recipes

Cloning into 'tao_toolkit_recipes'...
remote: Enumerating objects: 269, done.[K
remote: Counting objects: 100% (72/72), done.[K
remote: Compressing objects: 100% (41/41), done.[K
remote: Total 269 (delta 31), reused 44 (delta 19), pack-reused 197[K
Receiving objects: 100% (269/269), 741.55 KiB | 2.17 MiB/s, done.
Resolving deltas: 100% (91/91), done.


Install the dependency for data generator:

In [19]:
!pip3 install xmltodict opencv-python



Run the process script. 

In [25]:
!cd tao_toolkit_recipes/tao_action_recognition/data_generation/ && bash ./preprocess_HMDB_RGB.sh $HOST_DATA_DIR/raw_data $HOST_DATA_DIR/processed_data

/Users/jaswanthngade/Documents/Github/TAO_setup/data/actionrecognitionnet/raw_data
/Users/jaswanthngade/Documents/Github/TAO_setup/data/actionrecognitionnet/processed_data
Preprocess fall_floor
f cnt: 55.0
f cnt: 51.0
f cnt: 64.0
f cnt: 34.0
f cnt: 110.0
f cnt: 63.0
f cnt: 72.0
f cnt: 49.0
f cnt: 48.0
f cnt: 74.0
f cnt: 72.0
f cnt: 47.0
f cnt: 72.0
f cnt: 79.0
f cnt: 47.0
f cnt: 55.0
f cnt: 77.0
f cnt: 60.0
f cnt: 79.0
f cnt: 57.0
f cnt: 79.0
f cnt: 49.0
f cnt: 50.0
f cnt: 48.0
f cnt: 59.0
f cnt: 86.0
f cnt: 50.0
f cnt: 43.0
f cnt: 49.0
f cnt: 46.0
f cnt: 79.0
f cnt: 54.0
f cnt: 63.0
f cnt: 148.0
f cnt: 49.0
f cnt: 50.0
f cnt: 73.0
f cnt: 54.0
f cnt: 48.0
f cnt: 50.0
f cnt: 48.0
f cnt: 74.0
f cnt: 50.0
f cnt: 74.0
f cnt: 45.0
f cnt: 47.0
f cnt: 78.0
f cnt: 48.0
f cnt: 51.0
f cnt: 49.0
f cnt: 49.0
f cnt: 49.0
f cnt: 55.0
f cnt: 49.0
f cnt: 49.0
f cnt: 51.0
f cnt: 49.0
f cnt: 51.0
f cnt: 78.0
f cnt: 50.0
f cnt: 48.0
f cnt: 47.0
f cnt: 56.0
f cnt: 76.0
f cnt: 79.0
f cnt: 56.0
f cnt: 49.0


We also provide scripts to preprocess optical flow dataset. The following cells for processing optical flow dataset is `Optional`.

`OPTIONAL:` Download the app based on NVOF SDK to generate optical flow. It is packaged with this notebook.

In [26]:
#!echo <passwd> | sudo -S apt install -y libfreeimage-dev

`OPTIONAL` Run the process script for HMDB. 

`IMPORTANT NOTE`: to run the `preprocess_HMDB.sh` generating optical flow, a Turing or Ampere above GPU is needed. 

In [27]:
#!cp ./AppOFCuda tao_toolkit_recipes/tao_action_recognition/data_generation/
#!cd tao_toolkit_recipes/tao_action_recognition/data_generation/ && bash ./preprocess_HMDB.sh $HOST_DATA_DIR/raw_data $HOST_DATA_DIR/processed_data

In [28]:
# download the split files and unrar
!wget -P $HOST_DATA_DIR http://serre-lab.clps.brown.edu/wp-content/uploads/2013/10/test_train_splits.rar
!mkdir -p $HOST_DATA_DIR/splits && unrar x -o+ $HOST_DATA_DIR/test_train_splits.rar $HOST_DATA_DIR/splits

--2023-11-23 15:58:22--  http://serre-lab.clps.brown.edu/wp-content/uploads/2013/10/test_train_splits.rar
Resolving serre-lab.clps.brown.edu (serre-lab.clps.brown.edu)... 128.148.254.114
Connecting to serre-lab.clps.brown.edu (serre-lab.clps.brown.edu)|128.148.254.114|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://serre-lab.clps.brown.edu/wp-content/uploads/2013/10/test_train_splits.rar [following]
--2023-11-23 15:58:22--  https://serre-lab.clps.brown.edu/wp-content/uploads/2013/10/test_train_splits.rar
Connecting to serre-lab.clps.brown.edu (serre-lab.clps.brown.edu)|128.148.254.114|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 199521 (195K)
Saving to: ‘/Users/jaswanthngade/Documents/Github/TAO_setup/data/actionrecognitionnet/test_train_splits.rar.2’


2023-11-23 15:58:23 (626 KB/s) - ‘/Users/jaswanthngade/Documents/Github/TAO_setup/data/actionrecognitionnet/test_train_splits.rar.2’ saved [199521/199521]


UNRAR 6.24 f

In [29]:
# run split_HMDB to generate training split
!if [ -d $HOST_DATA_DIR/train ]; then rm -rf $HOST_DATA_DIR/train $HOST_DATA_DIR/test; fi
!cd tao_toolkit_recipes/tao_action_recognition/data_generation/ && python3 ./split_dataset.py $HOST_DATA_DIR/processed_data $HOST_DATA_DIR/splits/testTrainMulti_7030_splits $HOST_DATA_DIR/train  $HOST_DATA_DIR/test

Traceback (most recent call last):
  File "/Users/jaswanthngade/Documents/Github/TAO_setup/getting_started_v5.0.0/notebooks/tao_launcher_starter_kit/action_recognition_net/tao_toolkit_recipes/tao_action_recognition/data_generation/./split_dataset.py", line 46, in <module>
    with open(split_files, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/Users/jaswanthngade/Documents/Github/TAO_setup/data/actionrecognitionnet/splits/testTrainMulti_7030_splits/.DS_Store_test_split1.txt'


In [30]:
# verify
!ls -l $HOST_DATA_DIR/train
!ls -l $HOST_DATA_DIR/train/ride_bike
!ls -l $HOST_DATA_DIR/test
!ls -l $HOST_DATA_DIR/test/ride_bike

total 0
drwxr-xr-x  72 jaswanthngade  staff  2304 Nov 23 15:58 [34mride_bike[m[m
total 0
drwxr-xr-x  3 jaswanthngade  staff  96 Nov 23 15:57 [34m#437_How_To_Ride_A_Bike_ride_bike_f_cm_np1_ba_med_0[m[m
drwxr-xr-x  3 jaswanthngade  staff  96 Nov 23 15:57 [34m#437_How_To_Ride_A_Bike_ride_bike_f_cm_np1_ba_med_1[m[m
drwxr-xr-x  3 jaswanthngade  staff  96 Nov 23 15:57 [34m#437_How_To_Ride_A_Bike_ride_bike_f_cm_np1_ba_med_3[m[m
drwxr-xr-x  3 jaswanthngade  staff  96 Nov 23 15:57 [34m#437_How_To_Ride_A_Bike_ride_bike_f_cm_np1_fr_med_2[m[m
drwxr-xr-x  3 jaswanthngade  staff  96 Nov 23 15:57 [34m1989_Tour_de_France_Final_Time_Trial_ride_bike_f_cm_np1_ba_med_0[m[m
drwxr-xr-x  3 jaswanthngade  staff  96 Nov 23 15:57 [34m1989_Tour_de_France_Final_Time_Trial_ride_bike_f_cm_np1_ba_med_1[m[m
drwxr-xr-x  3 jaswanthngade  staff  96 Nov 23 15:57 [34m1989_Tour_de_France_Final_Time_Trial_ride_bike_f_cm_np1_ba_med_2[m[m
drwxr-xr-x  3 jaswanthngade  staff  96 Nov 23 15:57 [34m1989_To

### 2.2 Download pretrained model from NGC

We will use NGC CLI to get the pre-trained models. For more details, go to https://ngc.nvidia.com and click the SETUP on the navigation bar.

In [31]:
# Installing NGC CLI on the local machine.
## Download and install
import os
%env CLI=ngccli_cat_linux.zip
!mkdir -p $HOST_RESULTS_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $HOST_RESULTS_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $HOST_RESULTS_DIR/ngccli
!unzip -u "$HOST_RESULTS_DIR/ngccli/$CLI" -d $HOST_RESULTS_DIR/ngccli/
!rm $HOST_RESULTS_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("HOST_RESULTS_DIR", ""), os.getenv("PATH", ""))

env: CLI=ngccli_cat_linux.zip
zsh:1: no matches found: /Users/jaswanthngade/Documents/Github/TAO_setup/actionrecognitionnet/ngccli/*
--2023-11-23 15:58:24--  https://ngc.nvidia.com/downloads/ngccli_cat_linux.zip
Resolving ngc.nvidia.com (ngc.nvidia.com)... 18.155.173.28, 18.155.173.81, 18.155.173.22, ...
Connecting to ngc.nvidia.com (ngc.nvidia.com)|18.155.173.28|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 45879114 (44M) [application/zip]
Saving to: ‘/Users/jaswanthngade/Documents/Github/TAO_setup/actionrecognitionnet/ngccli/ngccli_cat_linux.zip’


2023-11-23 15:58:28 (13.5 MB/s) - ‘/Users/jaswanthngade/Documents/Github/TAO_setup/actionrecognitionnet/ngccli/ngccli_cat_linux.zip’ saved [45879114/45879114]

Archive:  /Users/jaswanthngade/Documents/Github/TAO_setup/actionrecognitionnet/ngccli/ngccli_cat_linux.zip
   creating: /Users/jaswanthngade/Documents/Github/TAO_setup/actionrecognitionnet/ngccli/ngc-cli/
  inflating: /Users/jaswanthngade/Documents/Github

In [32]:
!ngc registry model list nvidia/tao/actionrecognitionnet:*

zsh:1: no matches found: nvidia/tao/actionrecognitionnet:*


In [33]:
!mkdir -p $HOST_RESULTS_DIR/pretrained

In [34]:
# Pull pretrained model from NGC 
!ngc registry model download-version "nvidia/tao/actionrecognitionnet:trainable_v1.0" --dest $HOST_RESULTS_DIR/pretrained
# Download the optical flow model from NGC
# !ngc registry model download-version "nvidia/tao/actionrecognitionnet:trainable_v2.0" --dest $HOST_RESULTS_DIR/pretrained

Getting files to download...
[?25l[32m⠋[0m [36m━━[0m • [32m0…[0m • [36mRemaining:[0m [36m…[0m • [31m?[0m • [33mElapsed:[0m [33m0…[0m • [34mTotal: 2 - Completed: 0 - Failed: 0[0m
[2K[1A[2K[32m⠙[0m [36m━━[0m • [32m0…[0m • [36mRemaining:[0m [36m…[0m • [31m?[0m • [33mElapsed:[0m [33m0…[0m • [34mTotal: 2 - Completed: 0 - Failed: 0[0m
[2K[1A[2K[32m⠹[0m [36m━━[0m • [32m0…[0m • [36mRemaining:[0m [36m…[0m • [31m?[0m • [33mElapsed:[0m [33m0…[0m • [34mTotal: 2 - Completed: 0 - Failed: 0[0m
[2K[1A[2K[32m⠼[0m [36m━━[0m • [32m0…[0m • [36mRemaining:[0m [36m…[0m • [31m?[0m • [33mElapsed:[0m [33m0…[0m • [34mTotal: 2 - Completed: 0 - Failed: 0[0m
[2K[1A[2K[32m⠴[0m [36m━━[0m • [32m0…[0m • [36mRemaining:[0m [36m…[0m • [31m?[0m • [33mElapsed:[0m [33m0…[0m • [34mTotal: 2 - Completed: 0 - Failed: 0[0m
[2K[1A[2K[32m⠦[0m [36m━━[0m • [32m0…[0m • [36mRemaining:[0m [36m…[0m • [31m?[0m • [33mElaps

In [35]:
print("Check that model is downloaded into dir.")
!ls -l $HOST_RESULTS_DIR/pretrained/actionrecognitionnet_vtrainable_v1.0

Check that model is downloaded into dir.
total 884944
-rw-------  1 jaswanthngade  staff  114846245 Nov 23 15:59 resnet18_2d_rgb_hmdb5_32.tlt
-rw-------  1 jaswanthngade  staff  332019621 Nov 23 15:59 resnet18_3d_rgb_hmdb5_32.tlt


## 3. Provide training specification <a class="anchor" id="head-2"></a>

We provide specification files to configure the training parameters including:

* model: configure the model setting
    * model_type: type of model, rgb/of/joint
    * backbone: resnet18/34/50/101/152 
    * rgb_seq_length: length of RGB input sequence
    * input_type: 2d/3d
    * sample_strategy: consecutive
    * dropout_ratio: probability to drop the hidden units
* train: configure the training hyperparameters
    * optim_config
    * epochs
    * checkpoint_interval
* dataset: configure the dataset and augmentation methods
    * train_dataset_dir
    * val_dataset_dir
    * label_map: map the class label to id
    * output_shape
    * batch_size
    * workers: number of workers to do data loading
    * clips_per_video: number of clips to be sampled from single video
    * augmentation_config

Please refer to the TAO documentation about ActionRecognitionNet to get all the parameters that are configurable.

In [36]:
!cat $HOST_SPECS_DIR/experiment_rgb_3d_finetune.yaml

results_dir: /results/rgb_3d_ptm
encryption_key: nvidia_tao
model:
  model_type: rgb
  backbone: resnet_18
  rgb_seq_length: 3
  input_height: 224
  input_width: 224
  input_type: 3d
  sample_strategy: consecutive
  dropout_ratio: 0.0
dataset:
  train_dataset_dir: /data/train
  val_dataset_dir: /data/test
  label_map:
    fall_floor: 0
    ride_bike: 1
  batch_size: 32
  workers: 8
  clips_per_video: 5
  augmentation_config:
    train_crop_type: no_crop
    horizontal_flip_prob: 0.5
    rgb_input_mean: [0.5]
    rgb_input_std: [0.5]
    val_center_crop: False
train:
  optim:
    lr: 0.001
    momentum: 0.9
    weight_decay: 0.0001
    lr_scheduler: MultiStep
    lr_steps: [5, 15, 20]
    lr_decay: 0.1
  num_epochs: 20
  checkpoint_interval: 1
evaluate:
  checkpoint: "??"
  test_dataset_dir: "??"
inference:
  checkpoint: "??"
  inference_dataset_dir: "??"
export:
  checkpoint: "??"


## 4. Run TAO training <a class="anchor" id="head-3"></a>
* Provide the sample spec file and the output directory location for models
* WARNING: training will take several hours or one day to complete

In [37]:
# NOTE: The following paths are set from the perspective of the TAO Docker.

# The data is saved here
%env DATA_DIR = /data
%env SPECS_DIR = /specs
%env RESULTS_DIR = /results

env: DATA_DIR=/data
env: SPECS_DIR=/specs
env: RESULTS_DIR=/results


### 4.1 Train 3D model:

We provide pretrained RGB-only model trained on HMDB5 dataset. With the pretrained model, we can even get better accuracy with less epochs.

`KNOWN ISSUE`: 
- 1) The training log will be corrupted by pytorch warning in the notebook:

     `[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)` 
     
     To see the full log in std out, please run the command in terminal. 
- 2) "=" in the checkpoint file name should removed before using the checkpoint in command.

In [38]:
print("Train RGB only model with PTM")
!tao model action_recognition train \
                  -e $SPECS_DIR/experiment_rgb_3d_finetune.yaml \
                  -k $KEY \
                  results_dir=$RESULTS_DIR/rgb_3d_ptm \
                  model.rgb_pretrained_model_path=$RESULTS_DIR/pretrained/actionrecognitionnet_vtrainable_v1.0/resnet18_3d_rgb_hmdb5_32.tlt  \
                  model.rgb_pretrained_num_classes=5

Train RGB only model with PTM
2023-11-23 15:59:44,478 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2023-11-23 15:59:44,514 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-pyt
2023-11-23 15:59:46,837 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 296: The required docker doesn't exist locally/the manifest has changed. Pulling a new docker.
2023-11-23 15:59:46,837 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 155: Pulling the required container. This may take several minutes if you're doing this for the first time. Please wait here.
...
Pulling from repository: nvcr.io/nvidia/tao/tao-toolkit
2023-11-23 16:15:25,174 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 165: Container pull complete.
Docker will run the commands as root. If you would like to retain your
local host permissions, pleas

In [39]:
## Training command for multi-gpu training. We can define the number of gpus and specify which GPU's are to be used by setting the `train.gpu_ids` parameter.
## The following command will trigger multi-gpu training on gpu 0 and gpu 1.
# !tao model action_recognition train \
#                   -e $SPECS_DIR/experiment_rgb_3d_finetune.yaml \
#                   -k $KEY \
#                   train.gpu_ids=[0,1] \
#                   results_dir=$RESULTS_DIR/rgb_3d_ptm \
#                   model.rgb_pretrained_model_path=$RESULTS_DIR/pretrained/actionrecognitionnet_vtrainable_v1.0/resnet18_3d_rgb_hmdb5_32.tlt  \
#                   model.rgb_pretrained_num_classes=5

In [40]:
print('Encrypted checkpoints:')
print('---------------------')
!ls -ltrh $HOST_RESULTS_DIR/rgb_3d_ptm/train

Encrypted checkpoints:
---------------------
ls: /Users/jaswanthngade/Documents/Github/TAO_setup/actionrecognitionnet/rgb_3d_ptm/train: No such file or directory


In [41]:
print('Rename a model: Note that the training is not deterministic, so you may change the model name accordingly.')
print('---------------------')
# NOTE: The following command may require `sudo`. You can run the command outside the notebook.
!find $HOST_RESULTS_DIR/rgb_3d_ptm/train -name *epoch=019* | xargs realpath | xargs -I {} mv {} $HOST_RESULTS_DIR/rgb_3d_ptm/train/rgb_only_model.tlt 
!ls -ltrh $HOST_RESULTS_DIR/rgb_3d_ptm/train/rgb_only_model.tlt

Rename a model: Note that the training is not deterministic, so you may change the model name accordingly.
---------------------
zsh:1: no matches found: *epoch=019*
ls: /Users/jaswanthngade/Documents/Github/TAO_setup/actionrecognitionnet/rgb_3d_ptm/train/rgb_only_model.tlt: No such file or directory


### `OPTIONAL` 4.2 Train optical flow only model

`Important Note` The following cells are using optical flow dataset. 

We will train a 3D OF-only model

In [42]:
# print("Train 3D OF-only model")
# !tao model action_recognition train \
#                   -e $SPECS_DIR/experiment_of_3d_finetune.yaml \
#                   -k $KEY \
#                   results_dir=$RESULTS_DIR/of_3d_ptm \
#                   dataset.train_dataset_dir=$DATA_DIR/train \
#                   dataset.val_dataset_dir=$DATA_DIR/test \
#                   model.of_pretrained_model_path=$RESULTS_DIR/pretrained/actionrecognitionnet_vtrainable_v2.0/resnet18_3d_of_hmdb5_32_a100.tlt  \
#                   model.of_pretrained_num_classes=5

In [43]:
# print("To resume training from a checkpoint, set the resume_training_checkpoint_path option to be the .tlt you want to resume from")
# print("remember to remove the `=` in the checkpoint's file name")
# !tao model action_recognition train \
#                   -e $SPECS_DIR/experiment_of_3d_finetune.yaml \
#                   -k $KEY \
#                   results_dir=$RESULTS_DIR/of_3d_ptm \
#                   train.resume_training_checkpoint_path=

In [44]:
# print('Encrypted checkpoints:')
# print('---------------------')
# !ls -ltrh $HOST_RESULTS_DIR/of_3d_ptm/train

In [45]:
# print('Rename a model: ')
# print('---------------------')
# # NOTE: The following command may require `sudo`. You can run the command outside the notebook.
# !find $HOST_RESULTS_DIR/of_3d_ptm/train -name *epoch=019* | xargs realpath | xargs -I {} mv {} $HOST_RESULTS_DIR/of_3d_ptm/train/of_only_model.tlt 
# !ls -ltrh $HOST_RESULTS_DIR/of_3d_ptm/train/of_only_model.tlt

## 5. Evaluate trained models <a class="anchor" id="head-4"></a>

We provide two different sample strategy to evaluate the pretrained model on video clips.

* `center` mode: pick up the middle frames of a sequence to do inference. For example, if the model requires 32 frames as input and a video clip has 128 frames, then we will choose the frames from index 48 to index 79 to do the inference. 
* `conv` mode: convolutionly sample 10 sequences out of a single video and do inference. The final results are averaged.

Evaluate RGB model trained with PTM

In [46]:
!tao model action_recognition evaluate \
                    -e $SPECS_DIR/experiment_rgb_3d_finetune.yaml \
                    -k $KEY \
                    results_dir=$RESULTS_DIR/rgb_3d_ptm \
                    dataset.workers=0 \
                    evaluate.checkpoint=$RESULTS_DIR/rgb_3d_ptm/train/rgb_only_model.tlt  \
                    evaluate.batch_size=1 \
                    evaluate.test_dataset_dir=$DATA_DIR/test \
                    evaluate.video_eval_mode=center

2023-11-23 16:15:30,578 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2023-11-23 16:15:30,761 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-pyt
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/Users/jaswanthngade/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
2023-11-23 16:15:30,781 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 275: Printing tty value True
Docker instantiation failed with error: 500 Server Error: Internal Server Error ("could not select device driver "" with capabilities: [[gpu]]")


`Optional:` Evaluate OF model

In [47]:
# !tao model action_recognition evaluate \
#                     -e $SPECS_DIR/experiment_of_3d_finetune.yaml \
#                     -k $KEY \
#                     results_dir=$RESULTS_DIR/of_3d_ptm \
#                     dataset.workers=0 \
#                     evaluate.checkpoint=$RESULTS_DIR/of_3d_ptm/train/of_only_model.tlt  \
#                     evaluate.batch_size=1 \
#                     evaluate.test_dataset_dir=$DATA_DIR/test \
#                     evaluate.video_eval_mode=center

## 6. Inferences <a class="anchor" id="head-5"></a>
In this section, we run the action recognition inference tool to generate inferences with the trained RGB models and print the results. 

There are also two modes for inference just like evaluation: `center` mode and `conv` mode. And the final output will show each input sequence label in the videos like:
`[video_sample_path] [labels list for sequences in the video sample]`

In [48]:
!tao model action_recognition inference \
                    -e $SPECS_DIR/experiment_rgb_3d_finetune.yaml \
                    -k $KEY \
                    results_dir=$RESULTS_DIR/rgb_3d_ptm \
                    dataset.workers=0 \
                    inference.checkpoint=$RESULTS_DIR/rgb_3d_ptm/train/rgb_only_model.tlt \
                    inference.inference_dataset_dir=$DATA_DIR/test/ride_bike \
                    inference.video_inf_mode=center

2023-11-23 16:15:31,653 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2023-11-23 16:15:31,822 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-pyt
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/Users/jaswanthngade/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
2023-11-23 16:15:31,832 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 275: Printing tty value True
Docker instantiation failed with error: 500 Server Error: Internal Server Error ("could not select device driver "" with capabilities: [[gpu]]")


`Optional:` Inference with OF-only model

In [49]:
# !tao model action_recognition inference \
#                     -e $SPECS_DIR/experiment_of_3d_finetune.yaml \
#                     -k $KEY \
#                     results_dir=$RESULTS_DIR/of_3d_ptm \
#                     dataset.workers=0 \
#                     inference.checkpoint=$RESULTS_DIR/of_3d_ptm/train/of_only_model.tlt \
#                     inference.inference_dataset_dir=$DATA_DIR/test/ride_bike \
#                     inference.video_inf_mode=center

## 7. Deploy! <a class="anchor" id="head-6"></a>

In [50]:
!mkdir -p $HOST_RESULTS_DIR/export

In [51]:
# Export the RGB model to encrypted ONNX model
!tao model action_recognition export \
                   -e $SPECS_DIR/experiment_rgb_3d_finetune.yaml \
                   -k $KEY \
                   results_dir=$RESULTS_DIR/rgb_3d_ptm \
                   export.checkpoint=$RESULTS_DIR/rgb_3d_ptm/train/rgb_only_model.tlt \
                   export.onnx_file=$RESULTS_DIR/export/rgb_resnet18_3.onnx

2023-11-23 16:15:32,606 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2023-11-23 16:15:32,767 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-pyt
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/Users/jaswanthngade/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
2023-11-23 16:15:32,778 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 275: Printing tty value True
Docker instantiation failed with error: 500 Server Error: Internal Server Error ("could not select device driver "" with capabilities: [[gpu]]")


In [52]:
print('Exported model:')
print('------------')
!ls -lth $HOST_RESULTS_DIR/export

Exported model:
------------
total 0


This notebook has come to an end. You may continue by deploying this RGB model to [DeepStream](https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_3D_Action.html)

`Optional` Export OF-Only model

In [53]:
# # Export the OF model to encrypted ONNX model
# !tao model action_recognition export \
#                    -e $SPECS_DIR/experiment_of_3d_finetune.yaml \
#                    -k $KEY \
#                    results_dir=$RESULTS_DIR/of_3d_ptm \
#                    export.checkpoint=$RESULTS_DIR/of_3d_ptm/of_only_model.tlt\
#                    export.onnx_file=$RESULTS_DIR/export/of_resnet18_3.onnx

OF model is not supported in DeepStream. But you can play with stand-alone TensorRT inference in [tao_toolkit_recipes](https://github.com/NVIDIA-AI-IOT/tao_toolkit_recipes/tree/main/tao_action_recognition/tensorrt_inference) 