# Skeleton-based action recognition using TAO PoseClassificationNet

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://d29g4g2dyqv443.cloudfront.net/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png" width="1080">


## Learning Objectives

In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Train a model for skeleton-based action recognition on the [Kinetics](https://deepmind.com/research/open-source/kinetics) dataset.
* Evaluate the trained model.
* Run Inference on the trained model.
* Export the trained model to an .onnx file (encrypted ONNX model) for deployment to DeepStream or TensorRT.
* Convert the pose data from [deepstream-bodypose-3d](https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps/tree/master/deepstream-bodypose-3d) to skeleton arrays for inference.

At the end of this notebook, you will have generated a trained and optimized `PoseClassification` model, 
which you may deploy with this [end-to-end sample](https://github.com/NVIDIA-AI-IOT/tao-toolkit-triton-apps) with Triton.

## Table of Contents

This notebook shows an example usecase of PoseClassificationNet using Train Adapt Optimize (TAO) Toolkit.

0. [Set up env variables and map drives](#head-0)
1. [Installing the TAO launcher](#head-1)
2. [Prepare dataset and pre-trained model](#head-2)
3. [Provide training specification](#head-3)
4. [Run TAO training](#head-4)
5. [Evaluate trained models](#head-5)
6. [Inferences](#head-6)
7. [Deploy](#head-7)
8. [Convert pose data](#head-8)


## 0. Set up env variables and map drives <a class="anchor" id="head-0"></a>

When using the purpose-built pretrained models from NGC, please make sure to set the `$KEY` environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

The TAO launcher uses docker containers under the hood, and **for our data and results directory to be visible to the docker, they need to be mapped**. The launcher can be configured using the config file `~/.tao_mounts.json`. Apart from the mounts, you can also configure additional options like the Environment Variables and amount of Shared Memory available to the TAO launcher. <br>

`IMPORTANT NOTE:` The code below creates a sample `~/.tao_mounts.json`  file. Here, we can map directories in which we save the data, specs, results and cache. You should configure it for your specific case so these directories are correctly visible to the docker container.


In [None]:
import os

# Please define this local project directory that needs to be mapped to the TAO docker session.
%env LOCAL_PROJECT_DIR=/path/to/local/tao-experiments

os.environ["HOST_DATA_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "data", "poseclassificationnet")
os.environ["HOST_RESULTS_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "poseclassificationnet")

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=/path/to/local/tao-experiments/pose_classification_net
# The sample spec files are present in the same path as the downloaded samples.
os.environ["HOST_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)
os.environ["PROJECT_DIR"]=FIXME

# Set your encryption key, and use the same key for all commands
%env KEY = nvidia_tao

In [None]:
! mkdir -p $HOST_DATA_DIR
! mkdir -p $HOST_SPECS_DIR
! mkdir -p $HOST_RESULTS_DIR

In [None]:
# Mapping up the local directories to the TAO docker.
import json
import os
mounts_file = os.path.expanduser("~/.tao_mounts.json")
tlt_configs = {
   "Mounts":[
       # Mapping the data directory
       {
           "source": os.environ["LOCAL_PROJECT_DIR"],
           "destination": "/workspace/tao-experiments"
       },
       {
           "source": os.environ["HOST_DATA_DIR"],
           "destination": "/data"
       },
       {
           "source": os.environ["HOST_SPECS_DIR"],
           "destination": "/specs"
       },
       {
           "source": os.environ["HOST_RESULTS_DIR"],
           "destination": "/results"
       }
   ],
   "DockerOptions": {
        "shm_size": "16G",
        "ulimits": {
            "memlock": -1,
            "stack": 67108864
         }
   }
}
# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(tlt_configs, mfile, indent=4)

In [None]:
!cat ~/.tao_mounts.json

## 1. Installing the TAO launcher <a class="anchor" id="head-1"></a>
The TAO launcher is a python package distributed as a python wheel listed in PyPI. You may install the launcher by executing the following cell.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:
* python >=3.7, <=3.10.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 455+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be triggered to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python >=3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the virtualenv and virtualenvwrapper packages.

In [None]:
# SKIP this step IF you have already installed the TAO launcher.
!pip3 install nvidia-tao

In [None]:
# View the versions of the TAO launcher
!tao info

## 2. Prepare dataset and pre-trained model <a class="anchor" id="head-2"></a>
 We will be using the [Kinetics](https://deepmind.com/research/open-source/kinetics) dataset for the tutorial. Download the pre-processed data of Kinetics-Skeleton [here](https://drive.google.com/uc?id=1dmzCRQsFXJ18BlXj1G9sbDnsclXIdDdR) and extract them first: 

In [None]:
# download the dataset.
!pip3 install -U gdown
!gdown https://drive.google.com/uc?id=1dmzCRQsFXJ18BlXj1G9sbDnsclXIdDdR -O $HOST_DATA_DIR/st-gcn-processed-data.zip

In [None]:
# extract the files
!unzip -o $HOST_DATA_DIR/st-gcn-processed-data.zip -d $HOST_DATA_DIR
!mv $HOST_DATA_DIR/data/Kinetics/kinetics-skeleton $HOST_DATA_DIR/kinetics
!rm -r $HOST_DATA_DIR/data
!rm $HOST_DATA_DIR/st-gcn-processed-data.zip

In [None]:
# verify
!ls -l $HOST_DATA_DIR/kinetics

In [None]:
# Install required dependencies from the notebook.
!pip3 install Cython==0.29.36
!pip3 install -r $PROJECT_DIR/deps/requirements-pip.txt

In [None]:
# select actions
import os
import pickle
import numpy as np

data_dir = os.path.join(os.environ["HOST_DATA_DIR"], "kinetics")

# front_raises: 134
# pull_ups: 255
# clean_and_jerk: 59
# presenting_weather_forecast: 254
# deadlifting: 88
selected_actions = {
    134: 0,
    255: 1,
    59: 2,
    254: 3,
    88: 4
}

def select_actions(selected_actions, data_dir, split_name):
    """Select a subset of actions and their corresponding labels.
    
    Args:
        selected_actions (dict): Map from selected class IDs to new class IDs.
        data_dir (str): Path to the directory of data arrays (.npy) and labels (.pkl).
        split_name (str): Name of the split to be processed, e.g., "train" and "val".
        
    Returns:
        No explicit returns
    """
    data_path = os.path.join(data_dir, f"{split_name}_data.npy")
    label_path = os.path.join(data_dir, f"{split_name}_label.pkl")

    data_array = np.load(file=data_path)
    with open(label_path, "rb") as label_file:
        labels = pickle.load(label_file)

    assert(len(labels) == 2)
    assert(data_array.shape[0] == len(labels[0]))
    assert(len(labels[0]) == len(labels[1]))

    print(f"No. total samples for {split_name}: {data_array.shape[0]}")

    selected_indices = []
    for i in range(data_array.shape[0]):
        if labels[1][i] in selected_actions.keys():
            selected_indices.append(i)

    data_array = data_array[selected_indices, :, :, :, :]
    selected_sample_names = [labels[0][x] for x in selected_indices]
    selected_labels = [selected_actions[labels[1][x]] for x in selected_indices]
    labels = (selected_sample_names, selected_labels)

    print(f"No. selected samples for {split_name}: {data_array.shape[0]}")

    np.save(file=data_path, arr=data_array, allow_pickle=False)
    with open(label_path, "wb") as label_file:
        pickle.dump(labels, label_file, protocol=4)

select_actions(selected_actions, data_dir, "train")
select_actions(selected_actions, data_dir, "val")

We also provide scripts to process the NVIDIA dataset generated by [deepstream-bodypose-3d](https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps/tree/master/deepstream-bodypose-3d). The following cells for processing the NVIDIA dataset is `Optional`.

`OPTIONAL:` Download the NVIDIA dataset and extract the files.

In [None]:
# # Download the dataset
# !pip3 install -U gdown
# !gdown https://drive.google.com/uc?id=1GhSt53-7MlFfauEZ2YkuzOaZVNIGo_c- -O $HOST_DATA_DIR/data_3dbp_nvidia.zip

In [None]:
# # Extract the files
# !mkdir -p $HOST_DATA_DIR/nvidia
# !unzip $HOST_DATA_DIR/data_3dbp_nvidia.zip -d $HOST_DATA_DIR/nvidia
# !rm $HOST_DATA_DIR/data_3dbp_nvidia.zip

In [None]:
# # Verify
# !ls -l $HOST_DATA_DIR/nvidia

`OPTIONAL:` Download the pretrained model from NGC. We will use NGC CLI to get the data and model. For more details, go to https://ngc.nvidia.com and click the SETUP on the navigation bar.

In [None]:
# # Installing NGC CLI on the local machine.
# ## Download and install
# import os
# import platform

# if platform.machine() == "x86_64":
#     os.environ["CLI"]="ngccli_linux.zip"
# else:
#     os.environ["CLI"]="ngccli_arm64.zip"

# # Remove any previously existing CLI installations
# !rm -rf $HOST_RESULTS_DIR/ngccli/*
# !wget "https://ngc.nvidia.com/downloads/$CLI" -P $HOST_RESULTS_DIR/ngccli
# !unzip -u "$HOST_RESULTS_DIR/ngccli/$CLI" -d $HOST_RESULTS_DIR/ngccli/
# !rm $HOST_RESULTS_DIR/ngccli/*.zip 
# os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("HOST_RESULTS_DIR", ""), os.getenv("PATH", ""))

In [None]:
# !ngc registry model list nvidia/tao/poseclassificationnet:*

In [None]:
# !mkdir -p $HOST_RESULTS_DIR/pretrained

In [None]:
# # Pull pretrained model from NGC 
# !ngc registry model download-version "nvidia/tao/poseclassificationnet:trainable_v1.0" --dest $HOST_RESULTS_DIR/pretrained

In [None]:
# print("Check that model is downloaded into dir.")
# !ls -l $HOST_RESULTS_DIR/pretrained/poseclassificationnet_vtrainable_v1.0

## 3. Provide training specification <a class="anchor" id="head-3"></a>

We provide specification files to configure the training parameters including:

* model: configure the model setting
    * model_type: type of model, ST-GCN
    * pretrained_model_path: path for the input model
    * input_channels: number of input channels
    * dropout: probability to drop the hidden units
    * graph_layout: type of graph layout, nvidia/openpose/human3.6m/ntu-rgb+d/ntu_edge/coco
    * graph_strategy: type of graph strategy, uniform/distance/spatial
    * edge_importance_weighting: enabling edge importance weighting
* dataset: configure the dataset and augmentation methods
    * train_dataset: paths for the training data and label file
    * val_dataset: paths for the validation data and label file
    * num_classes: number of classes
    * label_map: map from labels to class IDs
    * random_choose: enabling randomly choosing a portion of the input sequence
    * random_move: enabling randomly moving the input sequence
    * window_size: length of the output sequence
    * batch_size: number of arrays in 1 batch
    * num_workers: number of workers to do data loading
* train: configure the training hyperparameters
    * optim: configure optimizer
    * num_epochs: number of epochs
    * checkpoint_interval: enabling how often to store models
    * grad_clip: enabling gradient clipping

Please refer to the TAO documentation about PoseClassificationNet to get all the parameters that are configurable.

In [None]:
!cat $HOST_SPECS_DIR/experiment_kinetics.yaml

## 4. Run TAO training <a class="anchor" id="head-4"></a>
* Provide the sample spec file and the output directory location for models.
* WARNING: Training will take several hours or one day to complete.

In [None]:
# NOTE: The following paths are set from the perspective of the TAO Docker.

# The data is saved here
%env DATA_DIR = /data
%env SPECS_DIR = /specs
%env RESULTS_DIR = /results

### 4.1 Train Kinetics model

We will train a Kinetics model from scratch.

In [None]:
print("Train model")
!tao model pose_classification train \
                  -e $SPECS_DIR/experiment_kinetics.yaml \
                  results_dir=$RESULTS_DIR/kinetics \
                  encryption_key=$KEY

In [None]:
# print("Train model using multiple (2) GPUs")
# !tao model pose_classification train \
#                   -e $SPECS_DIR/experiment_kinetics.yaml \
#                   results_dir=$RESULTS_DIR/kinetics \
#                   encryption_key=$KEY \
#                   train.gpu_ids=[0,1]

In [None]:
print('Encrypted checkpoints:')
print('---------------------')
!ls -ltrh $HOST_RESULTS_DIR/kinetics/train

In [None]:
# You can set NUM_EPOCH to the epoch corresponding to any saved checkpoint
# %env NUM_EPOCH=029

# Get the name of the checkpoint corresponding to your set epoch
# tmp=!ls $HOST_RESULTS_DIR/kinetics/train/*.pth | grep epoch_$NUM_EPOCH
# %env CHECKPOINT={tmp[0]}

# Or get the latest checkpoint
os.environ["CHECKPOINT"] = os.path.join(os.getenv("HOST_RESULTS_DIR"), "kinetics/train/pc_model_latest.pth")

print('Rename a trained model: ')
print('---------------------')
!cp $CHECKPOINT $HOST_RESULTS_DIR/kinetics/train/kinetics_model.tlt
!ls -ltrh $HOST_RESULTS_DIR/kinetics/train/kinetics_model.tlt

### `OPTIONAL` 4.2 Train NVIDIA model

In [None]:
# print("Train model from scratch")
# !tao model pose_classification train \
#                   -e $SPECS_DIR/experiment_nvidia.yaml \
#                   results_dir=$RESULTS_DIR/nvidia \
#                   encryption_key=$KEY

In [None]:
# print("Train model from scratch using multiple (2) GPUs")
# !tao model pose_classification train \
#                   -e $SPECS_DIR/experiment_nvidia.yaml \
#                   results_dir=$RESULTS_DIR/nvidia \
#                   encryption_key=$KEY \
#                   train.gpu_ids=[0,1]

We provide pre-trained ST-GCN model trained on the NVIDIA dataset. With the pre-trained model, we can even get better accuracy with less epochs.

In [None]:
# print("To resume training from a checkpoint, set the model.pretrained_model_path option to be the .tlt you want to resume from")
# print("remember to remove the `=` in the checkpoint's file name")
# !tao model pose_classification train \
#                   -e $SPECS_DIR/experiment_nvidia.yaml \
#                   results_dir=$RESULTS_DIR/nvidia \
#                   encryption_key=$KEY \
#                   model.pretrained_model_path=

In [None]:
# print('Encrypted checkpoints:')
# print('---------------------')
# !ls -ltrh $HOST_RESULTS_DIR/nvidia/train

In [None]:
# You can set NUM_EPOCH to the epoch corresponding to any saved checkpoint
# %env NUM_EPOCH=029

# Get the name of the checkpoint corresponding to your set epoch
# tmp=!ls $HOST_RESULTS_DIR/nvidia/train/*.pth | grep epoch_$NUM_EPOCH
# %env CHECKPOINT={tmp[0]}

# Or get the latest checkpoint
# os.environ["CHECKPOINT"] = os.path.join(os.getenv("HOST_RESULTS_DIR"), "nvidia/train/pc_model_latest.pth")

# print('Rename a trained model: ')
# print('---------------------')
# !cp $CHECKPOINT $HOST_RESULTS_DIR/nvidia/train/nvidia_model.tlt
# !ls -ltrh $HOST_RESULTS_DIR/nvidia/train/nvidia_model.tlt

## 5. Evaluate trained models <a class="anchor" id="head-5"></a>
Evaluate trained model.

In [None]:
!tao model pose_classification evaluate \
                    -e $SPECS_DIR/experiment_kinetics.yaml \
                    results_dir=$RESULTS_DIR/kinetics \
                    encryption_key=$KEY \
                    evaluate.checkpoint=$RESULTS_DIR/kinetics/train/kinetics_model.tlt \
                    evaluate.test_dataset.data_path=$DATA_DIR/kinetics/val_data.npy \
                    evaluate.test_dataset.label_path=$DATA_DIR/kinetics/val_label.pkl

## 6. Inferences <a class="anchor" id="head-6"></a>
In this section, we run the pose classification inference tool to generate inferences with the trained models and save the results under `$RESULTS_DIR`. 

In [None]:
!tao model pose_classification inference \
                    -e $SPECS_DIR/experiment_kinetics.yaml \
                    results_dir=$RESULTS_DIR/kinetics \
                    encryption_key=$KEY \
                    inference.checkpoint=$RESULTS_DIR/kinetics/train/kinetics_model.tlt \
                    inference.output_file=$RESULTS_DIR/kinetics/inference/inference.txt \
                    inference.test_dataset.data_path=$DATA_DIR/kinetics/val_data.npy

## 7. Deploy <a class="anchor" id="head-7"></a>
Export the model to encrypted ONNX model.

In [None]:
!tao model pose_classification export \
                   -e $SPECS_DIR/experiment_kinetics.yaml \
                   results_dir=$RESULTS_DIR/kinetics \
                   encryption_key=$KEY \
                   export.checkpoint=$RESULTS_DIR/kinetics/train/kinetics_model.tlt \
                   export.onnx_file=$RESULTS_DIR/kinetics/export/kinetics_model.onnx

In [None]:
print('Exported model:')
print('------------')
!ls -lth $HOST_RESULTS_DIR/kinetics/export

You may continue by deploying the exported model to [Triton Inference Server](https://developer.nvidia.com/nvidia-triton-inference-server). Please refer to the [TAO Toolkit Triton Apps](https://github.com/NVIDIA-AI-IOT/tao-toolkit-triton-apps), where a sample for end-to-end inference from video is also provided. 

## `OPTIONAL` 8. Convert pose data <a class="anchor" id="head-8"></a>
Convert the JSON pose data from [deepstream-bodypose-3d](https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps/tree/master/deepstream-bodypose-3d) to NumPy arrays for inference.

In [None]:
# !tao model pose_classification dataset_convert \
#                    -e $SPECS_DIR/experiment_nvidia.yaml \
#                    results_dir=$RESULTS_DIR/nvidia \
#                    encryption_key=$KEY \
#                    dataset_convert.data=/absolute/path/to/your/json/pose/data

In [None]:
# print('Converted pose data:')
# print('------------')
# !ls -lth $HOST_RESULTS_DIR/nvidia/dataset_convert

This notebook has come to an end.