# Gesture Classification using TAO GestureNet

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/embedded-transfer-learning-toolkit-software-stack-1200x670px.png" width="1080"> 

## Learning Objectives

In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Take a pretrained model and train a GestureNet model on HGR dataset
* Run Inference on the trained model
* Export the retrained model to a .etlt file for deployment to DeepStream SDK

### Table of Contents

This notebook shows an example of classifying gestures using GestureNet in the Train Adapt Optimize (TAO) Toolkit.

0. [Set up env variables, map drives, and install dependencies](#head-0)
1. [Install the TAO launcher](#head-1)
2. [Prepare dataset and pre-trained model](#head-2) <br>
    A. [Verify and prepare dataset](#head-2-1) <br>
    B. [Generate hand crops and dataset json](#head-2-2) <br>
    C. [Download pre-trained model](#head-2-3) <br>
3. [Provide training specification](#head-3) <br>
4. [Run TAO training](#head-4) <br>
5. [Evaluate the trained model](#head-5) <br>
6. [Export](#head-6) <br>
7. [Inference](#head-7) <br>

## 0. Set up env variables, map drives and install dependencies <a class="anchor" id="head-0"></a>
When using the purpose-built pretrained models from NGC, please make sure to set the `$KEY` environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users' workspace. Please note that the dataset to run this notebook is expected to reside in the `$LOCAL_PROJECT_DIR/gesturenet/data`, while the TAO experiment generated collaterals will be output to `$LOCAL_PROJECT_DIR/gesturenet`. More information on how to set up the dataset and the supported steps in the TAO workflow are provided in the subsequent cells.

*Note: This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly*

In [None]:
# Setting up env variables for cleaner command-line commands.
import os

%env KEY=nvidia_tlt
%env NUM_GPUS=1
%env USER_EXPERIMENT_DIR=/workspace/tao-experiments/gesturenet
%env DATA_DOWNLOAD_DIR=/workspace/tao-experiments/gesturenet/data

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=~/tao-samples/gesturenet

# Please define this local project directory that needs to be mapped to the TAO docker session.
# The dataset is expected to be present in $LOCAL_PROJECT_DIR/gesturenet/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/gesturenet
# !PLEASE MAKE SURE TO UPDATE THIS PATH!.
%env LOCAL_PROJECT_DIR=/path/to/local/experiments

# $PROJECT_DIR is the path to the sample notebook folder and the dependency folder
# $PROJECT_DIR/deps should exist for dependency installation
%env PROJECT_DIR=/path/to/local/samples_dir

os.environ["LOCAL_DATA_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "gesturenet/data"
)
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "gesturenet"
)

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)

%env SPECS_DIR=/workspace/tao-experiments/gesturenet/specs

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR

The cell below maps the project directory on your local host to a workspace directory in the TAO docker instance, so that the data and the results are mapped from in and out of the docker. For more information please refer to the [launcher instance](https://docs.nvidia.com/tao/tao-toolkit/tao_launcher.html) in the user guide.

When running this cell on AWS, update the drive_map entry with the dictionary defined below, so that you don't have permission issues when writing data into folders created by the TAO docker.

```json
drive_map = {
    "Mounts": [
            # Mapping the data directory
            {
                "source": os.environ["LOCAL_PROJECT_DIR"],
                "destination": "/workspace/tao-experiments"
            },
            # Mapping the specs directory.
            {
                "source": os.environ["LOCAL_SPECS_DIR"],
                "destination": os.environ["SPECS_DIR"]
            },
        ],
    "DockerOptions": {
        "user": "{}:{}".format(os.getuid(), os.getgid())
    }
}
```

In [None]:
# Mapping up the local directories to the TAO docker.
import json
mounts_file = os.path.expanduser("~/.tao_mounts.json")

# Define the dictionary with the mapped drives
drive_map = {
    "Mounts": [
        # Mapping the data directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tao-experiments"
        },
        # Mapping the specs directory.
        {
            "source": os.environ["LOCAL_SPECS_DIR"],
            "destination": os.environ["SPECS_DIR"]
        },
    ]
}

# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(drive_map, mfile, indent=4)

In [None]:
!cat ~/.tao_mounts.json

In [None]:
# Install requirement
!pip3 install -r $PROJECT_DIR/deps/requirements-pip.txt

## 1. Install the TAO launcher <a class="anchor" id="head-1"></a>
The TAO launcher is a python package distributed as a python wheel listed in the `nvidia-pyindex` python index. You may install the launcher by executing the following cell.

Please note that TAO Toolkit recommends users run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction on this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have set up virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:
* python >=3.6.9 < 3.8.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 455+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be triggered to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.

In [None]:
# Skip this cell if the TAO launcher was already installed.
!pip3 install nvidia-pyindex
!pip3 install nvidia-tao

In [None]:
# View the version of the TAO launcher
!tao info

## 2. Prepare dataset and pre-trained model <a class="anchor" id="head-2"></a>

We will be using the database for hand gesture recognition (HGR) for the tutorial. To find more details, please visit http://sun.aei.polsl.pl/~mkawulok/gestures/. Please download the HGR1 [images](http://sun.aei.polsl.pl/~mkawulok/gestures/hgr1_images.zip), [feature points](http://sun.aei.polsl.pl/~mkawulok/gestures/hgr1_feature_pts.zip) and HGR2B [images](http://sun.aei.polsl.pl/~mkawulok/gestures/hgr2b_images.zip), [feature points](http://sun.aei.polsl.pl/~mkawulok/gestures/hgr2b_feature_pts.zip) and place the zip files in `$LOCAL_DATA_DIR`. 

### A. Verify and prepare dataset <a class="anchor" id="head-2-1"></a>

In [None]:
# Check the zip files are present.
!mkdir -p $LOCAL_DATA_DIR
!if [ ! -f $LOCAL_DATA_DIR/hgr1_images.zip ]; then echo 'hgr1_images zip file not found, please download.'; else echo 'Found hgr1_images zip file.';fi
!if [ ! -f $LOCAL_DATA_DIR/hgr1_feature_pts.zip ]; then echo 'hgr1_feature_pts zip file not found, please download.'; else echo 'Found hgr1_feature_pts zip file.';fi
!if [ ! -f $LOCAL_DATA_DIR/hgr2b_images.zip ]; then echo 'hgr2b_images zip file not found, please download.'; else echo 'Found hgr2b_images zip file.';fi
!if [ ! -f $LOCAL_DATA_DIR/hgr2b_feature_pts.zip ]; then echo 'hgr2b_feature_pts zip file not found, please download.'; else echo 'Found hgr2b_feature_pts zip file.';fi

In [None]:
# unpack downloaded datasets to $DATA_DOWNLOAD_DIR.
# The images will be under $DATA_DOWNLOAD_DIR/original_images and $DATA_DOWNLOAD_DIR/feature_points
!unzip -u ${LOCAL_DATA_DIR}/hgr1_images.zip -d ${LOCAL_DATA_DIR}
!unzip -u ${LOCAL_DATA_DIR}/hgr1_feature_pts.zip -d ${LOCAL_DATA_DIR}
!unzip -u ${LOCAL_DATA_DIR}/hgr2b_images.zip -d ${LOCAL_DATA_DIR}
!unzip -u ${LOCAL_DATA_DIR}/hgr2b_feature_pts.zip -d ${LOCAL_DATA_DIR}

In [None]:
# Convert dataset to required format for gesturenet dataset_convert
!python3.6 convert_hgr_to_tlt_data.py --input_image_dir=$LOCAL_DATA_DIR/original_images \
                                      --input_label_file=$LOCAL_DATA_DIR/feature_points \
                                      --output_dir=$LOCAL_EXPERIMENT_DIR

In [None]:
# verify
import os

LOCAL_EXPERIMENT_DIR = os.environ.get('LOCAL_EXPERIMENT_DIR')
num_labels = len(os.listdir(os.path.join(LOCAL_EXPERIMENT_DIR, "original/data/annotation")))
print("Number of labels in the dataset. {}".format(num_labels))

### B. Generate hand crops and dataset json <a class="anchor" id="head-2-2"></a>

* Update the `dataset_config.json` and `dataset_experiment_config.json` spec files
* Create the crop and json using the gesturenet dataset_convert 

*Note: Crops and dataset json only need to be generated once.*

In [None]:
print("Hand crop generation spec file")
!cat $LOCAL_SPECS_DIR/dataset_config.json

In [None]:
print("Dataset experiment spec file")
!cat $LOCAL_SPECS_DIR/dataset_experiment_config.json

In [None]:
!tao gesturenet dataset_convert --dataset_spec $SPECS_DIR/dataset_config.json \
                                --k_folds 0 \
                                --experiment_spec $SPECS_DIR/dataset_experiment_config.json \
                                --output_filename $USER_EXPERIMENT_DIR/data.json \
                                --experiment_name v1

In [None]:
# Check to see if proper json file is generated.
!if [ ! -f $LOCAL_EXPERIMENT_DIR/data.json ]; then echo "Json file was not generated properly."; else echo "Json was generated properly."; fi

### C. Download pre-trained model <a class="anchor" id="head-2-3"></a>

Please follow the instructions in the following to download and verify the pretrained model for gesturenet.

For FpeNet pretrained model please download model: `nvidia/tao/gesturenet:trainable_v1.0`.

After obtaining the pre-trained model, please place the model in $LOCAL_EXPERIMENT_DIR

You will have the following path-

* pretrained model in `$LOCAL_EXPERIMENT_DIR/pretrained_models/gesturenet_vtrainable_v1.0/model.tlt`

In [None]:
# Installing NGC CLI on the local machine.
## Download and install
%env CLI=ngccli_cat_linux.zip
!mkdir -p $LOCAL_PROJECT_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))

In [None]:
# List models available in the model registry.
!ngc registry model list nvidia/tao/gesturenet:*

In [None]:
# Create the target destination to download the model.
!mkdir -p $LOCAL_EXPERIMENT_DIR/pretrained_models/

In [None]:
# Download the pretrained model from NGC
!ngc registry model download-version nvidia/tao/gesturenet:trainable_v1.0 \
    --dest $LOCAL_EXPERIMENT_DIR/pretrained_models/

In [None]:
!ls -rlt $LOCAL_EXPERIMENT_DIR/pretrained_models/gesturenet_vtrainable_v1.0 

In [None]:
# Check the model is present
!if [ ! -f $LOCAL_EXPERIMENT_DIR/pretrained_models/gesturenet_vtrainable_v1.0/model.tlt ]; then echo 'Pretrained model file not found, please download.'; else echo 'Found Pretrain model file.';fi

## 3. Provide training specification <a class="anchor" id="head-3"></a>

* Dataset configuration
    * In order to load the data properly, you will need to change the `dataset:data_path` to the generated `json` (folder and file) file generated in part B above. By default it is located at `$LOCAL_SPECS_DIR/data.json`
    * Update number of classes and class number to name map
* Pre-trained models. There is an optional parameter to load head of model. Only set `add_new_head: false` if you want to finetune on dataset with same gestures as pretrained model. Please ensure the gesture class to index map matches pretrained model.
* Augmentation parameters for on the fly data augmentation
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.|

In [None]:
!cat $LOCAL_SPECS_DIR/train_spec.json

## 4. Run TAO training <a class="anchor" id="head-4"></a>
* Provide the sample spec file and the encryption key

In [None]:
!tao gesturenet train -e $SPECS_DIR/train_spec.json \
                      -k $KEY

## 5. Evaluate the trained model <a class="anchor" id="head-5"></a>

* Please update model path to location trained model is saved at

In [None]:
!tao gesturenet evaluate -e $USER_EXPERIMENT_DIR/model/train_spec.json \
                         -m $USER_EXPERIMENT_DIR/model/model.tlt \
                         -k $KEY

## 6. Inference <a class="anchor" id="head-7"></a>
In this section, we run the `gesturenet inference` tool to generate inferences on the trained models. Please ensure the spec file `inference.json` is configured correctly. 

In [None]:
!tao gesturenet inference -e $USER_EXPERIMENT_DIR/model/train_spec.json \
                          -m $USER_EXPERIMENT_DIR/model/model.tlt \
                          -k $KEY \
                          --image_root_path $USER_EXPERIMENT_DIR \
                          --data_json $USER_EXPERIMENT_DIR/data.json \
                          --data_type kpi_set \
                          --results_dir $USER_EXPERIMENT_DIR/model

In [None]:
import os
import cv2
import IPython.display
import json
import PIL.Image

json_spec_path = os.path.join(os.environ.get('LOCAL_EXPERIMENT_DIR'), 'data.json')
data_type = "kpi_set"
result_file = os.path.join(os.environ.get('LOCAL_EXPERIMENT_DIR'), 'model/results.txt')
model_spec_path = os.path.join(os.environ.get('LOCAL_EXPERIMENT_DIR'), 'model/train_spec.json')

# Read in json spec.
with open(json_spec_path, 'r') as file:
    full_spec = json.load(file)
spec = full_spec[data_type]

# Read in model spec.
with open(model_spec_path, 'r') as file:
    model_spec = json.load(file)

class_labels = model_spec['dataset']['classes']

results = open(result_file, 'r')

images = spec['images']

for image_dict in images:

    image_path = os.path.join(os.environ.get('LOCAL_EXPERIMENT_DIR'), image_dict['full_image_path'])
    bbox = image_dict['bbox_coordinates']
    image = cv2.imread(image_path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    # Get corners of rectangle.
    upper_left = tuple(bbox[0])
    bottom_right = tuple(bbox[3])
    # draw rectangle onto image.
    cv2.rectangle(image, upper_left, bottom_right, (0, 255, 0), 2)

    if image is None:
        results.readline()
        continue
    image_result = results.readline()
    prediction = image_result.split(' ')[1]
    # Get class label.
    label = list(class_labels.keys())[list(class_labels.values()).index(int(prediction))]
    # Get bottom right corner.
    x = 0
    y = image.shape[0]-5
    # Display Image.
    image = cv2.putText(image, label, (x, y), cv2.FONT_HERSHEY_COMPLEX_SMALL, 1, (255, 24, 8))
    IPython.display.display(PIL.Image.fromarray(image))

## 7. Export <a class="anchor" id="head-6"></a>

### 7.1 Export .etlt model

Use the export functionality to export an encrypted model in fp32 format without any optimizations.

* Modify `-m` to your model path
* Modify `--out_file` to your desired output path

In [None]:
!tao gesturenet export -m $USER_EXPERIMENT_DIR/model/model.tlt \
                       -k ${KEY} \
                       --out_file $USER_EXPERIMENT_DIR/model/model.etlt

In [None]:
# check the deployment file is presented
!if [ ! -f $LOCAL_EXPERIMENT_DIR/model/model.etlt ]; then echo 'Deployment file not found, please generate.'; else echo 'Found deployment file.';fi

### 7.2 INT8 Optimization

GestureNet model supports int8 inference mode in TensorRT. In order to do this, the model is first calibrated to run 8-bit inferences. This is the process:

* Provide a directory with set of images to be used for calibration.
* A calibration tensorfile is generated and saved in --cal_data_file
* This tensorfile is use to calibrate the model and the calibration table is stored in --cal_cache_file
* The calibration table in addition to the model is used to generate the int8 tensorrt engine to the path --engine_file

*Note: For this example, we generate a calibration tensorfile containing 100 batches of training data. Ideally, it is best to use at least 10-20% of the training data to do so. The more data provided during calibration, the closer int8 inferences are to fp32 inferences.*

In [None]:
# Number of calibration samples to use
%set_env NUM_CALIB_SAMPLES=100

In [None]:
!python3 sample_calibration_images.py \
    -a $LOCAL_EXPERIMENT_DIR/data.json \
    -i $LOCAL_EXPERIMENT_DIR \
    -o $LOCAL_EXPERIMENT_DIR/calibration_samples/ \
    -n $NUM_CALIB_SAMPLES \
    --randomize

### 7.3 Export Deployable INT8 Model

In [None]:
!tao gesturenet export -m $USER_EXPERIMENT_DIR/model/model.tlt \
                       -k $KEY \
                       --engine_file $USER_EXPERIMENT_DIR/model/model.int8.engine \
                       --data_type int8 \
                       --cal_image_dir $USER_EXPERIMENT_DIR/calibration_samples/ \
                       --cal_cache_file $USER_EXPERIMENT_DIR/model/int8_calibration.bin \
                       --cal_data_file $USER_EXPERIMENT_DIR/model/int8_calibration.tensorfile \
                       --batches 100

### 7.4 Run Inference on Exported INT8 Engine File

In [None]:
!tao gesturenet inference -e $USER_EXPERIMENT_DIR/model/train_spec.json \
                          -m $USER_EXPERIMENT_DIR/model/model.int8.engine \
                          -k $KEY \
                          --image_root_path $USER_EXPERIMENT_DIR \
                          --data_json $USER_EXPERIMENT_DIR/data.json \
                          --data_type kpi_set \
                          --results_dir $USER_EXPERIMENT_DIR/model/model_int8_engine

In [None]:
# check the results file is generated
!if [ ! -f $LOCAL_EXPERIMENT_DIR/model/model_int8_engine/results.txt ]; then echo 'Results file not found!'; else cat $LOCAL_EXPERIMENT_DIR/model/model_int8_engine/results.txt;fi