# Bodypose Estimation using TAO BodyposeNet

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/embedded-transfer-learning-toolkit-software-stack-1200x670px.png" width="1080"> 

## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Train a Bodypose Estimation model on the Common Objects in Context (COCO) dataset
* Evaluate the model's performance
* Run Inference on the trained model
* Prune and re-train the pruned model
* Export the model to a .etlt file for deployment to DeepStream SDK
* Optimize the standard fp32 model into an int8 TensorRT Engine for optimized deployment for the system GPU


### Table of Contents

1. [Set up env variables, map drives, and install dependencies](#head-1) <br>
2. [Install the TAO Launcer](#head-2) <br>
3. [Prepare dataset and pre-trained model](#head-3) <br>
    3.1 [Generate masks and tfrecords from labels in json format](#head-3-1) <br>
    3.2 [Convert dataset format](#head-3-2) <br>
    3.3 [Download pre-trained model](#head-3-3) <br>
4. [Provide training specification](#head-4) <br>
5. [Run TAO training](#head-5) <br>
6. [Evaluate trained models](#head-6) <br>
7. [Run inference for a set of images](#head-7) <br>
    7.1 [Visualize annotations](#head-7-1) <br>
    7.2 [Visualize annotations manually from detections](#head-7-2) <br>
8. [Pruning workflow](#head-8) <br>
    8.1 [Prune trained models](#head-8-1) <br>
    8.2 [Retrain pruned models](#head-8-2) <br>
    8.3 [Evaluate retrained model](#head-8-3) <br>
    8.4 [Inference using retrained model](#head-8-4) <br>
    8.5 [Visualize retrained model inferences](#head-8-5) <br>
9. [Model Export and INT8 Quantization](#head-9) <br>
    9.1 [Choose network input resolution for deployment](#head-9-1) <br>
    9.2 [Export `.etlt` model](#head-9-2) <br>
    9.3 [Int8 Optimization](#head-9-3) <br>
    9.4 [Generate TensorRT Engine](#head-9-4) <br>
10. [Verify TensorRT models and Deploy](#head-10) <br>
    10.1 [Inference using TensorRT Engine](#head-10-1) <br>
    10.2 [Visualize TensorRT Inferences](#head-10-2) <br>
    10.3 [Evaluate the TensorRT engine](#head-10-3) <br>
    10.4 [Export Deployable Model](#head-10-4) <br>

## 1. Set up env variables, map drives, and install dependencies <a class="anchor" id="head-1"></a>

The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users' workspace. Please note that the dataset to run this notebook is expected to reside in the `$LOCAL_PROJECT_DIR/bpnet/data`, while the TAO experiment generated collaterals will be output to `$LOCAL_PROJECT_DIR/bpnet`. More information on how to set up the dataset and the supported steps in the TAO workflow are provided in the subsequent cells.

*Note: This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly*

In [None]:
# Setting up env variables for cleaner command line commands.
import os

%env KEY=nvidia_tlt
%env NUM_GPUS=1

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=~/tao-samples/bpnet

# Please define this local project directory that needs to be mapped to the TAO docker session.
# The dataset is expected to be present in $LOCAL_PROJECT_DIR/bpnet/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/bpnet
# !PLEASE MAKE SURE TO UPDATE THIS PATH!.
%env LOCAL_PROJECT_DIR=FIXME

# $SAMPLES_DIR is the path to the sample notebook folder and the dependency folder
# $SAMPLES_DIR/deps should exist for dependency installation
%env SAMPLES_DIR=FIXME

os.environ["LOCAL_DATA_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "bpnet/data"
)
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "bpnet"
)

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)

os.environ["LOCAL_DATA_POSE_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "data_pose_config"
)

os.environ["LOCAL_MODEL_POSE_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "model_pose_config"
)

%env USER_EXPERIMENT_DIR=/workspace/tao-experiments/bpnet
%env DATA_DIR=/workspace/tao-experiments/bpnet/data
%env SPECS_DIR=/workspace/examples/bpnet/specs
%env DATA_POSE_SPECS_DIR=/workspace/examples/bpnet/data_pose_config
%env MODEL_POSE_SPECS_DIR=/workspace/examples/bpnet/model_pose_config

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR
!ls -rlt $LOCAL_DATA_POSE_SPECS_DIR
!ls -rlt $LOCAL_MODEL_POSE_SPECS_DIR

The cell below maps the project directory on your local host to a workspace directory in the TAO docker instance, so that the data and the results are mapped from in and out of the docker. For more information please refer to the [launcher instance](https://docs.nvidia.com/tao/tao-toolkit/tao_launcher.html) in the user guide.

When running this cell on AWS, update the drive_map entry with the dictionary defined below, so that you don't have permission issues when writing data into folders created by the TAO docker.

```json
drive_map = {
    "Mounts": [
            # Mapping the data directory
            {
                "source": os.environ["LOCAL_PROJECT_DIR"],
                "destination": "/workspace/tao-experiments"
            },
            # Mapping the specs directory.
            {
                "source": os.environ["LOCAL_SPECS_DIR"],
                "destination": os.environ["SPECS_DIR"]
            },
            {
                "source": os.environ["LOCAL_DATA_POSE_SPECS_DIR"],
                "destination": os.environ["DATA_POSE_SPECS_DIR"]
            },
            {
                "source": os.environ["LOCAL_MODEL_POSE_SPECS_DIR"],
                "destination": os.environ["MODEL_POSE_SPECS_DIR"]
            },
        ],
    "DockerOptions": {
        "user": "{}:{}".format(os.getuid(), os.getgid())
    }
}
```

In [None]:
# Mapping up the local directories to the TAO docker.
import json
mounts_file = os.path.expanduser("~/.tao_mounts.json")

# Define the dictionary with the mapped drives
drive_map = {
    "Mounts": [
        # Mapping the data directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tao-experiments"
        },
        # Mapping the specs directory.
        {
            "source": os.environ["LOCAL_SPECS_DIR"],
            "destination": os.environ["SPECS_DIR"]
        },
        {
            "source": os.environ["LOCAL_DATA_POSE_SPECS_DIR"],
            "destination": os.environ["DATA_POSE_SPECS_DIR"]
        },
        {
            "source": os.environ["LOCAL_MODEL_POSE_SPECS_DIR"],
            "destination": os.environ["MODEL_POSE_SPECS_DIR"]
        },
    ]
}

# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(drive_map, mfile, indent=4)

In [None]:
!cat ~/.tao_mounts.json 

In [None]:
# Install requirement
!pip3 install -r $SAMPLES_DIR/deps/requirements-pip.txt

## 2. Install the TAO launcher <a class="anchor" id="head-2"></a>
The TAO launcher is a python package distributed as a python wheel listed in PyPI. You may install the launcher by executing the following cell.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:
* python >=3.6.9 < 3.8.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 455+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be triggered to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key. <a class="anchor" id="head-1"></a>

In [None]:
# Please replace fixme with the path to the wheel file that you downloaded from the developer zone link mentioned above.
# SKIP this step IF you have already installed the TAO launcher wheel.
!pip3 install nvidia-tao

In [None]:
# Initialize the TAO launcher
!tao info

## 3. Prepare dataset and pre-trained model <a class="anchor" id="head-3"></a>

We will be using the COCO (common objects on context) 2017 dataset for this example. To find more details please visit https://cocodataset.org/#keypoints-2017 and https://cocodataset.org/#keypoints-eval.
Please download the dataset and extract as per instructions below.

Links to download the data: [train_data](http://images.cocodataset.org/zips/train2017.zip), [val_data](http://images.cocodataset.org/zips/val2017.zip) and [annotations](http://images.cocodataset.org/annotations/annotations_trainval2017.zip). Please unzip the images into the `$LOCAL_DATA_DIR` directory and the annotations into the `$LOCAL_DATA_DIR/annotations`. You may use this notebook with your own dataset as well. To use this example with your own dataset, please refer to `Use your own dataset` section below.

*Note: There are no labels for the testing images, therefore we use COCO validation set to evaluate the trained model.*

In [None]:
# Modify dataset_config for data preparation
# verify all paths
!cat $LOCAL_DATA_POSE_SPECS_DIR/coco_spec.json

In [None]:
# Check the dataset is present
!if [ ! -d $LOCAL_DATA_DIR ]; then echo 'Data folder not found, please download.'; else echo 'Found Data folder.';fi
!if [ ! -d $LOCAL_DATA_DIR/annotations ]; then echo 'Annotations folder not found, please download.'; else echo 'Found Annotations folder.';fi
!if [ ! -d $LOCAL_DATA_DIR/train2017 ]; then echo 'Train Images folder not found, please download.'; else echo 'Found Train Images folder.';fi
!if [ ! -d $LOCAL_DATA_DIR/val2017 ]; then echo 'Val Images folder not found, please download.'; else echo 'Found Val Images folder.';fi

In [None]:
# Check the labels are present
!if [ ! -f $LOCAL_DATA_DIR/annotations/person_keypoints_train2017.json ]; then echo 'Train labels not found, please regenerate.'; else echo 'Found Train Labels.';fi
!if [ ! -f $LOCAL_DATA_DIR/annotations/person_keypoints_val2017.json ]; then echo 'Val labels not found, please regenerate.'; else echo 'Found Val Labels.';fi

In [None]:
# Sample json label.
!sed -n 1,201p $LOCAL_DATA_DIR/annotations/person_keypoints_val2017.json

In [None]:
os.path.join(os.getenv("LOCAL_DATA_DIR"), "train2017/000000304473.jpg")

In [None]:
# Sample image.
import os
from IPython.display import Image
Image(filename=os.path.join(
    os.getenv("LOCAL_DATA_DIR"), "train2017/000000304473.jpg"))

### 3.1. Generate segmentation masks and tfrecords from annotations <a class="anchor" id="head-3-1"></a>
* Create the tfrecords using the `bpnet dataset_convert` tool
* Generate and save masks of regions with unlabeled people - used to mask out the loss for those regions duirng training.
* Mask folder is created based on the `coco_spec.json` file path. `mask_root_dir_path` directory is relative to `root_directory_path`. Similarly for `images_root_dir_path` and `annotation_root_dir_path`
* Use `-m 'train'` to process data specified under `train_data` in `coco_spec.json`. Similarly, `-m 'test'` for `test_data`.

*Note: TfRecords and masks only need to be generated once.*

In [None]:
# Generate TFRecords for training dataset
!tao bpnet dataset_convert \
        -m 'train' \
        -o $DATA_DIR/train \
        --generate_masks \
        --dataset_spec $DATA_POSE_SPECS_DIR/coco_spec.json

In [None]:
# Generate TFRecords for validation dataset
!tao bpnet dataset_convert \
        -m 'test' \
        -o $DATA_DIR/val \
        --generate_masks \
        --dataset_spec $DATA_POSE_SPECS_DIR/coco_spec.json

In [None]:
# check the tfrecords are generated
!if [ ! -f $LOCAL_DATA_DIR/train-fold-000-of-001 ]; then echo 'Train Tfrecords not found, please generate.'; else echo 'Found train Tfrecords.';fi
!if [ ! -f $LOCAL_DATA_DIR/val-fold-000-of-001 ]; then echo 'Val Tfrecords not found, please generate.'; else echo 'Found val Tfrecords.';fi

### 3.2. Use your own dataset by converting to COCO dataset format <a class="anchor" id="head-3-2"></a>

You may use this notebook with your own dataset as well. This section briefly talks about how you can use your own dataset with BodyposeNet. 

*Note: If you are already using the coco dataset for this notebook example, you may skip this section.*

To use this example with your own dataset:
* Prepare the data and annotations in a format similar to COCO dataset.
* Create a dataset spec under `data_pose_config` similar to `coco_spec.json` which includes the dataset paths, pose configuration, occlusion labeling convention etc.
* Convert your annotations to COCO annotations format.
* Follow same instructions from section 3 through last section.

This section outlines COCO annotations dataset format that the data must be in for BodyposeNet. Although COCO annotations have many fields (please see snippet view of annotations above), only the attributes that are needed by BodyposeNet are specified here. The dataset should use the following overall structure (in a `.json` format):
```
{
    "images": [...],
    "annotations": [...],
    "categories": [...]
}
```

The `images` section contains the complete list of images in the dataset with some metadata. *Note: image ids need to be unique among other images.*

```
"images": [
    {
        "file_name": "000000001000.jpg",
        "height": 480,
        "width": 640,
        "id": 1000
    },
    {
        "file_name": "000000580197.jpg",
        "height": 480,
        "width": 640,
        "id": 580197
    },
    ...
]
```

The "annotations" section follow this format:
```
"annotations": [
    {
        "segmentation": [[162.46,152.13,150.73,...173.92,156.23]],
        "num_keypoints": 17,
        "area": 8720.28915,
        "iscrowd": 0,
        "keypoints": [162,174,2,...,149,352,2],
        "image_id": 1000,
        "bbox": [115.16,152.13,83.23,228.41],
        "category_id": 1,
        "id": 1234574
    }
]
```

Where:
* `segmentation` is a list of polygons which has a list of vertices - for a given person / group.
* `num_keypoints` is the number of keypoints that are labeled
* `iscrowd` if `1` indicates that the annotation mask is for multiple people
* `category_id` is always `1` which is for a `person`
* `id` is the id of the annotation and `image_id` is the id of the associated image
* `keypoints` is a list of keypoints with format as follows `[x1, y1, v1, x2, y2, v2 ...]` where `x` and `y` are pixel locations and `v` is visibility/occlusion flag. 

Each keypoint annotation adheres to the following format below. The keypoint convention in your dataset needs to be converted to this format.
```
"categories": [
    {
        "supercategory": "person",
        "id": 1,
        "name": "person",
        "keypoints": [
            "nose","left_eye","right_eye","left_ear","right_ear",
            "left_shoulder","right_shoulder","left_elbow","right_elbow",
            "left_wrist","right_wrist","left_hip","right_hip",
            "left_knee","right_knee","left_ankle","right_ankle"
        ],
        "skeleton": [
            [16,14],[14,12],[17,15],[15,13],[12,13],[6,12],[7,13],[6,7],
            [6,8],[7,9],[8,10],[9,11],[2,3],[1,2],[1,3],[2,4],[3,5],[4,6],[5,7]
        ]
    }
]
```

COCO dataset follows the given visibility flag convention:
```
"visibility_flags": {
    "value": {
        "visible": 2,
        "occluded": 1,
        "not_labeled": 0
    },
    "mapping": {
        "visible": "visible",
        "occluded": "occluded",
        "not_labeled": "not_labeled"
    }
}
```
You can either convert your dataset to this format, or provide the mapping as above. `value` maps the visibility flag to value. `mapping` maps your naming convention with the convention used in BodyposeNet. You need to map all your states to these three categories: (`visible`, `occluded`, `not_labeled`)


### 3.3. Download pre-trained model <a class="anchor" id="head-2-3"></a>

Download the correct pretrained model from the NGC model registry for your experiment. For optimum results please download model templates from `nvidia/tao/bodyposenet`. The templates are now organized as version strings. For example, to download a pretrained model suitable for bpnet please resolve to the ngc object shown as `nvidia/tao/bodyposenet:trainable_v1.0`.

In [None]:
# Installing NGC CLI on the local machine.
## Download and install
%env CLI=ngccli_cat_linux.zip
!mkdir -p $LOCAL_PROJECT_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))

In [None]:
# List models available in the model registry.
!ngc registry model list nvidia/tao/bodyposenet:*

In [None]:
# Create the target destination to download the model.
!mkdir -p $LOCAL_EXPERIMENT_DIR/pretrained_model/

In [None]:
 # Download the pretrained model from NGC
!ngc registry model download-version nvidia/tao/bodyposenet:trainable_v1.0 \
    --dest $LOCAL_EXPERIMENT_DIR/pretrained_model

In [None]:
# Check if the pretrained model is present 
!ls -rlt $LOCAL_EXPERIMENT_DIR/pretrained_model/bodyposenet_vtrainable_v1.0

## 4. Provide training specification <a class="anchor" id="head-4"></a>

Update the training spec at `$SPECS_DIR/bpnet_train_m1_coco.yaml` as needed. Some guidelines:
* Tfrecords for the train datasets
    * In order to use the newly generated tfrecords for training, update the 'tfrecords_directory_path' and 'train_records_path' parameters of 'dataset_config' section in the spec file at `$SPECS_DIR/bpnet_train_m1_coco.yaml`
* Update `pose_config_path` with spec file at `$MODEL_POSE_SPECS_DIR/bpnet_18joints.json`.
* Update `dataset_specs` with `{'coco': $DATA_POSE_SPECS_DIR/coco_spec.json}`. If using other datasets, `{'<dataset>': $DATA_POSE_SPECS_DIR/<dataset>_spec.json}` 
* Augmentation parameters for on the fly data augmentation
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.

In [None]:
!cat $LOCAL_SPECS_DIR/bpnet_train_m1_coco.yaml

## 5. Run TAO training <a class="anchor" id="head-5"></a>

* Provide the sample spec file and the output directory location for models

*Note: The training may take days to complete. Also, the remaining notebook, assumes that the training was done in single-GPU mode. In multi-gpu mode, training time will roughly decrease by a factor of `$NUM_GPUS`.*

When running the training in multi-GPU mode (`$NUM_GPUS` > 1), you may need to modify the `learning_rate` and/or `batch_size` to get similar accuracy as a 1GPU training run. In most cases, scaling down the batch-size by a factor of `$NUM_GPUS` or scaling up the learning rate by a factor of `$NUM_GPUS` would be a good place to start.

BodyposeNet supports restart from checkpoint. In case the training job is killed prematurely, you may resume training from the closest checkpoint by simply re-running the same command line. Please do make sure to use the same number of GPUs when restarting the training.


In [None]:
!tao bpnet train -e $SPECS_DIR/bpnet_train_m1_coco.yaml \
                 -r $USER_EXPERIMENT_DIR/models/exp_m1_unpruned \
                 -k $KEY \
                 --gpus $NUM_GPUS

In [None]:
# check the training folder for generated files
!ls -lh $LOCAL_EXPERIMENT_DIR/models/exp_m1_unpruned

In [None]:
# Set env for model to use for the remaining steps: infer, evaluate, export.
# NOTE: The last epoch model will tagged/saved as `bpnet_model.tlt` additionally.
# If you want to evaluate model from any other step, please change the below env
# variable accordingly with the filename of the checkpoint.
# Example:
# %set_env MODEL_CHECKPOINT=model.step-1152500.tlt
%set_env MODEL_CHECKPOINT=bpnet_model.tlt

## 6. Evaluate the trained model <a class="anchor" id="head-6"></a>

Evaluate the trained model using the latest checkpoint (or any other checkpoint). 

To keep the evaluation consistent with bottom-up human pose estimation research, we have two modes to evaluate the model.
* `infer_spec.yaml`: This configuration does a single-scale inference on the input image. Aspect ratio of the input image is retained by fixing one of the sides of the network input (height or width), and adjusting the other side to match the aspect ratio of the input image. 
* `infer_spec_refine.yaml`: This configuration does a multi-scale inference on the input image. The scales are configurable. By default, the following scales are used: (0.5, 1.0, 1.5, 2.0)

We also have another mode which is used primarily to verify against the final exported TRT models. *We will be using this in the later sections*  
* `infer_spec_strict.yaml`: This configuration does a single-scale inference on the input image. Aspect ratio of the input image is retained by padding the image on the sides as needed to fit the network input size since the TRT model input dims are fixed.  

*Note: The `--model_filename` arg will override the `model_path` in the `infer_spec.yaml`*


In [None]:
# Single-scale inference
!tao bpnet evaluate  --inference_spec $SPECS_DIR/infer_spec.yaml \
                     --model_filename $USER_EXPERIMENT_DIR/models/exp_m1_unpruned/$MODEL_CHECKPOINT \
                     --dataset_spec $DATA_POSE_SPECS_DIR/coco_spec.json \
                     --results_dir $USER_EXPERIMENT_DIR/results/exp_m1_unpruned/eval_default \
                     -k $KEY

# Uncomment this section for Multi-scale inference
# !tao bpnet evaluate  --inference_spec $SPECS_DIR/infer_spec_refine.yaml \
#                      --model_filename $USER_EXPERIMENT_DIR/models/exp_m1_unpruned/$MODEL_CHECKPOINT \
#                      --dataset_spec $DATA_POSE_SPECS_DIR/coco_spec.json \
#                      --results_dir $USER_EXPERIMENT_DIR/results/exp_m1_unpruned/eval_refine \
#                      -k $KEY

In [None]:
# Check if the bodypose evaluation results file is generated.
!if [ ! -f $LOCAL_EXPERIMENT_DIR/results/exp_m1_unpruned/eval_default/results.csv ]; then echo 'Bodypose Evaluation results file not found!'; else cat $LOCAL_EXPERIMENT_DIR/results/exp_m1_unpruned/eval_default/results.csv;fi

## 7. Run inference on validation set <a class="anchor" id="head-7"></a>

In this section, we run the inference tool to generate inferences on the trained models.

Set-up:
* The model to be used for inference can be either specified in the `model_path` of the `infer_spec.yaml` or as command line argument.
* The `train_spec` in the `inference_spec` file should be the bodypose training spec file used for training. 

The inference tool produces two outputs
* Overlaid images in `$USER_EXPERIMENT_DIR/results/exp_m1_unpruned/infer_default/images_annotated` (this is when `--dump_visualizations` is enabled)
* Frame by frame keypoint labels in `$USER_EXPERIMENT_DIR/results/exp_m1_unpruned/infer_default/detection.json`. *Visualize annotations manually from detections* shows how to parse this result json.

*Note: This supports multiple input types: (`image`, `dir` and `json`). To run inferences for any of these, set the `input_type` and add the path to `input`.

In [None]:
import json
filenames = ['000000214720.jpg', '000000283520.jpg', '000000239537.jpg', '000000001000.jpg',
             '000000006954.jpg', '000000032081.jpg', '000000033759.jpg', '000000076468.jpg',
             '000000121673.jpg', '000000130599.jpg', '000000160864.jpg', '000000140270.jpg']
data = [os.path.join(os.getenv("DATA_DIR"), "val2017", filename) for filename in filenames]
with open(os.path.join(os.getenv("LOCAL_DATA_DIR"), "viz_example_data.json"), 'w') as f:
    json.dump(data, f)

In [None]:
# Single-scale inference
!tao bpnet inference  --inference_spec $SPECS_DIR/infer_spec.yaml \
                      --model_filename $USER_EXPERIMENT_DIR/models/exp_m1_unpruned/$MODEL_CHECKPOINT \
                      --input_type json \
                      --input $USER_EXPERIMENT_DIR/data/viz_example_data.json \
                      --results_dir $USER_EXPERIMENT_DIR/results/exp_m1_unpruned/infer_default \
                      --dump_visualizations \
                      -k $KEY

# Uncomment this section for Multi-scale inference
# !tao bpnet inference  --inference_spec $SPECS_DIR/infer_spec_refine.yaml \
#                       --model_filename $USER_EXPERIMENT_DIR/models/exp_m1_unpruned/$MODEL_CHECKPOINT \
#                       --input_type json \
#                       --input $USER_EXPERIMENT_DIR/data/viz_example_data.json \
#                       --results_dir $USER_EXPERIMENT_DIR/results/exp_m1_unpruned/infer_refine \
#                       --dump_visualizations \
#                       -k $KEY

In [None]:
# check the results file is generated
!if [ ! -f $LOCAL_EXPERIMENT_DIR/results/exp_m1_unpruned/infer_default/detections.json ]; then echo 'Results file not found!'; else cat $LOCAL_EXPERIMENT_DIR/results/exp_m1_unpruned/infer_default/detections.json;fi

### 7.1. Visualize annotations <a class="anchor" id="head-7-1"></a>

In [None]:
!ls $LOCAL_EXPERIMENT_DIR/results/exp_m1_unpruned/infer_default/images_annotated

In [None]:
# Simple grid visualizer
%matplotlib inline
import matplotlib.pyplot as plt
import os
from math import ceil
valid_image_ext = ['.jpg', '.png', '.jpeg', '.ppm']

def visualize_images(output_path, num_cols=2, num_images=4):
    num_rows = int(ceil(float(num_images) / float(num_cols)))
    f, axarr = plt.subplots(num_rows, num_cols, figsize=[80,30])
    f.tight_layout()
    a = [os.path.join(output_path, image) for image in os.listdir(output_path) 
         if os.path.splitext(image)[1].lower() in valid_image_ext]
    for idx, img_path in enumerate(a[:num_images]):
        col_id = idx % num_cols
        row_id = idx // num_cols
        img = plt.imread(img_path)
        axarr[row_id, col_id].imshow(img) 

In [None]:
 # Visualizing sampled images.
OUTPUT_PATH = os.path.join(os.getenv("LOCAL_EXPERIMENT_DIR"), 'results/exp_m1_unpruned/infer_default/images_annotated/')

# Uncomment to visualize multi-scale results
# OUTPUT_PATH = os.path.join(os.getenv("LOCAL_EXPERIMENT_DIR"), 'results/exp_m1_unpruned/infer_refine/images_annotated/')

COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)
# Note that the accuracy is not gauranteed for these visualization examples.

### 7.2. Visualize annotations manually from detections <a class="anchor" id="head-7-2"></a>

This section illustrates the following:
* How the result `detections.json` can be parsed
* How the skeleton is built from the `categories` inside the result file.
* How to visualize the skeleton on an image


In [None]:
# Helper Functions
def gen_topology(skeleton):
    """Generate skeleton topology."""
    K = len(skeleton)
    topology = np.zeros((K, 4), dtype=np.int)
    for k in range(K):
        topology[k][0] = 2 * k
        topology[k][1] = 2 * k + 1
        topology[k][2] = skeleton[k][0] - 1
        topology[k][3] = skeleton[k][1] - 1
    return topology

def draw_on_image(img, topology, keypoints):
    peak_color = (0, 150, 255)
    edge_color = (190, 0, 254)
    stick_width = 2

    # loop through keypoints and draw on image
    for i in range(topology.shape[0]):
        start_idx = topology[i][2]
        end_idx = topology[i][3]
        for n in range(len(keypoints)):
            start_joint = keypoints[n][start_idx]
            end_joint = keypoints[n][end_idx]
            if 0 in start_joint or 0 in end_joint:
                continue
            cv2.circle(
                image, (int(
                    start_joint[0]), int(
                    start_joint[1])), 4, peak_color, thickness=-1)
            cv2.circle(
                image, (int(
                    end_joint[0]), int(
                    end_joint[1])), 4, peak_color, thickness=-1)
            cv2.line(
                image, (int(
                    start_joint[0]), int(
                    start_joint[1])), (int(
                        end_joint[0]), int(
                        end_joint[1])), edge_color, thickness=stick_width)

In [None]:
import os
import cv2
import IPython.display
import PIL.Image
import json
import numpy as np
# read results
results_file = os.path.join(os.getenv("LOCAL_EXPERIMENT_DIR"), "results/exp_m1_unpruned/infer_default/detections.json")
with open(results_file) as f:
    results = json.load(f)

# Generate the topology
skeleton = results['categories'][0]['skeleton']
topology = gen_topology(skeleton)

# get predictions
image_data = results['images'][10]
keypoints = image_data['keypoints']
image_path = image_data['full_image_path'] \
                .replace(os.getenv("USER_EXPERIMENT_DIR"), os.getenv("LOCAL_EXPERIMENT_DIR"))

# read image
img = cv2.imread(image_path)
image = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
draw_on_image(image, topology, keypoints)

# display image
IPython.display.display(PIL.Image.fromarray(image))
# Note that the accuracy is not gauranteed for this visualization example.

## 8. Pruning Workflow <a class="anchor" id="head-8"></a>

### 8.1. Prune the trained model <a class="anchor" id="head-8-1"></a>

* Specify pre-trained model
* Equalization criterion (Applicable for resnets and mobilenets)
* Threshold for pruning.
* A key to save and load the model
* Output directory to store the model

*Usually, you just need to adjust -pth (threshold) for accuracy and model size trade off. Higher pth gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold to use depends on the dataset. A pth value 5.2e-6 is just a start point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy.*

For some internal studies, we have noticed that a pth value of 0.05 is a good starting point for bodyposenet models.

In [None]:
 # Create an output directory if it doesn't exist.
!mkdir -p $LOCAL_EXPERIMENT_DIR/models/exp_m1_pruned

In [None]:
!tao bpnet prune -m $USER_EXPERIMENT_DIR/models/exp_m1_unpruned/$MODEL_CHECKPOINT \
                 -o $USER_EXPERIMENT_DIR/models/exp_m1_pruned/bpnet_model.pruned-0.2.tlt \
                 -eq union \
                 -pth 0.2 \
                 -k $KEY

In [None]:
# Check if the file exists
!ls -rlt $LOCAL_EXPERIMENT_DIR/models/exp_m1_pruned/

### 8.2. Retrain the pruned model <a class="anchor" id="head-8-2"></a>

* Model needs to be re-trained to bring back accuracy after pruning
* Specify re-training specification with pretrained weights as pruned model.
* Follow the same instructions as in *Run TAO Training* section for multi-gpu support

*Note: For retraining, please set the load_graph option to true in the model_config to load the pruned model graph. Also, if after retraining, the model shows some decrease in mAP, it could be that the originally trained model, was pruned a little too much. Please try reducing the pruning threshold, thereby reducing the pruning ratio, and use the new model to retrain.*

In [None]:
# Printing the retrain experiment file. 
# Note: We have updated the experiment file to include the 
# newly pruned model as a pretrained weights and, the
# load_graph option is set to true 
!cat $LOCAL_SPECS_DIR/bpnet_retrain_m1_coco.yaml

In [None]:
# Retraining using the pruned model as model graph 
!tao bpnet train -e $SPECS_DIR/bpnet_retrain_m1_coco.yaml \
                 -r $USER_EXPERIMENT_DIR/models/exp_m1_retrain \
                 -k $KEY \
                 --gpus $NUM_GPUS

In [None]:
 # Listing the newly retrained model.
!ls -rlt $LOCAL_EXPERIMENT_DIR/models/exp_m1_retrain

In [None]:
# Set env for model to use for the remaining steps: infer, evaluate, export.
# NOTE: The last epoch model will tagged/saved as `bpnet_model.tlt` additionally.
# If you want to evaluate model from any other step, please change the below env
# variable accordingly with the filename of the checkpoint.
# Example:
# %set_env MODEL_CHECKPOINT=model.step-1152500.tlt
%set_env RETRAIN_MODEL_CHECKPOINT=bpnet_model.tlt

### 8.3. Evaluate the retrained model <a class="anchor" id="head-8-3"></a>

This section evaluates the pruned and retrained model, using bpnet evaluate. If you see large drop in accuracy, please adjust pruning threshold or retraining params accordingly. 

In [None]:
# Single-scale inference
!tao bpnet evaluate --inference_spec $SPECS_DIR/infer_spec_retrained.yaml \
                    --model_filename $USER_EXPERIMENT_DIR/models/exp_m1_retrain/$RETRAIN_MODEL_CHECKPOINT \
                    --dataset_spec $DATA_POSE_SPECS_DIR/coco_spec.json \
                    --results_dir $USER_EXPERIMENT_DIR/results/exp_m1_retrain/eval_default \
                    -k $KEY

# Uncomment this section for Multi-scale inference
# !tao bpnet evaluate  --inference_spec $SPECS_DIR/infer_spec_retrained_refine.yaml \
#                      --model_filename $USER_EXPERIMENT_DIR/models/exp_m1_retrain/$RETRAIN_MODEL_CHECKPOINT \
#                      --dataset_spec $DATA_POSE_SPECS_DIR/coco_spec.json \
#                      --results_dir $USER_EXPERIMENT_DIR/results/exp_m1_retrain/eval_refine \
#                      -k $KEY

In [None]:
# Check if the bodypose evaluation results file is generated.
!if [ ! -f $LOCAL_EXPERIMENT_DIR/results/exp_m1_retrain/eval_default/results.csv ]; then echo 'Bodypose Evaluation results file not found!'; else cat $LOCAL_EXPERIMENT_DIR/results/exp_m1_retrain/eval_default/results.csv;fi

### 8.4. Inference using retrained model <a class="anchor" id="head-8-4"></a>


In [None]:
# Single-scale inference
!tao bpnet inference  --inference_spec $SPECS_DIR/infer_spec_retrained.yaml \
                      --model_filename $USER_EXPERIMENT_DIR/models/exp_m1_retrain/$RETRAIN_MODEL_CHECKPOINT \
                      --input_type json \
                      --input $USER_EXPERIMENT_DIR/data/viz_example_data.json \
                      --results_dir $USER_EXPERIMENT_DIR/results/exp_m1_retrain/infer_default \
                      --dump_visualizations \
                      -k $KEY

# Uncomment this section for Multi-scale inference
# !tao bpnet inference  --inference_spec $SPECS_DIR/infer_spec_retrained_refine.yaml \
#                       --model_filename $USER_EXPERIMENT_DIR/models/exp_m1_retrain/$RETRAIN_MODEL_CHECKPOINT \
#                       --input_type json \
#                       --input $USER_EXPERIMENT_DIR/data/viz_example_data.json \
#                       --results_dir $USER_EXPERIMENT_DIR/results/exp_m1_retrain/infer_refine \
#                       --dump_visualizations \
#                       -k $KEY

### 8.5. Visualize retrained model inferences <a class="anchor" id="head-8-5"></a>


In [None]:
# Visualize retrained model inferences
OUTPUT_PATH = os.path.join(os.getenv("LOCAL_EXPERIMENT_DIR"), 'results/exp_m1_retrain/infer_default/images_annotated/')

# Uncomment to visualize multi-scale results
# OUTPUT_PATH = os.path.join(os.getenv("LOCAL_EXPERIMENT_DIR"), 'results/exp_m1_retrain/infer_refine/images_annotated/')

COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

## 9. Model Export and INT8 Quantization <a class="anchor" id="head-9"></a>

### 9.1. Choose network input resolution for deployment  <a class="anchor" id="head-9-1"></a>

Network input resolution of the model is one of the major factors that determine the accuracy of bottom-up approaches. Bottom-up methods have to feed the whole image at once, resulting in smaller resolution per person. Hence, higher input resolution would yield better accuracy, especially on small and medium scale persons (w.r.t the image scale). But also note that with higher input resolution, the runtime of the CNN also would be higher. So the accuracy/runtime tradeoff should be decided based on the accuracy and runtime requirements for the target use case.

**Height of the desired network**
Depending on the target use case and the compute or latency constraints, you would need to choose a resolution that works best for you. If your application involves pose estimation for one or more persons close to camera such that the scale of the person are relatively large, then you could go with a smaller network input resolution. Whereas if you are targeting to use for persons with smaller relative scales like crowded scenes, you might want to go with a higher network input resolution. For instance, if your application has person with height of about 25% of the image, the final resized height would be -> (56px for network height of 224, 72px for network height of 288, and 80px for network height of 320). The network with 320 height has maximum resolution for the person and hence, would be more accurate.

**Width of the desired network**
Once you freeze the height of the network, the width can be decided based on the aspect ratio for your input data used during deployment time. Or you can also follow a standard multiple of 32/64 closest to the aspect ratio.

*NOTE: The height and width should be a multiple of 8. Preferably, a multiple of 16/32/64*

**Illustration of accuracy/runtime variation for different resolutions**

*Note: These are approximate runtimes/accuracies for the default architecture and spec used in the notebook. Any changes to the architecture or params will yield different results. This is primarily to get a better sense of which resolution would suit your needs. The runtimes provided are for the CNN*

| Input Resolution | Precision | Runtime (GeForce RTX 2080) | Runtime (Jetson AGX) |
| :-----------: | :-----------: | :-----------: | :-----------: |
| 320x448     | FP16        | 3.13ms    | 18.8ms    |
| 288x384     | FP16        | 2.58ms    | 12.8ms    |
| 224x320     | FP16        | 2.27ms    | 10.1ms    |
| 320x448     | INT8        | 1.80ms    | 8.90ms    |
| 288x384     | INT8        | 1.56ms    | 6.38ms    |
| 224x320     | INT8        | 1.33ms    | 5.07ms    |

You can expect to see a 7-10% mAP increase in `area=medium` category when going from 224x320 to 288x384 and an additional 7-10% mAP when you go to 320x448. The accuracy for `area=large` remains almost same across these resolutions, so you can stick to lower resolution if this is what you need. As per [COCO keypoint evaluation](https://cocodataset.org/#keypoints-eval), `medium` area is defined as persons occupying less than area between 36^2 to 96^2. Anything above it is categorized as `large`.


**Default size used in the notebook**
We use a default size of `288x384` for tradeoff between good accuracy and runtime. For the remainder of the notebook, we assume this configuration. If you would like to use a different resolution, you would need the following changes:
1. Update the environment variables in the cell below with the desired shape.
2. Update the `input_shape` in `infer_spec_strict.yaml` and `infer_spec_retrained_strict.yaml` which will allow you do a sanity evaluation of the exported TRT model. By default, it is set to `[288, 384]`


In [None]:
# Set dimensions of desired output model for inference/deployment
%set_env IN_HEIGHT=288
%set_env IN_WIDTH=384
%set_env IN_CHANNELS=3
%set_env INPUT_SHAPE=288x384x3

# Set input name
%set_env INPUT_NAME=input_1:0

### 9.2. Export `.etlt` model.<a class="anchor" id="head-9-2"></a>

Use the export functionality to export an encrypted model in `fp32` format without any optimizations.

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/model/exp_m1_final
# Removing a pre-existing copy of the etlt if there has been any.
output_file = os.path.join(os.getenv("LOCAL_EXPERIMENT_DIR"), "models/exp_m1_final/bpnet_model.etlt")

if os.path.exists(output_file):
    os.system("rm {}".format(output_file))

# Export the pruned model as is with fp32 with no optimizations.
!tao bpnet export -m $USER_EXPERIMENT_DIR/models/exp_m1_retrain/$RETRAIN_MODEL_CHECKPOINT \
                  -e $SPECS_DIR/bpnet_retrain_m1_coco.yaml \
                  -o $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.etlt \
                  -k $KEY \
                  -t tfonnx

# Use command below, if you'd like to export the unpruned version.
# !tao bpnet export -m $USER_EXPERIMENT_DIR/models/exp_m1_unpruned/$MODEL_CHECKPOINT \
#                   -e $SPECS_DIR/bpnet_train_m1_coco.yaml \
#                   -o $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.etlt \
#                   -k $KEY \
#                   -t tfonnx

In [None]:
# check the deployment file is presented
!if [ ! -f $LOCAL_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.etlt ]; then echo 'Deployment file not found, please generate.'; else echo 'Found deployment file.';fi

### 9.3. Int8 Optimization<a class="anchor" id="head-9-3"></a>

BodyposeNet model supports int8 inference mode in TensorRT. In order to do this, the model is first calibrated to run 8-bit inferences. This is the process:
* Provide a directory with set of images to be used for calibration. 
* A calibration tensorfile is generated and saved in `--cal_data_file`
* This tensorfile is use to calibrate the model and the calibration table is stored in `--cal_cache_file`
* The calibration table in addition to the model is used to generate the int8 tensorrt engine to the path `--engine_file`

Since the COCO dataset contains a lot of non-person images as well which might not be useful for the calibration process, we use a sampling script which parses the annotations and samples required number of images at random based on certain criteria. The following command ensures that there is at least one person in the image being picked. (`pth` corresponds to threshold for minimum number of persons per image). You can choose to remove the `--randomize` flag to always pick the same subset of qualified images. 

*Note: For this example, we generate a calibration tensorfile containing 2000 batches of training data. Ideally, it is best to use at least 10-20% of the training data to do so. The more data provided during calibration, the closer int8 inferences are to fp32 inferences.*

In [None]:
# Number of calibration samples to use
%set_env NUM_CALIB_SAMPLES=2000

In [None]:
!python3 sample_calibration_images.py \
    -a $LOCAL_EXPERIMENT_DIR/data/annotations/person_keypoints_train2017.json \
    -i $LOCAL_EXPERIMENT_DIR/data/train2017/ \
    -o $LOCAL_EXPERIMENT_DIR/data/calibration_samples/ \
    -n $NUM_CALIB_SAMPLES \
    -pth 1 \
    --randomize

In [None]:
output_file = os.path.join(os.getenv("LOCAL_EXPERIMENT_DIR"), "models/exp_m1_final/bpnet_model.etlt")
# NOTE: If you are trying to re-run calibration, please remove the calibration table (cal_cache_file).
# If you are trying to re-generate calibration data, please remove cal_data_file as well.

if os.path.exists(output_file):
    os.system("rm {}".format(output_file))

!tao bpnet export \
    -m $USER_EXPERIMENT_DIR/models/exp_m1_retrain/$RETRAIN_MODEL_CHECKPOINT \
    -o $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.etlt \
    -k $KEY \
    -d $IN_HEIGHT,$IN_WIDTH,$IN_CHANNELS \
    -e $SPECS_DIR/bpnet_retrain_m1_coco.yaml \
    -t tfonnx \
    --data_type int8 \
    --engine_file $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.$IN_HEIGHT.$IN_WIDTH.int8.engine \
    --cal_image_dir $USER_EXPERIMENT_DIR/data/calibration_samples/ \
    --cal_cache_file $USER_EXPERIMENT_DIR/models/exp_m1_final/calibration.$IN_HEIGHT.$IN_WIDTH.bin  \
    --cal_data_file $USER_EXPERIMENT_DIR/models/exp_m1_final/coco.$IN_HEIGHT.$IN_WIDTH.tensorfile \
    --batch_size 1 \
    --batches $NUM_CALIB_SAMPLES \
    --max_batch_size 1 \
    --data_format channels_last

### 9.4. Generate TensorRT engine<a class="anchor" id="head-9-4"></a>
Here, we use another method of generating the TensorRT engine model. If you have another Nvidia GPU device where you'd like to optimize the `.etlt` model for, you can use the `tao-converter` command on that device alongside your `.etlt` model and calibration cache to generate the optimized TensorRT engine.

Verify engine generation using the `tao-converter` utility included with the docker.

The `tao-converter` produces optimized tensorrt engines for the platform that it resides on. Therefore, to get maximum performance, please instantiate this docker and execute the `tao-converter` command, with the exported `.etlt` file and calibration cache (for int8 mode) on your target device. 

The tao-converter utility included in this docker only works for x86 devices, with discrete NVIDIA GPU's. For the jetson devices, please download the tao-converter for jetson from the dev zone link [here](https://developer.nvidia.com/tao-converter). 


In [None]:
# Set opt profile shapes
%set_env MAX_BATCH_SIZE=1
%set_env OPT_BATCH_SIZE=1

In [None]:
# Convert to TensorRT engine(FP32).
!tao converter $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.etlt \
                -k $KEY \
                -t fp32 \
                -e $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.$IN_HEIGHT.$IN_WIDTH.fp32.engine \
                -p ${INPUT_NAME},1x$INPUT_SHAPE,${OPT_BATCH_SIZE}x$INPUT_SHAPE,${MAX_BATCH_SIZE}x$INPUT_SHAPE

In [None]:
# Convert to TensorRT engine(FP16).
!tao converter $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.etlt \
                -k $KEY \
                -t fp16 \
                -e $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.$IN_HEIGHT.$IN_WIDTH.fp16.engine \
                -p ${INPUT_NAME},1x$INPUT_SHAPE,${OPT_BATCH_SIZE}x$INPUT_SHAPE,${MAX_BATCH_SIZE}x$INPUT_SHAPE

In [None]:
# Convert to TensorRT engine(INT8).
!tao converter $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.etlt \
                -k $KEY \
                -t int8 \
                -c $USER_EXPERIMENT_DIR/models/exp_m1_final/calibration.$IN_HEIGHT.$IN_WIDTH.bin \
                -e $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.$IN_HEIGHT.$IN_WIDTH.int8.engine \
                -p ${INPUT_NAME},1x$INPUT_SHAPE,${OPT_BATCH_SIZE}x$INPUT_SHAPE,${MAX_BATCH_SIZE}x$INPUT_SHAPE

## 10. Verify TensorRT model and Deploy <a class="anchor" id="head-10"></a>

Verify the exported model by visualizing inferences on TensorRT.
In addition to running inference on a `.tlt` model, the inference tool is also capable of consuming the converted TensorRT engine.


### 10.1. Inference Using TensorRt Engine <a class="anchor" id="head-10-1"></a>

Please make sure to update the inference_spec file if you are using a different resolution other than default.

In [None]:
# Set helper envs
%set_env INFER_DIR_NAME=infer_strict_${IN_HEIGHT}_${IN_WIDTH}
%set_env EVAL_DIR_NAME=eval_strict_${IN_HEIGHT}_${IN_WIDTH}

In [None]:
# INT8 inference
!tao bpnet inference --inference_spec $SPECS_DIR/infer_spec_retrained_strict.yaml \
                     --model_filename $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.$IN_HEIGHT.$IN_WIDTH.int8.engine \
                     --input_type json \
                     --input $USER_EXPERIMENT_DIR/data/viz_example_data.json \
                     --results_dir $USER_EXPERIMENT_DIR/results/exp_m1_final/${INFER_DIR_NAME}_int8 \
                     --dump_visualizations

# FP16 inference
# !tao bpnet inference --inference_spec $SPECS_DIR/infer_spec_retrained_strict.yaml \
#                      --model_filename $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.$IN_HEIGHT.$IN_WIDTH.fp16.engine \
#                      --input_type json \
#                      --input $USER_EXPERIMENT_DIR/data/viz_example_data.json \
#                      --results_dir $USER_EXPERIMENT_DIR/results/exp_m1_final/${INFER_DIR_NAME}_fp16 \
#                      --dump_visualizations

# FP32 inference
# !tao bpnet inference --inference_spec $SPECS_DIR/infer_spec_retrained_strict.yaml \
#                      --model_filename $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.$IN_HEIGHT.$IN_WIDTH.fp32.engine \
#                      --input_type json \
#                      --input $USER_EXPERIMENT_DIR/data/viz_example_data.json \
#                      --results_dir $USER_EXPERIMENT_DIR/results/exp_m1_final/${INFER_DIR_NAME}_fp32 \
#                      --dump_visualizations

### 10.2. Visualize TensorRT Inferences<a class="anchor" id="head-10-2"></a>

In [None]:
# Visualize trt inferences
OUTPUT_PATH = os.path.join(
    os.getenv("LOCAL_EXPERIMENT_DIR"), 'results/exp_m1_final/infer_strict_{}_{}_int8/images_annotated/'.format(
        os.getenv("IN_HEIGHT"), os.getenv("IN_WIDTH")))

# Uncomment to visualize FP16/FP32 results
# OUTPUT_PATH = os.path.join(
#     os.getenv("LOCAL_EXPERIMENT_DIR"), 'results/exp_m1_final/infer_strict_{}_{}_fp16/images_annotated/'.format(
#         os.getenv("IN_HEIGHT"), os.getenv("IN_WIDTH")))
# OUTPUT_PATH = os.path.join(
#     os.getenv("LOCAL_EXPERIMENT_DIR"), 'results/exp_m1_final/infer_strict_{}_{}_fp32/images_annotated/'.format(
#         os.getenv("IN_HEIGHT"), os.getenv("IN_WIDTH")))

COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

### 10.3. Evaluate the TensorRT Engine<a class="anchor" id="head-10-3"></a>

*Note: This evaluation is mainly used as a sanity check for the exported TRT (INT8/FP16) models. This doesn't reflect the true accuracy of the model as the input aspect ratio here can vary a lot from the aspect ratio of the images in the validation set (which has a collection of images with various resolutions). Here, we retain a strict input resolution and pad the image to retrain the aspect ratio. So the accuracy here might vary based on the aspect ratio and the network resolution you choose.*

We run the evaluation of the `.tlt` model in strict mode as well to compare with the accuracies of the INT8/FP16/FP32 models for any drop in accuracy. 

The FP16/FP32 models should have no or minimal drop in accuracy when compared to the `.tlt` model in this step. The INT8 models would have similar accuracies (or comparable within 2-3% mAP range) to the `.tlt` model. 

Note: If after INT8 calibration the accuracy of the INT8 inferences seem to degrade, it could be because of a couple of reasons:
- There wasn't enough data in the calibration tensorfile used to calibrate the model
- The training data is not entirely representative of your test images, and the calibration may be incorrect. Therefore, you may either regenerate the calibration tensorfile with more batches of the training data and recalibrate the model, or add a few images from the test set. 
- When using calibration data sampling, it is possible that the randomly sampled subset of data is not a good representative of the test dataset. So this could lead to a poor calibration as well. You can either re-try the sampling script, or increase the number of samples / modify the criterion like min person and min keypoint thresholds.  

*For more information, please follow the instructions in the USER GUIDE. Alternatively, you can opt for corresponding `fp16` model instead of `int8`.*

In [None]:
# .tlt model evaluation in strict mode
# Single-scale inference
!tao bpnet evaluate --inference_spec $SPECS_DIR/infer_spec_retrained_strict.yaml \
                    --model_filename $USER_EXPERIMENT_DIR/models/exp_m1_retrain/$RETRAIN_MODEL_CHECKPOINT \
                    --dataset_spec $DATA_POSE_SPECS_DIR/coco_spec.json \
                    --results_dir $USER_EXPERIMENT_DIR/results/exp_m1_retrain/$EVAL_DIR_NAME \
                    -k $KEY

In [None]:
# Check if the tao bodypose evaluation results file is generated.
!if [ ! -f $LOCAL_EXPERIMENT_DIR/results/exp_m1_retrain/eval_strict_${IN_HEIGHT}_${IN_WIDTH}/results.csv ]; then echo '.tlt model evaluation results file not found!'; else cat $LOCAL_EXPERIMENT_DIR/results/exp_m1_retrain/eval_strict_${IN_HEIGHT}_${IN_WIDTH}/results.csv;fi

In [None]:
# FP16 evaluation
!tao bpnet evaluate --inference_spec $SPECS_DIR/infer_spec_retrained_strict.yaml \
                    --model_filename $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.$IN_HEIGHT.$IN_WIDTH.fp16.engine \
                    --dataset_spec $DATA_POSE_SPECS_DIR/coco_spec.json \
                    --results_dir $USER_EXPERIMENT_DIR/results/exp_m1_final/${EVAL_DIR_NAME}_fp16

# Uncomment if you'd like to evaluate FP32 model
# # FP32 evaluation
# !tao bpnet evaluate --inference_spec $SPECS_DIR/infer_spec_retrained_strict.yaml \
#                     --model_filename $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.$IN_HEIGHT.$IN_WIDTH.fp32.engine \
#                     --dataset_spec $DATA_POSE_SPECS_DIR/coco_spec.json \
#                     --results_dir $USER_EXPERIMENT_DIR/results/exp_m1_final/${EVAL_DIR_NAME}_fp32

In [None]:
# INT8 evaluation
!tao bpnet evaluate --inference_spec $SPECS_DIR/infer_spec_retrained_strict.yaml \
                    --model_filename $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.$IN_HEIGHT.$IN_WIDTH.int8.engine \
                    --dataset_spec $DATA_POSE_SPECS_DIR/coco_spec.json \
                    --results_dir $USER_EXPERIMENT_DIR/results/exp_m1_final/${EVAL_DIR_NAME}_int8

In [None]:
# Check if the INT8/FP16 model evaluation results file are generated.
!if [ ! -f $LOCAL_EXPERIMENT_DIR/results/exp_m1_final/eval_strict_${IN_HEIGHT}_${IN_WIDTH}_int8/results.csv ]; then echo 'INT8 model evaluation results file not found!'; else cat $LOCAL_EXPERIMENT_DIR/results/exp_m1_final/eval_strict_${IN_HEIGHT}_${IN_WIDTH}_int8/results.csv;fi
!if [ ! -f $LOCAL_EXPERIMENT_DIR/results/exp_m1_final/eval_strict_${IN_HEIGHT}_${IN_WIDTH}_fp16/results.csv ]; then echo 'FP16 model evaluation results file not found!'; else cat $LOCAL_EXPERIMENT_DIR/results/exp_m1_final/eval_strict_${IN_HEIGHT}_${IN_WIDTH}_fp16/results.csv;fi

### 10.4. Export Deployable Model <a class="anchor" id="head-10-4"></a>

Once the model is verified, now we need to re-export the model so it can be used to run on our inference platforms like TAO CV Inference or Deepstream. It's the same guidelines as `Export .etlt` and `INT8 Optimization` sections, but we need to add `--sdk_compatible_model` flag to the export command. This adds a few non-trainable post-process layers to the model.

Please make sure to re-use the already generated calibration tensorfile (`--cal_data_file`) in the previous step to keep it consistent, but you will need to regenerate the `cal_cache_file` and the `.etlt` model. 

*NOTE: This model will not work with the bpnet inference / evaluate commands. This is for deployment only*

In [None]:
output_file = os.path.join(os.getenv("LOCAL_EXPERIMENT_DIR"), "models/exp_m1_final/bpnet_model.deploy.etlt")
if os.path.exists(output_file):
    os.system("rm {}".format(output_file))

!tao bpnet export \
    -m $USER_EXPERIMENT_DIR/models/exp_m1_retrain/$RETRAIN_MODEL_CHECKPOINT \
    -o $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.deploy.etlt \
    -k $KEY \
    -d $IN_HEIGHT,$IN_WIDTH,$IN_CHANNELS \
    -e $SPECS_DIR/bpnet_retrain_m1_coco.yaml \
    -t tfonnx \
    --data_type int8 \
    --engine_file $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.$IN_HEIGHT.$IN_WIDTH.int8.deploy.engine \
    --cal_image_dir $USER_EXPERIMENT_DIR/data/calibration_samples/ \
    --cal_cache_file $USER_EXPERIMENT_DIR/models/exp_m1_final/calibration.$IN_HEIGHT.$IN_WIDTH.deploy.bin  \
    --cal_data_file $USER_EXPERIMENT_DIR/models/exp_m1_final/coco.$IN_HEIGHT.$IN_WIDTH.tensorfile \
    --batch_size 1 \
    --batches $NUM_CALIB_SAMPLES \
    --max_batch_size 1 \
    --data_format channels_last \
    --sdk_compatible_model

You can now generate the corresponding TRT engines for the target platforms using `tao-converter` as shown in the  previous section (for INT8 / FP16 / FP32) using the generated etlt model (`bpnet_model.deploy.etlt`) and calibration table (`calibration.*.deploy.bin`)