# Object Detection using TAO FasterRCNN

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png" width="1080"> 

 ## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Take a pretrained resnet18 model and train a ResNet-18 FasterRCNN model on the KITTI dataset
* Prune the trained FasterRCNN model
* Retrain the pruned model to recover lost accuracy
* Run evaluation & inference on the trained model to verify the accuracy
* Export & deploy the model in DeepStream/TensorRT
* Quantization-Aware Training(QAT) workflow for the best accuracy-performance trade-off

At the end of this notebook, you will have generated a trained and optimized `faster_rcnn` model
which you may deploy via [Triton](https://github.com/NVIDIA-AI-IOT/tao-toolkit-triton-apps)
or [DeepStream](https://developer.nvidia.com/deepstream-sdk).
 
 ### Table of Contents

 This notebook shows an example use case of FasterRCNN using Train Adapt Optimize (TAO) Toolkit.

 0. [Set up env variables and map drives](#head-0)
 1. [Install the TAO launcher](#head-1)
 2. [Prepare dataset and pretrained model](#head-2)<br>
     2.1 [Download the dataset](#head-2-1)<br>
     2.2 [Verify the downloaded dataset](#head-2-2)<br>
     2.3 [Prepare tfrecords from kitti format dataset](#head-2-3)<br>
     2.4 [Download pretrained model](#head-2-4)
 3. [Provide training specification](#head-3)
 4. [Run TAO training](#head-4)
 5. [Evaluate trained models](#head-5)
 6. [Prune trained models](#head-6)
 7. [Retrain pruned models](#head-7)
 8. [Evaluate retrained model](#head-8)
 9. [Visualize inferences](#head-9)
 10. [Deploy](#head-10)
 11. [QAT workflow](#head-11)<br>
     11.1 [Training](#head-11.1)<br>
     11.2 [Evaluation](#head-11.2)<br>
     11.3 [Pruning](#head-11.3)<br>
     11.4 [Retraining](#head-11.4)<br>
     11.5 [Evaluation of the retrained model](#head-11.5)<br>
     11.6 [Inference of the retrained model](#head-11.6)<br>
     11.7 [Deployment of the QAT model](#head-11.7)

 ## 0. Set up env variables and map drives <a class="anchor" id="head-0"></a>
 
The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users workspace. More information on how to set up the dataset and the supported steps in the TAO workflow are provided in the subsequent cells.

In [19]:
import os

print("Please replace the variable with your key.")
%env GPU_INDEX=0
%env KEY=amgyMTMzcDc3ZDY0MHUyN3FrMWFpa2E5bHI6MGQ2ODg0YzEtYmZkOC00YWJlLTk5NjQtYmMyMDYxZTU1NjNl
%env USER_EXPERIMENT_DIR=/workspace/tao-experiments/faster_rcnn
%env DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data

# Set this path if you don't run the notebook from the samples directory.
%env NOTEBOOK_ROOT=/home/msc1/workspace/tao-experiments/faster_rcnn

# Please define this local project directory that needs to be mapped to the TAO docker session.
# The dataset expected to be present in $LOCAL_PROJECT_DIR/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/faster_rcnn
%env LOCAL_PROJECT_DIR=/home/msc1/workspace/tao-experiments
os.environ["LOCAL_DATA_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "data")
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "faster_rcnn")

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)
%env SPECS_DIR=/workspace/tao-experiments/faster_rcnn/specs

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR

Please replace the variable with your key.
env: GPU_INDEX=0
env: KEY=amgyMTMzcDc3ZDY0MHUyN3FrMWFpa2E5bHI6MGQ2ODg0YzEtYmZkOC00YWJlLTk5NjQtYmMyMDYxZTU1NjNl
env: USER_EXPERIMENT_DIR=/workspace/tao-experiments/faster_rcnn
env: DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data
env: NOTEBOOK_ROOT=/home/msc1/workspace/tao-experiments/faster_rcnn
env: LOCAL_PROJECT_DIR=/home/msc1/workspace/tao-experiments
env: SPECS_DIR=/workspace/tao-experiments/faster_rcnn/specs
total 68
-rw-r--r-- 1 msc1 msc1 3735 Jul 11 23:54 default_spec_resnet10.txt
-rw-r--r-- 1 msc1 msc1 3756 Jul 11 23:54 default_spec_mobilenet_v2.txt
-rw-r--r-- 1 msc1 msc1 3753 Jul 11 23:54 default_spec_mobilenet_v1.txt
-rw-r--r-- 1 msc1 msc1 3738 Jul 11 23:54 default_spec_googlenet.txt
-rw-r--r-- 1 msc1 msc1 3830 Jul 11 23:54 default_spec_efficientnet_b1.txt
-rw-r--r-- 1 msc1 msc1 3830 Jul 11 23:54 default_spec_efficientnet_b0.txt
-rw-r--r-- 1 msc1 msc1 3740 Jul 11 23:54 default_spec_darknet53.txt
-rw-r--r-- 1 msc1 msc1 3740 J

In [18]:
# Create local dir
!mkdir -p $LOCAL_DATA_DIR
!mkdir -p $LOCAL_EXPERIMENT_DIR

The cell below maps the project directory on your local host to a workspace directory in the TAO docker instance, so that the data and the results are mapped from in and out of the docker. For more information please refer to the [launcher instance](https://docs.nvidia.com/tao/tao-toolkit/tao_launcher.html) in the user guide.

When running this cell on AWS, update the drive_map entry with the dictionary defined below, so that you don't have permission issues when writing data into folders created by the TAO docker.

```json
drive_map = {
    "Mounts": [
        # Mapping the data directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tao-experiments"
        },
        # Mapping the specs directory.
        {
            "source": os.environ["LOCAL_SPECS_DIR"],
            "destination": os.environ["SPECS_DIR"]
        },
    ],
    "DockerOptions": {
        "user": "{}:{}".format(os.getuid(), os.getgid())
    },
    # set gpu index for tao-converter
    "Envs": [
        {"variable": "CUDA_VISIBLE_DEVICES", "value": os.getenv("GPU_INDEX")},
    ]
}
```

In [21]:
# Mapping up the local directories to the TAO docker.
import json
import os
mounts_file = os.path.expanduser("~/.tao_mounts.json")

# Define the dictionary with the mapped drives
drive_map = {
    "Mounts": [
        # Mapping the data directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tao-experiments"
        },
        # Mapping the specs directory.
        {
            "source": os.environ["LOCAL_SPECS_DIR"],
            "destination": os.environ["SPECS_DIR"]
        },
    ],
    # set gpu index for tao-converter
    "Envs": [
        {"variable": "CUDA_VISIBLE_DEVICES", "value": os.getenv("GPU_INDEX")},
    ]
}

# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(drive_map, mfile, indent=4)

In [22]:
!cat ~/.tao_mounts.json

{
    "Mounts": [
        {
            "source": "/home/msc1/workspace/tao-experiments",
            "destination": "/workspace/tao-experiments"
        },
        {
            "source": "/home/msc1/workspace/tao-experiments/faster_rcnn/specs",
            "destination": "/workspace/tao-experiments/faster_rcnn/specs"
        }
    ],
    "Envs": [
        {
            "variable": "CUDA_VISIBLE_DEVICES",
            "value": "0"
        }
    ]
}

## 1. Install the TAO launcher <a class="anchor" id="head-1"></a>
The TAO launcher is a python package distributed as a python wheel listed in PyPI. You may install the launcher by executing the following cell.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:
* python >=3.6.9 < 3.8.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 460+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be triggered to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.

In [None]:
# Skip this step if you have already installed the TAO launcher.
!pip3 install --upgrade nvidia-tao

In [3]:
# View the versions of the TAO launcher
!tao info

Configuration of the TAO Toolkit Instance
dockers: ['nvidia/tao/tao-toolkit']
format_version: 2.0
toolkit_version: 4.0.1
published_date: 03/06/2023


 ## 2. Prepare dataset and pretrained model <a class="anchor" id="head-2"></a>


  The data will then be extracted to have
 * training images in `$LOCAL_DATA_DIR/train/images`
 * training labels in `$LOCAL_DATA_DIR/train/labels`
 * testing images in `$LOCAL_DATA_DIR/test/images`
 
You may use this notebook with your own dataset as well. To use this example with your own dataset, please follow the same directory structure as mentioned below.

*Note: There are no labels for the testing images, therefore we use it just to visualize inferences for the trained model.*

In [5]:
# verify
import os

DATA_DIR = os.environ.get('LOCAL_DATA_DIR')
num_training_images = len(os.listdir(os.path.join(DATA_DIR, "train/images")))
num_training_labels = len(os.listdir(os.path.join(DATA_DIR, "train/labels")))
num_testing_images = len(os.listdir(os.path.join(DATA_DIR, "test/images")))
print("Number of images in the train/val set. {}".format(num_training_images))
print("Number of labels in the train/val set. {}".format(num_training_labels))
print("Number of images in the test set. {}".format(num_testing_images))

Number of images in the train/val set. 6884
Number of labels in the train/val set. 6884
Number of images in the test set. 981


In [5]:
# Sample kitti label.
!cat $LOCAL_DATA_DIR/train/labels/0_jpg.rf.c4f611bd2d74c025eba5621935579ef6.txt

knife 0.00 0 0.0 289.00 228.00 380.00 388.00 0.0 0.0 0.0 0.0 0.0 0.0 0.0

### 2.3 Prepare tfrecords from kitti format dataset <a class="anchor" id="head-2-3"></a>

* Update the tfrecords spec file to take in your kitti format dataset
* Create the tfrecords using the dataset_convert 
* TFRecords only need to be generated once.

In [6]:
print("TFrecords conversion spec file for training")
!cat $LOCAL_SPECS_DIR/frcnn_tfrecords_kitti_trainval.txt

TFrecords conversion spec file for training
kitti_config {
  root_directory_path: "/workspace/tao-experiments/data/train"
  image_dir_name: "images"
  label_dir_name: "labels"
  image_extension: ".jpg"
  partition_mode: "random"
  num_partitions: 2
  val_split: 14
  num_shards: 10
}
image_directory_path: "/workspace/tao-experiments/data/train/images"


In [23]:
#KITTI trainval
!tao faster_rcnn dataset_convert --gpu_index $GPU_INDEX -d $SPECS_DIR/frcnn_tfrecords_kitti_trainval.txt \
                     -o $DATA_DOWNLOAD_DIR/train/tfrecords

2023-07-26 02:04:40,606 [INFO] root: Registry: ['nvcr.io']
2023-07-26 02:04:40,641 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/msc1/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
2023-07-26 01:04:41.435674: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.
2023-07-26 01:04:46,071 [INFO] iva.detectnet_v2.dataio.build_converter: Instantiating a kitti converter
2023-07-26 01:04:46,084 [INFO] iva.detectnet_v2.dataio.kitti_converter_lib: Num images in
Train: 5921	Val: 963
2023-07-26 01:04:46,084 [INFO] iva.detectnet_v2.dataio.kitti_converter_lib: V

In [30]:
!ls -rlt $LOCAL_DATA_DIR/train/tfrecords*

-rw-r--r-- 1 root root  62079 Jul 26 02:04 /home/msc1/workspace/tao-experiments/data/train/tfrecords-fold-000-of-002-shard-00000-of-00010
-rw-r--r-- 1 root root  62271 Jul 26 02:04 /home/msc1/workspace/tao-experiments/data/train/tfrecords-fold-000-of-002-shard-00001-of-00010
-rw-r--r-- 1 root root  62305 Jul 26 02:04 /home/msc1/workspace/tao-experiments/data/train/tfrecords-fold-000-of-002-shard-00002-of-00010
-rw-r--r-- 1 root root  62728 Jul 26 02:04 /home/msc1/workspace/tao-experiments/data/train/tfrecords-fold-000-of-002-shard-00003-of-00010
-rw-r--r-- 1 root root  62018 Jul 26 02:04 /home/msc1/workspace/tao-experiments/data/train/tfrecords-fold-000-of-002-shard-00004-of-00010
-rw-r--r-- 1 root root  62185 Jul 26 02:04 /home/msc1/workspace/tao-experiments/data/train/tfrecords-fold-000-of-002-shard-00005-of-00010
-rw-r--r-- 1 root root  62152 Jul 26 02:04 /home/msc1/workspace/tao-experiments/data/train/tfrecords-fold-000-of-002-shard-00006-of-00010
-rw-r--r-- 1 root root  621

 ### 2.4 Download pre-trained model <a class="anchor" id="head-2-4"></a>

In [8]:
# Installing NGC CLI on the local machine.
## Download and install
%env CLI=ngccli_cat_linux.zip
!mkdir -p $LOCAL_PROJECT_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))

env: CLI=ngccli_cat_linux.zip
--2023-07-26 01:56:37--  https://ngc.nvidia.com/downloads/ngccli_cat_linux.zip
Resolving ngc.nvidia.com (ngc.nvidia.com)... 18.172.89.60, 18.172.89.76, 18.172.89.74, ...
Connecting to ngc.nvidia.com (ngc.nvidia.com)|18.172.89.60|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 45656890 (44M) [application/zip]
Saving to: ‘/home/msc1/workspace/tao-experiments/ngccli/ngccli_cat_linux.zip’


2023-07-26 01:56:41 (18.5 MB/s) - ‘/home/msc1/workspace/tao-experiments/ngccli/ngccli_cat_linux.zip’ saved [45656890/45656890]

Archive:  /home/msc1/workspace/tao-experiments/ngccli/ngccli_cat_linux.zip
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/boto3/
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/boto3/examples/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/boto3/examples/s3.rst  
  inflating: /home/msc1/workspace/tao-exper

  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/main.so  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/cryptography-39.0.2.dist-info/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/cryptography-39.0.2.dist-info/RECORD  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/cryptography-39.0.2.dist-info/LICENSE.BSD  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/cryptography-39.0.2.dist-info/LICENSE  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/cryptography-39.0.2.dist-info/LICENSE.PSF  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/cryptography-39.0.2.dist-info/METADATA  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/cryptography-39.0.2.dist-info/LICENSE.APACHE  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/cryptography-39.0.2.dist-info/WHEEL  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/cryptography-39.

  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/opentelemetry/sdk/metrics/_internal/exponential_histogram/mapping/ieee_754.md  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/opentelemetry/sdk/metrics/_internal/exponential_histogram/buckets.py  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/opentelemetry/sdk/metrics/_internal/exceptions.py  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/opentelemetry/sdk/metrics/_internal/measurement_consumer.py  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/opentelemetry/sdk/metrics/_internal/export/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/opentelemetry/sdk/metrics/_internal/export/__init__.py  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/opentelemetry/sdk/metrics/_internal/__init__.py  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/opentelemetry/sdk/metrics/_internal/metric_reader_storage.py  
 

  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/lib-dynload/binascii.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/lib-dynload/_csv.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/lib-dynload/_posixsubprocess.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/lib-dynload/_sha256.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/lib-dynload/_codecs_hk.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/lib-dynload/_multiprocessing.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/lib-dynload/select.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/lib-dynload/_asyncio.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/msc1/workspace/tao

  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/redshift/2012-12-01/service-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/redshift/2012-12-01/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/events/
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/events/2015-10-07/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/events/2015-10-07/paginators-1.json  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/events/2015-10-07/endpoint-rule-set-1.json.gz  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/events/2015-10-07/service-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/events/2015-10-07/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/events/2014-02-03/
 e

  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/iotanalytics/2017-11-27/service-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/iotanalytics/2017-11-27/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/connect/
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/connect/2017-08-08/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/connect/2017-08-08/paginators-1.json  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/connect/2017-08-08/endpoint-rule-set-1.json.gz  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/connect/2017-08-08/service-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/connect/2017-08-08/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/marketpl

  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/iot/2015-05-28/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/databrew/
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/databrew/2017-07-25/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/databrew/2017-07-25/paginators-1.json  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/databrew/2017-07-25/endpoint-rule-set-1.json.gz  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/databrew/2017-07-25/service-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/databrew/2017-07-25/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/apprunner/
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/apprunner/2020-05-15/
 extracting: /home

   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/ec2/2014-09-01/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/ec2/2014-09-01/waiters-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/ec2/2014-09-01/paginators-1.json  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/ec2/2014-09-01/endpoint-rule-set-1.json.gz  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/ec2/2014-09-01/service-2.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/ec2/2015-04-15/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/ec2/2015-04-15/waiters-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/ec2/2015-04-15/paginators-1.json  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/ec2/2015-04-15/endpoint-rule-set-1.jso

  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/lex-models/2017-04-19/service-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/lex-models/2017-04-19/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/budgets/
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/budgets/2016-10-20/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/budgets/2016-10-20/paginators-1.json  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/budgets/2016-10-20/endpoint-rule-set-1.json.gz  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/budgets/2016-10-20/service-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/budgets/2016-10-20/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/glue/
   cr

  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/prettytable-2.0.0.dist-info/METADATA  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/prettytable-2.0.0.dist-info/WHEEL  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/prettytable-2.0.0.dist-info/INSTALLER  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/prettytable-2.0.0.dist-info/top_level.txt  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/opentelemetry_exporter_otlp_proto_common-1.19.0.dist-info/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/opentelemetry_exporter_otlp_proto_common-1.19.0.dist-info/RECORD  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/opentelemetry_exporter_otlp_proto_common-1.19.0.dist-info/METADATA  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/opentelemetry_exporter_otlp_proto_common-1.19.0.dist-info/WHEEL  
   creating: /home/msc1/workspace/tao-experiments/ng

In [9]:
!ngc registry model list nvidia/tao/pretrained_object_detection

[{
    "application": "Other",
    "createdDate": "2021-08-16T15:53:38.516Z",
    "description": "Pretrained weights to facilitate transfer learning using TAO Toolkit.",
    "displayName": "TAO Pretrained Object Detection",
    "framework": "Other",
    "isPublic": true,
    "labels": [
        {
            "key": "general",
            "values": [
                "yolo",
                "tao",
                "ssd",
                "retinanet",
                "dssd",
                "resnet",
                "Retail",
                "industrial",
                "cv",
                "public safety",
                "efficientnet",
                "fasterrcnn",
                "inspection",
                "smart city",
                "smart infrastructure"
            ]
        },
        {
            "key": "framework",
            "values": [
                "Other"
            ]
        },
        {
            "key": "precision",
         

In [10]:
# Download model from NGC.
!ngc registry model download-version nvidia/tao/pretrained_object_detection:resnet18

{
    "download_end": "2023-07-26 01:57:18",
    "download_start": "2023-07-26 01:57:06",
    "download_time": "11s",
    "files_downloaded": 1,
    "local_path": "/home/msc1/workspace/tao-experiments/faster_rcnn/pretrained_object_detection_vresnet18",
    "size_downloaded": "88.96 MB",
    "status": "COMPLETED"
}


In [11]:
# Copy weights to experiment directory.
!cp pretrained_object_detection_vresnet18/resnet_18.hdf5 $LOCAL_EXPERIMENT_DIR
!rm -rf pretrained_object_detection_vresnet18
!ls -rlt $LOCAL_EXPERIMENT_DIR

total 101848
drwxr-xr-x 2 msc1 msc1     4096 Jul 25 15:39 specs
-rw-r--r-- 1 msc1 msc1 10952551 Jul 26 01:42 yolo_v4.ipynb
-rw-r--r-- 1 msc1 msc1    50479 Jul 26 01:55 faster_rcnn.ipynb
-rw------- 1 msc1 msc1 93278448 Jul 26 01:57 resnet_18.hdf5


 ## 3. Provide training specification <a class="anchor" id="head-3"></a>

In [34]:
!sed -i 's/$KEY/'"$KEY/g" $LOCAL_SPECS_DIR/default_spec_resnet18.txt
!cat $LOCAL_SPECS_DIR/default_spec_resnet18.txt

# Copyright (c) 2017-2020, NVIDIA CORPORATION.  All rights reserved.
random_seed: 42
enc_key: 'amgyMTMzcDc3ZDY0MHUyN3FrMWFpa2E5bHI6MGQ2ODg0YzEtYmZkOC00YWJlLTk5NjQtYmMyMDYxZTU1NjNl'
verbose: True
model_config {
input_image_config {
image_type: RGB
image_channel_order: 'bgr'
size_height_width {
height: 384
width: 1248
}
    image_channel_mean {
        key: 'b'
        value: 103.939
}
    image_channel_mean {
        key: 'g'
        value: 116.779
}
    image_channel_mean {
        key: 'r'
        value: 123.68
}
image_scaling_factor: 1.0
max_objects_num_per_image: 100
}
arch: "resnet:18"
anchor_box_config {
scale: 64.0
scale: 128.0
scale: 256.0
ratio: 1.0
ratio: 0.5
ratio: 2.0
}
freeze_bn: True
freeze_blocks: 0
freeze_blocks: 1
roi_mini_batch: 256
rpn_stride: 16
use_bias: False
roi_pooling_config {
pool_size: 7
pool_size_2x: False
}
all_projections: True
use_pooling:False
}
dataset_config {
  data_sources: {
    tfrecords_path: "/wor

 ## 4. Run TAO training <a class="anchor" id="head-4"></a>
 * Provide the sample spec file for training.

In [38]:
!tao faster_rcnn train --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18.txt

2023-07-26 03:40:40,452 [INFO] root: Registry: ['nvcr.io']
2023-07-26 03:40:40,488 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/msc1/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
2023-07-26 02:40:41.264083: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.
2023-07-26 02:40:46,530 [INFO] iva.faster_rcnn.spec_loader.spec_loader: Loading experiment spec at /workspace/tao-experiments/faster_rcnn/specs/default_spec_resnet18.txt.
2023-07-26 02:40:46,718 [INFO] iva.common.logging.logging: Log file already exists at /workspace/tao-experiments/faster_r







Pretrained weights loading status summary:
None: layer has no weights at all.
Yes: layer has weights and loaded successfully by name.
No: layer has weights but names not match, skipped.
Layer(Type):                                                                              Status:  
---------------------------------------------------------------------------------------------------
input_image(InputLayer)                                                                   None     
---------------------------------------------------------------------------------------------------
conv1(Conv2D)                                                                             Yes      
---------------------------------------------------------------------------------------------------
bn_conv1(BatchNormalization)                                                              Yes      
---------------------------------------------------------------------------------------------------
activati

2023-07-26 02:41:10,246 [INFO] __main__: Building validation dataset...
2023-07-26 02:41:11,854 [INFO] root: Sampling mode of the dataloader was set to user_defined.
2023-07-26 02:41:11,855 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False
2023-07-26 02:41:11,855 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False
2023-07-26 02:41:11,855 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)
2023-07-26 02:41:11,855 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 20, io threads: 40, compute threads: 20, buffered batches: 4
2023-07-26 02:41:11,855 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 963, number of sources: 1, batch size per gpu: 1, steps: 963
2023-07-26 02:41:11,876 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates we





2023-07-26 02:45:20,728 [INFO] root: Starting Training Loop.
Epoch 1/22
  % delta_t_median)
147ddb6f6a97:162:225 [0] NCCL INFO Bootstrap : Using eth0:172.17.0.4<0>
147ddb6f6a97:162:225 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v6 symbol.
147ddb6f6a97:162:225 [0] NCCL INFO NET/Plugin: Loaded net plugin NCCL RDMA Plugin (v5)
147ddb6f6a97:162:225 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v6 symbol.
147ddb6f6a97:162:225 [0] NCCL INFO NET/Plugin: Loaded coll plugin SHARP (v5)
147ddb6f6a97:162:225 [0] NCCL INFO cudaDriverVersion 12000
NCCL version 2.15.1+cuda11.8
147ddb6f6a97:162:225 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
147ddb6f6a97:162:225 [0] NCCL INFO P2P plugin IBext
147ddb6f6a97:162:225 [0] NCCL INFO NET/IB : No device found.
147ddb6f6a97:162:225 [0] NCCL INFO NET/IB : No device found.
147ddb6f6a97:162:225 [0] NCCL INFO NET/Socket : Using [0]eth0:172.17.0.4<0>
147ddb6f6a97:162:225 [0] NCCL INFO Using network So

Doing validation at epoch 4(1-based index)...
100%|█████████████████████████████████████████| 963/963 [00:22<00:00, 43.74it/s]
Class               AP                  precision           recall              RPN_recall          
------------------------------------------------------------------------------------------
gun                 0.4871              0.0122              0.7778              0.9128              
------------------------------------------------------------------------------------------
knife               0.4355              0.0113              0.7271              0.8965              
------------------------------------------------------------------------------------------
mAP@0.5 = 0.4613              
Validation done!
2023-07-26 03:04:12,315 [INFO] root: Training loop in progress
Epoch 5/22
Doing validation at epoch 5(1-based index)...
100%|█████████████████████████████████████████| 963/963 [00:21<00:00, 43.80it/s]
Class               AP                  precisio

gun                 0.5798              0.0353              0.7795              0.8496              
------------------------------------------------------------------------------------------
knife               0.5659              0.0276              0.8235              0.9106              
------------------------------------------------------------------------------------------
mAP@0.5 = 0.5729              
Validation done!
2023-07-26 03:34:15,322 [INFO] root: Training loop in progress
Epoch 12/22
Doing validation at epoch 12(1-based index)...
100%|█████████████████████████████████████████| 963/963 [00:22<00:00, 43.72it/s]
Class               AP                  precision           recall              RPN_recall          
------------------------------------------------------------------------------------------
gun                 0.6666              0.0203              0.8393              0.9556              
------------------------------------------------------------------------

Doing validation at epoch 19(1-based index)...
100%|█████████████████████████████████████████| 963/963 [00:21<00:00, 43.80it/s]
Class               AP                  precision           recall              RPN_recall          
------------------------------------------------------------------------------------------
gun                 0.7049              0.0346              0.8444              0.9111              
------------------------------------------------------------------------------------------
knife               0.7263              0.0362              0.8871              0.9553              
------------------------------------------------------------------------------------------
mAP@0.5 = 0.7156              
Validation done!
2023-07-26 04:08:26,844 [INFO] root: Training loop in progress
Epoch 20/22
Doing validation at epoch 20(1-based index)...
100%|█████████████████████████████████████████| 963/963 [00:21<00:00, 43.81it/s]
Class               AP                  preci

In [39]:
print('Model for each epoch:')
print('---------------------')
!ls -lht $LOCAL_EXPERIMENT_DIR

Model for each epoch:
---------------------
total 2.2G
-rw-r--r-- 1 msc1 msc1 515K Jul 26 05:21 faster_rcnn.ipynb
-rw-r--r-- 1 root root  21K Jul 26 05:21 status.json
-rw-r--r-- 1 root root  97M Jul 26 05:20 frcnn_kitti_resnet18.epoch22.tlt
-rw-r--r-- 1 root root  97M Jul 26 05:16 frcnn_kitti_resnet18.epoch21.tlt
-rw-r--r-- 1 root root  97M Jul 26 05:12 frcnn_kitti_resnet18.epoch20.tlt
-rw-r--r-- 1 root root  97M Jul 26 05:08 frcnn_kitti_resnet18.epoch19.tlt
-rw-r--r-- 1 root root  97M Jul 26 05:03 frcnn_kitti_resnet18.epoch18.tlt
-rw-r--r-- 1 root root  97M Jul 26 04:59 frcnn_kitti_resnet18.epoch17.tlt
-rw-r--r-- 1 root root  97M Jul 26 04:55 frcnn_kitti_resnet18.epoch16.tlt
-rw-r--r-- 1 root root  97M Jul 26 04:50 frcnn_kitti_resnet18.epoch15.tlt
-rw-r--r-- 1 root root  97M Jul 26 04:46 frcnn_kitti_resnet18.epoch14.tlt
-rw-r--r-- 1 root root  97M Jul 26 04:42 frcnn_kitti_resnet18.epoch13.tlt
-rw-r--r-- 1 root root  97M Jul 26 04:38 frcnn_kitti_resnet18.epoch12.tlt
-rw-r

In [None]:
print("For multi-GPU data parallelism, please uncomment and run this instead. Change --gpus  and --gpu_index based on your machine.")
# !tao faster_rcnn train -e $SPECS_DIR/default_spec_resnet18.txt \
#                    --gpus 2 \
#                    --gpu_index 0 1

In [None]:
print("""
For multi-GPU model parallelism, please uncomment and run this instead.
Also add related parameters in training_config to enable model parallelism. E.g., 

             model_parallelism: 50
             model_parallelism: 50

""")

#!tao faster_rcnn train -e $SPECS_DIR/default_spec_resnet18.txt \
#                   --gpus 2 \
#                   --gpu_index 0 1\
#                   -np 1

In [None]:
print("For resume training from checkpoint, please uncomment and run this instead. Change/Add the 'resume_from_model' field in the spec file.")
# !tao faster_rcnn train --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18.txt

In [None]:
print("For Automatic Mixed Precision(AMP) training, please uncomment and run this. Make sure you use the Volta or above GPU arch to enable AMP.")
# !tao faster_rcnn train --gpu_index $GPU_INDEX --use_amp -e $SPECS_DIR/default_spec_resnet18.txt

 ## 5. Evaluate trained models <a class="anchor" id="head-5"></a>

In [None]:
!tao faster_rcnn evaluate --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18.txt -m /workspace/tao-experiments/faster_rcnn/frcnn_kitti_resnet18.epoch12.tlt

2023-07-26 17:25:32,486 [INFO] root: Registry: ['nvcr.io']
2023-07-26 17:25:32,521 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/msc1/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
2023-07-26 16:25:33.365993: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.
2023-07-26 16:25:38,860 [INFO] iva.faster_rcnn.spec_loader.spec_loader: Loading experiment spec at /workspace/tao-experiments/faster_rcnn/specs/default_spec_resnet18.txt.


2023-07-26 16:25:39,124 [INFO] __main__: Running evaluation with TLT as backend.














2023-07-26 16:25:44,884 

 ## 6. Prune trained models <a class="anchor" id="head-6"></a>
 * Specify pre-trained model
 * Equalization criterion
 * Threshold for pruning
 * A key to save and load the model
 * Output directory to store the model
 
Usually, you just need to adjust `-pth` (threshold) for accuracy and model size trade off. Higher `pth` gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold to use is depend on the dataset. A `pth` value below is just a start point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy.

In [None]:
!tao faster_rcnn prune --gpu_index $GPU_INDEX -m $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18.epoch12.tlt \
           -o $USER_EXPERIMENT_DIR/model_1_pruned.tlt  \
           -eq union  \
           -pth 0.2 \
           -k $KEY

In [None]:
!ls -lht $LOCAL_EXPERIMENT_DIR

 ## 7. Retrain pruned models <a class="anchor" id="head-7"></a>
 * Model needs to be re-trained to bring back accuracy after pruning
 * Specify re-training specification

In [None]:
# Here we have updated the spec file to include the newly pruned model as a pretrained weights.
!sed -i 's/$KEY/'"$KEY/g" $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt
!cat $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt

In [None]:
# Retraining using the pruned model as pretrained weights 
!tao faster_rcnn train --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt

In [None]:
# Listing the newly retrained model.
!ls -lht $LOCAL_EXPERIMENT_DIR

 ## 8. Evaluate retrained model <a class="anchor" id="head-8"></a>

In [None]:
!tao faster_rcnn evaluate --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt -m /workspace/tao-experiments/faster_rcnn/frcnn_kitti_resnet18_retrain.epoch12.tlt

 ## 9. Visualize inferences <a class="anchor" id="head-9"></a>
 In this section, we run the inference tool to generate inferences on the trained models.

In [None]:
# Copy some test images
!mkdir -p $LOCAL_DATA_DIR/test_samples
!cp $LOCAL_DATA_DIR/testing/image_2/00000* $LOCAL_DATA_DIR/test_samples

In [None]:
# Running inference for detection on n images
# Please go to $LOCAL_EXPERIMENT_DIR/inference_results_imgs_retrain to see the visualizations.
!tao faster_rcnn inference --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt -m /workspace/tao-experiments/faster_rcnn/frcnn_kitti_resnet18_retrain.epoch12.tlt

The `inference` tool produces two outputs. 
1. Overlain images in `$LOCAL_EXPERIMENT_DIR/inference_results_imgs_retrain`
2. Frame by frame bbox labels in kitti format located in `$LOCAL_EXPERIMENT_DIR/inference_dump_labels_retrain`

In [None]:
# Simple grid visualizer
!pip3 install matplotlib==3.3.3
%matplotlib inline
import matplotlib.pyplot as plt
import os
from math import ceil
valid_image_ext = ['.jpg', '.png', '.jpeg', '.ppm']

def visualize_images(image_dir, num_cols=4, num_images=10):
    output_path = os.path.join(os.environ['LOCAL_EXPERIMENT_DIR'], image_dir)
    num_rows = int(ceil(float(num_images) / float(num_cols)))
    f, axarr = plt.subplots(num_rows, num_cols, figsize=[80,30])
    f.tight_layout()
    a = [os.path.join(output_path, image) for image in os.listdir(output_path) 
         if os.path.splitext(image)[1].lower() in valid_image_ext]
    for idx, img_path in enumerate(a[:num_images]):
        col_id = idx % num_cols
        row_id = idx // num_cols
        img = plt.imread(img_path)
        axarr[row_id, col_id].imshow(img) 

In [None]:
# Visualizing the sample images.
OUTPUT_PATH = 'inference_results_imgs_retrain' # relative path from $LOCAL_EXPERIMENT_DIR.
COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

 ## 10. Deploy! <a class="anchor" id="head-10"></a>

In [None]:
# Generate .etlt file using tao container
!if [ -f $LOCAL_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain.etlt ]; then rm -f $LOCAL_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain.etlt; fi
!tao faster_rcnn export --gpu_index $GPU_INDEX -m $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain.epoch12.tlt  \
                        -o $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain.etlt \
                        -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt \
                        -k $KEY \
                        --target_opset 12 \
                        --gen_ds_config

Using the `tao-deploy` container, you can generate a TensorRT engine and verify the correctness of the generated through evaluate and inference. 

The `tao-deploy` produces optimized tensorrt engines for the platform that it resides on. Therefore, to get maximum performance, please run `tao-deploy` command which will instantiate a deploy container, with the exported `.etlt` file on your target device. The `tao-deploy` container only works for x86, with discrete NVIDIA GPU's. 

For the jetson devices, please download the tao-converter for jetson from the dev zone link [here](https://developer.nvidia.com/tao-converter). 

If you choose to integrate your model into deepstream directly, you may do so by simply copying the exported `.etlt` file along with the calibration cache to the target device and updating the spec file that configures the `gst-nvinfer` element to point to this newly exported model. Usually this file is called `config_infer_primary.txt` for detection models and `config_infer_secondary_*.txt` for classification models.

In [None]:
# Convert to TensorRT engine (FP32).
!tao-deploy faster_rcnn gen_trt_engine --gpu_index $GPU_INDEX \
                        -m $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain.etlt \
                        -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt \
                        -k $KEY \
                        --data_type fp32 \
                        --batch_size 8 \
                        --max_batch_size 4 \
                        --engine_file $USER_EXPERIMENT_DIR/trt.fp32.engine

In [None]:
# Convert to TensorRT engine (FP16).
!tao-deploy faster_rcnn gen_trt_engine --gpu_index $GPU_INDEX \
                        -m $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain.etlt \
                        -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt \
                        -k $KEY \
                        --data_type fp16 \
                        --batch_size 8 \
                        --max_batch_size 4 \
                        --engine_file $USER_EXPERIMENT_DIR/trt.fp16.engine

In [None]:
# Convert to TensorRT engine (INT8).
!tao-deploy faster_rcnn gen_trt_engine --gpu_index $GPU_INDEX \
                        -m $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain.etlt \
                        -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt \
                        -k $KEY \
                        --data_type int8 \
                        --batch_size 8 \
                        --max_batch_size 4 \
                        --batches 10 \
                        --cal_cache_file $USER_EXPERIMENT_DIR/cal.bin \
                        --cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \
                        --cal_data_file $USER_EXPERIMENT_DIR/cal.tensorfile \
                        --engine_file $USER_EXPERIMENT_DIR/trt.int8.engine

In [None]:
print('Exported model and converted TensorRT engine:')
print('------------')
!ls -lht $LOCAL_EXPERIMENT_DIR

In [None]:
# Do inference with TensorRT on the generated TensorRT engine
# Please go to $LOCAL_EXPERIMENT_DIR/images_annotated to see the visualizations.
!tao-deploy faster_rcnn inference  --gpu_index $GPU_INDEX \
                                   -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt \
                                   -m $USER_EXPERIMENT_DIR/trt.fp32.engine \
                                   -i $DATA_DOWNLOAD_DIR/test_samples

The `inference` tool produces two outputs. 
The paths to the two outputs are exactly the same as the first `inference` command.

In [None]:
# Visualizing the sample images from TensorRT inference.
OUTPUT_PATH = 'inference_results_imgs_retrain' # relative path from $LOCAL_EXPERIMENT_DIR.
COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

In [None]:
import os
import shutil
val_split = 0.14  # Val split from the spec file
img_ext = '.png'
lab_ext = '.txt'

# Get images and labels from the partitioned validation set
images_root = f"{os.environ['LOCAL_DATA_DIR']}/training/image_2"
labels_root = f"{os.environ['LOCAL_DATA_DIR']}/training/label_2"
images_list = [os.path.splitext(imfile)[0] for imfile in sorted(os.listdir(images_root)) if imfile.endswith(img_ext)]
num_val_images = int(len(images_list) * val_split)

# Copy the data to a separate directory for evaluation
os.makedirs(os.path.join(f"{os.environ['LOCAL_DATA_DIR']}/training", "images_val"), exist_ok=True)
os.makedirs(os.path.join(f"{os.environ['LOCAL_DATA_DIR']}/training", "labels_val"), exist_ok=True)

for fname in images_list[:num_val_images]:
    shutil.copy(os.path.join(images_root, fname + img_ext), \
                             os.path.join(f"{os.environ['LOCAL_DATA_DIR']}/training", "images_val", fname + img_ext))
    shutil.copy(os.path.join(labels_root, fname + lab_ext), \
                             os.path.join(f"{os.environ['LOCAL_DATA_DIR']}/training", "labels_val", fname + lab_ext))

In [None]:
# Doing evaluation with the generated TensorRT engine
!tao-deploy faster_rcnn evaluate --gpu_index $GPU_INDEX \
                                 -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt \
                                 -m $USER_EXPERIMENT_DIR/trt.int8.engine \
                                 -i $DATA_DOWNLOAD_DIR/training/images_val \
                                 -l $DATA_DOWNLOAD_DIR/training/labels_val

 ## 11. QAT workflow <a class="anchor" id="head-11"></a>

In this section, we will explore the typical Quantization-Aware Training(QAT) workflow with TAO. QAT workflow is almost the same as non-QAT workflow except for two major differences:
1. set `enable_qat` to `True` in training and retraining spec files to enable the QAT for training/retraining
2. when doing export in INT8 mode, the calibration cache is extracted directly from the QAT .tlt model, so no need to specify any TensorRT INT8 calibration related arguments for `export`

 ### 11.1. Training <a class="anchor" id="head-10.1"></a>

In [None]:
# set enable_qat to True in training spec file to enable QAT training
!sed -i 's/enable_qat: False/enable_qat: True/' $LOCAL_SPECS_DIR/default_spec_resnet18.txt
!cat $LOCAL_SPECS_DIR/default_spec_resnet18.txt

In [None]:
# run QAT training
!tao faster_rcnn train --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18.txt

 ### 11.2. Evaluation <a class="anchor" id="head-10.2"></a>

In [None]:
!tao faster_rcnn evaluate --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18.txt -m /workspace/tao-experiments/faster_rcnn/frcnn_kitti_resnet18.epoch12.tlt

 ### 11.3. Pruning <a class="anchor" id="head-10.3"></a>

In [None]:
!tao faster_rcnn prune --gpu_index $GPU_INDEX -m $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18.epoch12.tlt \
           -o $USER_EXPERIMENT_DIR/model_1_pruned.tlt  \
           -eq union  \
           -pth 0.2 \
           -k $KEY

 ### 11.4. Retraining <a class="anchor" id="head-10.4"></a>

In [None]:
# set enable_qat to True in retraining spec file to enable QAT
!sed -i 's/enable_qat: False/enable_qat: True/' $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt
!cat $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt

In [None]:
!tao faster_rcnn train --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt

 ### 11.5. Evaluation of the retrained model <a class="anchor" id="head-10.5"></a>

In [None]:
# do evaluation with .tlt model
!tao faster_rcnn evaluate --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt -m /workspace/tao-experiments/faster_rcnn/frcnn_kitti_resnet18_retrain.epoch12.tlt

 ### 11.6. Inference of the retrained model <a class="anchor" id="head-10.6"></a>

In [None]:
# do inference with .tlt model
!tao faster_rcnn inference --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt -m /workspace/tao-experiments/faster_rcnn/frcnn_kitti_resnet18_retrain.epoch12.tlt

In [None]:
# Visualizing the sample images
OUTPUT_PATH = 'inference_results_imgs_retrain' # relative path from $USER_EXPERIMENT_DIR.
COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

 ### 11.7. Deployment of the QAT model <a class="anchor" id="head-10.7"></a>

In [None]:
# Calibration JSON file is required for INT8 engine generation from tao-deploy
!if [ -f $LOCAL_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_int8_qat.etlt ]; then rm -f $LOCAL_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_int8_qat.etlt; fi
!if [ -f $LOCAL_EXPERIMENT_DIR/cal.bin ]; then rm -f $LOCAL_EXPERIMENT_DIR/cal.bin; fi
!tao faster_rcnn export --gpu_index $GPU_INDEX \
                        -m $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain.epoch12.tlt  \
                        -o $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_int8_qat.etlt \
                        -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt \
                        -k $KEY \
                        --cal_json_file $USER_EXPERIMENT_DIR/cal.json \
                        --target_opset 12 \
                        --gen_ds_config

In [None]:
# Convert to TensorRT engine(INT8).
# No need for calibration dataset for QAT model INT8 export
!tao-deploy faster_rcnn gen_trt_engine --gpu_index $GPU_INDEX \
                        -m $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_int8_qat.etlt \
                        -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt \
                        -k $KEY \
                        --data_type int8 \
                        --cal_cache_file $USER_EXPERIMENT_DIR/cal.bin \
                        --cal_json_file $USER_EXPERIMENT_DIR/cal.json \
                        --batch_size 8 \
                        --max_batch_size 4 \
                        --engine_file $USER_EXPERIMENT_DIR/trt.int8.qat.engine

In [None]:
print('Exported model and converted TensorRT engine:')
print('------------')
!ls -lht $LOCAL_EXPERIMENT_DIR

In [None]:
# Do inference with TensorRT on the generated TensorRT engine
# Please go to $LOCAL_EXPERIMENT_DIR/images_annotated to see the visualizations.
!tao-deploy faster_rcnn inference --gpu_index $GPU_INDEX \
                                  -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt \
                                  -m $USER_EXPERIMENT_DIR/trt.int8.qat.engine \
                                  -i $DATA_DOWNLOAD_DIR/test_samples

In [None]:
# Visualizing the sample images from TensorRT inference.
OUTPUT_PATH = 'inference_results_imgs_retrain' # relative path from $USER_EXPERIMENT_DIR.
COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

In [None]:
# Doing evaluation with the generated TensorRT engine
!tao-deploy faster_rcnn evaluate --gpu_index $GPU_INDEX \
                                 -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt \
                                 -m $USER_EXPERIMENT_DIR/trt.int8.qat.engine \
                                 -i $DATA_DOWNLOAD_DIR/training/images_val \
                                 -l $DATA_DOWNLOAD_DIR/training/labels_val