# Object Detection using TLT FasterRCNN

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Transfer Learning Toolkit (TLT) is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/embedded-transfer-learning-toolkit-software-stack-1200x670px.png" width="1080"> 

 ## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TLT to:

* Take a pretrained resnet18 model and train a ResNet-18 FasterRCNN model on the KITTI dataset
* Prune the trained FasterRCNN model
* Retrain the pruned model to recover lost accuracy
* Run evaluation & inference on the trained model to verify the accuracy
* Export & deploy the model in DeepStream/TensorRT
* Quantization-Aware Training(QAT) workflow for the best accuracy-performance trade-off
 
 ### Table of Contents

 This notebook shows an example usecase of FasterRCNN using Transfer Learning Toolkit.

 0. [Set up env variables and map drives](#head-0)
 1. [Install the TLT launcher](#head-1)
 2. [Prepare dataset and pretrained model](#head-2)<br>
     2.1 [Download the dataset](#head-2-1)<br>
     2.2 [Verify the downloaded dataset](#head-2-2)<br>
     2.3 [Prepare tfrecords from kitti format dataset](#head-2-3)<br>
     2.4 [Download pretrained model](#head-2-4)
 3. [Provide training specification](#head-3)
 4. [Run TLT training](#head-4)
 5. [Evaluate trained models](#head-5)
 6. [Prune trained models](#head-6)
 7. [Retrain pruned models](#head-7)
 8. [Evaluate retrained model](#head-8)
 9. [Visualize inferences](#head-9)
 10. [Deploy](#head-10)
 11. [QAT workflow](#head-11)<br>
     11.1 [Training](#head-11.1)<br>
     11.2 [Evaluation](#head-11.2)<br>
     11.3 [Pruning](#head-11.3)<br>
     11.4 [Retraining](#head-11.4)<br>
     11.5 [Evaluation of the retrained model](#head-11.5)<br>
     11.6 [Inference of the retrained model](#head-11.6)<br>
     11.7 [Deployment of the QAT model](#head-11.7)

 ## 0. Set up env variables and map drives <a class="anchor" id="head-0"></a>
 
The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users workspace. More information on how to set up the dataset and the supported steps in the TLT workflow are provided in the subsequent cells.

In [None]:
# Setting up env variables for cleaner command line commands.
import os

print("Please replace the variables with your own.")
%env GPU_INDEX=0
%env KEY=tlt

# Please define this local project directory that needs to be mapped to the TLT docker session.
%env LOCAL_PROJECT_DIR=/path/to/your/tlt-experiments
os.environ["LOCAL_DATA_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "data"
)
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "faster_rcnn"
)
%env USER_EXPERIMENT_DIR=/workspace/tlt-experiments/faster_rcnn
%env DATA_DOWNLOAD_DIR=/workspace/tlt-experiments/data
# The sample spec files are present in the same path as the downloaded samples.
# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=~/tlt-samples/faster_rcnn
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)
%env SPECS_DIR=/workspace/tlt-experiments/faster_rcnn/specs

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR

The cell below maps the project directory on your local host to a workspace directory in the TLT docker instance, so that the data and the results are mapped from outside to inside of the docker instance.

In [None]:
# Mapping up the local directories to the TLT docker.
import json
import os
mounts_file = os.path.expanduser("~/.tlt_mounts.json")

# Define the dictionary with the mapped drives
drive_map = {
    "Mounts": [
        # Mapping the data directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tlt-experiments"
        },
        # Mapping the specs directory.
        {
            "source": os.environ["LOCAL_SPECS_DIR"],
            "destination": os.environ["SPECS_DIR"]
        },
    ],
    # set gpu index for tlt-converter
    "Envs": [
        {"variable": "CUDA_VISIBLE_DEVICES", "value": os.getenv("GPU_INDEX")},
    ]
}

# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(drive_map, mfile, indent=4)

In [None]:
!cat ~/.tlt_mounts.json

## 1. Install the TLT launcher <a class="anchor" id="head-1"></a>
The TLT launcher is a python package distributed as a python wheel listed in the `nvidia-pyindex` python index. You may install the launcher by executing the following cell.

Please note that TLT recommends users to run the TLT launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TLT python package, please make sure of the following software requirements:
* python >=3.6.9 < 3.8.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 455+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be trigerred to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.

After setting up your virtual environment with the above requirements, install TLT pip package. You will need to update `FIXME` in the cell below to the path where the wheel was downloaded to.

In [None]:
# Skip this step if you have already installed the tlt launcher.
!pip3 install nvidia-pyindex
!pip3 install nvidia-tlt

In [None]:
# View the versions of the TLT launcher
!tlt info

 ## 2. Prepare dataset and pretrained model <a class="anchor" id="head-2"></a>

 We will be using the KITTI detection dataset for the tutorial. To find more details please visit
 http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=2d. Please download the KITTI detection images (http://www.cvlibs.net/download.php?file=data_object_image_2.zip) and labels (http://www.cvlibs.net/download.php?file=data_object_label_2.zip) to $DATA_DOWNLOAD_DIR.
 
 The data will then be extracted to have
 * training images in `$LOCAL_DATA_DIR/training/image_2`
 * training labels in `$LOCAL_DATA_DIR/training/label_2`
 * testing images in `$LOCAL_DATA_DIR/testing/image_2`
 
You may use this notebook with your own dataset as well. To use this example with your own dataset, please follow the same directory structure as mentioned below.

*Note: There are no labels for the testing images, therefore we use it just to visualize inferences for the trained model.*

### 2.1 Download the dataset <a class="anchor" id="head-2-1"></a>

Once you have gotten the download links in your email, please populate them in place of the `KITTI_IMAGES_DOWNLOAD_URL` and the `KITTI_LABELS_DOWNLOAD_URL`. This next cell, will download the data and place in `$LOCAL_DATA_DIR`

In [None]:
import os
!mkdir -p $LOCAL_DATA_DIR
os.environ["URL_IMAGES"]=KITTI_IMAGES_DOWNLOAD_URL
!if [ ! -f $LOCAL_DATA_DIR/data_object_image_2.zip ]; then wget $URL_IMAGES -O $LOCAL_DATA_DIR/data_object_image_2.zip; else echo "image archive already downloaded"; fi 
os.environ["URL_LABELS"]=KITTI_LABELS_DOWNLOAD_URL
!if [ ! -f $LOCAL_DATA_DIR/data_object_label_2.zip ]; then wget $URL_LABELS -O $LOCAL_DATA_DIR/data_object_label_2.zip; else echo "label archive already downloaded"; fi 

### 2.2 Verify the downloaded dataset <a class="anchor" id="head-2-2"></a>

In [None]:
# Check the dataset is present
!mkdir -p $LOCAL_DATA_DIR
!if [ ! -f $LOCAL_DATA_DIR/data_object_image_2.zip ]; then echo 'Image zip file not found, please download.'; else echo 'Found Image zip file.';fi
!if [ ! -f $LOCAL_DATA_DIR/data_object_label_2.zip ]; then echo 'Label zip file not found, please download.'; else echo 'Found Labels zip file.';fi

In [None]:
# This may take a while: verify integrity of zip files 
!sha256sum $LOCAL_DATA_DIR/data_object_image_2.zip | cut -d ' ' -f 1 | grep -xq '^351c5a2aa0cd9238b50174a3a62b846bc5855da256b82a196431d60ff8d43617$' ; \
if test $? -eq 0; then echo "images OK"; else echo "images corrupt, re-download!" && rm -f $LOCAL_DATA_DIR/data_object_image_2.zip; fi 
!sha256sum $LOCAL_DATA_DIR/data_object_label_2.zip | cut -d ' ' -f 1 | grep -xq '^4efc76220d867e1c31bb980bbf8cbc02599f02a9cb4350effa98dbb04aaed880$' ; \
if test $? -eq 0; then echo "labels OK"; else echo "labels corrupt, re-download!" && rm -f $LOCAL_DATA_DIR/data_object_label_2.zip; fi 

In [None]:
# unpack 
!unzip -u $LOCAL_DATA_DIR/data_object_image_2.zip -d $LOCAL_DATA_DIR
!unzip -u $LOCAL_DATA_DIR/data_object_label_2.zip -d $LOCAL_DATA_DIR

In [None]:
# verify
import os

DATA_DIR = os.environ.get('LOCAL_DATA_DIR')
num_training_images = len(os.listdir(os.path.join(DATA_DIR, "training/image_2")))
num_training_labels = len(os.listdir(os.path.join(DATA_DIR, "training/label_2")))
num_testing_images = len(os.listdir(os.path.join(DATA_DIR, "testing/image_2")))
print("Number of images in the train/val set. {}".format(num_training_images))
print("Number of labels in the train/val set. {}".format(num_training_labels))
print("Number of images in the test set. {}".format(num_testing_images))

In [None]:
# Sample kitti label.
!cat $LOCAL_DATA_DIR/training/label_2/000110.txt

### 2.3 Prepare tfrecords from kitti format dataset <a class="anchor" id="head-2-3"></a>

* Update the tfrecords spec file to take in your kitti format dataset
* Create the tfrecords using the dataset_convert 
* TFRecords only need to be generated once.

In [None]:
print("TFrecords conversion spec file for training")
!cat $LOCAL_SPECS_DIR/frcnn_tfrecords_kitti_trainval.txt

In [None]:
# Creating a new directory for the output tfrecords dump.
!mkdir -p $LOCAL_EXPERIMENT_DIR/tfrecords
#KITTI trainval
!tlt faster_rcnn dataset_convert --gpu_index $GPU_INDEX -d $SPECS_DIR/frcnn_tfrecords_kitti_trainval.txt \
                     -o $DATA_DOWNLOAD_DIR/tfrecords/kitti_trainval/kitti_trainval

In [None]:
!ls -rlt $LOCAL_DATA_DIR/tfrecords/kitti_trainval

 ### 2.4 Download pre-trained model <a class="anchor" id="head-2-4"></a>

In [None]:
# Installing NGC CLI on the local machine.
## Download and install
%env CLI=ngccli_reg_linux.zip
!mkdir -p $LOCAL_PROJECT_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))

In [None]:
!ngc registry model list nvidia/tlt_pretrained_object_detection*

In [None]:
# Download model from NGC.
!ngc registry model download-version nvidia/tlt_pretrained_object_detection:resnet18

In [None]:
# Copy weights to experiment directory.
!cp tlt_pretrained_object_detection_vresnet18/resnet_18.hdf5 $LOCAL_EXPERIMENT_DIR
!rm -rf tlt_pretrained_object_detection_vresnet18
!ls -rlt $LOCAL_EXPERIMENT_DIR

 ## 3. Provide training specification <a class="anchor" id="head-3"></a>

In [None]:
!sed -i 's/$KEY/'"$KEY/g" $LOCAL_SPECS_DIR/default_spec_resnet18.txt
!cat $LOCAL_SPECS_DIR/default_spec_resnet18.txt

 ## 4. Run TLT training <a class="anchor" id="head-4"></a>
 * Provide the sample spec file for training.

In [None]:
!tlt faster_rcnn train --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18.txt

In [None]:
print('Model for each epoch:')
print('---------------------')
!ls -lht $LOCAL_EXPERIMENT_DIR

In [None]:
print("For multi-GPU, please uncomment and run this instead. Change --gpus  and --gpu_index based on your machine.")
# !tlt faster_rcnn train -e $SPECS_DIR/default_spec_resnet18.txt \
#                    --gpus 2 \
#                    --gpu_index 1 2

In [None]:
print("For resume training from checkpoint, please uncomment and run this instead. Change/Add the 'resume_from_model' field in the spec file.")
# !tlt faster_rcnn train --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18.txt

In [None]:
print("For Automatic Mixed Precision(AMP) training, please uncomment and run this. Make sure you use the Volta or above GPU arch to enable AMP.")
# !tlt faster_rcnn train --gpu_index $GPU_INDEX --use_amp -e $SPECS_DIR/default_spec_resnet18.txt

 ## 5. Evaluate trained models <a class="anchor" id="head-5"></a>

In [None]:
!tlt faster_rcnn evaluate --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18.txt

 ## 6. Prune trained models <a class="anchor" id="head-6"></a>
 * Specify pre-trained model
 * Equalization criterion
 * Threshold for pruning
 * A key to save and load the model
 * Output directory to store the model
 
Usually, you just need to adjust `-pth` (threshold) for accuracy and model size trade off. Higher `pth` gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold to use is depend on the dataset. A `pth` value below is just a start point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy.

In [None]:
!tlt faster_rcnn prune --gpu_index $GPU_INDEX -m $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18.epoch12.tlt \
           -o $USER_EXPERIMENT_DIR/model_1_pruned.tlt  \
           -eq union  \
           -pth 0.2 \
           -k $KEY

In [None]:
!ls -lht $LOCAL_EXPERIMENT_DIR

 ## 7. Retrain pruned models <a class="anchor" id="head-7"></a>
 * Model needs to be re-trained to bring back accuracy after pruning
 * Specify re-training specification

In [None]:
# Here we have updated the spec file to include the newly pruned model as a pretrained weights.
!sed -i 's/$KEY/'"$KEY/g" $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt
!cat $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt

In [None]:
# Retraining using the pruned model as pretrained weights 
!tlt faster_rcnn train --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt

In [None]:
# Listing the newly retrained model.
!ls -lht $LOCAL_EXPERIMENT_DIR

 ## 8. Evaluate retrained model <a class="anchor" id="head-8"></a>

In [None]:
!tlt faster_rcnn evaluate --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt

 ## 9. Visualize inferences <a class="anchor" id="head-9"></a>
 In this section, we run the inference tool to generate inferences on the trained models.

In [None]:
# Running inference for detection on n images
# Please go to $LOCAL_EXPERIMENT_DIR/inference_results_imgs_retrain to see the visualizations.
!tlt faster_rcnn inference --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt

The `inference` tool produces two outputs. 
1. Overlain images in `$LOCAL_EXPERIMENT_DIR/inference_results_imgs_retrain`
2. Frame by frame bbox labels in kitti format located in `$LOCAL_EXPERIMENT_DIR/inference_dump_labels_retrain`

In [None]:
# Simple grid visualizer
!pip3 install matplotlib==3.3.3
%matplotlib inline
import matplotlib.pyplot as plt
import os
from math import ceil
valid_image_ext = ['.jpg', '.png', '.jpeg', '.ppm']

def visualize_images(image_dir, num_cols=4, num_images=10):
    output_path = os.path.join(os.environ['LOCAL_EXPERIMENT_DIR'], image_dir)
    num_rows = int(ceil(float(num_images) / float(num_cols)))
    f, axarr = plt.subplots(num_rows, num_cols, figsize=[80,30])
    f.tight_layout()
    a = [os.path.join(output_path, image) for image in os.listdir(output_path) 
         if os.path.splitext(image)[1].lower() in valid_image_ext]
    for idx, img_path in enumerate(a[:num_images]):
        col_id = idx % num_cols
        row_id = idx // num_cols
        img = plt.imread(img_path)
        axarr[row_id, col_id].imshow(img) 

In [None]:
# Visualizing the sample images.
OUTPUT_PATH = 'inference_results_imgs_retrain' # relative path from $LOCAL_EXPERIMENT_DIR.
COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

 ## 10. Deploy! <a class="anchor" id="head-10"></a>

In [None]:
# Export in FP32 mode.
!if [ -f $LOCAL_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain.etlt ]; then rm -f $LOCAL_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain.etlt; fi
!tlt faster_rcnn export --gpu_index $GPU_INDEX -m $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain.epoch12.tlt  \
                        -o $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain.etlt \
                        -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt \
                        -k $KEY

In [None]:
# Export in FP16 mode.
# Note that the .etlt model in FP16 mode is the same as in FP32 mode.
!if [ -f $LOCAL_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_fp16.etlt ]; then rm -f $LOCAL_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_fp16.etlt; fi
!tlt faster_rcnn export --gpu_index $GPU_INDEX -m $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain.epoch12.tlt  \
                        -o $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_fp16.etlt \
                        -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt \
                        -k $KEY \
                        --data_type fp16

In [None]:
# Export in INT8 mode(generate calibration cache file).
# Note that the .etlt model in INT8 mode is the same as in FP32 mode.
!if [ -f $LOCAL_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_int8.etlt ]; then rm -f $LOCAL_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_int8.etlt; fi
!tlt faster_rcnn export --gpu_index $GPU_INDEX -m $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain.epoch12.tlt  \
                        -o $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_int8.etlt \
                        -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt \
                        -k $KEY \
                        --data_type int8 \
                        --batch_size 8 \
                        --batches 10 \
                        --cal_cache_file $USER_EXPERIMENT_DIR/cal.bin

In [None]:
# Converting to TensorRT engine(FP32) is omitted here as this is trivial.
# Convert to TensorRT engine(FP16).
# Make sure your GPU type supports the FP16 data type before running this cell.
!tlt tlt-converter -k $KEY  \
               -d 3,384,1248 \
               -o NMS \
               -e $USER_EXPERIMENT_DIR/trt.fp16.engine \
               -m 4 \
               -t fp16 \
               -i nchw \
               $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_fp16.etlt

In [None]:
# Convert to TensorRT engine(INT8).
# Make sure your GPU type supports the INT8 data type before running this cell.
!tlt tlt-converter -k $KEY  \
               -d 3,384,1248 \
               -o NMS \
               -c $USER_EXPERIMENT_DIR/cal.bin \
               -e $USER_EXPERIMENT_DIR/trt.int8.engine \
               -b 8 \
               -m 4 \
               -t int8 \
               -i nchw \
               $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_int8.etlt

In [None]:
print('Exported model and converted TensorRT engine:')
print('------------')
!ls -lht $LOCAL_EXPERIMENT_DIR

In [None]:
# Do inference with TensorRT on the generated TensorRT engine
# Please go to $LOCAL_EXPERIMENT_DIR/inference_results_imgs_retrain to see the visualizations.
# Here we use the INT8 engine for inference, if you want to use FP16 engine instead please
# customize the 'trt_engine' parameter in the spec file below to point to the FP16 engine.
!TRT_LINES=$(grep -n 'trt_inference' $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt | cut -d: -f1) && printf '%ds/#//g\n' $(seq $TRT_LINES $((TRT_LINES+4))) | sed -i -f - $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt
!tlt faster_rcnn inference  --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt

The `inference` tool produces two outputs. 
The paths to the two outputs are exactly the same as the first `inference` command.

In [None]:
# Visualizing the sample images from TensorRT inference.
OUTPUT_PATH = 'inference_results_imgs_retrain' # relative path from $LOCAL_EXPERIMENT_DIR.
COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

In [None]:
# Doing evaluation with the generated TensorRT engine
# modify the spec file a little for tensorrt_evaluation configuration
# compare the mAP below with that of `evaluate` with retrained tlt model
!TRT_LINES=$(grep -n 'trt_evaluation' $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt | cut -d: -f1) && printf '%ds/#//g\n' $(seq $TRT_LINES $((TRT_LINES+4))) | sed -i -f - $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt
# do evaluation with tensorrt engine
!tlt faster_rcnn evaluate --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt

 ## 11. QAT workflow <a class="anchor" id="head-11"></a>

In this section, we will explore the typical Quantization-Aware Training(QAT) workflow with TLT. QAT workflow is almost the same as non-QAT workflow except for two major differences:
1. set `enable_qat` to `True` in training and retraining spec files to enable the QAT for training/retraining
2. when doing export in INT8 mode, the calibration cache is extracted directly from the QAT .tlt model, so no need to specify any TensorRT INT8 calibration related arguments for `export`

 ### 11.1. Training <a class="anchor" id="head-10.1"></a>

In [None]:
# set enable_qat to True in training spec file to enable QAT training
!sed -i 's/enable_qat: False/enable_qat: True/' $LOCAL_SPECS_DIR/default_spec_resnet18.txt
!cat $LOCAL_SPECS_DIR/default_spec_resnet18.txt

In [None]:
# run QAT training
!tlt faster_rcnn train --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18.txt

 ### 11.2. Evaluation <a class="anchor" id="head-10.2"></a>

In [None]:
!tlt faster_rcnn evaluate --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18.txt

 ### 11.3. Pruning <a class="anchor" id="head-10.3"></a>

In [None]:
!tlt faster_rcnn prune --gpu_index $GPU_INDEX -m $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18.epoch12.tlt \
           -o $USER_EXPERIMENT_DIR/model_1_pruned.tlt  \
           -eq union  \
           -pth 0.2 \
           -k $KEY

 ### 11.4. Retraining <a class="anchor" id="head-10.4"></a>

In [None]:
# set enable_qat to True in retraining spec file to enable QAT
!sed -i 's/enable_qat: False/enable_qat: True/' $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt
!cat $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt

In [None]:
!tlt faster_rcnn train --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt

 ### 11.5. Evaluation of the retrained model <a class="anchor" id="head-10.5"></a>

In [None]:
# disable the tensorrt evaluation config in spec file
!TRT_LINES=$(grep -n 'trt_evaluation' $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt | cut -d: -f1) && printf '%ds/^/#/g\n' $(seq $TRT_LINES $((TRT_LINES+4))) | sed -i -f - $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt

In [None]:
# do evaluation with .tlt model
!tlt faster_rcnn evaluate --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt

 ### 11.6. Inference of the retrained model <a class="anchor" id="head-10.6"></a>

In [None]:
# disable the tensorrt inference config in spec file
!TRT_LINES=$(grep -n 'trt_inference' $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt | cut -d: -f1) && printf '%ds/^/#/g\n' $(seq $TRT_LINES $((TRT_LINES+4))) | sed -i -f - $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt

In [None]:
# do inference with .tlt model
!tlt faster_rcnn inference --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt

In [None]:
# Visualizing the sample images
OUTPUT_PATH = 'inference_results_imgs_retrain' # relative path from $USER_EXPERIMENT_DIR.
COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

 ### 11.7. Deployment of the QAT model <a class="anchor" id="head-10.7"></a>

In [None]:
# Export in INT8 mode(generate calibration cache file).
# No need for calibration dataset for QAT model INT8 export
!if [ -f $LOCAL_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_int8_qat.etlt ]; then rm -f $LOCAL_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_int8_qat.etlt; fi
!if [ -f $LOCAL_EXPERIMENT_DIR/cal.bin ]; then rm -f $LOCAL_EXPERIMENT_DIR/cal.bin; fi
!tlt faster_rcnn export --gpu_index $GPU_INDEX -m $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain.epoch12.tlt  \
                        -o $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_int8_qat.etlt \
                        -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt \
                        -k $KEY \
                        --data_type int8 \
                        --cal_cache_file $USER_EXPERIMENT_DIR/cal.bin

In [None]:
# Convert to TensorRT engine(INT8).
# Make sure your GPU type supports the INT8 data type before running this cell.
!tlt tlt-converter -k $KEY  \
               -d 3,384,1248 \
               -o NMS \
               -c $USER_EXPERIMENT_DIR/cal.bin \
               -e $USER_EXPERIMENT_DIR/trt.int8.engine \
               -b 8 \
               -m 4 \
               -t int8 \
               -i nchw \
               $USER_EXPERIMENT_DIR/frcnn_kitti_resnet18_retrain_int8_qat.etlt

In [None]:
print('Exported model and converted TensorRT engine:')
print('------------')
!ls -lht $LOCAL_EXPERIMENT_DIR

In [None]:
# Do inference with TensorRT on the generated TensorRT engine
# Please go to $LOCAL_EXPERIMENT_DIR/inference_results_imgs_retrain to see the visualizations.
!TRT_LINES=$(grep -n 'trt_inference' $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt | cut -d: -f1) && printf '%ds/#//g\n' $(seq $TRT_LINES $((TRT_LINES+4))) | sed -i -f - $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt
!tlt faster_rcnn inference --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt

In [None]:
# Visualizing the sample images from TensorRT inference.
OUTPUT_PATH = 'inference_results_imgs_retrain' # relative path from $USER_EXPERIMENT_DIR.
COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

In [None]:
# Doing evaluation with the generated TensorRT engine
# compare the mAP below with that of `evaluate` with retrained tlt model
!TRT_LINES=$(grep -n 'trt_evaluation' $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt | cut -d: -f1) && printf '%ds/#//g\n' $(seq $TRT_LINES $((TRT_LINES+4))) | sed -i -f - $LOCAL_SPECS_DIR/default_spec_resnet18_retrain_spec.txt
!tlt faster_rcnn evaluate --gpu_index $GPU_INDEX -e $SPECS_DIR/default_spec_resnet18_retrain_spec.txt