# Object Detection using TAO YOLOv4

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png" width="1080">


## Overview
In this notebook, I'll leverage the simplicity and convenience of TAO to:

* Take a pretrained resnet18 model and train a ResNet-18 Yolo_v4 model on the KITTI dataset
* Prune the trained yolo_V4 model
* Retrain the pruned model to recover lost accuracy
* Export the pruned model
* Quantize the pruned model using QAT
* Run Inference on the trained model
* Export the pruned, quantized and retrained model to a .etlt file for deployment to DeepStream
* Run inference on the exported. etlt model to verify deployment using TensorRT

At the end of this notebook, I'll generate a trained and optimized `YOLOv4` model
which you may deploy via [Triton](https://github.com/NVIDIA-AI-IOT/tao-toolkit-triton-apps)
or [DeepStream](https://developer.nvidia.com/deepstream-sdk).

## Table of Contents

This notebook demonstrate fine-tuning a YOLO v4 object detection using Train Adapt Optimize (TAO) Toolkit.

0. [Set up env variables and map drives](#head-0)
1. [Install the TAO launcher](#head-1)
2. [Prepare dataset and pre-trained model](#head-2) <br>
     2.1 [Download the dataset](#head-2-1)<br>
     2.2 [Verify the downloaded dataset](#head-2-2)<br>
     2.3 [Generate tfrecords](#head-2-3)<br>
     2.4 [Download pretrained model](#head-2-4)
3. [Provide training specification](#head-3)
4. [Run TAO training](#head-4)
5. [Evaluate trained models](#head-5)
6. [Prune trained models](#head-6)
7. [Retrain pruned models](#head-7)
8. [Evaluate retrained model](#head-8)
9. [Visualize inferences](#head-9)
10. [Model Export](#head-10)
11. [Verify deployed model](#head-11)
12. [QAT workflow](#head-12) <br>
    12.1 [QAT Training](#head-12-1) <br>
    12.2. [QAT Evaluation](#head-12-2) <br>
    12.3. [Pruning QAT model](#head-12-3)<br>
    12.4. [Retraining](#head-12-4)<br>
    12.5. [Evaluation of the retrained model](#head-12-5)<br>
    12.6. [Inference of the retrained QAT model](#head-12-6)<br>
    12.7. [Deployment of the QAT model](#head-12-7)<br>
    12.8. [Verify the deployed QAT model](#head-12-8)<br>


## 0. Set up env variables and map drives <a class="anchor" id="head-0"></a>

When using the purpose-built pretrained models from NGC, please make sure to set the `$KEY` environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users workspace. Please note that the dataset to run this notebook is expected to reside in the `$LOCAL_PROJECT_DIR/data`, while the TAO experiment generated collaterals will be output to `$LOCAL_PROJECT_DIR/yolo_v4`. More information on how to set up the dataset and the supported steps in the TAO workflow are provided in the subsequent cells.

*Note: Please make sure to remove any stray artifacts/files from the `$USER_EXPERIMENT_DIR` or `$DATA_DOWNLOAD_DIR` paths as mentioned below, that may have been generated from previous experiments. Having checkpoint files etc may interfere with creating a training graph for a new experiment.*

In [1]:
# Setting up env variables for cleaner command line commands.
import os

print("Please replace the variable with your key.")
%env KEY=amgyMTMzcDc3ZDY0MHUyN3FrMWFpa2E5bHI6MGQ2ODg0YzEtYmZkOC00YWJlLTk5NjQtYmMyMDYxZTU1NjNl
%env USER_EXPERIMENT_DIR=/workspace/tao-experiments/yolo_v4
%env DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data

# Set this path if you don't run the notebook from the samples directory.
%env NOTEBOOK_ROOT=/home/msc1/workspace/tao-experiments/yolo_v4

# Please define this local project directory that needs to be mapped to the TAO docker session.
# The dataset expected to be present in $LOCAL_PROJECT_DIR/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/yolo_v4_tiny
%env LOCAL_PROJECT_DIR=/home/msc1/workspace/tao-experiments
os.environ["LOCAL_DATA_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "data")
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "yolo_v4")

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)
%env SPECS_DIR=/workspace/tao-experiments/yolo_v4/specs

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR

Please replace the variable with your key.
env: KEY=amgyMTMzcDc3ZDY0MHUyN3FrMWFpa2E5bHI6MGQ2ODg0YzEtYmZkOC00YWJlLTk5NjQtYmMyMDYxZTU1NjNl
env: USER_EXPERIMENT_DIR=/workspace/tao-experiments/yolo_v4
env: DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data
env: NOTEBOOK_ROOT=/home/msc1/workspace/tao-experiments/yolo_v4
env: LOCAL_PROJECT_DIR=/home/msc1/workspace/tao-experiments
env: SPECS_DIR=/workspace/tao-experiments/yolo_v4/specs
total 48
-rw-r--r-- 1 msc1 msc1 2468 Jul 11 23:54 yolo_v4_retrain_resnet18_kitti.txt
-rw-r--r-- 1 msc1 msc1 2490 Jul 11 23:54 yolo_v4_train_resnet18_kitti.txt
-rw-r--r-- 1 msc1 msc1  312 Jul 11 23:54 yolo_v4_tfrecords_kitti_val_16bit_grayscale.txt
-rw-r--r-- 1 msc1 msc1  326 Jul 11 23:54 yolo_v4_tfrecords_kitti_train_16bit_grayscale.txt
-rw-r--r-- 1 msc1 msc1 2420 Jul 11 23:54 yolo_v4_retrain_resnet18_kitti_seq.txt
-rw-r--r-- 1 msc1 msc1 2457 Jul 11 23:54 yolo_v4_retrain_resnet18_kitti_qat.txt
-rw-r--r-- 1 msc1 msc1 2580 Jul 11 23:54 yolo_v4_retrain_resne

In [2]:
# Create local dir
!mkdir -p $LOCAL_DATA_DIR
!mkdir -p $LOCAL_EXPERIMENT_DIR

The cell below maps the project directory on your local host to a workspace directory in the TAO docker instance, so that the data and the results are mapped from outside to inside of the docker instance.

In [2]:
# Mapping up the local directories to the TAO docker.
import json
mounts_file = os.path.expanduser("~/.tao_mounts.json")

# Define the dictionary with the mapped drives
drive_map = {
    "Mounts": [
        # Mapping the data directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tao-experiments"
        },
        # Mapping the specs directory.
        {
            "source": os.environ["LOCAL_SPECS_DIR"],
            "destination": os.environ["SPECS_DIR"]
        },
    ]
}

# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(drive_map, mfile, indent=4)

In [3]:
!cat ~/.tao_mounts.json

{
    "Mounts": [
        {
            "source": "/home/msc1/workspace/tao-experiments",
            "destination": "/workspace/tao-experiments"
        },
        {
            "source": "/home/msc1/workspace/tao-experiments/yolo_v4/specs",
            "destination": "/workspace/tao-experiments/yolo_v4/specs"
        }
    ]
}

## 1. Install the TAO launcher <a class="anchor" id="head-1"></a>
The TAO launcher is a python package distributed as a python wheel listed in PyPI. You may install the launcher by executing the following cell.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:
* python >=3.6.9 < 3.8.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 455+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be triggered to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.

After setting up your virtual environment with the above requirements, install TAO pip package.

In [12]:
# SKIP this step IF you have already installed the TAO launcher.
!pip3 install nvidia-tao

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com


In [9]:
# View the versions of the TAO launcher
!tao info

Configuration of the TAO Toolkit Instance
task_group: ['model', 'dataset', 'deploy']
format_version: 3.0
toolkit_version: 5.0.0
published_date: 07/14/2023


## 2. Prepare dataset and pre-trained model <a class="anchor" id="head-2"></a>

In [5]:
# verify
import os

DATA_DIR = os.environ.get('LOCAL_DATA_DIR')
num_training_images = len(os.listdir(os.path.join(DATA_DIR, "train/images")))
num_training_labels = len(os.listdir(os.path.join(DATA_DIR, "train/labels")))
num_testing_images = len(os.listdir(os.path.join(DATA_DIR, "test/images")))
print("Number of images in the train/val set. {}".format(num_training_images))
print("Number of labels in the train/val set. {}".format(num_training_labels))
print("Number of images in the test set. {}".format(num_testing_images))

Number of images in the train/val set. 6884
Number of labels in the train/val set. 6884
Number of images in the test set. 981


In [6]:
# Sample kitti label.
!cat $LOCAL_DATA_DIR/train/labels/0_jpg.rf.c4f611bd2d74c025eba5621935579ef6.txt

knife 0.00 0 0.0 289.00 228.00 380.00 388.00 0.0 0.0 0.0 0.0 0.0 0.0 0.0

### 2.3 Generate tfrecords <a class="anchor" id="head-2-3"></a>

The default YOLOv4 data format requires generation of TFRecords. Currently, the old sequence data format (image folders and label txt folders) is still supported and if you prefer to use the sequence data format, you can skip this section. To use sequence data format, please use spec file `yolo_v4_train_resnet18_kitti_seq.txt` and `yolo_v4_retrain_resnet18_kitti_seq.txt`. And you can check our [user guide](https://docs.nvidia.com/tao/tao-toolkit/text/object_detection/yolo_v4.html#dataset-config) for more details about tfrecords generation and sequence data format usage.

Note: we observe that for YOLOv4, when mosaic augmentation is turned on (mosaic_prob > 0), the sequence format has faster training speed.

Note: we observe the TFRecords format sometimes results in CUDA error during evaluation. Setting `force_on_cpu` in `nms_config` to `true` can help prevent this problem.

In [16]:
!tao model yolo_v4 dataset_convert -d $SPECS_DIR/yolo_v4_tfrecords_kitti_train.txt \
                             -o $DATA_DOWNLOAD_DIR/train/tfrecords

2023-07-21 00:45:08,050 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2023-07-21 00:45:08,092 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/msc1/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
2023-07-21 00:45:08,156 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 275: Printing tty value True
Using TensorFlow backend.
2023-07-20 23:45:08.911490: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2023-07-20 23:45:10,934 [TAO Toolkit] [INFO] matplotlib.font_manager 1633: generated new fontManager
Using TensorFlow backend.
2023-07-20 23:45

In [17]:
!tao model yolo_v4 dataset_convert -d $SPECS_DIR/yolo_v4_tfrecords_kitti_val.txt \
                             -o $DATA_DOWNLOAD_DIR/val/tfrecords

2023-07-21 00:45:22,742 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2023-07-21 00:45:22,789 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/msc1/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
2023-07-21 00:45:22,867 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 275: Printing tty value True
Using TensorFlow backend.
2023-07-20 23:45:23.591756: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2023-07-20 23:45:25,581 [TAO Toolkit] [INFO] matplotlib.font_manager 1633: generated new fontManager
Using TensorFlow backend.
2023-07-20 23:45

### 2.4 Download pre-trained model <a class="anchor" id="head-2-4"></a>

We will use NGC CLI to get the pre-trained models. For more details, go to [ngc.nvidia.com](ngc.nvidia.com) and click the SETUP on the navigation bar.

In [20]:
# Installing NGC CLI on the local machine.
## Download and install
%env CLI=ngccli_cat_linux.zip
!mkdir -p $LOCAL_PROJECT_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))

env: CLI=ngccli_cat_linux.zip
--2023-07-21 00:49:17--  https://ngc.nvidia.com/downloads/ngccli_cat_linux.zip
Resolving ngc.nvidia.com (ngc.nvidia.com)... 13.33.52.64, 13.33.52.55, 13.33.52.102, ...
Connecting to ngc.nvidia.com (ngc.nvidia.com)|13.33.52.64|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 43272608 (41M) [application/zip]
Saving to: ‘/home/msc1/workspace/tao-experiments/ngccli/ngccli_cat_linux.zip’


2023-07-21 00:49:18 (110 MB/s) - ‘/home/msc1/workspace/tao-experiments/ngccli/ngccli_cat_linux.zip’ saved [43272608/43272608]

Archive:  /home/msc1/workspace/tao-experiments/ngccli/ngccli_cat_linux.zip
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/opentelemetry_semantic_conventions-0.38b0.dist-info/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/opentelemetry_semantic_conventions-0.38b0.dist-info/RECORD  
  inflating: /home/msc1/workspace/tao-experim

  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/main.so  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/six-1.16.0.dist-info/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/six-1.16.0.dist-info/RECORD  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/six-1.16.0.dist-info/LICENSE  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/six-1.16.0.dist-info/METADATA  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/six-1.16.0.dist-info/WHEEL  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/six-1.16.0.dist-info/INSTALLER  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/six-1.16.0.dist-info/top_level.txt  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/typing_extensions-4.4.0.dist-info/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/typing_extensions-4.4.0.dist-info/RECORD  
  inflating: /home/msc1/workspace/tao-experiment

  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/libpython3.9.so.1.0  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/certifi/
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/certifi/py.typed  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/certifi/cacert.pem  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/frozenlist/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/frozenlist/_frozenlist.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/libfreebl3.so  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/multidict/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/multidict/_multidict.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/libkrb5.so.3  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/opentelemetry_proto-1.17.0.dist-info/
  inflating:

  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/ecr-public/2020-10-30/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/lightsail/
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/lightsail/2016-11-28/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/lightsail/2016-11-28/paginators-1.json  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/lightsail/2016-11-28/endpoint-rule-set-1.json.gz  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/lightsail/2016-11-28/service-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/lightsail/2016-11-28/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/iotevents-data/
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/iotevents-data/2018-10-

  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/support/2013-04-15/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/proton/
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/proton/2020-07-20/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/proton/2020-07-20/waiters-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/proton/2020-07-20/paginators-1.json  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/proton/2020-07-20/endpoint-rule-set-1.json.gz  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/proton/2020-07-20/service-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/proton/2020-07-20/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/iotsecuretunneling/
   c

  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/lookoutvision/2020-11-20/service-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/lookoutvision/2020-11-20/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/customer-profiles/
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/customer-profiles/2020-08-15/
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/customer-profiles/2020-08-15/paginators-1.json  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/customer-profiles/2020-08-15/endpoint-rule-set-1.json.gz  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/customer-profiles/2020-08-15/service-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/customer-profiles/2020-08-15/examples-1.json  
   creating: /home/msc1/w

  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/iam/2010-05-08/service-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/iam/2010-05-08/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/greengrass/
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/greengrass/2017-06-07/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/greengrass/2017-06-07/paginators-1.json  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/greengrass/2017-06-07/endpoint-rule-set-1.json.gz  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/greengrass/2017-06-07/service-2.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/network-firewall/
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/network-firewall/2020-11-12/


  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/rds/2014-10-31/service-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/rds/2014-10-31/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/rds/2014-09-01/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/rds/2014-09-01/waiters-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/rds/2014-09-01/paginators-1.json  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/rds/2014-09-01/endpoint-rule-set-1.json.gz  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/rds/2014-09-01/service-2.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/macie/
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/macie/2017-12-19/
  inflating: /home/msc1/worksp

  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/cloudfront/2016-08-20/service-2.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/cloudfront/2020-05-31/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/cloudfront/2020-05-31/waiters-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/cloudfront/2020-05-31/paginators-1.json  
 extracting: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/cloudfront/2020-05-31/endpoint-rule-set-1.json.gz  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/cloudfront/2020-05-31/service-2.json  
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/cloudfront/2020-05-31/examples-1.json  
   creating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cli/botocore/data/cloudfront/2016-09-29/
  inflating: /home/msc1/workspace/tao-experiments/ngccli/ngc-cl

In [21]:
!ngc registry model list nvidia/tao/pretrained_object_detection

[{
    "application": "Other",
    "createdDate": "2021-08-16T15:53:38.516Z",
    "description": "Pretrained weights to facilitate transfer learning using TAO Toolkit.",
    "displayName": "TAO Pretrained Object Detection",
    "framework": "Other",
    "isPublic": true,
    "labels": [
        {
            "key": "general",
            "values": [
                "yolo",
                "tao",
                "ssd",
                "retinanet",
                "dssd",
                "resnet",
                "Retail",
                "industrial",
                "cv",
                "public safety",
                "efficientnet",
                "fasterrcnn",
                "inspection",
                "smart city",
                "smart infrastructure"
            ]
        },
        {
            "key": "framework",
            "values": [
                "Other"
            ]
        },
        {
            "key": "precision",
         

In [22]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/pretrained_resnet18/

In [23]:
# Pull pretrained model from NGC
!ngc registry model download-version nvidia/tao/pretrained_object_detection:resnet18 \
                    --dest $LOCAL_EXPERIMENT_DIR/pretrained_resnet18

{
    "download_end": "2023-07-21 00:49:56",
    "download_start": "2023-07-21 00:49:44",
    "download_time": "11s",
    "files_downloaded": 1,
    "local_path": "/home/msc1/workspace/tao-experiments/yolo_v4/pretrained_resnet18/pretrained_object_detection_vresnet18",
    "size_downloaded": "88.96 MB",
    "status": "COMPLETED"
}


In [24]:
print("Check that model is downloaded into dir.")
!ls -l $LOCAL_EXPERIMENT_DIR/pretrained_resnet18/pretrained_object_detection_vresnet18

Check that model is downloaded into dir.
total 91096
-rw------- 1 msc1 msc1 93278448 Jul 21 00:49 resnet_18.hdf5


## 3. Provide training specification <a class="anchor" id="head-3"></a>
* Augmentation parameters for on-the-fly data augmentation
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.
* Whether to use quantization aware training (QAT)

In [28]:
# Provide pretrained model path
!sed -i 's,EXPERIMENT_DIR,'"$USER_EXPERIMENT_DIR"',' $LOCAL_SPECS_DIR/yolo_v4_train_resnet18_kitti.txt

In [33]:
!cat $LOCAL_SPECS_DIR/yolo_v4_train_resnet18_kitti.txt

random_seed: 42
yolov4_config {
  big_anchor_shape: "[(114.94, 60.67), (159.06, 114.59), (297.59, 176.38)]"
  mid_anchor_shape: "[(42.99, 31.91), (79.57, 31.75), (56.80, 56.93)]"
  small_anchor_shape: "[(15.60, 13.88), (30.25, 20.25), (20.67, 49.63)]"
  box_matching_iou: 0.25
  matching_neutral_box_iou: 0.5
  arch: "resnet"
  nlayers: 18
  arch_conv_blocks: 2
  loss_loc_weight: 1.0
  loss_neg_obj_weights: 1.0
  loss_class_weights: 1.0
  label_smoothing: 0.0
  big_grid_xy_extend: 0.05
  mid_grid_xy_extend: 0.1
  small_grid_xy_extend: 0.2
  freeze_bn: false
  #freeze_blocks: 0
  force_relu: false
}
training_config {
  visualizer {
      enabled: False
      num_images: 3
  }
  batch_size_per_gpu: 8
  num_epochs: 80
  enable_qat: false
  checkpoint_interval: 10
  learning_rate {
    soft_start_cosine_annealing_schedule {
      min_learning_rate: 1e-7
      max_learning_rate: 1e-4
      soft_start: 0.3
    }
  }
  regularizer {
    type: L1
    weight

## 4. Run TAO training <a class="anchor" id="head-4"></a>
* Provide the sample spec file and the output directory location for models
* WARNING: training will take several hours or one day to complete

In [30]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned

In [39]:
print("To run with multigpu, please change --gpus based on the number of available GPUs in your machine.")
!tao model yolo_v4 train -e $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt \
                   -r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \
                   -k $KEY \
                   --gpus 1

To run with multigpu, please change --gpus based on the number of available GPUs in your machine.
2023-07-21 04:07:32,930 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2023-07-21 04:07:32,986 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/msc1/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
2023-07-21 04:07:33,063 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 275: Printing tty value True
Using TensorFlow backend.
2023-07-21 03:07:33.840775: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2023-07-21 03:07:35,906 [TAO Toolkit] [INFO] 























INFO: Serial augmentation enabled = False
INFO: Pseudo sharding enabled = False
INFO: Max Image Dimensions (all sources): (0, 0)
INFO: number of cpus: 20, io threads: 40, compute threads: 20, buffered batches: -1
INFO: total dataset size 1969, number of sources: 1, batch size per gpu: 8, steps: 247
INFO: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.
INFO: shuffle: False - shard 0 of 1
INFO: sampling 1 datasets with weights:
INFO: source: 0 weight: 1.000000


INFO: Log file already exists at /workspace/tao-experiments/yolo_v4/experiment_dir_unpruned/status.json
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
Input (InputLayer)              (None, 3, None, None 0                                            
______________________________



INFO: Starting Training Loop.
Epoch 1/80


INFO: Training loop in progress
Epoch 2/80
INFO: Training loop in progress
Epoch 3/80
INFO: Training loop in progress
Epoch 4/80
INFO: Training loop in progress
Epoch 5/80
INFO: Training loop in progress
Epoch 6/80
INFO: Training loop in progress
Epoch 7/80
INFO: Training loop in progress
Epoch 8/80
INFO: Training loop in progress
Epoch 9/80
INFO: Training loop in progress
Epoch 10/80
Producing predictions: 100%|██████████████████| 247/247 [00:46<00:00,  5.27it/s]
Start to calculate AP for each class
*******************************
gun           AP    0.3155
knife         AP    0.26015
              mAP   0.28782
*******************************
Validation loss: 121.67760516950476
INFO: Evaluation metrics generated.

Epoch 00010: saving model to /workspace/tao-experiments/yolo_v4/experiment_dir_unpruned/weights/yolov4_resnet18_epoch_010.hdf5
INFO: Training loop in progress
Epoch 11/80
INFO: Training loop in progress
Epoch 12/80
INFO: Training

INFO: Training loop in progress
Epoch 48/80
INFO: Training loop in progress
Epoch 49/80
INFO: Training loop in progress
Epoch 50/80
Producing predictions: 100%|██████████████████| 247/247 [00:34<00:00,  7.07it/s]
Start to calculate AP for each class
*******************************
gun           AP    0.77849
knife         AP    0.88239
              mAP   0.83044
*******************************
Validation loss: 11.673840368807557
INFO: Evaluation metrics generated.

Epoch 00050: saving model to /workspace/tao-experiments/yolo_v4/experiment_dir_unpruned/weights/yolov4_resnet18_epoch_050.hdf5
INFO: Training loop in progress
Epoch 51/80
INFO: Training loop in progress
Epoch 52/80
INFO: Training loop in progress
Epoch 53/80
INFO: Training loop in progress
Epoch 54/80
INFO: Training loop in progress
Epoch 55/80
INFO: Training loop in progress
Epoch 56/80
INFO: Training loop in progress
Epoch 57/80
INFO: Training loop in progress
Epoch 58/80
INFO: Training loop in progress
Epoch 59/80
INFO: 

In [None]:
print("To resume from checkpoint, please change pretrain_model_path to resume_model_path in config file.")

In [None]:
print('Model for each epoch:')
print('---------------------')
!ls -ltrh $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned/weights

In [None]:
# Now check the evaluation stats in the csv file and pick the model with highest eval accuracy.
!cat $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned/yolov4_training_log_resnet18.csv
%set_env EPOCH=080

## 5. Evaluate trained models <a class="anchor" id="head-5"></a>

In [None]:
!tao yolo_v4 evaluate -e $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt \
                      -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                      -k $KEY

## 6. Prune trained models <a class="anchor" id="head-6"></a>
* Specify pre-trained model
* Equalization criterion (`Only for resnets as they have element wise operations or MobileNets.`)
* Threshold for pruning.
* A key to save and load the model
* Output directory to store the model

Usually, you just need to adjust `-pth` (threshold) for accuracy and model size trade off. Higher `pth` gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold value depends on the dataset and the model. `0.5` in the block below is just a start point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy.

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_pruned

In [None]:
!tao yolo_v4 prune -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                   -e $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt \
                   -o $USER_EXPERIMENT_DIR/experiment_dir_pruned/yolov4_resnet18_pruned.tlt \
                   -eq intersection \
                   -pth 0.1 \
                   -k $KEY

In [None]:
!ls -rlt $LOCAL_EXPERIMENT_DIR/experiment_dir_pruned/

## 7. Retrain pruned models <a class="anchor" id="head-7"></a>
* Model needs to be re-trained to bring back accuracy after pruning
* Specify re-training specification
* WARNING: training will take several hours or one day to complete

In [None]:
# Printing the retrain spec file. 
# Here we have updated the spec file to include the newly pruned model as a pretrained weights.
!sed -i 's,EXPERIMENT_DIR,'"$USER_EXPERIMENT_DIR"',' $LOCAL_SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt
!cat $LOCAL_SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain

In [None]:
# Retraining using the pruned model as pretrained weights 
!tao yolo_v4 train --gpus 1 \
                   -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                   -r $USER_EXPERIMENT_DIR/experiment_dir_retrain \
                   -k $KEY

In [None]:
# Listing the newly retrained model.
!ls -rlt $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain/weights

In [None]:
# Now check the evaluation stats in the csv file and pick the model with highest eval accuracy.
!cat $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain/yolov4_training_log_resnet18.csv
%set_env EPOCH=080

## 8. Evaluate retrained model <a class="anchor" id="head-8"></a>

In [None]:
!tao yolo_v4 evaluate -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                      -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                      -k $KEY

## 9. Visualize inferences <a class="anchor" id="head-9"></a>
In this section, we run the `infer` tool to generate inferences on the trained models and visualize the results.

In [None]:
# Copy some test images
!mkdir -p $LOCAL_DATA_DIR/test_samples
!cp $LOCAL_DATA_DIR/testing/image_2/00000* $LOCAL_DATA_DIR/test_samples/

In [None]:
# Running inference for detection on n images
!tao yolo_v4 inference -i $DATA_DOWNLOAD_DIR/test_samples \
                       -o $USER_EXPERIMENT_DIR/yolo_infer_images \
                       -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                       -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                       -l $USER_EXPERIMENT_DIR/yolo_infer_labels \
                       -k $KEY

The `inference` tool produces two outputs. 
1. Overlain images in `$LOCAL_EXPERIMENT_DIR/yolo_infer_images`
2. Frame by frame bbox labels in kitti format located in `$LOCAL_EXPERIMENT_DIR/yolo_infer_labels`

In [None]:
# Simple grid visualizer
!pip3 install matplotlib==3.3.3
import matplotlib.pyplot as plt
import os
from math import ceil
valid_image_ext = ['.jpg', '.png', '.jpeg', '.ppm']

def visualize_images(image_dir, num_cols=4, num_images=10):
    output_path = os.path.join(os.environ['LOCAL_EXPERIMENT_DIR'], image_dir)
    num_rows = int(ceil(float(num_images) / float(num_cols)))
    f, axarr = plt.subplots(num_rows, num_cols, figsize=[80,30])
    f.tight_layout()
    a = [os.path.join(output_path, image) for image in os.listdir(output_path) 
         if os.path.splitext(image)[1].lower() in valid_image_ext]
    for idx, img_path in enumerate(a[:num_images]):
        col_id = idx % num_cols
        row_id = idx // num_cols
        img = plt.imread(img_path)
        axarr[row_id, col_id].imshow(img) 

In [None]:
# Visualizing the sample images.
OUTPUT_PATH = 'yolo_infer_images' # relative path from $USER_EXPERIMENT_DIR.
COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

## 10. Model Export <a class="anchor" id="head-10"></a>

If you trained a non-QAT model, you may export in FP32, FP16 or INT8 mode using the code block below. For INT8, you need to provide calibration image directory.

In [None]:
# tao <task> export will fail if .etlt already exists. So we clear the export folder before tao <task> export
!rm -rf $LOCAL_EXPERIMENT_DIR/export
!mkdir -p $LOCAL_EXPERIMENT_DIR/export
# Generate .etlt file using tao container
!tao yolo_v4 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                    -k $KEY \
                    -o $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.etlt \
                    -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                    --target_opset 12 \
                    --gen_ds_config

Using the `tao-deploy` container, you can generate a TensorRT engine and verify the correctness of the generated through evaluate and inference.

The `tao-deploy` produces optimized tensorrt engines for the platform that it resides on. Therefore, to get maximum performance, please run `tao-deploy` command which will instantiate a deploy container, with the exported `.etlt` file on your target device. The `tao-deploy` container only works for x86, with discrete NVIDIA GPU's.

For the jetson devices, please download the tao-converter for jetson and refer to [here](https://docs.nvidia.com/tao/tao-toolkit/text/tensorrt.html#installing-the-tao-converter) for more details.

If you choose to integrate your model into deepstream directly, you may do so by simply copying the exported `.etlt` file along with the calibration cache to the target device and updating the spec file that configures the `gst-nvinfer` element to point to this newly exported model. Usually this file is called `config_infer_primary.txt` for detection models and `config_infer_secondary_*.txt` for classification models.

In [None]:
# Convert to TensorRT engine (FP32). 
!tao-deploy yolo_v4 gen_trt_engine -m $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.etlt \
                                   -k $KEY \
                                   -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                                   --batch_size 16 \
                                   --min_batch_size 1 \
                                   --opt_batch_size 8 \
                                   --max_batch_size 16 \
                                   --data_type fp32 \
                                   --engine_file $USER_EXPERIMENT_DIR/export/trt.engine

In [None]:
# Convert to TensorRT engine (FP16). 
!tao-deploy yolo_v4 gen_trt_engine -m $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.etlt \
                                   -k $KEY \
                                   -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                                   --batch_size 16 \
                                   --min_batch_size 1 \
                                   --opt_batch_size 8 \
                                   --max_batch_size 16 \
                                   --data_type fp16 \
                                   --engine_file $USER_EXPERIMENT_DIR/export/trt.engine.fp16

`Note:` In this example, for ease of execution we restrict the number of calibrating batches to 10. TAO Toolkit recommends the use of at least 10% of the training dataset for int8 calibration.

In [None]:
# To export in INT8 mode (generate calibration cache file). 
!tao-deploy yolo_v4 gen_trt_engine -m $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.etlt \
                                   -k $KEY \
                                   -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                                   --cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \
                                   --data_type int8 \
                                   --batch_size 16 \
                                   --min_batch_size 1 \
                                   --opt_batch_size 8 \
                                   --max_batch_size 16 \
                                   --batches 10 \
                                   --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin  \
                                   --cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile \
                                   --engine_file $USER_EXPERIMENT_DIR/export/trt.engine.int8

In [None]:
print('Exported model:')
print('------------')
!ls -lh $LOCAL_EXPERIMENT_DIR/export

## 11. Verify the deployed model <a class="anchor" id="head-11"></a>
Verify the converted engine by visualizing TensorRT inferences.


In [None]:
# Infer using TensorRT engine
!tao-deploy yolo_v4 inference -m $USER_EXPERIMENT_DIR/export/trt.engine \
                              -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                              -i $DATA_DOWNLOAD_DIR/test_samples \
                              -r $USER_EXPERIMENT_DIR/yolo_infer_images \
                              -t 0.6

In [None]:
# Visualizing the sample images.
OUTPUT_PATH = 'yolo_infer_images/images_annotated' # relative path from $USER_EXPERIMENT_DIR.
COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

## 12. QAT workflow <a class="anchor" id="head-12"></a>
In this section, we will explore the typical Quantization-Aware Training(QAT) workflow with TAO. QAT workflow is almost the same as non-QAT workflow except for two major differences:
1. set `enable_qat` to `True` in training and retraining spec files to enable the QAT for training/retraining
2. when doing export in INT8 mode, the calibration json file that stores the scales used during QAT is extracted during `tao <task> export`. The .etlt file and calibration json file are used to generate engine file through tao-deploy

 ### 12.1. QAT Training <a class="anchor" id="head-12-1"></a>

In [None]:
# To enable QAT training on sample spec file, we need to set `enable_qat` to `True` in training spec files
!sed -i "s/enable_qat: false/enable_qat: true/g" $LOCAL_SPECS_DIR/yolo_v4_train_resnet18_kitti_qat.txt
!sed -i 's,EXPERIMENT_DIR,'"$USER_EXPERIMENT_DIR"',' $LOCAL_SPECS_DIR/yolo_v4_train_resnet18_kitti_qat.txt
!cat $LOCAL_SPECS_DIR/yolo_v4_train_resnet18_kitti_qat.txt

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned_QAT

In [None]:
print("To run with multigpu, please change --gpus based on the number of available GPUs in your machine.")
!tao yolo_v4 train -e $SPECS_DIR/yolo_v4_train_resnet18_kitti_qat.txt \
                   -r $USER_EXPERIMENT_DIR/experiment_dir_unpruned_QAT \
                   -k $KEY \
                   --gpus 1

In [None]:
print('Model for each epoch:')
print('---------------------')
!ls -ltrh $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned_QAT/weights

In [None]:
# Now check the evaluation stats in the csv file and pick the model with highest eval accuracy.
!cat $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned_QAT/yolov4_training_log_resnet18.csv
%set_env EPOCH=080

 ### 12.2. QAT Evaluation <a class="anchor" id="head-12-2"></a>

In [None]:
!tao yolo_v4 evaluate -e $SPECS_DIR/yolo_v4_train_resnet18_kitti_qat.txt \
                      -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned_QAT/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                      -k $KEY

 ### 12.3. Pruning QAT model <a class="anchor" id="head-12-3"></a>

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_pruned_QAT

In [None]:
!tao yolo_v4 prune -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned_QAT/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                   -e $SPECS_DIR/yolo_v4_train_resnet18_kitti_qat.txt \
                   -o $USER_EXPERIMENT_DIR/experiment_dir_pruned_QAT/yolov4_resnet18_qat_pruned.tlt \
                   -eq intersection \
                   -pth 0.1 \
                   -k $KEY

In [None]:
!ls -rlt $LOCAL_EXPERIMENT_DIR/experiment_dir_pruned_QAT/

 ### 12.4. Retraining <a class="anchor" id="head-12-4"></a>

In [None]:
!sed -i "s/enable_qat: false/enable_qat: true/g" $LOCAL_SPECS_DIR/yolo_v4_retrain_resnet18_kitti_qat.txt
!sed -i 's,EXPERIMENT_DIR,'"$USER_EXPERIMENT_DIR"',' $LOCAL_SPECS_DIR/yolo_v4_retrain_resnet18_kitti_qat.txt
!cat $LOCAL_SPECS_DIR/yolo_v4_retrain_resnet18_kitti_qat.txt

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain_qat

In [None]:
# Retraining using the pruned model as pretrained weights 
!tao yolo_v4 train --gpus 1 \
                   -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti_qat.txt \
                   -r $USER_EXPERIMENT_DIR/experiment_dir_retrain_qat \
                   -k $KEY

In [None]:
# Listing the newly retrained model.
!ls -rlt $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain_qat/weights

In [None]:
# Now check the evaluation stats in the csv file and pick the model with highest eval accuracy.
!cat $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain_qat/yolov4_training_log_resnet18.csv
%set_env EPOCH=080

 ### 12.5. Evaluation of the retrained model <a class="anchor" id="head-12-5"></a>

In [None]:
!tao yolo_v4 evaluate -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti_qat.txt \
                      -m $USER_EXPERIMENT_DIR/experiment_dir_retrain_qat/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                      -k $KEY

 ### 12.6. Inference of the retrained QAT model <a class="anchor" id="head-12-6"></a>

In [None]:
!tao yolo_v4 inference -i $DATA_DOWNLOAD_DIR/test_samples \
                       -o $USER_EXPERIMENT_DIR/yolo_infer_images_qat \
                       -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti_qat.txt \
                       -m $USER_EXPERIMENT_DIR/experiment_dir_retrain_qat/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                       -l $USER_EXPERIMENT_DIR/yolo_infer_labels_qat \
                       -k $KEY

In [None]:
# Visualizing the sample images.
OUTPUT_PATH = 'yolo_infer_images_qat' # relative path from $USER_EXPERIMENT_DIR.
COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

 ### 12.7. Deployment of the QAT model <a class="anchor" id="head-12-7"></a>

 #### Generate .etlt file using tao container
If you train a QAT model, you may only export in INT8 mode using following code block. This generates an etlt file and the corresponding calibration json file that stores scales used during QAT. You can either use the etlt file and calibration json file to generate int8 engine through tao-deploy or DeepStream for FP32 or FP16 mode. But please note this gives sub-optimal results. If you want to deploy in FP32 or FP16, you should disable QAT in training.

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/export_qat
!tao yolo_v4 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain_qat/weights/yolov4_resnet18_epoch_$EPOCH.tlt  \
                    -o $USER_EXPERIMENT_DIR/export_qat/yolov4_resnet18_epoch_$EPOCH.etlt \
                    -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti_qat.txt \
                    -k $KEY \
                    --cal_json_file $USER_EXPERIMENT_DIR/export_qat/cal.json \
                    --target_opset 12 \
                    --gen_ds_config

In [None]:
print('Exported model:')
print('------------')
!ls -lh $LOCAL_EXPERIMENT_DIR/export_qat

 #### Generate a TensorRT engine using tao-deploy

In [None]:
!tao-deploy yolo_v4 gen_trt_engine -m $USER_EXPERIMENT_DIR/export_qat/yolov4_resnet18_epoch_$EPOCH.etlt \
                                   -k $KEY \
                                   -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti_qat.txt \
                                   --data_type int8 \
                                   --batch_size 8 \
                                   --min_batch_size 1 \
                                   --opt_batch_size 8 \
                                   --max_batch_size 16 \
                                   --cal_json_file $USER_EXPERIMENT_DIR/export_qat/cal.json \
                                   --engine_file $USER_EXPERIMENT_DIR/export_qat/trt.engine.int8

In [None]:
print('Exported engine:')
print('------------')
!ls -lh $LOCAL_EXPERIMENT_DIR/export_qat/

 ### 12.8. Verify the deployed QAT model <a class="anchor" id="head-12-8"></a>

In [None]:
!tao-deploy yolo_v4 inference -m $USER_EXPERIMENT_DIR/export_qat/trt.engine.int8 \
                              -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti_qat.txt \
                              -i $DATA_DOWNLOAD_DIR/test_samples \
                              -r $USER_EXPERIMENT_DIR/yolo_infer_images_qat_trt \
                              -t 0.6

In [None]:
# Visualizing the sample images.
OUTPUT_PATH = 'yolo_infer_images_qat_trt/images_annotated' # relative path from $USER_EXPERIMENT_DIR.
COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)