# Object Detection using TAO YOLOv4 Tiny

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/embedded-transfer-learning-toolkit-software-stack-1200x670px.png" width="1080">


## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Take a pretrained model and train a YOLO v4 Tiny model on the KITTI dataset
* Prune the trained YOLO v4 Tiny model
* Retrain the pruned model to recover lost accuracy
* Export the pruned model
* Quantize the pruned model using QAT
* Run Inference on the trained model
* Export the pruned, quantized and retrained model to a .etlt file for deployment to DeepStream
* Run inference on the exported .etlt model to verify deployment using TensorRT

## Table of Contents

This notebook shows an example use case of YOLO v4 Tiny object detection using Train Adapt Optimize (TAO) Toolkit.

0. [Set up env variables and map drives](#head-0)
1. [Install the TAO launcher](#head-1)
2. [Prepare dataset and pre-trained model](#head-2) <br>
     2.1 [Download the dataset](#head-2-1)<br>
     2.2 [Verify the downloaded dataset](#head-2-2)<br>
     2.3 [Generate tfrecords](#head-2-3)<br>
     2.4 [Download pretrained model](#head-2-4)
3. [Provide training specification](#head-3)
4. [Run TAO training](#head-4)
5. [Evaluate trained models](#head-5)
6. [Prune trained models](#head-6)
7. [Retrain pruned models](#head-7)
8. [Evaluate retrained model](#head-8)
9. [Visualize inferences](#head-9)
10. [Model Export](#head-10)
11. [Verify deployed model](#head-11)


## 0. Set up env variables and map drives <a class="anchor" id="head-0"></a>

When using the purpose-built pretrained models from NGC, please make sure to set the `$KEY` environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users workspace. Please note that the dataset to run this notebook is expected to reside in the `$LOCAL_PROJECT_DIR/data`, while the TAO experiment generated collaterals will be output to `$LOCAL_PROJECT_DIR/yolo_v4_tiny`. More information on how to set up the dataset and the supported steps in the TAO workflow are provided in the subsequent cells.

*Note: Please make sure to remove any stray artifacts/files from the `$USER_EXPERIMENT_DIR` or `$DATA_DOWNLOAD_DIR` paths as mentioned below, that may have been generated from previous experiments. Having checkpoint files etc may interfere with creating a training graph for a new experiment.*

In [1]:
# Setting up env variables for cleaner command line commands.
import os

print("Please replace the variable with your key.")
#The key is from NGC official website
%env KEY=YzcwZDAydXZjZWY1bXNrZGdsa2hmZm91a2U6Yzk3Y2NkNzctOGZjZS00ZDcwLTljMTgtNjE2NjcyZDg2YTgw
%env USER_EXPERIMENT_DIR=/workspace/tao-experiments/yolo_v4_tiny
%env DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=~/tao-samples/yolo_v4_tiny

# Please define this local project directory that needs to be mapped to the TAO docker session.
# The dataset expected to be present in $LOCAL_PROJECT_DIR/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/yolo_v4_tiny
#%env LOCAL_PROJECT_DIR=YOUR_LOCAL_PROJECT_DIR_PATH
%env LOCAL_PROJECT_DIR=/home/hirain/cv_samples_v1.4.0/
os.environ["LOCAL_DATA_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "data")
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "yolo_v4_tiny")

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)
%env SPECS_DIR=/workspace/tao-experiments/yolo_v4_tiny/specs

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR

Please replace the variable with your key.
env: KEY=YzcwZDAydXZjZWY1bXNrZGdsa2hmZm91a2U6Yzk3Y2NkNzctOGZjZS00ZDcwLTljMTgtNjE2NjcyZDg2YTgw
env: USER_EXPERIMENT_DIR=/workspace/tao-experiments/yolo_v4_tiny
env: DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data
env: LOCAL_PROJECT_DIR=/home/hirain/cv_samples_v1.4.0/
env: SPECS_DIR=/workspace/tao-experiments/yolo_v4_tiny/specs
total 28
-rw-rw-r-- 1 hirain hirain 2316 Jun  3  2022 yolo_v4_tiny_train_kitti_seq.txt
-rw-rw-r-- 1 hirain hirain  296 Jun  3  2022 yolo_v4_tiny_tfrecords_kitti_val.txt
-rw-rw-r-- 1 hirain hirain  310 Jun  3  2022 yolo_v4_tiny_tfrecords_kitti_train.txt
-rw-rw-r-- 1 hirain hirain 2306 Jun  3  2022 yolo_v4_tiny_retrain_kitti_seq.txt
-rw-rw-r-- 1 hirain hirain 2354 Apr  5 15:58 yolo_v4_tiny_train_kitti.txt.bak
-rw-rw-r-- 1 hirain hirain 2343 Apr  5 16:16 yolo_v4_tiny_retrain_kitti.txt
-rw-rw-r-- 1 hirain hirain 2109 Apr  5 16:32 yolo_v4_tiny_train_kitti.txt


In [2]:
# Create local dir
!mkdir -p $LOCAL_DATA_DIR
!mkdir -p $LOCAL_EXPERIMENT_DIR

The cell below maps the project directory on your local host to a workspace directory in the TAO docker instance, so that the data and the results are mapped from outside to inside of the docker instance.

In [3]:
# Mapping up the local directories to the TAO docker.
import json
mounts_file = os.path.expanduser("~/.tao_mounts.json")

# Define the dictionary with the mapped drives
drive_map = {
    "Mounts": [
        # Mapping the data directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tao-experiments"
        },
        # Mapping the specs directory.
        {
            "source": os.environ["LOCAL_SPECS_DIR"],
            "destination": os.environ["SPECS_DIR"]
        },
    ]
}

# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(drive_map, mfile, indent=4)

In [4]:
!cat ~/.tao_mounts.json

{
    "Mounts": [
        {
            "source": "/home/hirain/cv_samples_v1.4.0/",
            "destination": "/workspace/tao-experiments"
        },
        {
            "source": "/home/hirain/cv_samples_v1.4.0/yolo_v4_tiny/specs",
            "destination": "/workspace/tao-experiments/yolo_v4_tiny/specs"
        }
    ]
}

## 1. Install the TAO launcher <a class="anchor" id="head-1"></a>
The TAO launcher is a python package distributed as a python wheel listed in PyPI. You may install the launcher by executing the following cell.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:
* python >=3.6.9 < 3.8.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 455+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be triggered to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.

After setting up your virtual environment with the above requirements, install TAO pip package.

In [14]:
# SKIP this step IF you have already installed the TAO launcher.
!pip3 install --upgrade nvidia-pyindex
!pip3 install --upgrade nvidia-tao

Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting nvidia-pyindex
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/92/59/813c065a5434541c1fff59b3a25f2767e5d176b470ef3a9b025f21edc71d/nvidia-pyindex-1.0.9.tar.gz (10 kB)
  Preparing metadata (setup.py) ... [?25ldone
[?25hBuilding wheels for collected packages: nvidia-pyindex
  Building wheel for nvidia-pyindex (setup.py) ... [?25ldone
[?25h  Created wheel for nvidia-pyindex: filename=nvidia_pyindex-1.0.9-py3-none-any.whl size=8416 sha256=5d4a4bfdcb495de23e0022c90ef69289ff0d0c68b3ecdc53cc16737a20cb5bad
  Stored in directory: /home/hirain/.cache/pip/wheels/17/f0/ea/0fd424b6ce8e07884c96b129a621dca8319cf094984cfad86b
Successfully built nvidia-pyindex
Installing collected packages: nvidia-pyindex
Successfully installed nvidia-pyindex-1.0.9
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple, https://pypi.ngc.nvidia.com


In [5]:
# View the versions of the TAO launcher
!tao info --verbose

Configuration of the TAO Toolkit Instance

dockers: 		
	nvidia/tao/tao-toolkit: 			
		4.0.0-tf2.9.1: 				
			docker_registry: nvcr.io
			tasks: 
				1. classification_tf2
				2. efficientdet_tf2
		4.0.0-tf1.15.5: 				
			docker_registry: nvcr.io
			tasks: 
				1. augment
				2. bpnet
				3. classification_tf1
				4. detectnet_v2
				5. dssd
				6. emotionnet
				7. efficientdet_tf1
				8. faster_rcnn
				9. fpenet
				10. gazenet
				11. gesturenet
				12. heartratenet
				13. lprnet
				14. mask_rcnn
				15. multitask_classification
				16. retinanet
				17. ssd
				18. unet
				19. yolo_v3
				20. yolo_v4
				21. yolo_v4_tiny
				22. converter
		4.0.1-tf1.15.5: 				
			docker_registry: nvcr.io
			tasks: 
				1. mask_rcnn
				2. unet
		4.0.0-pyt: 				
			docker_registry: nvcr.io
			tasks: 
				1. action_recognition
				2. deformable_detr
				3. segformer
				4. re_identification
				5. pointpillars
				6. pose_classification
				7. n_gra

## 2. Prepare dataset and pre-trained model <a class="anchor" id="head-2"></a>

 We will be using the KITTI detection dataset for the tutorial. To find more details please visit
 http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=2d. Please download the KITTI detection images (http://www.cvlibs.net/download.php?file=data_object_image_2.zip) and labels (http://www.cvlibs.net/download.php?file=data_object_label_2.zip) to $DATA_DOWNLOAD_DIR.
 
 The data will then be extracted to have
 * training images in `$LOCAL_DATA_DIR/training/image_2`
 * training labels in `$LOCAL_DATA_DIR/training/label_2`
 * testing images in `$LOCAL_DATA_DIR/testing/image_2`
 
You may use this notebook with your own dataset as well. To use this example with your own dataset, please follow the same directory structure as mentioned below.

*Note: There are no labels for the testing images, therefore we use it just to visualize inferences for the trained model.*

### 2.1 Download the dataset <a class="anchor" id="head-2-1"></a>

Once you have gotten the download links in your email, please populate them in place of the `KITTI_IMAGES_DOWNLOAD_URL` and the `KITTI_LABELS_DOWNLOAD_URL`. This next cell, will download the data and place in `$LOCAL_DATA_DIR`

In [None]:
import os
!mkdir -p $LOCAL_DATA_DIR
os.environ["URL_IMAGES"]=KITTI_IMAGES_DOWNLOAD_URL
!if [ ! -f $LOCAL_DATA_DIR/data_object_image_2.zip ]; then wget $URL_IMAGES -O $LOCAL_DATA_DIR/data_object_image_2.zip; else echo "image archive already downloaded"; fi 
os.environ["URL_LABELS"]=KITTI_LABELS_DOWNLOAD_URL
!if [ ! -f $LOCAL_DATA_DIR/data_object_label_2.zip ]; then wget $URL_LABELS -O $LOCAL_DATA_DIR/data_object_label_2.zip; else echo "label archive already downloaded"; fi 

### 2.2 Verify the downloaded dataset <a class="anchor" id="head-2-2"></a>

In [None]:
# Check the dataset is present
!mkdir -p $LOCAL_DATA_DIR
!if [ ! -f $LOCAL_DATA_DIR/data_object_image_2.zip ]; then echo 'Image zip file not found, please download.'; else echo 'Found Image zip file.';fi
!if [ ! -f $LOCAL_DATA_DIR/data_object_label_2.zip ]; then echo 'Label zip file not found, please download.'; else echo 'Found Labels zip file.';fi

In [None]:
# This may take a while: verify integrity of zip files 
!sha256sum $LOCAL_DATA_DIR/data_object_image_2.zip | cut -d ' ' -f 1 | grep -xq '^351c5a2aa0cd9238b50174a3a62b846bc5855da256b82a196431d60ff8d43617$' ; \
if test $? -eq 0; then echo "images OK"; else echo "images corrupt, re-download!" && rm -f $LOCAL_DATA_DIR/data_object_image_2.zip; fi 
!sha256sum $LOCAL_DATA_DIR/data_object_label_2.zip | cut -d ' ' -f 1 | grep -xq '^4efc76220d867e1c31bb980bbf8cbc02599f02a9cb4350effa98dbb04aaed880$' ; \
if test $? -eq 0; then echo "labels OK"; else echo "labels corrupt, re-download!" && rm -f $LOCAL_DATA_DIR/data_object_label_2.zip; fi 

In [None]:
# unpack 
!unzip -u $LOCAL_DATA_DIR/data_object_image_2.zip -d $LOCAL_DATA_DIR
!unzip -u $LOCAL_DATA_DIR/data_object_label_2.zip -d $LOCAL_DATA_DIR

In [None]:
# verify
import os

DATA_DIR = os.environ.get('LOCAL_DATA_DIR')
num_training_images = len(os.listdir(os.path.join(DATA_DIR, "training/image_2")))
num_training_labels = len(os.listdir(os.path.join(DATA_DIR, "training/label_2")))
num_testing_images = len(os.listdir(os.path.join(DATA_DIR, "testing/image_2")))
print("Number of images in the train/val set. {}".format(num_training_images))
print("Number of labels in the train/val set. {}".format(num_training_labels))
print("Number of images in the test set. {}".format(num_testing_images))

In [None]:
# Sample kitti label.
!cat $LOCAL_DATA_DIR/training/label_2/000110.txt

In [None]:
# Generate val dataset out of training dataset
!python3 ../ssd/generate_val_dataset.py --input_image_dir=$LOCAL_DATA_DIR/training/image_2 \
                                        --input_label_dir=$LOCAL_DATA_DIR/training/label_2 \
                                        --output_dir=$LOCAL_DATA_DIR/val

Additionally, if you have your own dataset already in a volume (or folder), you can mount the volume on `LOCAL_DATA_DIR` (or create a soft link). Below shows an example:
```bash
# if your dataset is in /dev/sdc1
mount /dev/sdc1 $LOCAL_DATA_DIR

# if your dataset is in folder /var/dataset
ln -sf /var/dataset $LOCAL_DATA_DIR
```

In [None]:
# If you use your own dataset, you will need to run the code below to generate the best anchor shape

# !tao yolo_v4_tiny kmeans -l $DATA_DOWNLOAD_DIR/training/label_2 \
#                          -i $DATA_DOWNLOAD_DIR/training/image_2 \
#                          -n 6 \
#                          -x 1248 \
#                          -y 384

# The anchor shape generated by this script is sorted. Write the first 3 into small_anchor_shape in the config
# file. Write middle 3 into mid_anchor_shape. Write last 3 into big_anchor_shape.

### 2.3 Generate tfrecords <a class="anchor" id="head-2-3"></a>

The default YOLOv4 Tiny data format requires generation of TFRecords. Currently, the old sequence data format (image folders and label txt folders) is still supported and if you prefer to use the sequence data format, you can skip this section. To use sequence data format, please use spec file `yolo_v4_tiny_train_kitti_seq.txt` and `yolo_v4_tiny_retrain_kitti_seq.txt`. And you can check our user guide for more details about tfrecords generation and sequence data format usage.

Note: we observe that for YOLOv4 Tiny, when mosaic augmentation is turned on (mosaic_prob > 0), the sequence format has faster training speed.

Note: we observe the TFRecords format sometimes results in CUDA error during evaluation. Setting `force_on_cpu` in `nms_config` to `true` can help prevent this problem.

In [6]:
#!tao yolo_v4_tiny dataset_convert -d $SPECS_DIR/yolo_v4_tiny_tfrecords_kitti_train.txt \
#    -o $DATA_DOWNLOAD_DIR/training/tfrecords/train

!tao yolo_v4_tiny dataset_convert -d /workspace/tao-experiments/data/spec-train.txt \
                                  -o /workspace/tao-experiments/data/training/tfrecords/train --gpu_index 0

2023-04-05 17:53:56,987 [INFO] root: Registry: ['nvcr.io']
2023-04-05 17:53:57,015 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/hirain/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
2023-04-05 09:53:58.472609: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.
2023-04-05 09:54:03,164 [INFO] iva.detectnet_v2.dataio.build_converter: Instantiating a coco converter
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
2023-04-05 09:54:03,165 [INFO] iva.detectnet_v2.dataio.coco_converter_lib: Writing partition 0, shard 0


In [7]:
#!tao yolo_v4_tiny dataset_convert -d $SPECS_DIR/yolo_v4_tiny_tfrecords_kitti_val.txt \
#                            -o $DATA_DOWNLOAD_DIR/val/tfrecords/val
!tao yolo_v4_tiny dataset_convert -d /workspace/tao-experiments/data/spec-val.txt \
                                  -o /workspace/tao-experiments/data/val/tfrecords/val --gpu_index 0

2023-04-05 17:54:31,213 [INFO] root: Registry: ['nvcr.io']
2023-04-05 17:54:31,241 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/hirain/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
2023-04-05 09:54:32.765717: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.
2023-04-05 09:54:37,439 [INFO] iva.detectnet_v2.dataio.build_converter: Instantiating a coco converter
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
2023-04-05 09:54:37,440 [INFO] iva.detectnet_v2.dataio.coco_converter_lib: Writing partition 0, shard 0


### 2.4 Download pre-trained model <a class="anchor" id="head-2-4"></a>

We will use NGC CLI to get the pre-trained models. For more details, go to [ngc.nvidia.com](ngc.nvidia.com) and click the SETUP on the navigation bar.

In [5]:
# Installing NGC CLI on the local machine.
## Download and install
%env CLI=ngccli_cat_linux.zip
!mkdir -p $LOCAL_PROJECT_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))

env: CLI=ngccli_cat_linux.zip
--2023-04-05 11:31:24--  https://ngc.nvidia.com/downloads/ngccli_cat_linux.zip
Resolving ngc.nvidia.com (ngc.nvidia.com)... 13.35.121.102, 13.35.121.39, 13.35.121.8, ...
Connecting to ngc.nvidia.com (ngc.nvidia.com)|13.35.121.102|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 42733033 (41M) [application/zip]
Saving to: ‘/home/hirain/cv_samples_v1.4.0//ngccli/ngccli_cat_linux.zip’


2023-04-05 11:31:36 (3.76 MB/s) - ‘/home/hirain/cv_samples_v1.4.0//ngccli/ngccli_cat_linux.zip’ saved [42733033/42733033]

Archive:  /home/hirain/cv_samples_v1.4.0//ngccli/ngccli_cat_linux.zip
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/multidict/
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/multidict/_multidict.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/libpython3.9.so.1.0  
  inflating: /home/hirain/cv_samples_

   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/markupsafe/
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/markupsafe/_speedups.cpython-39-x86_64-linux-gnu.so  
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/lib-dynload/
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/lib-dynload/mmap.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/lib-dynload/_opcode.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/lib-dynload/termios.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/lib-dynload/_codecs_jp.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/lib-dynload/_md5.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/lib-dynload/_hashlib.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-

   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/codecommit/2015-04-13/
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/codecommit/2015-04-13/paginators-1.json  
 extracting: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/codecommit/2015-04-13/endpoint-rule-set-1.json.gz  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/codecommit/2015-04-13/service-2.json  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/codecommit/2015-04-13/examples-1.json  
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/sagemaker-geospatial/
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/sagemaker-geospatial/2020-05-27/
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/sagemaker-geospatial/2020-05-27/paginators-1.json  
 extracting: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/sagemaker-geospatial/20

  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/backupstorage/2018-04-10/service-2.json  
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/marketplace-entitlement/
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/marketplace-entitlement/2017-01-11/
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/marketplace-entitlement/2017-01-11/paginators-1.json  
 extracting: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/marketplace-entitlement/2017-01-11/endpoint-rule-set-1.json.gz  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/marketplace-entitlement/2017-01-11/service-2.json  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/marketplace-entitlement/2017-01-11/examples-1.json  
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/kinesis-video-media/
   creating: /home/hirain/cv_samples_v1.4.0//ngccli

 extracting: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/sts/2011-06-15/paginators-1.json  
 extracting: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/sts/2011-06-15/endpoint-rule-set-1.json.gz  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/sts/2011-06-15/service-2.json  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/sts/2011-06-15/examples-1.json  
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/pricing/
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/pricing/2017-10-15/
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/pricing/2017-10-15/paginators-1.json  
 extracting: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/pricing/2017-10-15/endpoint-rule-set-1.json.gz  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/pricing/2017-10-15/service-2.json  
  inflating: /home/hirain/c

  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/codebuild/2016-10-06/service-2.json  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/codebuild/2016-10-06/examples-1.json  
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/keyspaces/
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/keyspaces/2022-02-10/
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/keyspaces/2022-02-10/paginators-1.json  
 extracting: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/keyspaces/2022-02-10/endpoint-rule-set-1.json.gz  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/keyspaces/2022-02-10/service-2.json  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/keyspaces/2022-02-10/waiters-2.json  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/keyspaces/2022-02-10/examples-1.json  
   creat

  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/clouddirectory/2016-05-10/service-2.json  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/clouddirectory/2016-05-10/examples-1.json  
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/clouddirectory/2017-01-11/
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/clouddirectory/2017-01-11/paginators-1.json  
 extracting: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/clouddirectory/2017-01-11/endpoint-rule-set-1.json.gz  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/clouddirectory/2017-01-11/service-2.json  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/clouddirectory/2017-01-11/examples-1.json  
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/cloudformation/
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/cloudfo

  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/accessanalyzer/2019-11-01/service-2.json  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/accessanalyzer/2019-11-01/examples-1.json  
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/auditmanager/
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/auditmanager/2017-07-25/
 extracting: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/auditmanager/2017-07-25/paginators-1.json  
 extracting: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/auditmanager/2017-07-25/endpoint-rule-set-1.json.gz  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/auditmanager/2017-07-25/service-2.json  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/auditmanager/2017-07-25/examples-1.json  
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/botocore/data/docdb/
   creating

  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/grpc/_cython/cygrpc.cpython-39-x86_64-linux-gnu.so  
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/boto3/
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/boto3/examples/
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/boto3/examples/s3.rst  
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/boto3/examples/cloudfront.rst  
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/boto3/data/
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/boto3/data/iam/
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/boto3/data/iam/2010-05-08/
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/boto3/data/iam/2010-05-08/resources-1.json  
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/boto3/data/sns/
   creating: /home/hirain/cv_samples_v1.4.0//ngccli/ngc-cli/boto3/data/sns/2010-03-31/
  inflating: /home/hirain/cv_samples_v1.4.0//ngccli/ng

In [8]:
#!ngc registry model list nvstaging/tao/pretrained_object_detection:*
!ngc registry model list nvidia/tao/pretrained_object_detection:*

+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| Versi | Accur | Epoch | Batch | GPU   | Memor | File  | Statu | Creat |
| on    | acy   | s     | Size  | Model | y Foo | Size  | s     | ed    |
|       |       |       |       |       | tprin |       |       | Date  |
|       |       |       |       |       | t     |       |       |       |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| vgg19 | 77.56 | 80    | 1     | V100  | 153.7 | 153.7 | UPLOA | Aug   |
|       |       |       |       |       |       | 2 MB  | D_COM | 18,   |
|       |       |       |       |       |       |       | PLETE | 2021  |
| vgg16 | 77.17 | 80    | 1     | V100  | 113.2 | 113.1 | UPLOA | Aug   |
|       |       |       |       |       |       | 6 MB  | D_COM | 18,   |
|       |       |       |       |       |       |       | PLETE | 2021  |
| squee | 65.13 | 80    | 1     | V100  | 6.5   | 6.46  | UPLOA | Aug   |
| zenet |       |       |

In [3]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/pretrained_cspdarknet_tiny

In [5]:
# Pull pretrained model from NGC
#!ngc registry model download-version nvstaging/tao/pretrained_object_detection:cspdarknet_tiny \
#                  --dest $LOCAL_EXPERIMENT_DIR/pretrained_cspdarknet_tiny

!ngc registry model download-version nvidia/tao/pretrained_object_detection:cspdarknet_tiny \
                  --dest $LOCAL_EXPERIMENT_DIR/pretrained_cspdarknet_tiny

Downloaded 26.38 MB in 22m 23s, Download speed: 20.1 KB/s                
--------------------------------------------------------------------------------
   Transfer id: pretrained_object_detection_vcspdarknet_tiny
   Download status: Completed
   Downloaded local path: /home/hirain/cv_samples_v1.4.0/yolo_v4_tiny/pretrained_cspdarknet_tiny/pretrained_object_detection_vcspdarknet_tiny
   Total files downloaded: 1
   Total downloaded size: 26.38 MB
   Started at: 2023-04-05 15:53:05.316463
   Completed at: 2023-04-05 16:15:28.712430
   Duration taken: 22m 23s
--------------------------------------------------------------------------------


In [6]:
print("Check that model is downloaded into dir.")
!ls -l $LOCAL_EXPERIMENT_DIR/pretrained_cspdarknet_tiny/pretrained_object_detection_vcspdarknet_tiny

Check that model is downloaded into dir.
total 29256
-rw------- 1 hirain hirain 29955696 Apr  5 16:15 cspdarknet_tiny.hdf5


## 3. Provide training specification <a class="anchor" id="head-3"></a>
* Augmentation parameters for on-the-fly data augmentation
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.
* Whether to use quantization aware training (QAT)

In [7]:
# Provide pretrained model path
!sed -i 's,EXPERIMENT_DIR,'"$USER_EXPERIMENT_DIR"',' $LOCAL_SPECS_DIR/yolo_v4_tiny_train_kitti.txt

# To enable QAT training on sample spec file, uncomment following lines
!sed -i "s/enable_qat: false/enable_qat: true/g" $LOCAL_SPECS_DIR/yolo_v4_tiny_train_kitti.txt
!sed -i "s/enable_qat: false/enable_qat: true/g" $LOCAL_SPECS_DIR/yolo_v4_tiny_retrain_kitti.txt

In [8]:
# By default, the sample spec file disables QAT training. You can force non-QAT training by running lines below
# !sed -i "s/enable_qat: true/enable_qat: false/g" $LOCAL_SPECS_DIR/yolo_v4_tiny_train_kitti.txt
# !sed -i "s/enable_qat: true/enable_qat: false/g" $LOCAL_SPECS_DIR/yolo_v4_tiny_retrain_kitti.txt

In [9]:
!cat $LOCAL_SPECS_DIR/yolo_v4_tiny_train_kitti.txt

random_seed: 42
yolov4_config {
  big_anchor_shape: "[(260.69, 172.35), (125.91, 81.47), (72.27, 42.42)]"
  mid_anchor_shape: "[(30.80, 71.40), (38.97, 26.86), (18.88, 17.11)]"
  box_matching_iou: 0.25
  matching_neutral_box_iou: 0.5
  arch: "cspdarknet_tiny"
  loss_loc_weight: 1.0
  loss_neg_obj_weights: 1.0
  loss_class_weights: 1.0
  label_smoothing: 0.0
  big_grid_xy_extend: 0.05
  mid_grid_xy_extend: 0.05
  freeze_bn: false
  #freeze_blocks: 0
  force_relu: false
}
training_config {
  visualizer {
      enabled: False
      num_images: 3
  }
  batch_size_per_gpu: 1
  num_epochs: 80
  enable_qat: true
  checkpoint_interval: 10
  learning_rate {
    soft_start_cosine_annealing_schedule {
      min_learning_rate: 1e-7
      max_learning_rate: 1e-4
      soft_start: 0.3
    }
  }
  regularizer {
    type: L1
    weight: 3e-5
  }
  optimizer {
    adam {
      epsilon: 1e-7
      beta1: 0.9
      beta2: 0.999
      amsgrad: false
    }
  }
 

## 4. Run TAO training <a class="anchor" id="head-4"></a>
* Provide the sample spec file and the output directory location for models
* WARNING: training will take several hours or one day to complete

In [10]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned

In [10]:
print("To run with multigpu, please change --gpus based on the number of available GPUs in your machine.")
!tao yolo_v4_tiny train -e $SPECS_DIR/yolo_v4_tiny_train_kitti.txt \
                   -r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \
                   -k $KEY \
                   --gpus 1

To run with multigpu, please change --gpus based on the number of available GPUs in your machine.
2023-04-05 17:56:35,838 [INFO] root: Registry: ['nvcr.io']
2023-04-05 17:56:35,868 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/hirain/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
2023-04-05 09:56:37.261785: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.


INFO: Log file already exists at /workspace/tao-experiments/yolo_v4_tiny/experiment_dir_unpruned/status.json
INFO: Starting Yolo_V4 Training job


















INFO: Serial augmentation e

INFO: shuffle: True - shard 0 of 1
INFO: sampling 1 datasets with weights:
INFO: source: 0 weight: 1.000000






INFO: Serial augmentation enabled = False
INFO: Pseudo sharding enabled = False
INFO: Max Image Dimensions (all sources): (0, 0)
INFO: number of cpus: 20, io threads: 40, compute threads: 20, buffered batches: -1
INFO: total dataset size 29, number of sources: 1, batch size per gpu: 1, steps: 29
INFO: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.
INFO: shuffle: False - shard 0 of 1
INFO: sampling 1 datasets with weights:
INFO: source: 0 weight: 1.000000
INFO: Log file already exists at /workspace/tao-experiments/yolo_v4_tiny/experiment_dir_unpruned/status.json
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
Input (InputLayer)              (None

INFO: Starting Training Loop.
Epoch 1/80
ac55251a1b7d:168:231 [0] NCCL INFO Bootstrap : Using eth0:172.17.0.4<0>
ac55251a1b7d:168:231 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v6 symbol.
ac55251a1b7d:168:231 [0] NCCL INFO NET/Plugin: Loaded net plugin NCCL RDMA Plugin (v5)
ac55251a1b7d:168:231 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v6 symbol.
ac55251a1b7d:168:231 [0] NCCL INFO NET/Plugin: Loaded coll plugin SHARP (v5)
ac55251a1b7d:168:231 [0] NCCL INFO cudaDriverVersion 11010
NCCL version 2.15.1+cuda11.8
ac55251a1b7d:168:231 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
ac55251a1b7d:168:231 [0] NCCL INFO P2P plugin IBext
ac55251a1b7d:168:231 [0] NCCL INFO NET/IB : No device found.
ac55251a1b7d:168:231 [0] NCCL INFO NET/IB : No device found.
ac55251a1b7d:168:231 [0] NCCL INFO NET/Socket : Using [0]eth0:172.17.0.4<0>
ac55251a1b7d:168:231 [0] NCCL INFO Using network Socket
ac55251a1b7d:168:231 [0] NCCL INFO Channel 00/32 :

Producing predictions: 100%|████████████████████| 29/29 [00:02<00:00, 14.47it/s]
Start to calculate AP for each class
*******************************
airpods       AP    0.89935
              mAP   0.89935
*******************************
Validation loss: 5.141962643327384
INFO: Evaluation metrics generated.

Epoch 00080: saving model to /workspace/tao-experiments/yolo_v4_tiny/experiment_dir_unpruned/weights/yolov4_cspdarknet_tiny_epoch_080.tlt
INFO: Training loop in progress
INFO: Training loop complete.
INFO: YOLO_V4 training finished successfully.
INFO: Training finished successfully.
ac55251a1b7d:168:231 [0] NCCL INFO comm 0x7f49c84e1030 rank 0 nranks 1 cudaDev 0 busId 1000 - Destroy COMPLETE
Telemetry data couldn't be sent, but the command ran successfully.
Execution status: PASS
2023-04-05 18:59:38,753 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.


In [None]:
print("To resume from checkpoint, please change pretrain_model_path to resume_model_path in config file.")

In [11]:
print('Model for each epoch:')
print('---------------------')
!ls -ltrh $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned/weights

Model for each epoch:
---------------------
total 541M
-rw-r--r-- 1 root root 68M Apr  5 18:06 yolov4_cspdarknet_tiny_epoch_010.tlt
-rw-r--r-- 1 root root 68M Apr  5 18:14 yolov4_cspdarknet_tiny_epoch_020.tlt
-rw-r--r-- 1 root root 68M Apr  5 18:21 yolov4_cspdarknet_tiny_epoch_030.tlt
-rw-r--r-- 1 root root 68M Apr  5 18:29 yolov4_cspdarknet_tiny_epoch_040.tlt
-rw-r--r-- 1 root root 68M Apr  5 18:36 yolov4_cspdarknet_tiny_epoch_050.tlt
-rw-r--r-- 1 root root 68M Apr  5 18:44 yolov4_cspdarknet_tiny_epoch_060.tlt
-rw-r--r-- 1 root root 68M Apr  5 18:52 yolov4_cspdarknet_tiny_epoch_070.tlt
-rw-r--r-- 1 root root 68M Apr  5 18:59 yolov4_cspdarknet_tiny_epoch_080.tlt


In [12]:
# Now check the evaluation stats in the csv file and pick the model with highest eval accuracy.
!cat $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned/yolov4_training_log_cspdarknet_tiny.csv
%set_env EPOCH=080

epoch,AP_airpods,loss,lr,mAP,validation_loss
1,nan,4168.8843,4.2624997e-06,nan,nan
2,nan,4580.679,8.425e-06,nan,nan
3,nan,4200.3926,1.2587499e-05,nan,nan
4,nan,3805.0322,1.6749998e-05,nan,nan
5,nan,2240.1448,2.09125e-05,nan,nan
6,nan,2604.8767,2.5074998e-05,nan,nan
7,nan,1396.5005,2.9237499e-05,nan,nan
8,nan,1295.835,3.3399996e-05,nan,nan
9,nan,1034.1371,3.75625e-05,nan,nan
10,0.13049690587914267,556.0489,4.1724998e-05,0.13049690587914267,670.206227269666
11,nan,400.9969,4.5887497e-05,nan,nan
12,nan,446.4676,5.0049995e-05,nan,nan
13,nan,368.70074,5.4212498e-05,nan,nan
14,nan,305.8533,5.8374997e-05,nan,nan
15,nan,294.18298,6.2537496e-05,nan,nan
16,nan,193.53893,6.67e-05,nan,nan
17,nan,222.64996,7.0862494e-05,nan,nan
18,nan,171.73407,7.5025e-05,nan,nan
19,nan,145.27596,7.918749e-05,nan,nan
20,0.36893622232138,147.1913,8.3349994e-05,0.36893622232138,112.79889389564251
21,nan,144.01213,8.75125e-05,nan,nan
22,nan,124.603775,9.167499e-05,nan,nan
23,nan,113.79615,9.5837495e-05,nan,nan
24,nan,

## 5. Evaluate trained models <a class="anchor" id="head-5"></a>

In [None]:
!tao yolo_v4_tiny evaluate -e $SPECS_DIR/yolo_v4_tiny_train_kitti.txt \
                      -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/yolov4_cspdarknet_tiny_epoch_$EPOCH.tlt \
                      -k $KEY

## 6. Prune trained models <a class="anchor" id="head-6"></a>
* Specify pre-trained model
* Equalization criterion (`Only for resnets as they have element wise operations or MobileNets.`)
* Threshold for pruning.
* A key to save and load the model
* Output directory to store the model

Usually, you just need to adjust `-pth` (threshold) for accuracy and model size trade off. Higher `pth` gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold value depends on the dataset and the model. `0.5` in the block below is just a start point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy.

In [13]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_pruned

In [14]:
#-pth (threshold). If the retrain accuracy is good, you can increase this value to get smaller models.
!tao yolo_v4_tiny prune -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/yolov4_cspdarknet_tiny_epoch_$EPOCH.tlt \
                   -e $SPECS_DIR/yolo_v4_tiny_train_kitti.txt \
                   -o $USER_EXPERIMENT_DIR/experiment_dir_pruned/yolov4_cspdarknet_tiny_pruned.tlt \
                   -eq intersection \
                   -pth 0.1 \
                   -k $KEY

2023-04-06 10:24:04,756 [INFO] root: Registry: ['nvcr.io']
2023-04-06 10:24:05,205 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/hirain/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
2023-04-06 02:24:22.053270: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.


































2023-04-06 02:25:09,029 [INFO] modulus.pruning.pruning: Exploring graph for retainable indices
2023-04-06 02:25:10,768 [INFO] modulus.pruning.pruning: Pruning model and appending pruned nodes to new graph


2023-04-06 02:25:54,169 [INFO] __main__: Pruning ratio (pruned model / original model): 1.0
Telemetry data couldn't be sent, but the command ran successfully.
Execution status: PASS
2023-04-06 10:26:04,317 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.


In [15]:
!ls -rlt $LOCAL_EXPERIMENT_DIR/experiment_dir_pruned/

total 23268
-rw-r--r-- 1 root root 23826312 Apr  6 10:25 yolov4_cspdarknet_tiny_pruned.tlt


## 7. Retrain pruned models <a class="anchor" id="head-7"></a>
* Model needs to be re-trained to bring back accuracy after pruning
* Specify re-training specification
* WARNING: training will take several hours or one day to complete

In [16]:
# Printing the retrain spec file. 
# Here we have updated the spec file to include the newly pruned model as a pretrained weights.
!sed -i 's,EXPERIMENT_DIR,'"$USER_EXPERIMENT_DIR"',' $LOCAL_SPECS_DIR/yolo_v4_tiny_retrain_kitti.txt
!cat $LOCAL_SPECS_DIR/yolo_v4_tiny_retrain_kitti.txt

random_seed: 42
yolov4_config {
  big_anchor_shape: "[(260.69, 172.35), (125.91, 81.47), (72.27, 42.42)]"
  mid_anchor_shape: "[(30.80, 71.40), (38.97, 26.86), (18.88, 17.11)]"
  box_matching_iou: 0.25
  matching_neutral_box_iou: 0.5
  arch: "cspdarknet_tiny"
  loss_loc_weight: 1.0
  loss_neg_obj_weights: 1.0
  loss_class_weights: 1.0
  label_smoothing: 0.0
  big_grid_xy_extend: 0.05
  mid_grid_xy_extend: 0.05
  freeze_bn: false
  #freeze_blocks: 0
  force_relu: false
}
training_config {
  visualizer {
      enabled: False
      num_images: 3
  }
  batch_size_per_gpu: 8
  num_epochs: 80
  enable_qat: true
  checkpoint_interval: 10
  learning_rate {
    soft_start_cosine_annealing_schedule {
      min_learning_rate: 1e-7
      max_learning_rate: 1e-4
      soft_start: 0.3
    }
  }
  regularizer {
    type: NO_REG
    weight: 3e-9
  }
  optimizer {
    adam {
      epsilon: 1e-7
      beta1: 0.9
      beta2: 0.999
      amsgrad: false
    }
  

In [17]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain

In [18]:
# Retraining using the pruned model as pretrained weights 
!tao yolo_v4_tiny train --gpus 1 \
                   -e $SPECS_DIR/yolo_v4_tiny_retrain_kitti.txt \
                   -r $USER_EXPERIMENT_DIR/experiment_dir_retrain \
                   -k $KEY

2023-04-06 10:49:35,523 [INFO] root: Registry: ['nvcr.io']
2023-04-06 10:49:35,552 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/hirain/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
2023-04-06 02:49:37.018687: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.


INFO: Starting Yolo_V4 Training job


















INFO: Serial augmentation enabled = False
INFO: Pseudo sharding enabled = False
INFO: Max Image Dimensions (all sources): (0, 0)
INFO: number of cpus: 20, io threads: 40, compute threads: 20, buffered batches: -1
INFO: total dataset s

INFO: Serial augmentation enabled = False
INFO: Pseudo sharding enabled = False
INFO: Max Image Dimensions (all sources): (0, 0)
INFO: number of cpus: 20, io threads: 40, compute threads: 20, buffered batches: -1
INFO: total dataset size 29, number of sources: 1, batch size per gpu: 1, steps: 29
INFO: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.
INFO: shuffle: False - shard 0 of 1
INFO: sampling 1 datasets with weights:
INFO: source: 0 weight: 1.000000


INFO: Log file already exists at /workspace/tao-experiments/yolo_v4_tiny/experiment_dir_retrain/status.json
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
Input (InputLayer)              (1, 3, None, None)   0                                            
___________________________________________________



INFO: Starting Training Loop.
Epoch 1/80
641fb48b811f:168:231 [0] NCCL INFO Bootstrap : Using eth0:172.17.0.4<0>
641fb48b811f:168:231 [0] NCCL INFO NET/Plugin: Failed to find ncclNetPlugin_v6 symbol.
641fb48b811f:168:231 [0] NCCL INFO NET/Plugin: Loaded net plugin NCCL RDMA Plugin (v5)
641fb48b811f:168:231 [0] NCCL INFO NET/Plugin: Failed to find ncclCollNetPlugin_v6 symbol.
641fb48b811f:168:231 [0] NCCL INFO NET/Plugin: Loaded coll plugin SHARP (v5)
641fb48b811f:168:231 [0] NCCL INFO cudaDriverVersion 11010
NCCL version 2.15.1+cuda11.8
641fb48b811f:168:231 [0] NCCL INFO Plugin Path : /opt/hpcx/nccl_rdma_sharp_plugin/lib/libnccl-net.so
641fb48b811f:168:231 [0] NCCL INFO P2P plugin IBext
641fb48b811f:168:231 [0] NCCL INFO NET/IB : No device found.
641fb48b811f:168:231 [0] NCCL INFO NET/IB : No device found.
641fb48b811f:168:231 [0] NCCL INFO NET/Socket : Using [0]eth0:172.17.0.4<0>
641fb48b811f:168:231 [0] NCCL INFO Using network Socket
641fb48b811f:168:231 [0] NCCL INFO Channel 00/32

INFO: Training loop in progress
Epoch 80/80
Producing predictions: 100%|████████████████████| 29/29 [00:01<00:00, 14.97it/s]
Start to calculate AP for each class
*******************************
airpods       AP    0.98827
              mAP   0.98827
*******************************
Validation loss: 2.160589877901406
INFO: Evaluation metrics generated.

Epoch 00080: saving model to /workspace/tao-experiments/yolo_v4_tiny/experiment_dir_retrain/weights/yolov4_cspdarknet_tiny_epoch_080.tlt
INFO: Training loop in progress
INFO: Training loop complete.
INFO: YOLO_V4 training finished successfully.
INFO: Training finished successfully.
641fb48b811f:168:231 [0] NCCL INFO comm 0x7fcf94424f80 rank 0 nranks 1 cudaDev 0 busId 1000 - Destroy COMPLETE
Telemetry data couldn't be sent, but the command ran successfully.
Execution status: PASS
2023-04-06 11:52:10,973 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.


In [19]:
# Listing the newly retrained model.
!ls -rlt $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain/weights

total 553920
-rw-r--r-- 1 root root 70901440 Apr  6 10:59 yolov4_cspdarknet_tiny_epoch_010.tlt
-rw-r--r-- 1 root root 70901440 Apr  6 11:06 yolov4_cspdarknet_tiny_epoch_020.tlt
-rw-r--r-- 1 root root 70901440 Apr  6 11:14 yolov4_cspdarknet_tiny_epoch_030.tlt
-rw-r--r-- 1 root root 70901440 Apr  6 11:21 yolov4_cspdarknet_tiny_epoch_040.tlt
-rw-r--r-- 1 root root 70901440 Apr  6 11:29 yolov4_cspdarknet_tiny_epoch_050.tlt
-rw-r--r-- 1 root root 70901440 Apr  6 11:37 yolov4_cspdarknet_tiny_epoch_060.tlt
-rw-r--r-- 1 root root 70901440 Apr  6 11:44 yolov4_cspdarknet_tiny_epoch_070.tlt
-rw-r--r-- 1 root root 70901440 Apr  6 11:52 yolov4_cspdarknet_tiny_epoch_080.tlt


In [20]:
# Now check the evaluation stats in the csv file and pick the model with highest eval accuracy.
!cat $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain/yolov4_training_log_cspdarknet_tiny.csv
%set_env EPOCH=080

epoch,AP_airpods,loss,lr,mAP,validation_loss
1,nan,12.48019,4.2624997e-06,nan,nan
2,nan,12.252178,8.425e-06,nan,nan
3,nan,13.571616,1.2587499e-05,nan,nan
4,nan,12.42761,1.6749998e-05,nan,nan
5,nan,11.299309,2.09125e-05,nan,nan
6,nan,11.570363,2.5074998e-05,nan,nan
7,nan,11.58001,2.9237499e-05,nan,nan
8,nan,12.644869,3.3399996e-05,nan,nan
9,nan,10.927196,3.75625e-05,nan,nan
10,0.9414169758812617,11.047786,4.1724998e-05,0.9414169758812617,3.6862680912017822
11,nan,10.47694,4.5887497e-05,nan,nan
12,nan,10.542068,5.0049995e-05,nan,nan
13,nan,10.446078,5.4212498e-05,nan,nan
14,nan,9.788577,5.8374997e-05,nan,nan
15,nan,10.352352,6.2537496e-05,nan,nan
16,nan,9.960376,6.67e-05,nan,nan
17,nan,9.867817,7.0862494e-05,nan,nan
18,nan,11.310824,7.5025e-05,nan,nan
19,nan,10.383048,7.918749e-05,nan,nan
20,0.8776723276723277,11.457218,8.3349994e-05,0.8776723276723277,4.006135315730654
21,nan,12.118472,8.75125e-05,nan,nan
22,nan,11.397757,9.167499e-05,nan,nan
23,nan,10.564539,9.5837495e-05,nan,nan
24,na

## 8. Evaluate retrained model <a class="anchor" id="head-8"></a>

In [21]:
!tao yolo_v4_tiny evaluate -e $SPECS_DIR/yolo_v4_tiny_retrain_kitti.txt \
                      -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_cspdarknet_tiny_epoch_$EPOCH.tlt \
                      -k $KEY

2023-04-06 12:20:07,301 [INFO] root: Registry: ['nvcr.io']
2023-04-06 12:20:07,329 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/hirain/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
2023-04-06 04:20:08.732092: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.




































__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
Input (InputLayer)              (None, 3, 384, 1248) 0                                            
__________________________________________________________________________________________________
Input_qdq (QDQ)                 (None, 3, 384, 1248) 1           Input[0][0]                      
__________________________________________________________________________________________________
conv_0 (QuantizedConv2D)        (None, 32, 192, 624) 864         Input_qdq[0][0]                  
__________________________________________________________________________________________________
conv_0_bn (BatchNormalization)  (None, 32, 192, 624) 128         conv_0[0][0]                     
__________________________________________________________________________________________________


2023-04-06 04:20:18,432 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False
2023-04-06 04:20:18,433 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False
2023-04-06 04:20:18,433 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)
2023-04-06 04:20:18,433 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 20, io threads: 40, compute threads: 20, buffered batches: -1
2023-04-06 04:20:18,433 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 29, number of sources: 1, batch size per gpu: 1, steps: 29


2023-04-06 04:20:18,504 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.
2023-04-06 04:20:18,691 [INFO] modulus.blocks.data_loaders.mul

## 9. Visualize inferences <a class="anchor" id="head-9"></a>
In this section, we run the `infer` tool to generate inferences on the trained models and visualize the results.

In [None]:
# Copy some test images
!mkdir -p $LOCAL_DATA_DIR/test_samples
!cp $LOCAL_DATA_DIR/testing/image_2/00000* $LOCAL_DATA_DIR/test_samples/

In [22]:
!echo $EPOCH

080


In [23]:
# Running inference for detection on n images
!tao yolo_v4_tiny inference -i $DATA_DOWNLOAD_DIR/test_samples \
                       -o $USER_EXPERIMENT_DIR/yolo_infer_images \
                       -e $SPECS_DIR/yolo_v4_tiny_retrain_kitti.txt \
                       -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_cspdarknet_tiny_epoch_$EPOCH.tlt \
                       -l $USER_EXPERIMENT_DIR/yolo_infer_labels \
                       -k $KEY

2023-04-06 12:23:14,347 [INFO] root: Registry: ['nvcr.io']
2023-04-06 12:23:14,379 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/hirain/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
2023-04-06 04:23:15.841451: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.




































Using TLT model for inference, setting batch size to the one in eval_config: 1
100%|█████████████████████████████████████████████| 4/4 [00:02<00:00,  1.56it/s]
Telemetry data couldn't be sent, but the command ran successfully.
Execution status: PASS
2023-04-06 12:23:30,497 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.


The `inference` tool produces two outputs. 
1. Overlain images in `$LOCAL_EXPERIMENT_DIR/yolo_infer_images`
2. Frame by frame bbox labels in kitti format located in `$LOCAL_EXPERIMENT_DIR/yolo_infer_labels`

In [None]:
# Simple grid visualizer
#!pip3 install matplotlib==3.3.3
#import matplotlib.pyplot as plt
#import os
#from math import ceil
#valid_image_ext = ['.jpg', '.png', '.jpeg', '.ppm']

#def visualize_images(image_dir, num_cols=4, num_images=10):
#    output_path = os.path.join(os.environ['LOCAL_EXPERIMENT_DIR'], image_dir)
#    num_rows = int(ceil(float(num_images) / float(num_cols)))
#    f, axarr = plt.subplots(num_rows, num_cols, figsize=[80,30])
#    f.tight_layout()
#    a = [os.path.join(output_path, image) for image in os.listdir(output_path) 
#         if os.path.splitext(image)[1].lower() in valid_image_ext]
#    for idx, img_path in enumerate(a[:num_images]):
#        col_id = idx % num_cols
#        row_id = idx // num_cols
#        img = plt.imread(img_path)
#        axarr[row_id, col_id].imshow(img) 

In [None]:
# Visualizing the sample images.
#OUTPUT_PATH = 'yolo_infer_images' # relative path from $USER_EXPERIMENT_DIR.
#COLS = 3 # number of columns in the visualizer grid.
#IMAGES = 9 # number of images to visualize.

#visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

## 10. Model Export <a class="anchor" id="head-10"></a>

If you trained a non-QAT model, you may export in FP32, FP16 or INT8 mode using the code block below. For INT8, you need to provide calibration image directory.

In [26]:
# tao <task> export will fail if .etlt already exists. So we clear the export folder before tao <task> export
!rm -rf $LOCAL_EXPERIMENT_DIR/export
!mkdir -p $LOCAL_EXPERIMENT_DIR/export
# Export in FP32 mode. Change --data_type to fp16 for FP16 mode
!tao yolo_v4_tiny export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_cspdarknet_tiny_epoch_$EPOCH.tlt \
                    -k $KEY \
                    -o $USER_EXPERIMENT_DIR/export/yolov4_cspdarknet_tiny_epoch_$EPOCH.etlt \
                    -e $SPECS_DIR/yolo_v4_tiny_retrain_kitti.txt \
                    --batch_size 16 \
                    --data_type fp32 \
                    --gen_ds_config

# Uncomment to export in INT8 mode (generate calibration cache file). 
#!tao yolo_v4_tiny export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_cspdarknet_tiny_epoch_$EPOCH.tlt  \
#                     -o $USER_EXPERIMENT_DIR/export/yolov4_cspdarknet_tiny_epoch_$EPOCH.etlt \
#                     -e $SPECS_DIR/yolo_v4_tiny_retrain_kitti.txt \
#                     -k $KEY \
#                     --cal_image_dir $DATA_DOWNLOAD_DIR/testing/image_2 \
#                     --data_type int8 \
#                     --batch_size 16 \
#                     --batches 10 \
#                     --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin  \
#                     --cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile \
#                     --gen_ds_config

2023-04-07 16:03:40,003 [INFO] root: Registry: ['nvcr.io']
2023-04-07 16:03:40,033 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/hirain/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
2023-04-07 08:03:41.577475: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.
2023-04-07 08:03:48,395 [INFO] iva.common.export.keras_exporter: Using input nodes: ['Input']
2023-04-07 08:03:48,395 [INFO] iva.common.export.keras_exporter: Using output nodes: ['BatchedNMS']
The ONNX operator number change on the optimization: 320 -> 158
2023-04-07 08:04:28,709 [INFO] k

`Note:` In this example, for ease of execution we restrict the number of calibrating batches to 10. TAO Toolkit recommends the use of at least 10% of the training dataset for int8 calibration.

If you train a QAT model, you may only export in INT8 mode using following code block. This generates an etlt file and the corresponding calibration cache. You can throw away the calibration cache and just use the etlt file in tao-converter or DeepStream for FP32 or FP16 mode. But please note this gives sub-optimal results. If you want to deploy in FP32 or FP16, you should disable QAT in training.

In [None]:
# Uncomment to export QAT model in INT8 mode (generate calibration cache file).
#!rm -rf $LOCAL_EXPERIMENT_DIR/export
#!mkdir -p $LOCAL_EXPERIMENT_DIR/export
#!tao yolo_v4_tiny export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_cspdarknet_tiny_epoch_$EPOCH.tlt  \
#                    -o $USER_EXPERIMENT_DIR/export/yolov4_cspdarknet_tiny_epoch_$EPOCH.etlt \
#                    -e $SPECS_DIR/yolo_v4_tiny_retrain_kitti.txt \
#                    -k $KEY \
#                    --data_type int8 \
#                    --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin

In [27]:
print('Exported model:')
print('------------')
!ls -lh $LOCAL_EXPERIMENT_DIR/export

Exported model:
------------
total 23M
-rw-r--r-- 1 root root   8 Apr  7 16:04 labels.txt
-rw-r--r-- 1 root root 277 Apr  7 16:04 nvinfer_config.txt
-rw-r--r-- 1 root root 23M Apr  7 16:04 yolov4_cspdarknet_tiny_epoch_080.etlt


Verify engine generation using the `tao-converter` utility included with the docker.

The `tao-converter` produces optimized tensorrt engines for the platform that it resides on. Therefore, to get maximum performance, please instantiate this docker and execute the `tao-converter` command, with the exported `.etlt` file and calibration cache (for int8 mode) on your target device. The tao-converter utility included in this docker only works for x86 devices, with discrete NVIDIA GPU's. 

For the jetson devices, please download the tao-converter for jetson from the dev zone link [here](https://developer.nvidia.com/tao-converter). 

The -p argument in following command is the optimization profile. This should be in format `<input_node>,<min_shape>,<opt_shape>,<max_shape>`. In YOLO v4 Tiny, the three shapes should only have differences at the batch dimension

In [29]:
# Convert to TensorRT engine (FP32)
!tao converter -k $KEY \
                  -p Input,1x3x384x1248,8x3x384x1248,16x3x384x1248 \
                  -e $USER_EXPERIMENT_DIR/export/trt.engine \
                  -t fp32 \
                  $USER_EXPERIMENT_DIR/export/yolov4_cspdarknet_tiny_epoch_$EPOCH.etlt

#Convert to TensorRT engine (FP16)
#!tao converter -k $KEY \
#                   -p Input,1x3x384x1248,8x3x384x1248,16x3x384x1248 \
#                   -e $USER_EXPERIMENT_DIR/export/trt.engine \
#                  -t fp16 \
#                   $USER_EXPERIMENT_DIR/export/yolov4_cspdarknet_tiny_epoch_$EPOCH.etlt

# Convert to TensorRT engine (INT8)
#!tao converter -k $KEY  \
#                   -p Input,1x3x384x1248,8x3x384x1248,16x3x384x1248 \
#                   -c $USER_EXPERIMENT_DIR/export/cal.bin \
#                   -e $USER_EXPERIMENT_DIR/export/trt.engine \
#                   -b 8 \
#                   -t int8 \
#                   $USER_EXPERIMENT_DIR/export/yolov4_cspdarknet_tiny_epoch_$EPOCH.etlt

2023-04-07 16:23:23,865 [INFO] root: Registry: ['nvcr.io']
2023-04-07 16:23:23,896 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/hirain/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[INFO] [MemUsageChange] Init CUDA: CPU +328, GPU +0, now: CPU 340, GPU 792 (MiB)
[INFO] [MemUsageChange] Init builder kernel library: CPU +445, GPU +118, now: CPU 840, GPU 910 (MiB)
[INFO] ----------------------------------------------------------------
[INFO] Input filename:   /tmp/fileBUy6gS
[INFO] ONNX IR version:  0.0.8
[INFO] Opset version:    15
[INFO] Producer name:    
[INFO] Producer version: 
[INFO] Domain:           
[INFO] Model version:    0
[INFO] Doc string:       
[IN

In [30]:
print('Exported engine:')
print('------------')
!ls -lh $LOCAL_EXPERIMENT_DIR/export/trt.engine

Exported engine:
------------
-rw-r--r-- 1 root root 13M Apr  7 16:30 /home/hirain/cv_samples_v1.4.0/yolo_v4_tiny/export/trt.engine


## 11. Verify the deployed model <a class="anchor" id="head-11"></a>
Verify the converted engine by visualizing TensorRT inferences.


In [31]:
# Infer using TensorRT engine
!tao yolo_v4_tiny inference -m $USER_EXPERIMENT_DIR/export/trt.engine \
                       -e $SPECS_DIR/yolo_v4_tiny_retrain_kitti.txt \
                       -i $DATA_DOWNLOAD_DIR/test_samples \
                       -o $USER_EXPERIMENT_DIR/yolo_infer_images \
                       -t 0.6

2023-04-07 16:39:18,680 [INFO] root: Registry: ['nvcr.io']
2023-04-07 16:39:18,708 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/hirain/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
2023-04-07 08:39:20.046801: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.






[04/07/2023-08:39:25] [TRT] [W] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
[04/07/2023-08:39:25] [TRT] [W] CUDA lazy loading is not enabled

In [None]:
# Visualizing the sample images.
OUTPUT_PATH = 'yolo_infer_images' # relative path from $USER_EXPERIMENT_DIR.
COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)