## Learning Objectives
In this notebook, you will learn how to Use the pruned and retrained model in "classification_16bit.ipynb" notebook which was exported as a .etlt file for deployment to DeepStream. First, You need to run the "classification_16bit.ipynb" notebook using quick deploy option or inside the tao-toolkit:4.0.0-tf2.9.1 container to create the export file for this model. Then, Go through this notebook. 


## TAO Deploy Container
Using the tao-deploy container, you can generate a TensorRT engine and verify the correctness of the generated through evaluate and inference.

The tao-deploy produces optimized tensorrt engines for the platform that it resides on. Therefore, to get maximum performance, please run tao-deploy command which will instantiate a deploy container, with the exported .etlt file on your target device. The tao-deploy container only works for x86, with discrete NVIDIA GPU's.
 
**Use the following commands to run this notebook inside the deploy container:**

```
docker run -it --rm --gpus all -v /path/to/notebooks:/workspace/tao --net=host nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tao-toolkit:4.0.0-deploy /bin/bash

pip3 install notebook

pip3 install jupyterlab

cd /workspace

jupyter notebook --ip 0.0.0.0 --port 8888 --allow-root

```

For the jetson devices, please download the tao-converter for jetson and refer to here for more details.

If you choose to integrate your model into deepstream directly, you may do so by simply copying the exported .etlt file along with the calibration cache to the target device and updating the spec file that configures the gst-nvinfer element to point to this newly exported model. Usually this file is called config_infer_primary.txt for detection models and config_infer_secondary_*.txt for classification models.


### Table of Contents
This notebook shows an example use case for classification using the Train Adapt Optimize (TAO) Toolkit.

1. [Set up env variables](#head-1)
2. [Prepare dataset](#head-2)
    1. [Split the dataset into train/test/val](#head-2-1)
3. [Deploy](#head-3)
4. [Verify the deployed model](#head-4)



## 1. Set up env variables <a class="anchor" id="head-0"></a>
When using the purpose-built pretrained models from NGC, please make sure to set the `$KEY` environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users workspace. Please note that the dataset to run this notebook is expected to reside in the `$LOCAL_PROJECT_DIR/data`, while the TAO experiment generated collaterals will be output to `$LOCAL_PROJECT_DIR/classification`. More information on how to set up the dataset and the supported steps in the TAO workflow are provided in the subsequent cells.

*Note: Please make sure to remove any stray artifacts/files from the `$USER_EXPERIMENT_DIR` or `$DATA_DOWNLOAD_DIR` paths as mentioned below, that may have been generated from previous experiments. Having checkpoint files etc may interfere with creating a training graph for a new experiment.*

*Note: This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly*

In [1]:
# Setting up env variables for cleaner command line commands.
import os

%env KEY=nvidia_tlt
%env NUM_GPUS=1

# Please define this local project directory that needs to be mapped to the TAO docker session.
# The dataset expected to be present in $LOCAL_PROJECT_DIR/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/classification
# !PLEASE MAKE SURE TO UPDATE THIS PATH!.
# example: os.environ["LOCAL_PROJECT_DIR"] = "/workspace/tao/tao_launcher_starter_kit/classification_tf2"
os.environ["LOCAL_PROJECT_DIR"] =FIX_ME

os.environ["LOCAL_DATA_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "data"
)
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "classification_16bit"
)

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "tao_voc/specs"
)

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR

env: KEY=nvidia_tlt
env: NUM_GPUS=1
total 20
-rw------- 1 1007 users 1375 Feb  2 18:09 spec_16bit_imgs.yaml
-rw------- 1 1007 users 1402 Feb  2 21:17 spec.yaml
-rw------- 1 1007 users 2494 Feb  3 22:27 spec_retrain_16bit_imgs.yaml
-rw------- 1 1007 users 2046 Feb  4 00:22 spec_retrain_qat.yaml
-rw------- 1 1007 users 2403 Feb  4 00:33 spec_retrain.yaml


The cell below maps the project directory on your local host to a workspace directory in the TAO docker instance, so that the data and the results are mapped from outside to inside of the docker instance.

## 2. Prepare datasets <a class="anchor" id="head-2"></a>

We will be using the pascal VOC dataset for the tutorial. To find more details please visit 
http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html#devkit. Please download the dataset present at http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar to $DATA_DOWNLOAD_DIR.

In [2]:
# Check that file is present
import os

DATA_DIR = os.environ.get('LOCAL_DATA_DIR')
print(DATA_DIR)
if not os.path.isfile(os.path.join(DATA_DIR , 'VOCtrainval_11-May-2012.tar')):
    print('tar file for dataset not found. Please download.')
else:
    print('Found dataset.')

/workspace/tao/tao_launcher_starter_kit/classification_tf2/data
Found dataset.


In [None]:
# unpack 
!tar -xvf $LOCAL_DATA_DIR/VOCtrainval_11-May-2012.tar -C $LOCAL_DATA_DIR 

In [5]:
# verify
!ls $LOCAL_DATA_DIR/VOCdevkit/VOC2012

Annotations  JPEGImages			 SegmentationClass
ImageSets    JPEGImages_16bit_grayscale  SegmentationObject


### A. Convert to 16bit image <a class="anchor" id="head-2-1"></a>

In [3]:
# install pip requirements
!pip3 install tqdm
!pip3 install matplotlib==3.3.3

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
[0mLooking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting matplotlib==3.3.3
  Downloading matplotlib-3.3.3-cp38-cp38-manylinux1_x86_64.whl (11.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m11.6/11.6 MB[0m [31m209.0 MB/s[0m eta [36m0:00:00[0m [36m0:00:01[0m
Installing collected packages: matplotlib
  Attempting uninstall: matplotlib
    Found existing installation: matplotlib 3.5.0
    Uninstalling matplotlib-3.5.0:
      Successfully uninstalled matplotlib-3.5.0
Successfully installed matplotlib-3.3.3
[0m

In [4]:
 # Convert RGB images to (fake) 16-bit grayscale
import os
import numpy as np
from PIL import Image
from tqdm import tqdm
def to16bit(images_dir, img_file, images_dir_16_bit):
    image = Image.open(os.path.join(images_dir,img_file)).convert("L")
    # shifted to the higher byte to get a fake 16-bit image
    image_np = np.array(image) * 256
    image16 = Image.fromarray(image_np.astype(np.uint32))
    # overwrite the image file
#     print(f"Converting {img_file} to 16-bit grayscale")
    img_file = os.path.splitext(img_file)[0] + '.png'
    image16.save(os.path.join(images_dir_16_bit, img_file))

In [5]:
from os.path import join as join_path
# Generate 16-bit grayscale images for train/val splits

!mkdir -p $LOCAL_DATA_DIR/training/image_2_16bit_grayscale
DATA_DIR = os.environ.get('LOCAL_DATA_DIR')
source_dir = join_path(DATA_DIR, "VOCdevkit/VOC2012")
images_dir = join_path(source_dir, "JPEGImages")
images_dir_16_bit = images_dir.replace('JPEGImages','JPEGImages_16bit_grayscale')
os.makedirs(images_dir_16_bit,exist_ok=True)
for img_file in tqdm(os.listdir(images_dir)):
    to16bit(images_dir,img_file,images_dir_16_bit)

100%|█████████████████████████████████████████████████████████████████████| 17125/17125 [29:23<00:00,  9.71it/s]


In [6]:
im = Image.open(join_path(images_dir_16_bit,'2008_007890.png'))
print("size:",im.size)
print("mode:",im.mode)
print("format:",im.format)
print(np.array(im).astype(np.uint32).shape)

size: (500, 375)
mode: I
format: PNG
(375, 500)


### B. Split the dataset into train/val/test <a class="anchor" id="head-2-2"></a>

Pascal VOC Dataset is converted to our format (for classification) and then to train/val/test in the next two blocks.

In [7]:
from os.path import join as join_path
import os
import glob
import re
import shutil

DATA_DIR=os.environ.get('LOCAL_DATA_DIR')
source_dir = join_path(DATA_DIR, "VOCdevkit/VOC2012")
target_dir = join_path(DATA_DIR, "formatted")


suffix = '_trainval.txt'
classes_dir = join_path(source_dir, "ImageSets", "Main")
images_dir = join_path(source_dir, "JPEGImages_16bit_grayscale")
classes_files = glob.glob(classes_dir+"/*"+suffix)
for file in classes_files:
    # get the filename and make output class folder
    classname = os.path.basename(file)
    if classname.endswith(suffix):
        classname = classname[:-len(suffix)]
        target_dir_path = join_path(target_dir, classname)
        if not os.path.exists(target_dir_path):
            os.makedirs(target_dir_path)
    else:
        continue
    print(classname)


    with open(file) as f:
        content = f.readlines()


    for line in content:
        tokens = re.split('\s+', line)
        if tokens[1] == '1':
            # copy this image into target dir_path
            target_file_path = join_path(target_dir_path, tokens[0] + '.png')
            src_file_path = join_path(images_dir, tokens[0] + '.png')
            shutil.copyfile(src_file_path, target_file_path)

motorbike
car
train
tvmonitor
pottedplant
chair
cow
cat
bus
boat
aeroplane
person
dog
horse
sheep
sofa
bird
bottle
diningtable
bicycle


In [8]:
import os
import glob
import shutil
from random import shuffle
from tqdm import tqdm

DATA_DIR=os.environ.get('LOCAL_DATA_DIR')
SOURCE_DIR=os.path.join(DATA_DIR, 'formatted')
TARGET_DIR=os.path.join(DATA_DIR,'split_16bit')
# list dir
print(os.walk(SOURCE_DIR))
dir_list = next(os.walk(SOURCE_DIR))[1]
# for each dir, create a new dir in split
for dir_i in tqdm(dir_list):
        newdir_train = os.path.join(TARGET_DIR, 'train', dir_i)
        newdir_val = os.path.join(TARGET_DIR, 'val', dir_i)
        newdir_test = os.path.join(TARGET_DIR, 'test', dir_i)
        
        if not os.path.exists(newdir_train):
                os.makedirs(newdir_train)
        if not os.path.exists(newdir_val):
                os.makedirs(newdir_val)
        if not os.path.exists(newdir_test):
                os.makedirs(newdir_test)

        img_list = glob.glob(os.path.join(SOURCE_DIR, dir_i, '*.png'))
        # shuffle data
        shuffle(img_list)

        for j in range(int(len(img_list)*0.7)):
                shutil.copy2(img_list[j], os.path.join(TARGET_DIR, 'train', dir_i))

        for j in range(int(len(img_list)*0.7), int(len(img_list)*0.8)):
                shutil.copy2(img_list[j], os.path.join(TARGET_DIR, 'val', dir_i))
                
        for j in range(int(len(img_list)*0.8), len(img_list)):
                shutil.copy2(img_list[j], os.path.join(TARGET_DIR, 'test', dir_i))
                
print('Done splitting dataset.')

<generator object walk at 0x7f64ec770ba0>


100%|███████████████████████████████████████████████████████████████████████████| 20/20 [00:21<00:00,  1.10s/it]

Done splitting dataset.





In [9]:
!ls $LOCAL_DATA_DIR/split_16bit/test/cat

2008_000060.png  2008_006999.png  2009_005177.png  2010_003467.png
2008_000062.png  2008_007039.png  2009_005251.png  2010_003468.png
2008_000112.png  2008_007082.png  2010_000001.png  2010_003481.png
2008_000115.png  2008_007085.png  2010_000048.png  2010_003483.png
2008_000116.png  2008_007151.png  2010_000067.png  2010_003497.png
2008_000182.png  2008_007164.png  2010_000099.png  2010_003509.png
2008_000196.png  2008_007165.png  2010_000114.png  2010_003551.png
2008_000227.png  2008_007176.png  2010_000157.png  2010_003641.png
2008_000358.png  2008_007216.png  2010_000163.png  2010_003665.png
2008_000464.png  2008_007256.png  2010_000172.png  2010_003672.png
2008_000502.png  2008_007260.png  2010_000182.png  2010_003675.png
2008_000536.png  2008_007269.png  2010_000224.png  2010_003696.png
2008_000641.png  2008_007289.png  2010_000244.png  2010_003747.png
2008_000660.png  2008_007324.png  2010_000291.png  2010_003752.png
2008_000724.png  2008_007327.png  2010_000303.pn

As explained in Getting Started Guide, this outputs a results.csv file in the same directory. We can use a simple python program to see the visualize the output of csv file.

## 3. Deploy! <a class="anchor" id="head-10"></a>

In [18]:
# Check if etlt model is correctly saved.
!ls -l $LOCAL_EXPERIMENT_DIR/export

total 12224
-rw-r--r-- 1 root root   22348 Feb  3 21:11 cal.bin
-rw-r--r-- 1 root root  699542 Feb  3 21:10 calib.tensorfile
-rw-r--r-- 1 root root 1911485 Feb  1 23:34 efficientnet-b0.etlt
-rw-r--r-- 1 root root 4871853 Feb  3 21:07 efficientnet-b0.fp32.engine
-rw-r--r-- 1 root root 5004695 Feb  3 21:16 efficientnet-b0.int8.engine


Using the `tao-deploy` container, you can generate a TensorRT engine and verify the correctness of the generated through evaluate and inference.

The `tao-deploy` produces optimized tensorrt engines for the platform that it resides on. Therefore, to get maximum performance, please run `tao-deploy` command which will instantiate a deploy container, with the exported `.etlt` file on your target device. The `tao-deploy` container only works for x86, with discrete NVIDIA GPU's.

For the jetson devices, please download the tao-converter for jetson and refer to [here](https://docs.nvidia.com/tao/tao-toolkit/text/tensorrt.html#installing-the-tao-converter) for more details.

If you choose to integrate your model into deepstream directly, you may do so by simply copying the exported `.etlt` file along with the calibration cache to the target device and updating the spec file that configures the `gst-nvinfer` element to point to this newly exported model. Usually this file is called `config_infer_primary.txt` for detection models and `config_infer_secondary_*.txt` for classification models.

In [19]:
# Convert to TensorRT engine (FP32).
!classification_tf2 gen_trt_engine -e $LOCAL_SPECS_DIR/spec_retrain_16bit_imgs.yaml

Loading uff directly from the package source code
python /usr/local/lib/python3.8/dist-packages/nvidia_tao_deploy/cv/classification_tf2/scripts/gen_trt_engine.py  --config-path /workspace/tao/tao_launcher_starter_kit/classification_tf2/tao_voc/specs --config-name spec_retrain_16bit_imgs.yaml 
Loading uff directly from the package source code
'spec_retrain_16bit_imgs.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
'spec_retrain_16bit_imgs.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
See https://hydra.cc/docs/next/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
  ret = run_j

In [10]:
# Convert to TensorRT engine (INT8).
!sed -i "s|fp32|int8|g" $LOCAL_SPECS_DIR/spec_retrain_16bit_imgs.yaml
!tao-deploy classification_tf2 gen_trt_engine -e $SPECS_DIR/spec_retrain_16bit_imgs.yaml

Loading uff directly from the package source code
python /usr/local/lib/python3.8/dist-packages/nvidia_tao_deploy/cv/classification_tf2/scripts/gen_trt_engine.py  --config-path /workspace/tao/tao_launcher_starter_kit/classification_tf2/tao_voc/specs --config-name spec_retrain_16bit_imgs.yaml 
Loading uff directly from the package source code
'spec_retrain_16bit_imgs.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
'spec_retrain_16bit_imgs.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
See https://hydra.cc/docs/next/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
  ret = run_j

In [22]:
print('Exported model:')
print('------------')
!ls -lh $LOCAL_EXPERIMENT_DIR/export/

Exported model:
------------
total 12M
-rw-r--r-- 1 root root  22K Feb  3 21:11 cal.bin
-rw-r--r-- 1 root root 684K Feb  3 21:10 calib.tensorfile
-rw-r--r-- 1 root root 1.9M Feb  1 23:34 efficientnet-b0.etlt
-rw-r--r-- 1 root root 4.7M Feb  3 21:07 efficientnet-b0.fp32.engine
-rw-r--r-- 1 root root 4.7M Feb  3 21:41 efficientnet-b0.int8.engine


## 4. Verify the deployed model <a class="anchor" id="head-11"></a>

Verify the converted engine by visualizing TensorRT inferences.

In [24]:
# Set engine as model_path
!sed -i "s|$LOCAL_EXPERIMENT_DIR/output_16bit_retrain/weights/$LAST_CHECKPOINT|$LOCAL_EXPERIMENT_DIR/export/efficientnet-b0.fp32.engine|g" $LOCAL_SPECS_DIR/spec_retrain_16bit_imgs.yaml
!sed -i "s|batch_size: 256|batch_size: 16|g" $LOCAL_SPECS_DIR/spec_retrain_16bit_imgs.yaml
# Running inference 
!classification_tf2 inference -e $LOCAL_SPECS_DIR/spec_retrain_16bit_imgs.yaml

Loading uff directly from the package source code
python /usr/local/lib/python3.8/dist-packages/nvidia_tao_deploy/cv/classification_tf2/scripts/inference.py  --config-path /workspace/tao/tao_launcher_starter_kit/classification_tf2/tao_voc/specs --config-name spec_retrain_16bit_imgs.yaml 
'spec_retrain_16bit_imgs.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
'spec_retrain_16bit_imgs.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
See https://hydra.cc/docs/next/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
  ret = run_job(
[02/03/2023-22:27:34] [TRT] [W] CUDA lazy loading i

In [25]:
!cat $LOCAL_EXPERIMENT_DIR/output_16bit_retrain/result.csv

/workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/test/aeroplane/2008_000021.png,aeroplane,0.4658203
/workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/test/aeroplane/2008_000037.png,aeroplane,0.92578125
/workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/test/aeroplane/2008_000151.png,aeroplane,0.7739258
/workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/test/aeroplane/2008_000585.png,aeroplane,0.53515625
/workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/test/aeroplane/2008_000805.png,aeroplane,0.40625
/workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/test/aeroplane/2008_001227.png,chair,0.2487793
/workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/test/aeroplane/2008_001380.png,aeroplane,0.68896484
/workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/test/aeroplane/2008_001719.png,