# TAO 16-bit Image Classification (TF2)

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png" width="1080"> 

## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Take a pretrained resnet18 model and finetune on a sample dataset converted from PascalVOC
* Prune the finetuned model
* Retrain the pruned model to recover lost accuracy
* Export the pruned model
* Run Inference on the trained model
* Export the pruned and retrained model to a .etlt file for deployment to DeepStream

At the end of this notebook, you will have generated a trained and optimized `classification` model
trained on 16-bit input images.

### Table of Contents
This notebook shows an example use case for classification using the Train Adapt Optimize (TAO) Toolkit.

1. [Set up env variables and map drives](#head-1)
2. [Prepare dataset and pretrained model](#head-2)
    1. [Split the dataset into train/test/val](#head-2-1)
    2. [Download pre-trained model](#head-2-2)
3. [Provide training specification](#head-3)
4. [Run TAO training](#head-4)
5. [Evaluate trained models](#head-5)
6. [Prune trained models](#head-6)
7. [Retrain pruned models](#head-7)
8. [Testing the model](#head-8)
9. [Visualize inferences](#head-9)
10. [Export and Deploy!](#head-10)
11. [Verify the deployed model](#head-11)

## 1. Set up env variables and map drives <a class="anchor" id="head-0"></a>
When using the purpose-built pretrained models from NGC, please make sure to set the `$KEY` environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users workspace. Please note that the dataset to run this notebook is expected to reside in the `$LOCAL_PROJECT_DIR/data`, while the TAO experiment generated collaterals will be output to `$LOCAL_PROJECT_DIR/classification`. More information on how to set up the dataset and the supported steps in the TAO workflow are provided in the subsequent cells.

*Note: Please make sure to remove any stray artifacts/files from the `$USER_EXPERIMENT_DIR` or `$DATA_DOWNLOAD_DIR` paths as mentioned below, that may have been generated from previous experiments. Having checkpoint files etc may interfere with creating a training graph for a new experiment.*

*Note: This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly*

In [2]:
# Setting up env variables for cleaner command line commands.
import os

%env KEY=nvidia_tlt
%env NUM_GPUS=1

# Please define this local project directory that needs to be mapped to the TAO docker session.
# The dataset expected to be present in $LOCAL_PROJECT_DIR/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/classification
# !PLEASE MAKE SURE TO UPDATE THIS PATH!.
#example os.environ["LOCAL_PROJECT_DIR"] = "/workspace/tao/tao_launcher_starter_kit/classification_tf2"
os.environ["LOCAL_PROJECT_DIR"] =FIX_ME
os.environ["LOCAL_DATA_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "data"
)
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "classification_16bit"
)

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "tao_voc/specs"
)

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR

env: KEY=nvidia_tlt
env: NUM_GPUS=1
total 20
-rw------- 1 1007 users 1151 Dec 14 20:37 spec_retrain_qat.yaml
-rw------- 1 1007 users 1362 Dec 14 20:37 spec_retrain_16bit_imgs.yaml
-rw------- 1 1007 users 1309 Dec 14 20:37 spec_retrain.yaml
-rw------- 1 1007 users  902 Dec 14 20:37 spec.yaml
-rw------- 1 1007 users 1248 Jan 30 23:12 spec_16bit_imgs.yaml


The cell below maps the project directory on your local host to a workspace directory in the TAO docker instance, so that the data and the results are mapped from outside to inside of the docker instance.

## 2. Prepare datasets and pre-trained model <a class="anchor" id="head-2"></a>

We will be using the pascal VOC dataset for the tutorial. To find more details please visit 
http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html#devkit. Please download the dataset present at http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar to $DATA_DOWNLOAD_DIR.

In [None]:
# Check that file is present
import os

DATA_DIR = os.environ.get('LOCAL_DATA_DIR')
print(DATA_DIR)
if not os.path.isfile(os.path.join(DATA_DIR , 'VOCtrainval_11-May-2012.tar')):
    print('tar file for dataset not found. Please download.')
else:
    print('Found dataset.')

In [None]:
# unpack 
!tar -xvf $LOCAL_DATA_DIR/VOCtrainval_11-May-2012.tar -C $LOCAL_DATA_DIR 

In [5]:
# verify
!ls $LOCAL_DATA_DIR/VOCdevkit/VOC2012

Annotations  JPEGImages			 SegmentationClass
ImageSets    JPEGImages_16bit_grayscale  SegmentationObject


### A. Convert to 16bit image <a class="anchor" id="head-2-1"></a>

In [6]:
# install pip requirements
!pip3 install tqdm
!pip3 install matplotlib==3.3.3

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
[0mLooking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting matplotlib==3.3.3
  Downloading matplotlib-3.3.3-cp38-cp38-manylinux1_x86_64.whl (11.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m11.6/11.6 MB[0m [31m204.5 MB/s[0m eta [36m0:00:00[0m [36m0:00:01[0m
Installing collected packages: matplotlib
  Attempting uninstall: matplotlib
    Found existing installation: matplotlib 3.5.0
    Uninstalling matplotlib-3.5.0:
      Successfully uninstalled matplotlib-3.5.0
Successfully installed matplotlib-3.3.3
[0m

In [3]:
 # Convert RGB images to (fake) 16-bit grayscale
import os
import numpy as np
from PIL import Image
from tqdm import tqdm
def to16bit(images_dir, img_file, images_dir_16_bit):
    image = Image.open(os.path.join(images_dir,img_file)).convert("L")
    # shifted to the higher byte to get a fake 16-bit image
    image_np = np.array(image) * 256
    image16 = Image.fromarray(image_np.astype(np.uint32))
    # overwrite the image file
#     print(f"Converting {img_file} to 16-bit grayscale")
    img_file = os.path.splitext(img_file)[0] + '.png'
    image16.save(os.path.join(images_dir_16_bit, img_file))

In [4]:
from os.path import join as join_path
# Generate 16-bit grayscale images for train/val splits

!mkdir -p $LOCAL_DATA_DIR/training/image_2_16bit_grayscale
DATA_DIR = os.environ.get('LOCAL_DATA_DIR')
source_dir = join_path(DATA_DIR, "VOCdevkit/VOC2012")
images_dir = join_path(source_dir, "JPEGImages")
images_dir_16_bit = images_dir.replace('JPEGImages','JPEGImages_16bit_grayscale')
os.makedirs(images_dir_16_bit,exist_ok=True)
for img_file in tqdm(os.listdir(images_dir)):
    to16bit(images_dir,img_file,images_dir_16_bit)

100%|█████████████████████████████████████████████████████████████████████| 17125/17125 [29:07<00:00,  9.80it/s]


In [5]:
im = Image.open(join_path(images_dir_16_bit,'2008_007890.png'))
print("size:",im.size)
print("mode:",im.mode)
print("format:",im.format)
print(np.array(im).astype(np.uint32).shape)

size: (500, 375)
mode: I
format: PNG
(375, 500)


### B. Split the dataset into train/val/test <a class="anchor" id="head-2-2"></a>

Pascal VOC Dataset is converted to our format (for classification) and then to train/val/test in the next two blocks.

In [6]:
from os.path import join as join_path
import os
import glob
import re
import shutil

DATA_DIR=os.environ.get('LOCAL_DATA_DIR')
source_dir = join_path(DATA_DIR, "VOCdevkit/VOC2012")
target_dir = join_path(DATA_DIR, "formatted")


suffix = '_trainval.txt'
classes_dir = join_path(source_dir, "ImageSets", "Main")
images_dir = join_path(source_dir, "JPEGImages_16bit_grayscale")
classes_files = glob.glob(classes_dir+"/*"+suffix)
for file in classes_files:
    # get the filename and make output class folder
    classname = os.path.basename(file)
    if classname.endswith(suffix):
        classname = classname[:-len(suffix)]
        target_dir_path = join_path(target_dir, classname)
        if not os.path.exists(target_dir_path):
            os.makedirs(target_dir_path)
    else:
        continue
    print(classname)


    with open(file) as f:
        content = f.readlines()


    for line in content:
        tokens = re.split('\s+', line)
        if tokens[1] == '1':
            # copy this image into target dir_path
            target_file_path = join_path(target_dir_path, tokens[0] + '.png')
            src_file_path = join_path(images_dir, tokens[0] + '.png')
            shutil.copyfile(src_file_path, target_file_path)

motorbike
car
train
tvmonitor
pottedplant
chair
cow
cat
bus
boat
aeroplane
person
dog
horse
sheep
sofa
bird
bottle
diningtable
bicycle


In [7]:
import os
import glob
import shutil
from random import shuffle
from tqdm import tqdm

DATA_DIR=os.environ.get('LOCAL_DATA_DIR')
SOURCE_DIR=os.path.join(DATA_DIR, 'formatted')
TARGET_DIR=os.path.join(DATA_DIR,'split_16bit')
# list dir
print(os.walk(SOURCE_DIR))
dir_list = next(os.walk(SOURCE_DIR))[1]
# for each dir, create a new dir in split
for dir_i in tqdm(dir_list):
        newdir_train = os.path.join(TARGET_DIR, 'train', dir_i)
        newdir_val = os.path.join(TARGET_DIR, 'val', dir_i)
        newdir_test = os.path.join(TARGET_DIR, 'test', dir_i)
        
        if not os.path.exists(newdir_train):
                os.makedirs(newdir_train)
        if not os.path.exists(newdir_val):
                os.makedirs(newdir_val)
        if not os.path.exists(newdir_test):
                os.makedirs(newdir_test)

        img_list = glob.glob(os.path.join(SOURCE_DIR, dir_i, '*.png'))
        # shuffle data
        shuffle(img_list)

        for j in range(int(len(img_list)*0.7)):
                shutil.copy2(img_list[j], os.path.join(TARGET_DIR, 'train', dir_i))

        for j in range(int(len(img_list)*0.7), int(len(img_list)*0.8)):
                shutil.copy2(img_list[j], os.path.join(TARGET_DIR, 'val', dir_i))
                
        for j in range(int(len(img_list)*0.8), len(img_list)):
                shutil.copy2(img_list[j], os.path.join(TARGET_DIR, 'test', dir_i))
                
print('Done splitting dataset.')

<generator object walk at 0x7f260bf4dba0>


100%|███████████████████████████████████████████████████████████████████████████| 20/20 [00:11<00:00,  1.72it/s]

Done splitting dataset.





In [8]:
!ls $LOCAL_DATA_DIR/split_16bit/test/cat

2008_000112.png  2008_006962.png  2010_000224.png  2010_003641.png
2008_000115.png  2008_006973.png  2010_000244.png  2010_003665.png
2008_000116.png  2008_007039.png  2010_000291.png  2010_003672.png
2008_000227.png  2008_007164.png  2010_000303.png  2010_003696.png
2008_000358.png  2008_007165.png  2010_000442.png  2010_003747.png
2008_000464.png  2008_007216.png  2010_000458.png  2010_003764.png
2008_000536.png  2008_007260.png  2010_000488.png  2010_003773.png
2008_000641.png  2008_007289.png  2010_000497.png  2010_003800.png
2008_000660.png  2008_007327.png  2010_000576.png  2010_003818.png
2008_000724.png  2008_007353.png  2010_000616.png  2010_003861.png
2008_000950.png  2008_007632.png  2010_000647.png  2010_003863.png
2008_000999.png  2008_007640.png  2010_000665.png  2010_003887.png
2008_001071.png  2008_007662.png  2010_000702.png  2010_003971.png
2008_001335.png  2008_007794.png  2010_000712.png  2010_003976.png
2008_001357.png  2008_007888.png  2010_000737.pn

## 3. Provide training specification <a class="anchor" id="head-3"></a>
* Training dataset
* Validation dataset
* Pre-trained models
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.

In [11]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/results_voc_16bit/   
!cat $LOCAL_SPECS_DIR/spec_16bit_imgs.yaml

results_dir: 'RESULTSDIR'
key: 'ENC_KEY'
data:
  train_dataset_path: "/workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/train"
  val_dataset_path: "/workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/val"
  preprocess_mode: 'torch'
augment:
  enable_color_augmentation: False
  enable_center_crop: True
train:
  qat: False
  pretrained_model_path: ''
  batch_size_per_gpu: 64
  num_epochs: 80
  optim_config:
    optimizer: 'sgd'
  lr_config:
    scheduler: 'cosine'
    learning_rate: 0.05
    soft_start: 0.05
  reg_config:
    type: 'L2'
    scope: ['conv2d', 'dense']
    weight_decay: 0.00005
model:
  arch: 'efficientnet-b0'
  input_image_size: [1,256,256]
  input_image_depth: 16
evaluate:
  dataset_path: "/workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/test"
  model_path: "/workspace/tao/tao_launcher_starter_kit/classification_tf2/classification_16bit/output_16bit/weights/efficien

*Note: for 16bit images, the input_image_depth must set to 16*

if you want to use the image_mean of you own 16bit img dataset, you can calculate it by the cell below


In [12]:
def norm(img_dir):
    cnt = 0
    means = 0
    print(img_dir)
    for imgfile in tqdm(os.listdir(img_dir)):
        if os.path.splitext(imgfile)[-1] == '.png':
            img = Image.open(os.path.join(img_dir , imgfile))
            img_np = np.array(img)
#             assert img_np.shape[0] == 1
            img_np = img_np.astype(np.float32)
            means += img_np.mean()
            cnt += 1
    means /= cnt
    return means

DATA_DIR=os.environ.get('LOCAL_DATA_DIR')
source_dir = join_path(DATA_DIR, "VOCdevkit/VOC2012")
images_dir_16_bit = join_path(source_dir,'JPEGImages_16bit_grayscale')
mean = norm(images_dir_16_bit)
print(mean)

/workspace/tao/tao_launcher_starter_kit/classification_tf2/data/VOCdevkit/VOC2012/JPEGImages_16bit_grayscale


100%|████████████████████████████████████████████████████████████████████| 17125/17125 [01:20<00:00, 211.77it/s]

28346.274527400776





## 3. Provide training specification <a class="anchor" id="head-3"></a>
* Training dataset
* Validation dataset
* Pre-trained models
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.

In [15]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/output_16bit
!sed -i "s|RESULTSDIR|$LOCAL_EXPERIMENT_DIR/output_16bit|g" $LOCAL_SPECS_DIR/spec_16bit_imgs.yaml
!sed -i "s|ENC_KEY|$KEY|g" $LOCAL_SPECS_DIR/spec_16bit_imgs.yaml

In [16]:
!classification_tf2 train -e $LOCAL_SPECS_DIR/spec_16bit_imgs.yaml

Setting up communication with ClearML server.
ClearML task init failed with error ClearML configuration could not be found (missing `~/clearml.conf` or Environment CLEARML_API_HOST)
To get started with ClearML: setup your own `clearml-server`, or create a free account at https://app.clear.ml
Training will still continue.
Starting classification training.
Found 15185 images belonging to 20 classes.
Processing dataset (train): /workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/train
Found 3154 images belonging to 20 classes.
Processing dataset (validation): /workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/val
Model: "efficientnet-b0"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 Input (InputLayer)             [(None, 1, 256, 256  0           []                               
           

In [17]:
print("To run this training in data parallelism using multiple GPU's, please uncomment the line below and "
      "update the --gpus parameter to the number of GPU's you wish to use.")
# !classification_tf2 train -e $LOCAL_SPECS_DIR/spec_16bit_imgs.yaml --gpus 2

To run this training in data parallelism using multiple GPU's, please uncomment the line below and update the --gpus parameter to the number of GPU's you wish to use.


In [None]:
print("To resume from a checkpoint,  just relaunch training with the same spec file.")
# !classification_tf2 train -e $LOCAL_SPECS_DIR/spec_16bit_imgs.yaml --gpus 2

## 5. Evaluate trained models <a class="anchor" id="head-5"></a>

In this step, we assume that the training is complete and the model from the final epoch (`efficientnet-b0_080.tlt`) is available. If you would like to run evaluation on an earlier model, please edit the spec file at `$SPECS_DIR/spec_16bit_imgs.yaml` to point to the intended model.

In [18]:
# get the last checkpoints
last_checkpoint = ''
for f in os.listdir(os.path.join(os.environ["LOCAL_EXPERIMENT_DIR"],'output_16bit', 'weights')):
    if f.startswith('efficientnet-b'):
        last_checkpoint = last_checkpoint if last_checkpoint > f else f
print(f'Last checkpoint: {last_checkpoint}')

Last checkpoint: efficientnet-b0_080.tlt


In [19]:
# Set LAST_CHECKPOINT in the spec file
%env LAST_CHECKPOINT={last_checkpoint}
!sed -i "s|EVALMODEL|$LOCAL_EXPERIMENT_DIR/output_16bit/weights/$LAST_CHECKPOINT|g" $LOCAL_SPECS_DIR/spec_16bit_imgs.yaml

env: LAST_CHECKPOINT=efficientnet-b0_080.tlt


In [20]:
!classification_tf2 evaluate -e $LOCAL_SPECS_DIR/spec_16bit_imgs.yaml

Log file already exists at /output_16bit/status.json
Starting classification evaluation.
Model: "efficientnet-b0"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 Input (InputLayer)             [(None, 1, 256, 256  0           []                               
                                )]                                                                
                                                                                                  
 stem_conv (Conv2D)             (None, 32, 128, 128  288         ['Input[0][0]']                  
                                )                                                                 
                                                                                                  
 stem_bn (BatchNormalization)   (None, 32, 128, 128  128         ['stem_conv[0][0]']          

## 6. Prune trained models <a class="anchor" id="head-6"></a>
* Specify pre-trained model
* Equalization criterion
* Threshold for pruning
* Exclude prediction layer that you don't want pruned (e.g. predictions)

Usually, you just need to adjust `prune.threshold` for accuracy and model size trade off. Higher `threshold` gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold to use is depend on the dataset. 0.68 is just a starting point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy.

In [21]:
# Specifying the checkpoint to be used for the pruning.
!mkdir -p $LOCAL_EXPERIMENT_DIR/output_16bit/efficientnet-b0_pruned
!sed -i "s|PRUNEDMODEL|$LOCAL_EXPERIMENT_DIR/output_16bit/efficientnet-b0_pruned/model_pruned.tlt|g" $LOCAL_SPECS_DIR/spec_16bit_imgs.yaml
!classification_tf2 prune -e $LOCAL_SPECS_DIR/spec_16bit_imgs.yaml

Log file already exists at /output_16bit/status.json
Starting classification pruning.
Model: "efficientnet-b0"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 Input (InputLayer)             [(None, 1, 256, 256  0           []                               
                                )]                                                                
                                                                                                  
 stem_conv (Conv2D)             (None, 32, 128, 128  288         ['Input[0][0]']                  
                                )                                                                 
                                                                                                  
 stem_bn (BatchNormalization)   (None, 32, 128, 128  128         ['stem_conv[0][0]']             

In [22]:
print('Pruned model:')
print('------------')
!ls -rlt $LOCAL_EXPERIMENT_DIR/output_16bit/efficientnet-b0_pruned

Pruned model:
------------
total 6780
-rw-r--r-- 1 root root 6942330 Feb  1 21:34 model_pruned.tlt


## 7. Retrain pruned models <a class="anchor" id="head-7"></a>
* Model needs to be re-trained to bring back accuracy after pruning
* Specify re-training specification

In [24]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/output_16bit_retrain
!sed -i "s|RESULTSDIR|$LOCAL_EXPERIMENT_DIR/output_16bit_retrain|g" $LOCAL_SPECS_DIR/spec_retrain_16bit_imgs.yaml
!sed -i "s|ENC_KEY|$KEY|g" $LOCAL_SPECS_DIR/spec_retrain_16bit_imgs.yaml
!sed -i "s|PRUNEDMODEL|$LOCAL_EXPERIMENT_DIR/output_16bit/efficientnet-b0_pruned/model_pruned.tlt|g" $LOCAL_SPECS_DIR/spec_retrain_16bit_imgs.yaml

!cat $LOCAL_SPECS_DIR/spec_retrain_16bit_imgs.yaml

results_dir: '/workspace/tao/tao_launcher_starter_kit/classification_tf2/classification_16bit/output_16bit_retrain'
key: 'nvidia_tlt'
data:
  train_dataset_path: "/workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/train"
  val_dataset_path: "/workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/val"
  preprocess_mode: 'torch'
  image_mean: [0.449]
augment:
  enable_color_augmentation: True
  enable_center_crop: True
train:
  qat: False
  pretrained_model_path: '/workspace/tao/tao_launcher_starter_kit/classification_tf2/classification_16bit/output_16bit/efficientnet-b0_pruned/model_pruned.tlt'
  batch_size_per_gpu: 64
  num_epochs: 80
  optim_config:
    optimizer: 'sgd'
  lr_config:
    scheduler: 'cosine'
    learning_rate: 0.05
    soft_start: 0.05
  reg_config:
    type: 'L2'
    scope: ['conv2d', 'dense']
    weight_decay: 0.00005
model:
  arch: 'efficientnet-b0'
  input_image_size: [1,256,256]
  input_image

In [25]:
!classification_tf2 train -e $LOCAL_SPECS_DIR/spec_retrain_16bit_imgs.yaml

Setting up communication with ClearML server.
ClearML task init failed with error ClearML configuration could not be found (missing `~/clearml.conf` or Environment CLEARML_API_HOST)
To get started with ClearML: setup your own `clearml-server`, or create a free account at https://app.clear.ml
Training will still continue.
Starting classification training.
Found 15185 images belonging to 20 classes.
Processing dataset (train): /workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/train
Found 3154 images belonging to 20 classes.
Processing dataset (validation): /workspace/tao/tao_launcher_starter_kit/classification_tf2/data/split_16bit/val
No training configuration found in save file, so the model was *not* compiled. Compile it manually.
Model: "efficientnet-b0"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 Input (In

## 8. Testing the model! <a class="anchor" id="head-8"></a>

In this step, we assume that the training is complete and the model from the final epoch (`efficientnet-b0_080.tlt`) is available. If you would like to run evaluation on an earlier model, please edit the spec file at `$SPECS_DIR/spec_retrain_16bit_imgs.yaml` to point to the intended model.

In [26]:
# get the last checkpoints
last_checkpoint = ''
for f in os.listdir(os.path.join(os.environ["LOCAL_EXPERIMENT_DIR"],'output_16bit_retrain', 'weights')):
    if f.startswith('efficientnet-b'):
        last_checkpoint = last_checkpoint if last_checkpoint > f else f
print(f'Last checkpoint: {last_checkpoint}')

Last checkpoint: efficientnet-b0_080.tlt


In [27]:
# Set LAST_CHECKPOINT in the spec file
%env LAST_CHECKPOINT={last_checkpoint}
!sed -i "s|EVALMODEL|$LOCAL_EXPERIMENT_DIR/output_16bit_retrain/weights/$LAST_CHECKPOINT|g" $LOCAL_SPECS_DIR/spec_retrain_16bit_imgs.yaml

env: LAST_CHECKPOINT=efficientnet-b0_080.tlt


In [28]:
!classification_tf2 evaluate -e $LOCAL_SPECS_DIR/spec_retrain_16bit_imgs.yaml

Log file already exists at /workspace/tao/tao_launcher_starter_kit/classification_tf2/classification_16bit/output_16bit_retrain/status.json
Starting classification evaluation.
Model: "efficientnet-b0"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 Input (InputLayer)             [(None, 1, 256, 256  0           []                               
                                )]                                                                
                                                                                                  
 stem_conv (Conv2D)             (None, 16, 128, 128  144         ['Input[0][0]']                  
                                )                                                                 
                                                                                                  
 stem_b

## 9. Visualize Inferences <a class="anchor" id="head-9"></a>

To see the output results of our model on test images, we can use the `tao inference` tool. Note that using models trained for higher epochs will usually result in better results. We'll run inference with the directory mode. You can also use the single image mode.

In [29]:
!classification_tf2 inference -e $LOCAL_SPECS_DIR/spec_retrain_16bit_imgs.yaml

Log file already exists at /workspace/tao/tao_launcher_starter_kit/classification_tf2/classification_16bit/output_16bit_retrain/status.json
Starting classification inference.
Model: "efficientnet-b0"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 Input (InputLayer)             [(None, 1, 256, 256  0           []                               
                                )]                                                                
                                                                                                  
 stem_conv (Conv2D)             (None, 16, 128, 128  144         ['Input[0][0]']                  
                                )                                                                 
                                                                                                  
 stem_bn

In [30]:
!cat $LOCAL_EXPERIMENT_DIR/output_16bit_retrain/result.csv

2008_000021.png,person,0.40387055
2008_000585.png,aeroplane,0.7527091
2008_000805.png,aeroplane,0.79433125
2008_001227.png,person,0.550684
2008_001380.png,aeroplane,0.9901938
2008_001801.png,aeroplane,0.47110713
2008_001971.png,aeroplane,0.979126
2008_002000.png,aeroplane,0.74467754
2008_002138.png,aeroplane,0.9153551
2008_002221.png,aeroplane,0.9847112
2008_002551.png,aeroplane,0.8937309
2008_002698.png,aeroplane,0.85423404
2008_002719.png,aeroplane,0.9483667
2008_002977.png,motorbike,0.32037416
2008_003196.png,aeroplane,0.7038804
2008_003275.png,boat,0.46733144
2008_003729.png,car,0.36578465
2008_003743.png,aeroplane,0.54762065
2008_003788.png,aeroplane,0.99011433
2008_003876.png,aeroplane,0.9959776
2008_003905.png,aeroplane,0.77435935
2008_003926.png,aeroplane,0.834433
2008_004000.png,aeroplane,0.48415333
2008_004138.png,aeroplane,0.9358444
2008_004165.png,aeroplane,0.7361653
2008_004190.png,aeroplane,0.96170306
2008_004348.png,aeroplane,0.40460312
2008_00

As explained in Getting Started Guide, this outputs a results.csv file in the same directory. We can use a simple python program to see the visualize the output of csv file.

## 10. Export and Deploy! <a class="anchor" id="head-10"></a>

In [31]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/export

!sed -i "s|EXPORTDIR|$LOCAL_EXPERIMENT_DIR/export|g" $LOCAL_SPECS_DIR/spec_retrain_16bit_imgs.yaml
!classification_tf2 export -e $LOCAL_SPECS_DIR/spec_retrain_16bit_imgs.yaml

Log file already exists at /workspace/tao/tao_launcher_starter_kit/classification_tf2/classification_16bit/output_16bit_retrain/status.json
Starting classification export.
Signatures found in model: [serving_default].
Output names: ['predictions']
Using tensorflow=2.9.1, onnx=1.12.0, tf2onnx=1.12.0/ddca3a
Using opset <onnx, 13>
Computed 0 values for constant folding
Optimizing ONNX model
After optimization: BatchNormalization -42 (49->7), Cast -1 (33->32), Const -378 (569->191), GlobalAveragePool +16 (0->16), Identity -2 (2->0), ReduceMean -16 (16->0), Reshape -16 (33->17), Transpose -17 (17->0), Unsqueeze -64 (64->0)
The etlt model is saved at /workspace/tao/tao_launcher_starter_kit/classification_tf2/classification_16bit/export/efficientnet-b0.etlt
Export finished successfully.
Sending telemetry data.
Telemetry data couldn't be sent, but the command ran successfully.
[Error]: <urlopen error [Errno -2] Name or service not known>
Execution status: PASS


In [32]:
# Check if etlt model is correctly saved.
!ls -l $LOCAL_EXPERIMENT_DIR/export

total 1868
-rw-r--r-- 1 root root 1911485 Feb  1 23:34 efficientnet-b0.etlt


Using the `tao-deploy` container, you can generate a TensorRT engine and verify the correctness of the generated through evaluate and inference.

The `tao-deploy` produces optimized tensorrt engines for the platform that it resides on. Therefore, to get maximum performance, please run `tao-deploy` command which will instantiate a deploy container, with the exported `.etlt` file on your target device. The `tao-deploy` container only works for x86, with discrete NVIDIA GPU's.

For the jetson devices, please download the tao-converter for jetson and refer to [here](https://docs.nvidia.com/tao/tao-toolkit/text/tensorrt.html#installing-the-tao-converter) for more details.

If you choose to integrate your model into deepstream directly, you may do so by simply copying the exported `.etlt` file along with the calibration cache to the target device and updating the spec file that configures the `gst-nvinfer` element to point to this newly exported model. Usually this file is called `config_infer_primary.txt` for detection models and `config_infer_secondary_*.txt` for classification models.

In order to deploy the trained and exported model please go through the classification_16bit-Deploy.ipynb notebook and run this notebook inside deploy container. 