# Object Detection using TAO EfficientDet (TF2)

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png" width="1080"> 

## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Take a pretrained model and train an EfficientDet-D0 model on COCO dataset
* Evaluate the trained model
* Run pruning and finetuning with the trained model
* Run inference with the trained model and visualize the result
* Export the trained model to a .etlt file for deployment to DeepStream
* Run inference on the exported .etlt model to verify deployment using TensorRT

At the end of this notebook, you will have generated a trained and optimized `EfficientDet` model
which you may deploy via [Triton](https://github.com/NVIDIA-AI-IOT/tao-toolkit-triton-apps)
or [DeepStream](https://developer.nvidia.com/deepstream-sdk).

### Table of Contents
This notebook shows an example use case for instance segmentation using the Train Adapt Optimize (TAO) Toolkit.

0. [Set up env variables and map drives](#head-0)
1. [Installing the TAO Launcher](#head-1)
2. [Prepare dataset and pre-trained model](#head-2)
3. [Provide training specification](#head-3)
4. [Run TAO training](#head-4)
5. [Evaluate trained models](#head-5)
6. [Prune trained model](#head-6)
7. [Retrain pruned models](#head-7)
8. [Evaluate retrained model](#head-8)
9. [Visualize inferences](#head-9)
10. [Deploy](#head-10)
11. [Verify the deployed model](#head-11)

## 0. Set up env variables and map drives <a class="anchor" id="head-0"></a>
When using the purpose-built pretrained models from NGC, please make sure to set the `$KEY` environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users workspace. Please note that the dataset to run this notebook is expected to reside in the `$LOCAL_PROJECT_DIR/data`, while the TAO experiment generated collaterals will be output to `$LOCAL_PROJECT_DIR/efficientdet_tf2`. More information on how to set up the dataset and the supported steps in the TAO workflow are provided in the subsequent cells.

*Note: Please make sure to remove any stray artifacts/files from the `$USER_EXPERIMENT_DIR` or `$DATA_DOWNLOAD_DIR` paths as mentioned below, that may have been generated from previous experiments. Having checkpoint files etc may interfere with creating a training graph for a new experiment.*

*Note: This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly*

In [20]:
# Setting up env variables for cleaner command line commands.
import os

%env KEY=nvidia_tlt
%env NUM_GPUS=1
%env USER_EXPERIMENT_DIR=/workspace/tao-experiments/efficientdet_tf2
%env DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data
%env SPECS_DIR=/workspace/tao-experiments/efficientdet_tf2/specs

# Please define this local project directory that needs to be mapped to the TAO docker session.
# The dataset expected to be present in $LOCAL_PROJECT_DIR/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/efficientdet_tf2
# !PLEASE MAKE SURE TO UPDATE THIS PATH!.
# %env LOCAL_PROJECT_DIR=/workspace/tao-experiments/
%env LOCAL_PROJECT_DIR=/data/Git_Repository/Projects_AI/roadai/TAO_Toolkit_Getting_Started/notebooks/tao_launcher_starter_kit/

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=~/tao-samples/efficientdet_tf2
os.environ["NOTEBOOK_ROOT"] =os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "efficientdet_tf2"
)

os.environ["LOCAL_DATA_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "data"
)
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "efficientdet_tf2"
)

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR

env: KEY=nvidia_tlt
env: NUM_GPUS=1
env: USER_EXPERIMENT_DIR=/workspace/tao-experiments/efficientdet_tf2
env: DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data
env: SPECS_DIR=/workspace/tao-experiments/efficientdet_tf2/specs
env: LOCAL_PROJECT_DIR=/data/Git_Repository/Projects_AI/roadai/TAO_Toolkit_Getting_Started/notebooks/tao_launcher_starter_kit/
total 28
-rw-r--r-- 1 vscode vscode 1091 Jul  7 02:18 coco_labels.yaml
-rwxr-xr-x 1 vscode vscode 3165 Jul  7 02:18 download_coco.sh
-rw-r--r-- 1 vscode vscode 2118 Jul  7 16:42 spec_retrain_qat.yaml
-rw-r--r-- 1 vscode vscode  674 Jul  7 19:19 convert_train.yaml
-rw-r--r-- 1 vscode vscode  667 Jul  7 19:19 convert_val.yaml
-rw-r--r-- 1 vscode vscode 2676 Jul  7 19:28 spec_retrain.yaml
-rw-r--r-- 1 vscode vscode 2197 Jul  7 19:28 spec_train.yaml


The cell below maps the project directory on your local host to a workspace directory in the TAO docker instance, so that the data and the results are mapped from in and out of the docker. For more information please refer to the [launcher instance](https://docs.nvidia.com/metropolis/TAO/tlt-user-guide/tlt_launcher.html) in the user guide.

When running this cell on AWS, update the drive_map entry with the dictionary defined below, so that you don't have permission issues when writing data into folders created by the TAO docker.

```json
drive_map = {
    "Mounts": [
            # Mapping the data directory
            {
                "source": os.environ["LOCAL_PROJECT_DIR"],
                "destination": "/workspace/tao-experiments"
            },
            # Mapping the specs directory.
            {
                "source": os.environ["LOCAL_SPECS_DIR"],
                "destination": os.environ["SPECS_DIR"]
            },
        ],
    "DockerOptions": {
        "user": "{}:{}".format(os.getuid(), os.getgid()),
        "network": "host"
    }
}
```

In [21]:
# Mapping up the local directories to the TAO docker.
import json
import os
mounts_file = os.path.expanduser("~/.tao_mounts.json")

# Define the dictionary with the mapped drives
drive_map = {
    "Mounts": [
        # Mapping the data directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tao-experiments"
        },
        # Mapping the specs directory.
        {
            "source": os.environ["LOCAL_SPECS_DIR"],
            "destination": os.environ["SPECS_DIR"]
        },
    ],
    "DockerOptions": {
        "user": "{}:{}".format(os.getuid(), os.getgid()),
        "network": "host"
    }
}

# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(drive_map, mfile, indent=4)

In [22]:
!cat ~/.tao_mounts.json

{
    "Mounts": [
        {
            "source": "/data/Git_Repository/Projects_AI/roadai/TAO_Toolkit_Getting_Started/notebooks/tao_launcher_starter_kit/",
            "destination": "/workspace/tao-experiments"
        },
        {
            "source": "/data/Git_Repository/Projects_AI/roadai/TAO_Toolkit_Getting_Started/notebooks/tao_launcher_starter_kit/efficientdet_tf2/specs",
            "destination": "/workspace/tao-experiments/efficientdet_tf2/specs"
        }
    ],
    "DockerOptions": {
        "user": "1000:1000",
        "network": "host"
    }
}

## 1. Installing the TAO launcher <a class="anchor" id="head-1"></a>
The TAO launcher is a python package distributed as a python wheel listed in PyPI. You may install the launcher by executing the following cell.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:
* python >=3.6.9 < 3.8.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 455+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be triggered to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.

After setting up your virtual environment with the above requirements, install TAO pip package.

In [23]:
# SKIP this step IF you have already installed the TAO launcher.
!pip3 install nvidia-tao

[33mThe directory '/var/cache/buildkit/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.[0m
[33mThe directory '/var/cache/buildkit/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.[0m
Looking in indexes: http://pypi.mirrors.ustc.edu.cn/simple/, http://192.168.1.10:7104/test/pypi/
[33mYou are using pip version 18.1, however version 21.3.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [24]:
# View the versions of the TAO launcher
!tao info

Configuration of the TAO Toolkit Instance
dockers: ['nvidia/tao/tao-toolkit']
format_version: 2.0
toolkit_version: 4.0.1
published_date: 03/06/2023


## 2. Prepare dataset and pre-trained model <a class="anchor" id="head-2"></a>

 We will be using the COCO dataset for the tutorial. The following script will download COCO dataset automatically and convert it to TFRecords. 

In [47]:
!echo $DATA_DOWNLOAD_DIR

/workspace/tao-experiments/data


In [45]:
# Create local dir
!mkdir -p $LOCAL_DATA_DIR
!mkdir -p $LOCAL_EXPERIMENT_DIR
# Download and preprocess data
!tao efficientdet_tf2 run bash $SPECS_DIR/download_coco.sh $DATA_DOWNLOAD_DIR

2023-07-07 19:59:47,399 [INFO] root: Registry: ['nvcr.io']
2023-07-07 19:59:47,433 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf2.9.1
+ '[' -z /workspace/tao-experiments/data ']'
+ UNZIP='unzip -nq'
+ OUTPUT_DIR=/workspace/tao-experiments/data
+ SCRATCH_DIR=/workspace/tao-experiments/data/raw-data
+ mkdir -p /workspace/tao-experiments/data
+ mkdir -p /workspace/tao-experiments/data/raw-data
++ pwd
+ CURRENT_DIR=/opt/nvidia
+ cd /workspace/tao-experiments/data/raw-data
+ BASE_IMAGE_URL=http://images.cocodataset.org/zips
+ TRAIN_IMAGE_FILE=train2017.zip
+ download_and_unzip http://images.cocodataset.org/zips train2017.zip
+ local BASE_URL=http://images.cocodataset.org/zips
+ local FILENAME=train2017.zip
+ '[' '!' -f train2017.zip ']'
++ pwd
+ echo 'Downloading train2017.zip to /workspace/tao-experiments/data/raw-data'
Downloading train2017.zip to /workspace/tao-experiments/data/raw-data
+ wget -nd -c http://im

In [49]:
# convert training data to TFRecords
!sed -i "s|RESULTSDIR|$USER_EXPERIMENT_DIR/experiment_dir_unpruned|g" $LOCAL_SPECS_DIR/convert_train.yaml
!tao efficientdet_tf2 dataset_convert -e $SPECS_DIR/convert_train.yaml

2023-07-07 20:14:54,486 [INFO] root: Registry: ['nvcr.io']
2023-07-07 20:14:54,519 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf2.9.1
[1688732097.434074] [ywh-pc:48   :f]        vfs_fuse.c:424  UCX  WARN  failed to connect to vfs socket '�': Invalid argument
2023-07-07 12:14:57,723 [INFO] matplotlib.font_manager: generated new fontManager
'convert_train.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
Starting efficientdet data conversion.
INFO:tensorflow:writing to output path: /workspace/tao-experiments/data/train
writing to output path: /workspace/tao-experiments/data/train
INFO:tensorflow:Building bounding box index.
Building bounding box index.
INFO:tensorflow:1021 images are missing bboxes.
1021 images are mi

In [50]:
# convert validation data to TFRecords
!sed -i "s|RESULTSDIR|$USER_EXPERIMENT_DIR/experiment_dir_unpruned|g" $LOCAL_SPECS_DIR/convert_val.yaml
!tao efficientdet_tf2 dataset_convert -e $SPECS_DIR/convert_val.yaml

2023-07-07 20:21:55,071 [INFO] root: Registry: ['nvcr.io']
2023-07-07 20:21:55,107 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf2.9.1
[1688732518.763149] [ywh-pc:48   :f]        vfs_fuse.c:424  UCX  WARN  failed to connect to vfs socket '�': Invalid argument
2023-07-07 12:21:59,112 [INFO] matplotlib.font_manager: generated new fontManager
'convert_val.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
Log file already exists at /workspace/tao-experiments/efficientdet_tf2/experiment_dir_unpruned/status.json
Starting efficientdet data conversion.
INFO:tensorflow:writing to output path: /workspace/tao-experiments/data/val
writing to output path: /workspace/tao-experiments/data/val
INFO:tensorflow:Building bounding box i

Note that the dataset conversion scripts provided in `specs` are intended for the standard COCO dataset. If your data doesn't have `caption` groundtruth or test set, you can modify `download_and_preprocess_coco.sh` and `create_coco_tf_record.py` by commenting out corresponding variables.

In [51]:
# verify
!ls -l $LOCAL_DATA_DIR

total 20535248
drwxr-xr-x 6 vscode vscode     4096 Jul  7 20:08 raw-data
-rw-r--r-- 1 vscode vscode 75626235 Jul  7 20:21 train-00000-of-00256.tfrecord
-rw-r--r-- 1 vscode vscode 75947730 Jul  7 20:21 train-00001-of-00256.tfrecord
-rw-r--r-- 1 vscode vscode 76695454 Jul  7 20:21 train-00002-of-00256.tfrecord
-rw-r--r-- 1 vscode vscode 77001478 Jul  7 20:21 train-00003-of-00256.tfrecord
-rw-r--r-- 1 vscode vscode 75855927 Jul  7 20:21 train-00004-of-00256.tfrecord
-rw-r--r-- 1 vscode vscode 74046270 Jul  7 20:21 train-00005-of-00256.tfrecord
-rw-r--r-- 1 vscode vscode 76400461 Jul  7 20:21 train-00006-of-00256.tfrecord
-rw-r--r-- 1 vscode vscode 78028465 Jul  7 20:21 train-00007-of-00256.tfrecord
-rw-r--r-- 1 vscode vscode 77848076 Jul  7 20:21 train-00008-of-00256.tfrecord
-rw-r--r-- 1 vscode vscode 78194325 Jul  7 20:21 train-00009-of-00256.tfrecord
-rw-r--r-- 1 vscode vscode 78137894 Jul  7 20:21 train-00010-of-00256.tfrecord
-rw-r--r-- 1 vscode vscode 78367787 Jul  7 20:21 train-000

### Download pretrained model from NGC

 We will use NGC CLI to get the pre-trained models. For more details, go to ngc.nvidia.com and click the SETUP on the navigation bar.

In [55]:
# Installing NGC CLI on the local machine.
## Download and install
%env CLI=ngccli_cat_linux.zip
!mkdir -p $LOCAL_PROJECT_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm -f $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))

env: CLI=ngccli_cat_linux.zip
--2023-07-07 20:26:01--  https://ngc.nvidia.com/downloads/ngccli_cat_linux.zip
Resolving ngc.nvidia.com (ngc.nvidia.com)... 18.66.112.117, 18.66.112.93, 18.66.112.75, ...
Connecting to ngc.nvidia.com (ngc.nvidia.com)|18.66.112.117|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 43271714 (41M) [application/zip]
Saving to: ‘/data/Git_Repository/Projects_AI/roadai/TAO_Toolkit_Getting_Started/notebooks/tao_launcher_starter_kit//ngccli/ngccli_cat_linux.zip’


2023-07-07 20:26:11 (5.70 MB/s) - ‘/data/Git_Repository/Projects_AI/roadai/TAO_Toolkit_Getting_Started/notebooks/tao_launcher_starter_kit//ngccli/ngccli_cat_linux.zip’ saved [43271714/43271714]

Archive:  /data/Git_Repository/Projects_AI/roadai/TAO_Toolkit_Getting_Started/notebooks/tao_launcher_starter_kit//ngccli/ngccli_cat_linux.zip
   creating: /data/Git_Repository/Projects_AI/roadai/TAO_Toolkit_Getting_Started/notebooks/tao_launcher_starter_kit//ngccli/ngc-cli/
   creating: /d

In [30]:
!ngc registry model list nvidia/tao/pretrained_efficientdet_tf2:efficientnet_b0*


+-------+-------+-------+-------+-------+-------+------+-------+-------+
| Versi | Accur | Epoch | Batch | GPU   | Memor | File | Statu | Creat |
| on    | acy   | s     | Size  | Model | y Foo | Size | s     | ed    |
|       |       |       |       |       | tprin |      |       | Date  |
|       |       |       |       |       | t     |      |       |       |
+-------+-------+-------+-------+-------+-------+------+-------+-------+
| effic |       |       |       |       |       | 45.6 | UPLOA | Dec   |
| ientn |       |       |       |       |       | MB   | D_COM | 08,   |
| et_b0 |       |       |       |       |       |      | PLETE | 2022  |
+-------+-------+-------+-------+-------+-------+------+-------+-------+


In [31]:
# Pull pretrained model from NGC
!ngc registry model download-version nvidia/tao/pretrained_efficientdet_tf2:efficientnet_b0 --dest $LOCAL_EXPERIMENT_DIR

Getting files to download...
[?25l[32m⠋[0m [36m━━[0m • [32m0…[0m • [36mRemaining:[0m [36m…[0m • [31m?[0m • [33mElapsed:[0m [33m0…[0m • [34mTotal: 4 - Completed: 0 - Failed: 0[0m
[2K[1A[2K[32m⠙[0m [36m━━[0m • [32m0…[0m • [36mRemaining:[0m [36m…[0m • [31m?[0m • [33mElapsed:[0m [33m0…[0m • [34mTotal: 4 - Completed: 0 - Failed: 0[0m
[2K[1A[2K[32m⠹[0m [36m━━[0m • [32m0…[0m • [36mRemaining:[0m [36m…[0m • [31m?[0m • [33mElapsed:[0m [33m0…[0m • [34mTotal: 4 - Completed: 0 - Failed: 0[0m
[2K[1A[2K[32m⠸[0m [36m━━[0m • [32m0…[0m • [36mRemaining:[0m [36m…[0m • [31m?[0m • [33mElapsed:[0m [33m0…[0m • [34mTotal: 4 - Completed: 0 - Failed: 0[0m
[2K[1A[2K[32m⠴[0m [36m━━[0m • [32m0…[0m • [36mRemaining:[0m [36m…[0m • [31m?[0m • [33mElapsed:[0m [33m0…[0m • [34mTotal: 4 - Completed: 0 - Failed: 0[0m
[2K[1A[2K[32m⠦[0m [36m━━[0m • [32m0…[0m • [36mRemaining:[0m [36m…[0m • [31m?[0m • [33mElaps

In [56]:
print("Check that model is downloaded into dir.")
!ls -l $LOCAL_EXPERIMENT_DIR/pretrained_efficientdet_tf2_vefficientnet_b0

Check that model is downloaded into dir.
total 4980
-rw-rw-rw- 1 vscode vscode  506069 Jul  7 19:36 keras_metadata.pb
-rw-rw-rw- 1 vscode vscode 4584557 Jul  7 19:36 saved_model.pb
drwxrw-rw- 2 vscode vscode    4096 Jul  7 14:58 variables


## 3. Provide training specification <a class="anchor" id="head-3"></a>
* Tfrecords for the train datasets
    * In order to use the newly generated tfrecords, update the dataset_config parameter in the spec file at `$SPECS_DIR/efficientdet_d0_train.txt` 
* Note that the learning rate in the spec file is set for 1 GPU training. If you have N gpus, you should multiply LR by N.
* "num_examples_per_epoch" should be set to the total number of images in the dataset divided by the number of GPUs. For example, if you train COCO with 8GPUs, you can set `num_examples_per_epoch=14700`
* Pre-trained models
* Augmentation parameters for on the fly data augmentation
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.
* **Note that the sample spec is not meant to produce SOTA accuracy on COCO. To reproduce SOTA, you might want to use TAO to train an ImageNet model first and change the total_steps to 100K or above.**

In [57]:
!sed -i "s|DATADIR|$DATA_DOWNLOAD_DIR|g" $LOCAL_SPECS_DIR/spec_train.yaml
!sed -i "s|DATADIR|$DATA_DOWNLOAD_DIR|g" $LOCAL_SPECS_DIR/spec_retrain.yaml
!cat $LOCAL_SPECS_DIR/spec_train.yaml

data:
  loader:
    prefetch_size: 4
    shuffle_file: False
    shuffle_buffer: 10000
    cycle_length: 32
    block_length: 16
  max_instances_per_image: 100
  skip_crowd_during_training: True
  image_size: '512x512'
  num_classes: 91
  train_tfrecords:
    - '/workspace/tao-experiments/data/train-*'
  val_tfrecords:
    - '/workspace/tao-experiments/data/val-*'
  val_json_file: '/workspace/tao-experiments/data/raw-data/annotations/instances_val2017.json'
train:
  optimizer:
    name: 'sgd'
    momentum: 0.9
  lr_schedule:
    name: 'cosine'
    warmup_epoch: 5
    warmup_init: 0.0001
    learning_rate: 0.2
  amp: True
  checkpoint: "/workspace/tao-experiments/efficientdet_tf2/pretrained_efficientdet_tf2_vefficientnet_b0"
  num_examples_per_epoch: 100
  moving_average_decay: 0.999
  batch_size: 20
  checkpoint_interval: 5
  l2_weight_decay: 0.00004
  l1_weight_decay: 0.0
  clip_gradients_norm: 10.0
  image_preview: True
  qat: False
  random_seed: 42
  pruned_model_path: ''
  num_epo

## 4. Train an Efficientdet model <a class="anchor" id="head-4"></a>
* Provide the sample spec file and the output directory location for models
* Evaluation uses COCO metrics. For more info, please refer to: https://cocodataset.org/#detection-eval
* WARNING: training will take several hours or one day to complete

In [58]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned
!sed -i "s|RESULTSDIR|$USER_EXPERIMENT_DIR/experiment_dir_unpruned|g" $LOCAL_SPECS_DIR/spec_train.yaml
!sed -i "s|ENC_KEY|$KEY|g" $LOCAL_SPECS_DIR/spec_train.yaml

In [59]:
print("For multi-GPU, change --gpus based on your machine.")
!echo tao efficientdet_tf2 train -e $SPECS_DIR/spec_train.yaml --gpus $NUM_GPUS
!tao efficientdet_tf2 train -e $SPECS_DIR/spec_train.yaml --gpus $NUM_GPUS

For multi-GPU, change --gpus based on your machine.
2023-07-07 20:36:46,084 [INFO] root: Registry: ['nvcr.io']
2023-07-07 20:36:46,118 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf2.9.1
[1688733409.372456] [ywh-pc:48   :f]        vfs_fuse.c:424  UCX  WARN  failed to connect to vfs socket '�': Invalid argument
2023-07-07 12:36:49,718 [INFO] matplotlib.font_manager: generated new fontManager
[1688733412.357751] [ywh-pc:358  :f]        vfs_fuse.c:424  UCX  WARN  failed to connect to vfs socket '�': Invalid argument
'spec_train.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
INFO:tensorflow:Mixed precision compatibility check (mixed_float16): OK
Your GPU will likely run quickly with dtype policy mixed_float16 as it ha

In [61]:
print("To resume training from a checkpoint, simply run the same training script. It will pick up from where it's left.")
!tao efficientdet_tf2 train -e $SPECS_DIR/spec_train.yaml --gpus $NUM_GPUS

To resume training from a checkpoint, simply run the same training script. It will pick up from where it's left.
2023-07-07 20:37:26,874 [INFO] root: Registry: ['nvcr.io']
2023-07-07 20:37:26,909 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf2.9.1
[1688733449.692628] [ywh-pc:48   :f]        vfs_fuse.c:424  UCX  WARN  failed to connect to vfs socket '�': Invalid argument
2023-07-07 12:37:29,982 [INFO] matplotlib.font_manager: generated new fontManager
[1688733452.495055] [ywh-pc:359  :f]        vfs_fuse.c:424  UCX  WARN  failed to connect to vfs socket '�': Invalid argument
'spec_train.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
INFO:tensorflow:Mixed precision compatibility check (mixed_float16): OK
Your GPU wil

In [41]:
print('Model for each epoch:')
print('---------------------')
!ls -ltrh $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned/

Model for each epoch:
---------------------
total 0


**Tips:** TAO commands use [Hydra](hydra.cc) to parse the spec file, so we can use Hydra override [syntax](https://hydra.cc/docs/advanced/override_grammar/basic/) to easily change parameters without modifying the spec file.
For example, if we want to train our model longer, we can run
```
!tao efficientdet_tf2 train -e $SPECS_DIR/spec_train.yaml --gpus $NUM_GPUS train.num_epochs=100
```
To check all the existing parameters, we can add `--info` to the command,
```
!tao efficientdet_tf2 train -e $SPECS_DIR/spec_train.yaml --info
```

The syntax applies to all TAO commands, including dataset_convert, train, evaluate, prune, inference and export

## 5. Evaluate trained models <a class="anchor" id="head-5"></a>

In [62]:
# get the last checkpoints
last_checkpoint = ''
for f in os.listdir(os.path.join(os.environ["LOCAL_EXPERIMENT_DIR"],'experiment_dir_unpruned', 'weights')):
    if f.startswith('efficientdet-d'):
        last_checkpoint = last_checkpoint if last_checkpoint > f else f
print(f'Last checkpoint: {last_checkpoint}')

FileNotFoundError: [Errno 2] No such file or directory: '/data/Git_Repository/Projects_AI/roadai/TAO_Toolkit_Getting_Started/notebooks/tao_launcher_starter_kit/efficientdet_tf2/experiment_dir_unpruned/weights'

In [None]:
# Set LAST_CHECKPOINT in the spec file
%env LAST_CHECKPOINT={last_checkpoint}
!sed -i "s|EVALMODEL|$USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/$LAST_CHECKPOINT|g" $LOCAL_SPECS_DIR/spec_train.yaml

In [None]:
!tao efficientdet_tf2 evaluate -e $SPECS_DIR/spec_train.yaml

## 6. Prune <a class="anchor" id="head-6"></a>

- Specify pre-trained model
- Equalization criterion
- Threshold for pruning.
- A key to save and load the model
- Output directory to store the model

Usually, you just need to adjust -pth (threshold) for accuracy and model size trade off. Higher pth gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold value depends on the dataset and the model. 0.4 in the block below is just a start point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy.

In [None]:
# Create an output directory to save the pruned model.
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_pruned
!sed -i "s|PRUNEDMODEL|$USER_EXPERIMENT_DIR/experiment_dir_pruned/model_pruned.tlt|g" $LOCAL_SPECS_DIR/spec_train.yaml

In [None]:
!tao efficientdet_tf2 prune -e $SPECS_DIR/spec_train.yaml

In [None]:
!ls -l $LOCAL_EXPERIMENT_DIR/experiment_dir_pruned

**Note** that you should retrain the pruned model first, as it cannot be directly used for evaluation or inference. 

## 7. Retrain pruned models <a class="anchor" id="head-7"></a>

- Model needs to be re-trained to bring back accuracy after pruning
- Specify re-training specification
- WARNING: training will take several hours or one day to complete

In [None]:
!cat $LOCAL_SPECS_DIR/spec_retrain.yaml

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain
!sed -i "s|ENC_KEY|$KEY|g" $LOCAL_SPECS_DIR/spec_retrain.yaml
!sed -i "s|RESULTSDIR|$USER_EXPERIMENT_DIR/experiment_dir_retrain|g" $LOCAL_SPECS_DIR/spec_retrain.yaml
!sed -i "s|PRUNEDMODEL|$USER_EXPERIMENT_DIR/experiment_dir_pruned/model_pruned.tlt|g" $LOCAL_SPECS_DIR/spec_retrain.yaml

In [None]:
!tao efficientdet_tf2 train -e $SPECS_DIR/spec_retrain.yaml --gpus $NUM_GPUS

## Quantization aware training (QAT) with the pruned model

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain_qat
!sed -i "s|ENC_KEY|$KEY|g" $LOCAL_SPECS_DIR/spec_retrain_qat.yaml
!sed -i "s|DATADIR|$DATA_DOWNLOAD_DIR|g" $LOCAL_SPECS_DIR/spec_retrain_qat.yaml
!sed -i "s|RESULTSDIR|$USER_EXPERIMENT_DIR/experiment_dir_retrain_qat|g" $LOCAL_SPECS_DIR/spec_retrain_qat.yaml
!sed -i "s|PRUNEDMODEL|$USER_EXPERIMENT_DIR/experiment_dir_pruned/model_pruned.tlt|g" $LOCAL_SPECS_DIR/spec_retrain_qat.yaml

In [None]:
!tao efficientdet_tf2 train -e $SPECS_DIR/spec_retrain_qat.yaml --gpus $NUM_GPUS

## 8. Evaluate retrained model <a class="anchor" id="head-8"></a>

In [None]:
# get the last step of saved checkpoints
last_checkpoint = ''
for f in os.listdir(os.path.join(os.environ["LOCAL_EXPERIMENT_DIR"],'experiment_dir_retrain', 'weights')):
    if f.startswith('efficientdet-d'):
        last_checkpoint = last_checkpoint if last_checkpoint > f else f
print(f'Last checkpoint: {last_checkpoint}')

In [None]:
# Set LAST_CHECKPOINT in the spec file
%env LAST_CHECKPOINT={last_checkpoint}
!sed -i "s|EVALMODEL|$USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/$LAST_CHECKPOINT|g" $LOCAL_SPECS_DIR/spec_retrain.yaml

In [None]:
!tao efficientdet_tf2 evaluate -e $SPECS_DIR/spec_retrain.yaml

## 9. Visualize inferences <a class="anchor" id="head-9"></a>
In this section, we run the `infer` tool to generate inferences on the trained models and visualize the results. The `infer` tool produces annotated image outputs. 

In [None]:
# Copy some test images
!mkdir -p $LOCAL_DATA_DIR/test_samples
!cp $LOCAL_DATA_DIR/raw-data/test2017/0000000000* $LOCAL_DATA_DIR/test_samples

In [None]:
# Running inference for detection on n images
!tao efficientdet_tf2 inference -e $SPECS_DIR/spec_retrain.yaml

In [None]:
# Simple grid visualizer
import matplotlib.pyplot as plt
import os
from math import ceil
valid_image_ext = ['.jpg']

def visualize_images(image_dir, num_cols=4, num_images=10):
    output_path = os.path.join(os.environ['LOCAL_EXPERIMENT_DIR'], image_dir)
    num_rows = int(ceil(float(num_images) / float(num_cols)))
    f, axarr = plt.subplots(num_rows, num_cols, figsize=[80,30])
    f.tight_layout()
    a = [os.path.join(output_path, image) for image in os.listdir(output_path) 
         if os.path.splitext(image)[1].lower() in valid_image_ext]
    for idx, img_path in enumerate(a[:num_images]):
        col_id = idx % num_cols
        row_id = idx // num_cols
        img = plt.imread(img_path)
        axarr[row_id, col_id].imshow(img) 

In [None]:
# Visualizing the sample images.
OUTPUT_PATH = 'experiment_dir_retrain/annotated_images' # relative path from $USER_EXPERIMENT_DIR.
COLS = 2 # number of columns in the visualizer grid.
IMAGES = 4 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

## 10. Deploy! <a class="anchor" id="head-10"></a>

In [None]:
# tao <task> export will fail if .etlt already exists. So we clear the export folder before tao <task> export
!rm -rf $LOCAL_EXPERIMENT_DIR/export
!mkdir -p $LOCAL_EXPERIMENT_DIR/export

# Generate .etlt file using tao container
!sed -i "s|EXPORTDIR|$USER_EXPERIMENT_DIR/export|g" $LOCAL_SPECS_DIR/spec_retrain.yaml
!tao efficientdet_tf2 export -e $SPECS_DIR/spec_retrain.yaml

In [None]:
# Check if etlt model is correctly saved.
!ls -l $LOCAL_EXPERIMENT_DIR/export

Using the `tao-deploy` container, you can generate a TensorRT engine and verify the correctness of the generated through evaluate and inference.

The `tao-deploy` produces optimized tensorrt engines for the platform that it resides on. Therefore, to get maximum performance, please run `tao-deploy` command which will instantiate a deploy container, with the exported `.etlt` file on your target device. The `tao-deploy` container only works for x86, with discrete NVIDIA GPU's.

For the jetson devices, please download the tao-converter for jetson and refer to [here](https://docs.nvidia.com/tao/tao-toolkit/text/tensorrt.html#installing-the-tao-converter) for more details.

If you choose to integrate your model into deepstream directly, you may do so by simply copying the exported `.etlt` file along with the calibration cache to the target device and updating the spec file that configures the `gst-nvinfer` element to point to this newly exported model. Usually this file is called `config_infer_primary.txt` for detection models and `config_infer_secondary_*.txt` for classification models.

In [None]:
# Convert to TensorRT engine (FP32).
!tao-deploy efficientdet_tf2 gen_trt_engine -e $SPECS_DIR/spec_retrain.yaml

In [None]:
# Convert to TensorRT engine (INT8).
!sed -i "s|fp32|int8|g" $LOCAL_SPECS_DIR/spec_retrain.yaml
!tao-deploy efficientdet_tf2 gen_trt_engine -e $SPECS_DIR/spec_retrain.yaml

In [None]:
print('Exported models:')
print('------------')
!ls -lth $LOCAL_EXPERIMENT_DIR/export

In [None]:
# get the last QAT checkpoints
last_checkpoint = ''
for f in os.listdir(os.path.join(os.environ["LOCAL_EXPERIMENT_DIR"],'experiment_dir_retrain_qat', 'weights')):
    if f.startswith('efficientdet-d'):
        last_checkpoint = last_checkpoint if last_checkpoint > f else f
print(f'Last checkpoint: {last_checkpoint}')

In [None]:
# Set LAST_CHECKPOINT in the spec file
%env LAST_CHECKPOINT={last_checkpoint}
!sed -i "s|EVALMODEL|$USER_EXPERIMENT_DIR/experiment_dir_retrain_qat/weights/$LAST_CHECKPOINT|g" $LOCAL_SPECS_DIR/spec_retrain_qat.yaml

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/export_qat
# Export in QAT mode. 
!sed -i "s|EXPORTDIR|$USER_EXPERIMENT_DIR/export_qat|g" $LOCAL_SPECS_DIR/spec_retrain_qat.yaml
!tao efficientdet_tf2 export -e $SPECS_DIR/spec_retrain_qat.yaml

In [None]:
# Convert QAT to TRT engine
!tao-deploy efficientdet_tf2 gen_trt_engine -e $SPECS_DIR/spec_retrain_qat.yaml

In [None]:
# Check if etlt model is correctly saved.
!ls -l $LOCAL_EXPERIMENT_DIR/export_qat

## 11. Verify the deployed model <a class="anchor" id="head-11"></a>

Verify the converted engine by visualizing TensorRT inferences.

In [None]:
# Set engine as model_path
!sed -i "s|$USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/$LAST_CHECKPOINT|$USER_EXPERIMENT_DIR/export/efficientdet-d0.fp32.engine|g" $LOCAL_SPECS_DIR/spec_retrain.yaml
# Running inference for detection on a dir of images
!tao-deploy efficientdet_tf2 inference -e $SPECS_DIR/spec_retrain.yaml \
                                       inference.output_dir=$USER_EXPERIMENT_DIR/export

In [None]:
!ls -l $LOCAL_EXPERIMENT_DIR/export/images_annotated