# Object Detection using TAO EfficientDet (TF2)

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://d29g4g2dyqv443.cloudfront.net/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png" width="1080"> 

## Sample prediction of EfficientDet
<img align="center" src="https://github.com/vpraveen-nv/model_card_images/blob/main/cv/notebook/common/sample.jpg?raw=true" width="960">

## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Take a pretrained model and train an EfficientDet-D0 model on COCO dataset
* Evaluate the trained model
* Run pruning and finetuning with the trained model
* Run inference with the trained model and visualize the result
* Export the trained model to a .onnx file for deployment to DeepStream
* Run inference on the exported .onnx model to verify deployment using TensorRT

At the end of this notebook, you will have generated a trained and optimized `EfficientDet` model
which you may deploy via [Triton](https://github.com/NVIDIA-AI-IOT/tao-toolkit-triton-apps)
or [DeepStream](https://developer.nvidia.com/deepstream-sdk).

### Table of Contents
This notebook shows an example use case for instance segmentation using the Train Adapt Optimize (TAO) Toolkit.

0. [Set up env variables and map drives](#head-0)
1. [Installing the TAO Launcher](#head-1)
2. [Prepare dataset and pre-trained model](#head-2)
3. [Provide training specification](#head-3)
4. [Run TAO training](#head-4)
5. [Evaluate trained models](#head-5)
6. [Prune trained model](#head-6)
7. [Retrain pruned models](#head-7)
8. [Evaluate retrained model](#head-8)
9. [Visualize inferences](#head-9)
10. [Deploy](#head-10)
11. [Verify the deployed model](#head-11)

## 0. Set up env variables and map drives <a class="anchor" id="head-0"></a>

The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users workspace. Please note that the dataset to run this notebook is expected to reside in the `$LOCAL_PROJECT_DIR/data`, while the TAO experiment generated collaterals will be output to `$LOCAL_PROJECT_DIR/efficientdet_tf2`. More information on how to set up the dataset and the supported steps in the TAO workflow are provided in the subsequent cells.

*Note: Please make sure to remove any stray artifacts/files from the `$USER_EXPERIMENT_DIR` or `$DATA_DOWNLOAD_DIR` paths as mentioned below, that may have been generated from previous experiments. Having checkpoint files etc may interfere with creating a training graph for a new experiment.*

*Note: This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly*

In [None]:
# Setting up env variables for cleaner command line commands.
import os

%env NUM_GPUS=1
%env USER_EXPERIMENT_DIR=/workspace/tao-experiments/efficientdet_tf2
%env DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=~/tao-samples/efficientdet_tf2

# Please define this local project directory that needs to be mapped to the TAO docker session.
# The dataset expected to be present in $LOCAL_PROJECT_DIR/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/efficientdet_tf2
# !PLEASE MAKE SURE TO UPDATE THIS PATH!.
%env LOCAL_PROJECT_DIR=/workspace/tao-experiments/

os.environ["LOCAL_DATA_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "data"
)
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "efficientdet_tf2"
)

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)
%env SPECS_DIR=/workspace/tao-experiments/efficientdet_tf2/specs

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR

The cell below maps the project directory on your local host to a workspace directory in the TAO docker instance, so that the data and the results are mapped from in and out of the docker. For more information please refer to the [launcher instance](https://docs.nvidia.com/tao/tao-toolkit-archive/tlt-30/text/tlt_launcher.html) in the user guide.

When running this cell on AWS, update the drive_map entry with the dictionary defined below, so that you don't have permission issues when writing data into folders created by the TAO docker.

```json
drive_map = {
    "Mounts": [
            # Mapping the data directory
            {
                "source": os.environ["LOCAL_PROJECT_DIR"],
                "destination": "/workspace/tao-experiments"
            },
            # Mapping the specs directory.
            {
                "source": os.environ["LOCAL_SPECS_DIR"],
                "destination": os.environ["SPECS_DIR"]
            },
        ],
    "DockerOptions": {
        "user": "{}:{}".format(os.getuid(), os.getgid()),
        "network": "host"
    }
}
```

In [None]:
# Mapping up the local directories to the TAO docker.
import json
import os
mounts_file = os.path.expanduser("~/.tao_mounts.json")

# Define the dictionary with the mapped drives
drive_map = {
    "Mounts": [
        # Mapping the data directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tao-experiments"
        },
        # Mapping the specs directory.
        {
            "source": os.environ["LOCAL_SPECS_DIR"],
            "destination": os.environ["SPECS_DIR"]
        },
    ],
    "DockerOptions": {
        "user": "{}:{}".format(os.getuid(), os.getgid()),
        "network": "host"
    }
}

# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(drive_map, mfile, indent=4)

In [None]:
!cat ~/.tao_mounts.json

## 1. Installing the TAO launcher <a class="anchor" id="head-1"></a>
The TAO launcher is a python package distributed as a python wheel listed in PyPI. You may install the launcher by executing the following cell.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:
* python >=3.7, <=3.10.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 455+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be triggered to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.

After setting up your virtual environment with the above requirements, install TAO pip package.

In [None]:
# SKIP this step IF you have already installed the TAO launcher.
!pip3 install nvidia-pyindex
!pip3 install nvidia-tao

In [None]:
# View the versions of the TAO launcher
!tao info

## 2. Prepare dataset and pre-trained model <a class="anchor" id="head-2"></a>

 We will be using the COCO dataset for the tutorial. The following script will download COCO dataset automatically and convert it to TFRecords. 

In [None]:
# Create local dir
!mkdir -p $LOCAL_DATA_DIR
!mkdir -p $LOCAL_EXPERIMENT_DIR
# Download and preprocess data
!tao model efficientdet_tf2 run bash $SPECS_DIR/download_coco.sh $DATA_DOWNLOAD_DIR

In [None]:
# convert training data to TFRecords
!sed -i "s|DATADIR|$DATA_DOWNLOAD_DIR|g" $LOCAL_SPECS_DIR/spec_train.yaml
!tao model efficientdet_tf2 dataset_convert -e $SPECS_DIR/spec_train.yaml \
    dataset_convert.results_dir=$DATA_DOWNLOAD_DIR

In [None]:
# convert validation data to TFRecords
!tao model efficientdet_tf2 dataset_convert -e $SPECS_DIR/spec_train.yaml \
    dataset_convert.image_dir="$DATA_DOWNLOAD_DIR/raw-data/val2017/"\
    dataset_convert.annotations_file="$DATA_DOWNLOAD_DIR/raw-data/annotations/instances_val2017.json"\
    dataset_convert.tag='val' \
    dataset_convert.num_shards=32

Note that the dataset conversion scripts provided in `specs` are intended for the standard COCO dataset. If your data doesn't have `caption` groundtruth or test set, you can modify `download_and_preprocess_coco.sh` and `create_coco_tf_record.py` by commenting out corresponding variables.

In [None]:
# verify
!ls -l $LOCAL_DATA_DIR

### Download pretrained model from NGC

 We will use NGC CLI to get the pre-trained models. For more details, go to ngc.nvidia.com and click the SETUP on the navigation bar.

In [None]:
# Installing NGC CLI on the local machine.
## Download and install
import os
import platform

if platform.machine() == "x86_64":
    os.environ["CLI"]="ngccli_linux.zip"
else:
    os.environ["CLI"]="ngccli_arm64.zip"


# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm -f $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))

In [None]:
!ngc registry model list nvidia/tao/pretrained_efficientdet_tf2:efficientnet_b0*


In [None]:
# Pull pretrained model from NGC
!ngc registry model download-version nvidia/tao/pretrained_efficientdet_tf2:efficientnet_b0 --dest $LOCAL_EXPERIMENT_DIR

In [None]:
print("Check that model is downloaded into dir.")
!ls -l $LOCAL_EXPERIMENT_DIR/pretrained_efficientdet_tf2_vefficientnet_b0

## 3. Provide training specification <a class="anchor" id="head-3"></a>
* Tfrecords for the train datasets
    * In order to use the newly generated tfrecords, update the dataset_config parameter in the spec file at `$SPECS_DIR/efficientdet_d0_train.txt` 
* Note that the learning rate in the spec file is set for 1 GPU training. If you have N gpus, you should multiply LR by N.
* "num_examples_per_epoch" should be set to the total number of images in the dataset.
* Pre-trained models
* Augmentation parameters for on the fly data augmentation
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.

In [None]:
!sed -i "s|DATADIR|$DATA_DOWNLOAD_DIR|g" $LOCAL_SPECS_DIR/spec_train.yaml
!sed -i "s|DATADIR|$DATA_DOWNLOAD_DIR|g" $LOCAL_SPECS_DIR/spec_retrain.yaml
!cat $LOCAL_SPECS_DIR/spec_train.yaml

## 4. Train an Efficientdet model <a class="anchor" id="head-4"></a>
* Provide the sample spec file and the output directory location for models
* Evaluation uses COCO metrics. For more info, please refer to: https://cocodataset.org/#detection-eval
* WARNING: training will take several hours or one day to complete

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned
!sed -i "s|RESULTSDIR|$USER_EXPERIMENT_DIR/experiment_dir_unpruned|g" $LOCAL_SPECS_DIR/spec_train.yaml

In [None]:
print("For multi-GPU, change num_gpus based on your machine.")
!tao model efficientdet_tf2 train -e $SPECS_DIR/spec_train.yaml num_gpus=$NUM_GPUS \
    results_dir="$USER_EXPERIMENT_DIR/experiment_dir_unpruned"

In [None]:
print("To resume training from a checkpoint, simply run the same training script. It will pick up from where it's left.")
# !tao model efficientdet_tf2 train -e $SPECS_DIR/spec_train.yaml num_gpus=$NUM_GPUS

In [None]:
print('Model for each epoch:')
print('---------------------')
!ls -ltrh $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned/train

**Tips:** TAO commands use [Hydra](hydra.cc) to parse the spec file, so we can use Hydra override [syntax](https://hydra.cc/docs/advanced/override_grammar/basic/) to easily change parameters without modifying the spec file.
For example, if we want to train our model longer, we can run
```
!tao model efficientdet_tf2 train -e $SPECS_DIR/spec_train.yaml num_gpus=$NUM_GPUS train.num_epochs=100
```
To check all the existing parameters, we can add `--info` to the command,
```
!tao model efficientdet_tf2 train -e $SPECS_DIR/spec_train.yaml --info
```

The syntax applies to all TAO commands, including dataset_convert, train, evaluate, prune, inference and export

## 5. Evaluate trained models <a class="anchor" id="head-5"></a>

In [None]:
# get the last checkpoints
last_checkpoint = ''
for f in os.listdir(os.path.join(os.environ["LOCAL_EXPERIMENT_DIR"],'experiment_dir_unpruned', 'train')):
    if f.startswith('efficientdet-d'):
        last_checkpoint = last_checkpoint if last_checkpoint > f else f
print(f'Last checkpoint: {last_checkpoint}')

In [None]:
# Set LAST_CHECKPOINT in the spec file
%env LAST_CHECKPOINT={last_checkpoint}
!sed -i "s|EVALMODEL|$USER_EXPERIMENT_DIR/experiment_dir_unpruned/train/$LAST_CHECKPOINT|g" $LOCAL_SPECS_DIR/spec_train.yaml

In [None]:
!tao model efficientdet_tf2 evaluate -e $SPECS_DIR/spec_train.yaml

## 6. Prune <a class="anchor" id="head-6"></a>

- Specify pre-trained model
- Equalization criterion
- Threshold for pruning.
- Output directory to store the model

Usually, you just need to adjust -pth (threshold) for accuracy and model size trade off. Higher pth gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold value depends on the dataset and the model. 0.5 in the block below is just a start point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy.

In [None]:
# Notes
print("The pruned model by default will be saved under `$LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned/prune/`")
print("To change the saving directory, please update `prune.results_dir`")

In [None]:
!tao model efficientdet_tf2 prune -e $SPECS_DIR/spec_train.yaml

In [None]:
!ls -l $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned/prune

**Note** that you should retrain the pruned model first, as it cannot be directly used for evaluation or inference. 

## 7. Retrain pruned models <a class="anchor" id="head-7"></a>

- Model needs to be re-trained to bring back accuracy after pruning
- Specify re-training specification
- WARNING: training will take several hours or one day to complete

In [None]:
!cat $LOCAL_SPECS_DIR/spec_retrain.yaml

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain
!sed -i "s|RESULTSDIR|$USER_EXPERIMENT_DIR/experiment_dir_retrain|g" $LOCAL_SPECS_DIR/spec_retrain.yaml
!sed -i "s|PRUNEDMODEL|$USER_EXPERIMENT_DIR/experiment_dir_unpruned/prune/model_th=0.5_eq=union.tlt|g" $LOCAL_SPECS_DIR/spec_retrain.yaml

In [None]:
!tao model efficientdet_tf2 train -e $SPECS_DIR/spec_retrain.yaml num_gpus=$NUM_GPUS

## Quantization aware training (QAT) with the pruned model

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain_qat
!sed -i "s|DATADIR|$DATA_DOWNLOAD_DIR|g" $LOCAL_SPECS_DIR/spec_retrain_qat.yaml
!sed -i "s|RESULTSDIR|$USER_EXPERIMENT_DIR/experiment_dir_retrain_qat|g" $LOCAL_SPECS_DIR/spec_retrain_qat.yaml
!sed -i "s|PRUNEDMODEL|$USER_EXPERIMENT_DIR/experiment_dir_unpruned/prune/model_th=0.5_eq=union.tlt|g" $LOCAL_SPECS_DIR/spec_retrain_qat.yaml

In [None]:
!tao model efficientdet_tf2 train -e $SPECS_DIR/spec_retrain_qat.yaml num_gpus=$NUM_GPUS

## 8. Evaluate retrained model <a class="anchor" id="head-8"></a>

In [None]:
# get the last step of saved checkpoints
last_checkpoint = ''
for f in os.listdir(os.path.join(os.environ["LOCAL_EXPERIMENT_DIR"],'experiment_dir_retrain', 'train')):
    if f.startswith('efficientdet-d'):
        last_checkpoint = last_checkpoint if last_checkpoint > f else f
print(f'Last checkpoint: {last_checkpoint}')

In [None]:
# Set LAST_CHECKPOINT in the spec file
%env LAST_CHECKPOINT={last_checkpoint}
!sed -i "s|EVALMODEL|$USER_EXPERIMENT_DIR/experiment_dir_retrain/train/$LAST_CHECKPOINT|g" $LOCAL_SPECS_DIR/spec_retrain.yaml

In [None]:
!tao model efficientdet_tf2 evaluate -e $SPECS_DIR/spec_retrain.yaml

## 9. Visualize inferences <a class="anchor" id="head-9"></a>
In this section, we run the `infer` tool to generate inferences on the trained models and visualize the results. The `infer` tool produces annotated image outputs. 

In [None]:
# Copy some test images
!mkdir -p $LOCAL_DATA_DIR/test_samples
!cp $LOCAL_DATA_DIR/raw-data/test2017/0000000000* $LOCAL_DATA_DIR/test_samples

In [None]:
# Running inference for detection on n images
!tao model efficientdet_tf2 inference -e $SPECS_DIR/spec_retrain.yaml

In [None]:
# install deps
!pip3 install "matplotlib>=3.3.3, <4.0"

In [None]:
# Simple grid visualizer
import matplotlib.pyplot as plt
import os
from math import ceil
valid_image_ext = ['.jpg']

def visualize_images(image_dir, num_cols=4, num_images=10):
    output_path = os.path.join(os.environ['LOCAL_EXPERIMENT_DIR'], image_dir)
    num_rows = int(ceil(float(num_images) / float(num_cols)))
    f, axarr = plt.subplots(num_rows, num_cols, figsize=[80,30])
    f.tight_layout()
    a = [os.path.join(output_path, image) for image in os.listdir(output_path) 
         if os.path.splitext(image)[1].lower() in valid_image_ext]
    for idx, img_path in enumerate(a[:num_images]):
        col_id = idx % num_cols
        row_id = idx // num_cols
        img = plt.imread(img_path)
        axarr[row_id, col_id].imshow(img) 

In [None]:
# Visualizing the sample images.
OUTPUT_PATH = 'experiment_dir_retrain/inference' # relative path from $USER_EXPERIMENT_DIR.
COLS = 2 # number of columns in the visualizer grid.
IMAGES = 4 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

## 10. Deploy! <a class="anchor" id="head-10"></a>

In [None]:
!rm -rf $LOCAL_EXPERIMENT_DIR/export
!mkdir -p $LOCAL_EXPERIMENT_DIR/export

# Generate .onnx file using tao container
!sed -i "s|EXPORTDIR|$USER_EXPERIMENT_DIR/export|g" $LOCAL_SPECS_DIR/spec_retrain.yaml
!tao model efficientdet_tf2 export -e $SPECS_DIR/spec_retrain.yaml

In [None]:
# Check if onnx model is correctly saved.
!ls -l $LOCAL_EXPERIMENT_DIR/export

Using the `tao deploy` container, you can generate a TensorRT engine and verify the correctness of the generated through evaluate and inference.

The `tao deploy` produces optimized tensorrt engines for the platform that it resides on. Therefore, to get maximum performance, please run `tao deploy` command which will instantiate a deploy container, with the exported `.onnx` file on your target device. The `tao deploy` container only works for x86, with discrete NVIDIA GPU's.

For the jetson devices, please download the tao-converter for jetson and refer to [here](https://docs.nvidia.com/tao/tao-toolkit-archive/tao-30-2108/text/tensorrt.html#installing-the-tao-converter) for more details.

If you choose to integrate your model into deepstream directly, you may do so by simply copying the exported `.onnx` file along with the calibration cache to the target device and updating the spec file that configures the `gst-nvinfer` element to point to this newly exported model. Usually this file is called `config_infer_primary.txt` for detection models and `config_infer_secondary_*.txt` for classification models.

In [None]:
# Convert to TensorRT engine (FP32).
!tao deploy efficientdet_tf2 gen_trt_engine -e $SPECS_DIR/spec_retrain.yaml

In [None]:
# Convert to TensorRT engine (INT8).
!sed -i "s|fp32|int8|g" $LOCAL_SPECS_DIR/spec_retrain.yaml
!tao deploy efficientdet_tf2 gen_trt_engine -e $SPECS_DIR/spec_retrain.yaml

In [None]:
print('Exported models:')
print('------------')
!ls -lth $LOCAL_EXPERIMENT_DIR/export

In [None]:
# get the last QAT checkpoints
last_checkpoint = ''
for f in os.listdir(os.path.join(os.environ["LOCAL_EXPERIMENT_DIR"],'experiment_dir_retrain_qat', 'train')):
    if f.startswith('efficientdet-d'):
        last_checkpoint = last_checkpoint if last_checkpoint > f else f
print(f'Last checkpoint: {last_checkpoint}')

In [None]:
# Set LAST_CHECKPOINT in the spec file
%env LAST_CHECKPOINT={last_checkpoint}
!sed -i "s|EVALMODEL|$USER_EXPERIMENT_DIR/experiment_dir_retrain_qat/train/$LAST_CHECKPOINT|g" $LOCAL_SPECS_DIR/spec_retrain_qat.yaml

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/export_qat
# Export in QAT mode. 
!sed -i "s|EXPORTDIR|$USER_EXPERIMENT_DIR/export_qat|g" $LOCAL_SPECS_DIR/spec_retrain_qat.yaml
!tao model efficientdet_tf2 export -e $SPECS_DIR/spec_retrain_qat.yaml

In [None]:
# Convert QAT to TRT engine
!tao deploy efficientdet_tf2 gen_trt_engine -e $SPECS_DIR/spec_retrain_qat.yaml

In [None]:
# Check if onnx model is correctly saved.
!ls -l $LOCAL_EXPERIMENT_DIR/export_qat

## 11. Verify the deployed model <a class="anchor" id="head-11"></a>

Verify the converted engine by visualizing TensorRT inferences.

In [None]:
# Running inference for detection on a dir of images
!tao deploy efficientdet_tf2 inference -e $SPECS_DIR/spec_retrain.yaml \
    inference.results_dir=$USER_EXPERIMENT_DIR/export

In [None]:
!ls -l $LOCAL_EXPERIMENT_DIR/export/