<a href="https://colab.research.google.com/github/NobuoTsukamoto/tensorrt-examples/blob/main/cpp/efficientdet/Export_EfficientDetLite_TensorRT.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# TensorRT EfficientDet-Lite Model Conversion AutoML Models to ONNX Model

This notebook contains a sample that converts EfficientDet-Lite's AutoML Model into an ONNX model for running on TensorRT.  

Reference
- [EfficientDet Object Detection in TensorRT](https://github.com/NVIDIA/TensorRT/tree/main/samples/python/efficientdet)
- [EfficientDet](https://github.com/google/automl/tree/master/efficientdet)

# Export Saved Model

## Clone [google/automl](https://github.com/google/automl) repository and install dependency.


In [1]:
%%bash

cd /content
git clone https://github.com/google/automl
cd automl
git checkout c7392f2bab3165244d1c565b66409fa11fa82367
cd efficientdet
pip3 install -r requirements.txt

Collecting git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI (from -r requirements.txt (line 13))
  Cloning https://github.com/cocodataset/cocoapi.git to /tmp/pip-req-build-82ov693r
  Resolved https://github.com/cocodataset/cocoapi.git to commit 8c9bcc3cf640524c4c20a9c40e89cb6a2f2fa0e9
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Collecting Pillow>=9.5.0 (from -r requirements.txt (line 5))
  Downloading Pillow-10.1.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.6/3.6 MB 10.4 MB/s eta 0:00:00
Collecting numpy>=1.19.4 (from -r requirements.txt (line 4))
  Downloading numpy-1.26.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 48.8 MB/s eta 0:00:00
Collecting protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3 (from tensorflow>=2.10.0->-r requirements.txt 

fatal: destination path 'automl' already exists and is not an empty directory.
Previous HEAD position was 38ecb93 Fix inference bug (#1060)
HEAD is now at c7392f2 p.add_(torch.sign(update), alpha=-group['lr']) has been made more efficient by using the torch.sign function's inplace mode. This will prevent the need to create a new tensor for the updated parameter, which can save a significant amount of time for large models. (#1193)
  Running command git clone --filter=blob:none --quiet https://github.com/cocodataset/cocoapi.git /tmp/pip-req-build-82ov693r
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lida 0.0.10 requires fastapi, which is not installed.
lida 0.0.10 requires kaleido, which is not installed.
lida 0.0.10 requires python-multipart, which is not installed.
lida 0.0.10 requires uvicorn, which is not installed.
cupy-cuda11x 11.0.0 requires numpy<1.26

In [2]:
import os
import yaml

os.environ['PYTHONPATH'] = '/content/automl/efficientdet:' + os.environ['PYTHONPATH']
print(os.environ['PYTHONPATH'])

/content/automl/efficientdet:/env/python


## Download EfficentDet Lite checkpoint and export saved model.

### Download checkpoint

Select the checkpoint you want to export.

In [3]:
#@title Select EfficientDet-lite model.

checkpoints = 'efficientdet-lite1' #@param ["efficientdet-lite0", "efficientdet-lite1", "efficientdet-lite2", "efficientdet-lite3", "efficientdet-lite3x", "efficientdet-lite4"] {allow-input: false}


In [4]:
file_name = checkpoints + ".tgz"
path = "https://storage.googleapis.com/cloud-tpu-checkpoints/efficientdet/coco/" + file_name

!wget $path
!tar xf $file_name

--2023-11-10 23:45:46--  https://storage.googleapis.com/cloud-tpu-checkpoints/efficientdet/coco/efficientdet-lite1.tgz
Resolving storage.googleapis.com (storage.googleapis.com)... 142.251.6.207, 172.217.214.207, 172.253.114.207, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|142.251.6.207|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 31143920 (30M) [application/octet-stream]
Saving to: ‘efficientdet-lite1.tgz.1’


2023-11-10 23:45:47 (128 MB/s) - ‘efficientdet-lite1.tgz.1’ saved [31143920/31143920]



In [5]:
size = {
    "efficientdet-lite0":"320x320",
    "efficientdet-lite1":"384x384",
    "efficientdet-lite2":"448x448",
    "efficientdet-lite3":"512x512",
    "efficientdet-lite3x":"640x640",
    "efficientdet-lite4":"640x640",
}

### Set NMS configs

In [6]:
obj = { 'image_size': size[checkpoints],
       'nms_configs': {
           'method': 'hard',
           'iou_thresh': 0.35,
           'score_thresh': 0.,
           'sigma': 0.0,
           'pyfunc': False,
           'max_nms_inputs': 0,
           'max_output_size': 100
           }
       }

with open('saved_model.yaml', 'w') as file:
    yaml.dump(obj, file)

In [7]:
!cat saved_model.yaml

image_size: 384x384
nms_configs:
  iou_thresh: 0.35
  max_nms_inputs: 0
  max_output_size: 100
  method: hard
  pyfunc: false
  score_thresh: 0.0
  sigma: 0.0


### Export Saved Model

In [8]:
model_dir = os.path.join("/content", checkpoints)
saved_model_dir = os.path.join("/content", "saved_model_" + checkpoints)

In [9]:
# Export Saved model
!python /content/automl/efficientdet/tf2/inspector.py \
    --mode=export \
    --model_name=$checkpoints \
    --model_dir=$model_dir \
    --saved_model_dir=$saved_model_dir \
    --hparams=/content/saved_model.yaml

2023-11-10 23:45:48.711726: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-11-10 23:45:48.711784: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-11-10 23:45:48.711826: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-11-10 23:45:48.719534: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-11-10 23:45:52.304144: I tensorflow/compiler/

# Export ONNX

## Clone [NVIDIA/TensorRT](https://github.com/NVIDIA/TensorRT) repository and install dependency.

In [10]:
%%bash

cd /content
git clone https://github.com/NVIDIA/TensorRT
cd TensorRT
cd /content/TensorRT/samples/python/efficientdet

pip3 install -r requirements.txt
pip3 install onnx-graphsurgeon --index-url https://pypi.ngc.nvidia.com

Ignoring onnx: markers 'python_version < "3.8"' don't match your environment
Ignoring onnxruntime: markers 'python_version < "3.8"' don't match your environment
Ignoring pywin32: markers 'platform_system == "Windows"' don't match your environment
Ignoring numpy: markers 'python_version < "3.8" and platform_system == "Windows"' don't match your environment
Ignoring numpy: markers 'python_version < "3.8" and platform_system != "Windows"' don't match your environment
Collecting numpy==1.23.2 (from -r requirements.txt (line 14))
  Using cached numpy-1.23.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.0 MB)
Collecting protobuf<=3.20.1,>=3.12.2 (from onnx==1.12.0->-r requirements.txt (line 3))
  Using cached protobuf-3.20.1-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)
Installing collected packages: protobuf, numpy
  Attempting uninstall: protobuf
    Found existing installation: protobuf 4.25.0
    Uninstalling protobuf-4.25.0:
      Successfully unin

fatal: destination path 'TensorRT' already exists and is not an empty directory.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lida 0.0.10 requires fastapi, which is not installed.
lida 0.0.10 requires kaleido, which is not installed.
lida 0.0.10 requires python-multipart, which is not installed.
lida 0.0.10 requires uvicorn, which is not installed.
google-api-core 2.11.1 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0.dev0,>=3.19.5, but you have protobuf 3.20.1 which is incompatible.
google-cloud-bigquery 3.12.0 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5, but you have protobuf 3.20.1 which is incompatible.
google-cloud-bigquery-connection 1.12.1 requires protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.

In [11]:
%cd /content/TensorRT/samples/python/efficientdet

/content/TensorRT/samples/python/efficientdet


## Export ONNX Model

In [12]:
input_shape = {
    "efficientdet-lite0":"320,320",
    "efficientdet-lite1":"384,384",
    "efficientdet-lite2":"448,448",
    "efficientdet-lite3":"512,512",
    "efficientdet-lite3x":"640,640",
    "efficientdet-lite4":"640,640",
}

In [13]:
input = input_shape[checkpoints]
output = os.path.join("/", "content", "efficientdet_onnx", checkpoints + ".onnx")

In [14]:
!python3 create_onnx.py \
    --input_size $input \
    --saved_model $saved_model_dir \
    --onnx $output

2023-11-10 23:47:29.492033: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-11-10 23:47:29.492080: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-11-10 23:47:29.492118: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-11-10 23:47:29.503165: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-11-10 23:47:35.102508: I tensorflow/compiler/

Now Download ONNX Model.