TensorRT backend for ONNX

Parses ONNX models for execution with TensorRT.

ONNX Python backend usage

The TensorRT backend for ONNX can be used in Python as follows:

import onnx
import onnx_tensorrt.backend as backend
import numpy as np

model = onnx.load("/path/to/model.onnx")
engine = backend.prepare(model, device='CUDA:1')
input_data = np.random.random(size=(32, 3, 224, 224)).astype(np.float32)
output_data = engine.run(input_data)[0]
print(output_data)
print(output_data.shape)

Executable usage

ONNX models can be converted to serialized TensorRT engines using the onnx2trt executable:

onnx2trt my_model.onnx -o my_engine.trt

ONNX models can also be converted to human-readable text:

onnx2trt my_model.onnx -t my_model.onnx.txt

See more usage information by running:

onnx2trt -h

C++ library usage

The model parser library, libnvonnxparser.so, has a C++ API declared in this header:

NvOnnxParser.h

TensorRT engines built using this parser must use the plugin factory provided in libnvonnxparser_runtime.so, which has a C++ API declared in this header:

NvOnnxParserRuntime.h

Installation

Dependencies

Download the code

Clone the code from GitHub.

git clone --recursive https://github.com/onnx/onnx-tensorrt.git

Executable and libraries

Suppose your TensorRT library is located at /opt/tensorrt. Build the onnx2trt executable and the libnvonnxparser* libraries using CMake. Note that onnx-tensorrt will by default tell the CUDA compiler generate code for the latest SM version. If you are using a GPU with a lower SM version you can specify which SMs to build for by using the optional -DGPU_ARCHS flag. For example, if you are running TensorRT on an older Pascal card such as a GTX 1080, you can specify -DGPU_ARCHS="61" to generate CUDA code specifically for your card.

See here for finding what maximum compute capability your specific GPU supports.

mkdir build
cd build
cmake .. -DTENSORRT_ROOT=/opt/tensorrt
OR
cmake .. -DTENSORRT_ROOT=/opt/tensorrt -DGPU_ARCHS="61"
make -j8
sudo make install

Python modules

Python bindings for the ONNX-TensorRT parser in TensorRT versions >= 5.0 are packaged in the shipped .whl files. No extra install is necessary.

For earlier versions of TensorRT, the Python wrappers are built using SWIG. Build the Python wrappers and modules by running:

python setup.py build
sudo python setup.py install

Docker image

Build the onnx_tensorrt Docker image by running:

cp /path/to/TensorRT-3.0.*.tar.gz .
docker build -t onnx_tensorrt .

Tests

After installation (or inside the Docker container), ONNX backend tests can be run as follows:

Real model tests only:

python onnx_backend_test.py OnnxBackendRealModelTest

All tests:

python onnx_backend_test.py

You can use -v flag to make output more verbose.

Pre-trained models

Pre-trained models in ONNX format can be found at the ONNX Model Zoo

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
onnx_tensorrt		onnx_tensorrt
third_party		third_party
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
Dockerfile		Dockerfile
FancyActivation.cu		FancyActivation.cu
FancyActivation.hpp		FancyActivation.hpp
ImporterContext.hpp		ImporterContext.hpp
InstanceNormalization.cpp		InstanceNormalization.cpp
InstanceNormalization.hpp		InstanceNormalization.hpp
LICENSE		LICENSE
ModelImporter.cpp		ModelImporter.cpp
ModelImporter.hpp		ModelImporter.hpp
NvOnnxParser.cpp		NvOnnxParser.cpp
NvOnnxParser.h		NvOnnxParser.h
NvOnnxParserRuntime.cpp		NvOnnxParserRuntime.cpp
NvOnnxParserRuntime.h		NvOnnxParserRuntime.h
NvOnnxParserTypedefs.h		NvOnnxParserTypedefs.h
OnnxAttrs.cpp		OnnxAttrs.cpp
OnnxAttrs.hpp		OnnxAttrs.hpp
PluginFactory.cpp		PluginFactory.cpp
PluginFactory.hpp		PluginFactory.hpp
README.md		README.md
ResizeNearest.cu		ResizeNearest.cu
ResizeNearest.hpp		ResizeNearest.hpp
ShapedWeights.cpp		ShapedWeights.cpp
ShapedWeights.hpp		ShapedWeights.hpp
Split.cu		Split.cu
Split.hpp		Split.hpp
Status.hpp		Status.hpp
TensorOrWeights.hpp		TensorOrWeights.hpp
builtin_op_importers.cpp		builtin_op_importers.cpp
builtin_op_importers.hpp		builtin_op_importers.hpp
builtin_plugins.cpp		builtin_plugins.cpp
builtin_plugins.hpp		builtin_plugins.hpp
common.hpp		common.hpp
getSupportedAPITest.cpp		getSupportedAPITest.cpp
libnvonnxparser.version		libnvonnxparser.version
libnvonnxparser_runtime.version		libnvonnxparser_runtime.version
main.cpp		main.cpp
nv_onnx_parser_bindings.i		nv_onnx_parser_bindings.i
nv_onnx_runtime_bindings.i		nv_onnx_runtime_bindings.i
onnx2trt.hpp		onnx2trt.hpp
onnx2trt_common.hpp		onnx2trt_common.hpp
onnx2trt_runtime.hpp		onnx2trt_runtime.hpp
onnx2trt_utils.cpp		onnx2trt_utils.cpp
onnx2trt_utils.hpp		onnx2trt_utils.hpp
onnx_backend_test.py		onnx_backend_test.py
onnx_trt_backend.cpp		onnx_trt_backend.cpp
onnx_utils.hpp		onnx_utils.hpp
plugin.cpp		plugin.cpp
plugin.hpp		plugin.hpp
plugin_common.hpp		plugin_common.hpp
serialize.hpp		serialize.hpp
setup.py		setup.py
toposort.hpp		toposort.hpp
trt_utils.hpp		trt_utils.hpp
utils.hpp		utils.hpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TensorRT backend for ONNX

ONNX Python backend usage

Executable usage

C++ library usage

Installation

Dependencies

Download the code

Executable and libraries

Python modules

Docker image

Tests

Pre-trained models

About

Releases

Packages

Languages

License

xiongzhangdavid/onnx-tensorrt

Folders and files

Latest commit

History

Repository files navigation

TensorRT backend for ONNX

ONNX Python backend usage

Executable usage

C++ library usage

Installation

Dependencies

Download the code

Executable and libraries

Python modules

Docker image

Tests

Pre-trained models

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages