- Automatically generate high-performance TensorRT plugins for unsupported operators or replacing inefficient kernels.
- End-to-end command line tool. No requirement for any CUDA programming knowledge. Users only need to provide the ONNX model and assign the node names or types to auto-generate TensorRT plugin.
- The performance of auto-generated TensorRT plugins in real cases:
- LLVM >= 9.0.1, (LLVM==9.0.1 recommended)
- GCC >= 7.3.0, (GCC==7.4.0 recommended)
- TensorRT
- numpy pycuda onnx onnxruntime onnx_graphsurgeon xgboost jinja2 ctypes tornado cloudpickle psutil
NOTE: these necessary packages are recorded in requirements.txt
- tensorflow-gpu==1.15
- tf2onnx
- torch
- pytest
NOTE: these optional packages are required by Example and UnitTest
git clone -b master https://github.com/nvidia/TensorRT TPAT
cd TPAT
git submodule update --init --recursive
mkdir build && cp cmake/config.cmake build
#Edit build/config.cmake to customize the compilation options
set(USE_LLVM /usr/local/llvm/bin/llvm-config)
set(USE_CUDA ON)
#gcc compiler is required to support C++14
cd build && cmake ..
make -j
#TVM Python package
export TVM_HOME=/path/to/tvm
export PYTHONPATH=$TVM_HOME/python:${PYTHONPATH}
Modify python/trt_plugin/Makefile according to your environment setup.
CUDA_PATH: local CUDA installation path
TRT_LIB_PATH: local TensorRT installation path
TPAT provides a Python function and command line for usage.
onnx2plugin(
input_model_path,
output_model_path,
node_names=None,
node_types=None,
plugin_name_dict=None
)
- input_model_path[required] : input onnx model including nodes which require TRT plugin
- output_model_path[required] : output onnx model where the corresponding node types are replaced by plugin names. The output onnx model can be directly converted to TRT with onnx parser and built plugin dynamic library.
- node_names : list of node names for autogen
- node_types : list of node types for autogen
- plugin_name_dict : dict of {plugin_name: node_name} for autogen
NOTE: For node_names, node_types, plugin_name_dict, at least one of them should be provided
python3 Onnx2Plugin.py -i input.onnx -o output.onnx -n op_name1 op_name2
python3 Onnx2Plugin.py -i input.onnx -o output.onnx -t op_type1 op_type2
python3 Onnx2Plugin.py -i input.onnx -o output.onnx -p '{"op_name1": "plugin_name1", "op_name2": "plugin_name2"}'
- -i[required]: input_model_path
- -o[required]: output_model_path
- -n: node_names
- -t: node_types
- -p: plugin_name_dict
- trt_plugin/src contains {plugin_name}.cu and {plugin_name}.h
- trt_plugin/lib contains {plugin_name}.so
- trt_plugin/src contains tpat_{node_name}.cu and tpat_{node_name}.h
- trt_plugin/lib contains tpat_{node_name}.so
- Example : example_tensorflow.py
- UnitTest : test_tapt.py
- Support mutiple nodes for autogen
- Support boolean input/outputs
- Able to reuse plugins
- Dynamic shapes are not supported
- Opeartors with int8/float16/double inputs/outputs are not supported
- Support ONNX subgraph for autogen
- Support direction conversion from TensorFlow and PyTorch