Description
I want deploy my transformer-like detection model in TensorRT-8.6(I can only choose TensorRT-8.6 because of its flash attention support):
(i) firstly, I generated engines from onnx-opset16 and evaluated it, with results below:
| model | onnx-op16-fp32 | onnx-op16-fp16 | trt-op16-fp32 | trt-op16-fp16 | trt-op16-fp16-int8 |
| mAP | 43.8 | 43.8 | 42.3 | 23.4 | 23.7 |
trt-op16-fp32 drop slightly but trt-op16-fp16 almost not work!
(ii) secondly, I tried onnx-opset17 as in tensorRT-86 release note mentioned For networks containing normalization layers, particularly if deploying with mixed precision, target the latest ONNX opset that contains the corresponding function ops, for example: opset 17 for LayerNormalization or opset 18 GroupNormalization. Numerical accuracy using function ops is superior to corresponding implementation with primitive ops for normalization layers.
But I found TensorRT-8.6 cannot parse LayerNormalization, some evaluation results and log below:
| model | onnx-op17-fp32 | onnx-op17-fp16 |
| mAP | 43.8 | 4.5 |
onnx-op17-fp16 almost not work!
tensorrtRT log:
[02/07/2024-07:37:01] [I] [TRT] [MemUsageChange] Init CUDA: CPU +11, GPU +0, now: CPU 35, GPU 7970 (MiB)
[02/07/2024-07:37:01] [V] [TRT] Trying to load shared library libnvinfer_builder_resource.so.8.6.2
[02/07/2024-07:37:01] [V] [TRT] Loaded shared library libnvinfer_builder_resource.so.8.6.2
[02/07/2024-07:37:07] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +1154, GPU +1310, now: CPU 1225, GPU 9325 (MiB)
[02/07/2024-07:37:07] [V] [TRT] CUDA lazy loading is enabled.
[02/07/2024-07:37:08] [I] [TRT] ----------------------------------------------------------------
[02/07/2024-07:37:08] [I] [TRT] Input filename: ./dinov2det/dinov2-small-rtdetr-966-546-op17-ep351-sim.onnx
[02/07/2024-07:37:08] [I] [TRT] ONNX IR version: 0.0.8
[02/07/2024-07:37:08] [I] [TRT] Opset version: 17
[02/07/2024-07:37:08] [I] [TRT] Producer name: pytorch
[02/07/2024-07:37:08] [I] [TRT] Producer version: 2.0.0
[02/07/2024-07:37:08] [I] [TRT] Domain:
[02/07/2024-07:37:08] [I] [TRT] Model version: 0
[02/07/2024-07:37:08] [I] [TRT] Doc string:
................
...............
[02/07/2024-07:39:32] [V] [TRT] /model/backbone/Add [Add] outputs: [/model/backbone/Add_output_0 -> (1, 2692, 384)[FLOAT]],
[02/07/2024-07:39:32] [V] [TRT] Parsing node: /model/backbone/blocks.0/norm1/LayerNormalization [LayerNormalization]
[02/07/2024-07:39:32] [V] [TRT] Searching for input: /model/backbone/Add_output_0
[02/07/2024-07:39:32] [V] [TRT] Searching for input: model.backbone.blocks.0.norm1.weight
[02/07/2024-07:39:32] [V] [TRT] Searching for input: model.backbone.blocks.0.norm1.bias
[02/07/2024-07:39:32] [V] [TRT] /model/backbone/blocks.0/norm1/LayerNormalization [LayerNormalization] inputs: [/model/backbone/Add_output_0 -> (1, 2692, 384)[FLOAT]], [model.backbone.blocks.0.norm1.weight -> (384)[FLOAT]], [model.backbone.blocks.0.norm1.bias -> (384)[FLOAT]],
[02/07/2024-07:39:32] [I] [TRT] No importer registered for op: LayerNormalization. Attempting to import as plugin.
[02/07/2024-07:39:32] [I] [TRT] Searching for plugin: LayerNormalization, plugin_version: 1, plugin_namespace:
[02/07/2024-07:39:32] [V] [TRT] Global registry did not find LayerNormalization creator. Will try parent registry if enabled.
[02/07/2024-07:39:32] [E] [TRT] 3: getPluginCreator could not find plugin: LayerNormalization version: 1
[02/07/2024-07:39:32] [E] [TRT] ModelImporter.cpp:757: While parsing node number 5 [LayerNormalization -> "/model/backbone/blocks.0/norm1/LayerNormalization_output_0"]:
[02/07/2024-07:39:32] [E] [TRT] ModelImporter.cpp:758: --- Begin node ---
[02/07/2024-07:39:32] [E] [TRT] ModelImporter.cpp:759: input: "/model/backbone/Add_output_0"
input: "model.backbone.blocks.0.norm1.weight"
input: "model.backbone.blocks.0.norm1.bias"
output: "/model/backbone/blocks.0/norm1/LayerNormalization_output_0"
name: "/model/backbone/blocks.0/norm1/LayerNormalization"
op_type: "LayerNormalization"
attribute {
name: "axis"
i: -1
type: INT
}
attribute {
name: "epsilon"
f: 1e-06
type: FLOAT
}
doc_string: "/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/functional.py(2515): layer_norm\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/normalization.py(190): forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1488): _slow_forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1501): _call_impl\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/../src/zoo/rtdetr/layers/block.py(91): attn_residual_func\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/../src/zoo/rtdetr/layers/block.py(112): forward\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/../src/zoo/rtdetr/layers/block.py(254): forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1488): _slow_forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1501): _call_impl\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/../src/zoo/rtdetr/vision_transformer.py(224): forward_features\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/../src/zoo/rtdetr/vision_transformer.py(315): forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1488): _slow_forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1501): _call_impl\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/../src/zoo/rtdetr/rtdetr.py(169): forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1488): _slow_forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1501): _call_impl\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/predict.py(126): forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1488): _slow_forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1501): _call_impl\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/jit/_trace.py(118): wrapper\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/jit/_trace.py(127): forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1501): _call_impl\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/jit/_trace.py(1268): _get_trace_graph\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/onnx/utils.py(893): _trace_and_get_graph_from_model\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/onnx/utils.py(989): _create_jit_graph\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/onnx/utils.py(1113): _model_to_graph\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/onnx/utils.py(1548): _export\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/onnx/utils.py(506): export\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/predict.py(195): main\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/predict.py(264): \n"
[02/07/2024-07:39:32] [E] [TRT] ModelImporter.cpp:760: --- End node ---
[02/07/2024-07:39:32] [E] [TRT] ModelImporter.cpp:762: ERROR: builtin_op_importers.cpp:5435 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
parse onnx file fail ...
Environment
Hardware: Orin NX 16G
TensorRT Version: 8.6
CUDA Version: 12.2
Docker: dustynv/l4t-pytorch:r36.2.0
Operating System: Ubuntu
Description
I want deploy my transformer-like detection model in TensorRT-8.6(I can only choose TensorRT-8.6 because of its flash attention support):
(i) firstly, I generated engines from onnx-opset16 and evaluated it, with results below:
| model | onnx-op16-fp32 | onnx-op16-fp16 | trt-op16-fp32 | trt-op16-fp16 | trt-op16-fp16-int8 |
| mAP | 43.8 | 43.8 | 42.3 | 23.4 | 23.7 |
trt-op16-fp32 drop slightly but trt-op16-fp16 almost not work!
(ii) secondly, I tried onnx-opset17 as in tensorRT-86 release note mentioned For networks containing normalization layers, particularly if deploying with mixed precision, target the latest ONNX opset that contains the corresponding function ops, for example: opset 17 for LayerNormalization or opset 18 GroupNormalization. Numerical accuracy using function ops is superior to corresponding implementation with primitive ops for normalization layers.
But I found TensorRT-8.6 cannot parse LayerNormalization, some evaluation results and log below:
| model | onnx-op17-fp32 | onnx-op17-fp16 |
| mAP | 43.8 | 4.5 |
onnx-op17-fp16 almost not work!
tensorrtRT log:
[02/07/2024-07:37:01] [I] [TRT] [MemUsageChange] Init CUDA: CPU +11, GPU +0, now: CPU 35, GPU 7970 (MiB)
[02/07/2024-07:37:01] [V] [TRT] Trying to load shared library libnvinfer_builder_resource.so.8.6.2
[02/07/2024-07:37:01] [V] [TRT] Loaded shared library libnvinfer_builder_resource.so.8.6.2
[02/07/2024-07:37:07] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +1154, GPU +1310, now: CPU 1225, GPU 9325 (MiB)
[02/07/2024-07:37:07] [V] [TRT] CUDA lazy loading is enabled.
[02/07/2024-07:37:08] [I] [TRT] ----------------------------------------------------------------
[02/07/2024-07:37:08] [I] [TRT] Input filename: ./dinov2det/dinov2-small-rtdetr-966-546-op17-ep351-sim.onnx
[02/07/2024-07:37:08] [I] [TRT] ONNX IR version: 0.0.8
[02/07/2024-07:37:08] [I] [TRT] Opset version: 17
[02/07/2024-07:37:08] [I] [TRT] Producer name: pytorch
[02/07/2024-07:37:08] [I] [TRT] Producer version: 2.0.0
[02/07/2024-07:37:08] [I] [TRT] Domain:
[02/07/2024-07:37:08] [I] [TRT] Model version: 0
[02/07/2024-07:37:08] [I] [TRT] Doc string:
................
...............
[02/07/2024-07:39:32] [V] [TRT] /model/backbone/Add [Add] outputs: [/model/backbone/Add_output_0 -> (1, 2692, 384)[FLOAT]],
[02/07/2024-07:39:32] [V] [TRT] Parsing node: /model/backbone/blocks.0/norm1/LayerNormalization [LayerNormalization]
[02/07/2024-07:39:32] [V] [TRT] Searching for input: /model/backbone/Add_output_0
[02/07/2024-07:39:32] [V] [TRT] Searching for input: model.backbone.blocks.0.norm1.weight
[02/07/2024-07:39:32] [V] [TRT] Searching for input: model.backbone.blocks.0.norm1.bias
[02/07/2024-07:39:32] [V] [TRT] /model/backbone/blocks.0/norm1/LayerNormalization [LayerNormalization] inputs: [/model/backbone/Add_output_0 -> (1, 2692, 384)[FLOAT]], [model.backbone.blocks.0.norm1.weight -> (384)[FLOAT]], [model.backbone.blocks.0.norm1.bias -> (384)[FLOAT]],
[02/07/2024-07:39:32] [I] [TRT] No importer registered for op: LayerNormalization. Attempting to import as plugin.
[02/07/2024-07:39:32] [I] [TRT] Searching for plugin: LayerNormalization, plugin_version: 1, plugin_namespace:
[02/07/2024-07:39:32] [V] [TRT] Global registry did not find LayerNormalization creator. Will try parent registry if enabled.
[02/07/2024-07:39:32] [E] [TRT] 3: getPluginCreator could not find plugin: LayerNormalization version: 1
[02/07/2024-07:39:32] [E] [TRT] ModelImporter.cpp:757: While parsing node number 5 [LayerNormalization -> "/model/backbone/blocks.0/norm1/LayerNormalization_output_0"]:
[02/07/2024-07:39:32] [E] [TRT] ModelImporter.cpp:758: --- Begin node ---
[02/07/2024-07:39:32] [E] [TRT] ModelImporter.cpp:759: input: "/model/backbone/Add_output_0"
input: "model.backbone.blocks.0.norm1.weight"
input: "model.backbone.blocks.0.norm1.bias"
output: "/model/backbone/blocks.0/norm1/LayerNormalization_output_0"
name: "/model/backbone/blocks.0/norm1/LayerNormalization"
op_type: "LayerNormalization"
attribute {
name: "axis"
i: -1
type: INT
}
attribute {
name: "epsilon"
f: 1e-06
type: FLOAT
}
doc_string: "/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/functional.py(2515): layer_norm\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/normalization.py(190): forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1488): _slow_forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1501): _call_impl\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/../src/zoo/rtdetr/layers/block.py(91): attn_residual_func\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/../src/zoo/rtdetr/layers/block.py(112): forward\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/../src/zoo/rtdetr/layers/block.py(254): forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1488): _slow_forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1501): _call_impl\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/../src/zoo/rtdetr/vision_transformer.py(224): forward_features\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/../src/zoo/rtdetr/vision_transformer.py(315): forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1488): _slow_forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1501): _call_impl\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/../src/zoo/rtdetr/rtdetr.py(169): forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1488): _slow_forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1501): _call_impl\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/predict.py(126): forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1488): _slow_forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1501): _call_impl\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/jit/_trace.py(118): wrapper\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/jit/_trace.py(127): forward\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/nn/modules/module.py(1501): _call_impl\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/jit/_trace.py(1268): _get_trace_graph\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/onnx/utils.py(893): _trace_and_get_graph_from_model\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/onnx/utils.py(989): _create_jit_graph\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/onnx/utils.py(1113): _model_to_graph\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/onnx/utils.py(1548): _export\n/home/qxit02/.conda/envs/dinov2/lib/python3.9/site-packages/torch/onnx/utils.py(506): export\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/predict.py(195): main\n/home/qxit02/cyr/proj/10.transformer/wrr/rtdetr_pytorch/tools/predict.py(264): \n"
[02/07/2024-07:39:32] [E] [TRT] ModelImporter.cpp:760: --- End node ---
[02/07/2024-07:39:32] [E] [TRT] ModelImporter.cpp:762: ERROR: builtin_op_importers.cpp:5435 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
parse onnx file fail ...
Environment
Hardware: Orin NX 16G
TensorRT Version: 8.6
CUDA Version: 12.2
Docker: dustynv/l4t-pytorch:r36.2.0
Operating System: Ubuntu