onnx to trt conversion of ssd mobilenet v2 fpnlite tensorflow object detection API 2 #1205

VeeranjaneyuluToka · 2021-04-20T13:59:27Z

Description

Trying to convert from onnx to tensorrt and getting [1] 23371 segmentation fault

adVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_4/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_1/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_1/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_1/FusedBatchNormV3/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_1/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/smoothing_1_batchnorm/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/smoothing_1_batchnorm/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/smoothing_1_batchnorm/FusedBatchNormV3/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/smoothing_1_batchnorm/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1312
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_1/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_1/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_1/FusedBatchNormV3/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_1/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_0/BatchNorm/feature_0/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_0/BatchNorm/feature_0/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_0/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_0/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1307
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_1/BatchNorm/feature_0/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_1/BatchNorm/feature_0/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_1/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_1/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1402
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_0/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_0/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_0/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_0/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1354
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1324
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_starts__757
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_ends__758
[04/20/2021-15:38:38] [V] [TRT] onnx2trt_utils.cpp:236: Weight at index 0: 9223372036854775807 is out of range. Clamping to: 2147483647
[04/20/2021-15:38:38] [V] [TRT] onnx2trt_utils.cpp:236: Weight at index 1: 9223372036854775807 is out of range. Clamping to: 2147483647
[04/20/2021-15:38:38] [V] [TRT] onnx2trt_utils.cpp:236: Weight at index 2: 9223372036854775807 is out of range. Clamping to: 2147483647
[04/20/2021-15:38:38] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/Decode/truediv_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/Decode/truediv:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: ConstantFolding/StatefulPartitionedCall/Postprocessor/Decode/truediv_2_recip:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/Decode/get_center_coordinates_and_sizes/add_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/Decode/get_center_coordinates_and_sizes/add:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_starts__760
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_ends__761
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_starts__763
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_ends__764
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_starts__766
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_ends__767
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_starts__769
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_ends__770
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_starts__772
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_ends__773
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/PadOrClipBoxList/Pad__1118
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/Decode/get_center_coordinates_and_sizes/sub:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/Decode/get_center_coordinates_and_sizes/sub_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: ConstantFolding/StatefulPartitionedCall/Postprocessor/Decode/truediv_7_recip:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1323
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/Minimum_5/x:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1406
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/mul:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/Reshape_3:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/Concatenate/concat_5:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/mul_5/x:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/range_6/delta:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/PadOrClipBoxList/Select/e:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const__1218
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: largest_int_val__1219
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/PadOrClipBoxList/zeros_10:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/PadOrClipBoxList/zeros:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/PadOrClipBoxList/sub_17/x:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/PadOrClipBoxList/sub_3/x:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1301
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1350
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:103: Parsing node: __inference_map_while_cond_8144_19055_map/while/Less_1 [Less]
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:119: Searching for input: const_fold_opt__1415
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:119: Searching for input: StatefulPartitionedCall/add/y:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:125: __inference_map_while_cond_8144_19055_map/while/Less_1 [Less] inputs: [const_fold_opt__1415 -> ()], [StatefulPartitionedCall/add/y:0 -> ()],
[04/20/2021-15:38:38] [V] [TRT] ImporterContext.hpp:141: Registering layer: __inference_map_while_cond_8144_19055_map/while/Less_1 for ONNX node: __inference_map_while_cond_8144_19055_map/while/Less_1
[04/20/2021-15:38:38] [V] [TRT] ImporterContext.hpp:116: Registering tensor: __inference_map_while_cond_8144_19055_map/while/Less_1:0 for ONNX tensor: __inference_map_while_cond_8144_19055_map/while/Less_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:179: __inference_map_while_cond_8144_19055_map/while/Less_1 [Less] outputs: [__inference_map_while_cond_8144_19055_map/while/Less_1:0 -> ()],
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:103: Parsing node: __inference_map_while_cond_8144_19055_map/while/LogicalAnd [And]
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:119: Searching for input: __inference_map_while_cond_8144_19055_map/while/Less_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:119: Searching for input: __inference_map_while_cond_8144_19055_map/while/Less_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:125: __inference_map_while_cond_8144_19055_map/while/LogicalAnd [And] inputs: [__inference_map_while_cond_8144_19055_map/while/Less_1:0 -> ()], [__inference_map_while_cond_8144_19055_map/while/Less_1:0 -> ()],
[04/20/2021-15:38:38] [V] [TRT] ImporterContext.hpp:141: Registering layer: __inference_map_while_cond_8144_19055_map/while/LogicalAnd for ONNX node: __inference_map_while_cond_8144_19055_map/while/LogicalAnd
[04/20/2021-15:38:38] [V] [TRT] ImporterContext.hpp:116: Registering tensor: __inference_map_while_cond_8144_19055_map/while/LogicalAnd:0 for ONNX tensor: __inference_map_while_cond_8144_19055_map/while/LogicalAnd:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:179: __inference_map_while_cond_8144_19055_map/while/LogicalAnd [And] outputs: [__inference_map_while_cond_8144_19055_map/while/LogicalAnd:0 -> ()],
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:103: Parsing node: StatefulPartitionedCall/map/while_loop [Loop]
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:119: Searching for input: StatefulPartitionedCall/map/while/maximum_iterations:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:119: Searching for input: __inference_map_while_cond_8144_19055_map/while/LogicalAnd:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:119: Searching for input: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/range_5/start:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:125: StatefulPartitionedCall/map/while_loop [Loop] inputs: [StatefulPartitionedCall/map/while/maximum_iterations:0 -> ()], [__inference_map_while_cond_8144_19055_map/while/LogicalAnd:0 -> ()], [StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/range_5/start:0 -> ()],
[04/20/2021-15:38:38] [V] [TRT] ImporterContext.hpp:116: Registering tensor: map_while_map_while_loop_counter:0 for ONNX tensor: map_while_map_while_loop_counter:0
[04/20/2021-15:38:38] [V] [TRT] ImporterContext.hpp:116: Registering tensor: map_while_placeholder:0 for ONNX tensor: map_while_placeholder:0
[1] 23371 segmentation fault (core dumped) ./trtexec --verbose

Environment

TensorRT Version: 7.1.3.0
NVIDIA GPU: Jetson AGX Xavier (16 gb)
NVIDIA Driver Version: Jetpack 4.4.1
CUDA Version: 10.2
CUDNN Version: 8
Operating System: Ubuntu 18.04
Python Version (if applicable): 3.6.9
Tensorflow Version (if applicable): 2.4.0
PyTorch Version (if applicable):
Baremetal or Container (if so, version):

Relevant Files

Steps To Reproduce

./trtexec --onnx=/data/Veeru/models/regression/120421-ed1-onnx/ssd-mobnetv2_fpnlite_model_no_nms.onnx --saveEngine=/data/Veeru/models/regression/120421-ed1-onnx/ssd-mobnetv2_fpnlite_model_no_nms.trt --verbose

Please include:

#795

pranavm-nvidia · 2021-04-22T13:42:32Z

@VeeranjaneyuluToka Could you try reducing the topK value in your NMS to <= 4096?

See #795 (comment)

VeeranjaneyuluToka · 2021-04-23T14:47:14Z

@pranavm-nvidia I have tried with <=4096 but still i am getting segmenation fault as shown in the above log. Am clueless from here. I have explored a couple of things suggested by @qraleq but they did not help me much. Would be great help if you can point at least some direction. Can you check my netron out and let me know if there is anything missed. I got to know from qraleq that input_tensor need to be connected but not able to get that.

pranavm-nvidia · 2021-04-23T14:58:09Z

@VeeranjaneyuluToka How did you generate the ONNX model? Can you share the original model and the scripts you used to modify it?

Also can you get a backtrace from gdb so we can see where it's segfaulting?

VeeranjaneyuluToka · 2021-04-24T05:45:01Z

@pranavm-nvidia I have savedmodel from tensorflow object detectionn API 2 which is trained using my own dataset, then i have used below command to convert from savedmodel to onnx model

python -m tf2onnx.convert --saved-model /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/saved_model --outputs 'Identity_6:0','Identity_7:0' --output /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/ssd-mobnetv2_fpnlite_model.onnx --opset 12 --verbose

i have sent an email with all the onnx models and script that i have used to modify it.

and i am using the below command to convert from onnx to trt

./trtexec --onnx=/tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/ssd-mobnetv2_fpnlite_model_no_nms_latest.onnx --saveEngine=tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/ssd_mobilenet_v2_fpnlite.trt --verbose

i will check with gdb and let you know if i find anything related to seg fault.

VeeranjaneyuluToka · 2021-04-26T09:48:11Z

@pranavm-nvidia , i have considered latest tf2onnx branch changes esp to fix this issues(onnx/tensorflow-onnx#1443) to avoid TensorListStack ops, then converted from savedmodel to onnx, then changed input-tensor datatype to float and tried to convert from onnx to trt, but getting the different error. Attached log for your reference.
log.txt, Can you please have a look into the log file and any suggestions to fix the error would be of great help

VeeranjaneyuluToka · 2021-04-30T10:23:28Z

@pranavm-nvidia Thanks for all your help provided and prompt response. I could successfully convert from onnx to trt engine but results are quite different. wondering how can i debug this? would be great help if you can provide any suggestions or any references to debug further.

pranavm-nvidia · 2021-04-30T13:56:36Z

@VeeranjaneyuluToka A good first step would be to try with Polygraphy to figure out which layer is the issue. I'd start with the ONNX model w/o the NMS node:

polygraphy run </path/to/model.onnx> \
    --trt --trt-outputs mark all \
    --onnxrt --onnx-outputs mark all

If that passes, then the problem must be in the NMS.

VeeranjaneyuluToka · 2021-04-30T14:21:58Z

polygraphy run /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/280421_convert/ssd-mobnetv2-fpnlite-model-inploop-removed.onnx --trt --trt-outputs mark all --onnxrt --onnx-outputs mark all
[W] 'colored' module is not installed, will not use colors when logging. To enable colors, please install the 'colored' module: python3 -m pip install colored
[I] Runner: trt-runner-N0-04/30/21-16:20:17 | Activating and starting inference
[TensorRT] WARNING: onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[I] Building engine with configuration: max_workspace_size=16777216 bytes (16.00 MB) | tf32=False, fp16=False, int8=False, strict_types=False | 1 profiles
[TensorRT] WARNING: TensorRT was linked against cuDNN 8.1.0 but loaded cuDNN 8.0.4
[TensorRT] WARNING: TensorRT was linked against cuDNN 8.1.0 but loaded cuDNN 8.0.4
[TensorRT] WARNING: TensorRT was linked against cuDNN 8.1.0 but loaded cuDNN 8.0.4
[I] Runner: trt-runner-N0-04/30/21-16:20:17 | Completed 1 iterations.
[I] Runner: onnxrt-runner-N0-04/30/21-16:20:17 | Activating and starting inference
[W] ONNX Checker exited with an error:
Field 'type' of value_info is required but missing.
2021-04-30 15:20:55.445659872 [W:onnxruntime:, graph.cc:84 MergeShapeInfo] Error merging shape info for output. 'StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/bn_Conv1/FusedBatchNormV3:0' source:{1,32,320,320} target:{0,32,320,320}. Falling back to lenient merge.
Traceback (most recent call last):
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/bin/polygraphy", line 47, in
main()
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/bin/polygraphy", line 43, in main
sys.exit(args.subcommand(args))
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/tools/run/run.py", line 264, in call
exec(script)
File "", line 32, in
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/comparator/comparator.py", line 170, in run
run_results[runner.name] = execute_runner(runner, loader_cache)
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/comparator/comparator.py", line 79, in execute_runner
with runner as active_runner:
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/backend/base/runner.py", line 67, in enter
self.activate()
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/backend/base/runner.py", line 93, in activate
self.activate_impl()
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/backend/onnxrt/runner.py", line 40, in activate_impl
self.sess, _ = misc.try_call(self._sess)
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/util/misc.py", line 274, in try_call
ret = func(*args, **kwargs)
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/backend/onnxrt/loader.py", line 41, in call
return onnxruntime.InferenceSession(model_bytes)
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 280, in init
self._create_inference_session(providers, provider_options)
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 309, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Node (StatefulPartitionedCall/Postprocessor/Decode/mul_1) Op (Mul) [ShapeInferenceError] Incompatible dimensions

This is how the log looks like, @pranavm-nvidia :any inputs on this?

pranavm-nvidia · 2021-04-30T15:29:16Z

@VeeranjaneyuluToka Seems like the ONNX model isn't valid. Looks to me like it could be a bug in tf2onnx, in which case you may want to post here.

VeeranjaneyuluToka · 2021-05-04T13:27:50Z

@pranavm-nvidia, i have one doubt when i use the below command to remove the loop and attached input shape, it started giving shape information and could visualize in netron the same.

polygraphy surgeon extract /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/290421_convert/ssd-mobnetv2_fpnlite_model.onnx --inputs StatefulPartitionedCall/map/while_loop:2,1x3x640x640,float32 -o /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/290421_convert/ssd-mobnetv2-fpnlite-model-inploop-removed.onnx

But if you notice the attached graph, it shows 1x3x640x640 at first node and after that it shows 0x32x320x320, why it is showing 0 at first dimension?

pranavm-nvidia · 2021-05-04T13:33:47Z

@VeeranjaneyuluToka Sounds like it might be a bug with ONNX shape inference. Is it causing any issues? TRT does its own shape inference, so it should ignore the intermediate shapes from the model

VeeranjaneyuluToka · 2021-05-04T13:36:09Z

@pranavm-nvidia , trt conversion goes fine. But when i run inference using trt engine, results are very bad. so i am not sure where it is going wrong? anything i can do to track or debug trt engine?

pranavm-nvidia · 2021-05-04T13:38:03Z

@VeeranjaneyuluToka You can use Polygraphy to debug. I would start by checking if TRT results match ONNX-Runtime:

polygraphy run \
     /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/290421_convert/ssd-mobnetv2-fpnlite-model-inploop-removed.onnx \
    --trt --onnxrt

VeeranjaneyuluToka · 2021-05-04T13:53:02Z

@pranavm-nvidia, Thanks for quick reply!, I have done some thing like this, i have run inference of tf savedmodel and converted savedmodel to onnx with the below command

python -m tf2onnx.convert --saved-model /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/saved_model --output /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/030521_convert/ssd-mobnetv2_fpnlite_model.onnx --opset 12

Then run the inference using onnx-Runtime, results are exactly same. But the onnx which got converted from the above command will have all the preprocessing, network, and post processing (includes NMS) and the input tensor data type is Uint8. So i need to modify the input tensor data type to float32, remove the loop nodes, consider only raw bounding boxes and get rid of NMS part from the saved model. I have used the below command for getting raw bounding box outputs

python -m tf2onnx.convert --saved-model /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/saved_model --output /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/030521_convert/ssd-mobnetv2_fpnlite_model_no_NMS.onnx --opset 12 --inputs "input_tensor:0" --outputs "Identity_6:0","Identity_7:0"

I can not run the inference using onnx runtime with the model created using the above command

And then polygraphy to remove the loop node as you suggested

polygraphy surgeon extract /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/290421_convert/ssd-mobnetv2_fpnlite_model.onnx --inputs StatefulPartitionedCall/map/while_loop:2,1x3x640x640,float32 -o /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/290421_convert/ssd-mobnetv2-fpnlite-model-inploop-removed.onnx

And then i have used onnx graph surgeon to modify the input tensor data type and to add the NMS plugin.

BTW, the above polygraphy command that you suggested gives the error as below

Runner: trt-runner-N0-05/04/21-15:50:36 | Activating and starting inference
[TensorRT] WARNING: onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[I] Building engine with configuration: max_workspace_size=16777216 bytes (16.00 MB) | tf32=False, fp16=False, int8=False, strict_types=False | 1 profiles
[TensorRT] WARNING: TensorRT was linked against cuDNN 8.1.0 but loaded cuDNN 8.0.4
[TensorRT] WARNING: TensorRT was linked against cuDNN 8.1.0 but loaded cuDNN 8.0.4
[TensorRT] WARNING: TensorRT was linked against cuDNN 8.1.0 but loaded cuDNN 8.0.4
[I] Runner: trt-runner-N0-05/04/21-15:50:36 | Completed 1 iterations.
[I] Runner: onnxrt-runner-N0-05/04/21-15:50:36 | Activating and starting inference
2021-05-04 14:50:55.194512363 [W:onnxruntime:, graph.cc:84 MergeShapeInfo] Error merging shape info for output. 'StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/bn_Conv1/FusedBatchNormV3:0' source:{1,32,320,320} target:{0,32,320,320}. Falling back to lenient merge.
Traceback (most recent call last):
File "/home/veeru/anaconda3/envs/tf_fish_2.4/bin/polygraphy", line 47, in
main()
File "/home/veeru/anaconda3/envs/tf_fish_2.4/bin/polygraphy", line 43, in main
sys.exit(args.subcommand(args))
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/tools/run/run.py", line 264, in call
exec(script)
File "", line 26, in
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/comparator/comparator.py", line 170, in run
run_results[runner.name] = execute_runner(runner, loader_cache)
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/comparator/comparator.py", line 79, in execute_runner
with runner as active_runner:
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/backend/base/runner.py", line 67, in enter
self.activate()
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/backend/base/runner.py", line 93, in activate
self.activate_impl()
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/backend/onnxrt/runner.py", line 40, in activate_impl
self.sess, _ = misc.try_call(self._sess)
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/util/misc.py", line 274, in try_call
ret = func(*args, **kwargs)
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/backend/onnxrt/loader.py", line 41, in call
return onnxruntime.InferenceSession(model_bytes)
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 280, in init
self._create_inference_session(providers, provider_options)
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 309, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Node (StatefulPartitionedCall/Postprocessor/Decode/mul_1) Op (Mul) [ShapeInferenceError] Incompatible dimensions

pranavm-nvidia · 2021-05-04T14:05:19Z

@VeeranjaneyuluToka Could you try saving layerwise outputs from the first model (the one that works with ONNX-RT) and then comparing?

Save layerwise outputs:

polygraphy run \
    /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/030521_convert/ssd-mobnetv2_fpnlite_model.onnx \
    --onnxrt --onnx-outputs mark all \
    --save-outputs golden.pkl

Then you can compare the TRT outputs against those:

polygraphy run \ 
    /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/290421_convert/ssd-mobnetv2-fpnlite-model-inploop-removed.onnx \
    --trt --trt-outputs mark all \
    --load-outputs golden.pkl

VeeranjaneyuluToka · 2021-05-04T14:25:26Z

looks like no --save-outputs option in first command?

pranavm-nvidia · 2021-05-04T14:33:39Z

@VeeranjaneyuluToka Which version of Polygraphy are you using? Could you try installing from source?

VeeranjaneyuluToka · 2021-05-05T10:05:12Z

@pranavm-nvidia after i build manually with the below commands
python3 setup.py bdist_wheel
python3 -m pip install polygraphy/dist/polygraphy-0.21.1-py2.py3-none-any.whl --user, here version is 0.21.1 i believe
i am getting the below error with the below command

polygraphy run /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/280421_convert/ssd-mobnetv2_fpnlite_model.onnx --onnxrt --onnx-outputs mark all --save-outputs /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/280421_convert/ssd-mobnetv2_fpnlite_model.pkl
Traceback (most recent call last):
File "/home/veeru/anaconda3/envs/tf_fish_2.4/bin/polygraphy", line 22, in
from polygraphy.tools.util import args as args_util
ImportError: cannot import name 'args' from 'polygraphy.tools.util' (/home/veeru/.local/lib/python3.8/site-packages/polygraphy/tools/util/init.py)

pranavm-nvidia · 2021-05-05T12:30:52Z

@VeeranjaneyuluToka Did you install Polygraphy in your conda environment? It looks like there's a mismatch between the polygraphy binary and the Python module it's trying to load. Can you try using ~/.local/bin/polygraphy instead?

VeeranjaneyuluToka · 2021-05-05T13:02:34Z

@pranavm-nvidia thanks for quick reply. If you remember earlier i was getting 1x51150x4 shape after squeeze operation(you can see the same in the attached graph

After i remove squeeze node also, i was getting 1x51150x4 shape only as shown in the below attached graph

And then i was able to convert from onnx to trt though trt results are quite different. Am i getting different results because of this?

Today i manage to get 1x51150x1x4 shape as shown in the below graph

But still i think i need to get 1x51150x6x4 as i have 6 classes. Do you have any suggestions on this?

pranavm-nvidia · 2021-05-05T13:12:27Z

@VeeranjaneyuluToka You could insert a Tile node with repeats=[1, 1, 6, 1] (see https://github.com/onnx/onnx/blob/master/docs/Operators.md#Tile).

VeeranjaneyuluToka · 2021-05-05T13:18:13Z

@pranavm-nvidia you mentioned that trt does not consider onnx shape info, does this cause accuracy differs and would you mind sharing any sample of using this tile operator? they have given an example there but i am just thinking that my output Identity-6 should be repeated as you mentioned with repeats?

I modified using the onnxconverter_common.onnx2py and got as attached graph, is it correct?

pranavm-nvidia · 2021-05-05T13:30:02Z

trt does not consider onnx shape info, does this cause accuracy differs

No, it shouldn't cause any difference in the output.

To insert the Tile, you could do something like:

import onnx
import onnx_graphsurgeon as gs
import numpy as np

graph = gs.import_onnx(onnx.load("/path/to/model.onnx"))
tmap = graph.tensors()

# Find boxes input and NMS node 
boxes = tmap["Identity_6:0"]
nms = boxes.outputs[0]

# Make Tile layer and connect to NMS
tile_out = graph.layer(op="Tile", 
                       inputs=[boxes, np.array([1, 1, 6, 1])], 
                       outputs=["tile_out"])[0]
nms.inputs[0] = tile_out

# Re-export graph
onnx.save(gs.export_onnx(graph), "/path/to/new_model.onnx"))

VeeranjaneyuluToka · 2021-05-05T13:42:53Z

@pranavm-nvidia Thanks for sharing sample code snippet, here is the modified graph, would you mind have a look once and let me know this is how it should be?

pranavm-nvidia · 2021-05-05T13:44:42Z

@VeeranjaneyuluToka Yup, looks right. Could you double check that the repeats input of the Tile is [1, 1, 6, 1]? Other than that, should be good to go.

VeeranjaneyuluToka · 2021-05-05T13:59:53Z

@pranavm-nvidia

Make Tile layer and connect to NMS

tile_out = graph.layer(op="Tile",
inputs=[boxes, np.array([1, 1, 6, 1])],
outputs=["tile_out"])[0]
this is what i have used as you suggested above and i think it is of shape that you mentioned. But onnx to trt conversion fails with this.
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_4_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_4_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_medium_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_5_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_5_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_medium_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_6_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_6_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_medium_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_7_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_7_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_8_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_8_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_9_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_9_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_10_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_10_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_11_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_11_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_12_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_12_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_13_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_13_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_14_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_14_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_15_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_15_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_16_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_16_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/Conv_1/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/out_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/projection_1/BiasAdd + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/add_1 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/smoothing_1_depthwise_conv/separable_conv2d + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/smoothing_1/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_small_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_0/separable_conv2d + StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_0/activation_0/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_small_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_1/separable_conv2d + StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_1/activation_0/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_small_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/separable_conv2d + StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/activation_0/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_small_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/separable_conv2d + StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/activation_0/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_small_nn_v1
[05/05/2021-15:58:57] [F] [TRT] Assertion failed: inputDims[0].d[1] == numLocClasses
batchedNMSPlugin.cpp:297
Aborting...

let me know if you want have a look into complete log?

pranavm-nvidia · 2021-05-05T14:06:05Z

@VeeranjaneyuluToka Actually, you probably don't want to tile in this case (though if you do, you can set the shareLocation attribute to False). If shareLocation is True, then your boxes input can be 1x51150x1x4 and the boxes will be shared for all classes. The accuracy issue is probably not due to this.

VeeranjaneyuluToka · 2021-05-05T14:10:52Z

@pranavm-nvidia ok, i will take off tile. Do you have any suggestions to investigate further on my accuracy issue?

pranavm-nvidia · 2021-05-05T14:13:16Z

@VeeranjaneyuluToka Comparing layer-wise outputs would be a good next step, see #1205 (comment)

This could be pretty tricky to debug though, given the number of modifications made to the ONNX model

ttyio · 2021-07-02T08:55:15Z

Closing since no activity for more than 3 weeks. please reopen if you still have question, thanks!

VeeranjaneyuluToka mentioned this issue Apr 22, 2021

How to use NMS with Pytorch model (that was converted to ONNX -> TensorRT) #795

Closed

ttyio added ONNX Export: tf2onnx https://github.com/onnx/tensorflow-onnx Topic: ONNX Plugin triaged Issue has been triaged by maintainers labels Apr 26, 2021

pranavm-nvidia mentioned this issue May 5, 2021

Segmentation Fault trying to convert TF2 SavedModel -> onnx -> trt #1233

Closed

ttyio closed this as completed Jul 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

onnx to trt conversion of ssd mobilenet v2 fpnlite tensorflow object detection API 2 #1205

onnx to trt conversion of ssd mobilenet v2 fpnlite tensorflow object detection API 2 #1205

VeeranjaneyuluToka commented Apr 20, 2021 •

edited

Loading

pranavm-nvidia commented Apr 22, 2021

VeeranjaneyuluToka commented Apr 23, 2021 •

edited

Loading

pranavm-nvidia commented Apr 23, 2021

VeeranjaneyuluToka commented Apr 24, 2021

VeeranjaneyuluToka commented Apr 26, 2021

VeeranjaneyuluToka commented Apr 30, 2021

pranavm-nvidia commented Apr 30, 2021

VeeranjaneyuluToka commented Apr 30, 2021 •

edited

Loading

pranavm-nvidia commented Apr 30, 2021

VeeranjaneyuluToka commented May 4, 2021

pranavm-nvidia commented May 4, 2021

VeeranjaneyuluToka commented May 4, 2021

pranavm-nvidia commented May 4, 2021

VeeranjaneyuluToka commented May 4, 2021 •

edited

Loading

pranavm-nvidia commented May 4, 2021

VeeranjaneyuluToka commented May 4, 2021

pranavm-nvidia commented May 4, 2021

VeeranjaneyuluToka commented May 5, 2021

pranavm-nvidia commented May 5, 2021

VeeranjaneyuluToka commented May 5, 2021

pranavm-nvidia commented May 5, 2021

VeeranjaneyuluToka commented May 5, 2021 •

edited

Loading

pranavm-nvidia commented May 5, 2021

VeeranjaneyuluToka commented May 5, 2021

pranavm-nvidia commented May 5, 2021

VeeranjaneyuluToka commented May 5, 2021

pranavm-nvidia commented May 5, 2021

VeeranjaneyuluToka commented May 5, 2021

pranavm-nvidia commented May 5, 2021

ttyio commented Jul 2, 2021

onnx to trt conversion of ssd mobilenet v2 fpnlite tensorflow object detection API 2 #1205

onnx to trt conversion of ssd mobilenet v2 fpnlite tensorflow object detection API 2 #1205

Comments

VeeranjaneyuluToka commented Apr 20, 2021 • edited Loading

Description

Environment

Relevant Files

Steps To Reproduce

pranavm-nvidia commented Apr 22, 2021

VeeranjaneyuluToka commented Apr 23, 2021 • edited Loading

pranavm-nvidia commented Apr 23, 2021

VeeranjaneyuluToka commented Apr 24, 2021

VeeranjaneyuluToka commented Apr 26, 2021

VeeranjaneyuluToka commented Apr 30, 2021

pranavm-nvidia commented Apr 30, 2021

VeeranjaneyuluToka commented Apr 30, 2021 • edited Loading

pranavm-nvidia commented Apr 30, 2021

VeeranjaneyuluToka commented May 4, 2021

pranavm-nvidia commented May 4, 2021

VeeranjaneyuluToka commented May 4, 2021

pranavm-nvidia commented May 4, 2021

VeeranjaneyuluToka commented May 4, 2021 • edited Loading

pranavm-nvidia commented May 4, 2021

VeeranjaneyuluToka commented May 4, 2021

pranavm-nvidia commented May 4, 2021

VeeranjaneyuluToka commented May 5, 2021

pranavm-nvidia commented May 5, 2021

VeeranjaneyuluToka commented May 5, 2021

pranavm-nvidia commented May 5, 2021

VeeranjaneyuluToka commented May 5, 2021 • edited Loading

pranavm-nvidia commented May 5, 2021

VeeranjaneyuluToka commented May 5, 2021

pranavm-nvidia commented May 5, 2021

VeeranjaneyuluToka commented May 5, 2021

Make Tile layer and connect to NMS

pranavm-nvidia commented May 5, 2021

VeeranjaneyuluToka commented May 5, 2021

pranavm-nvidia commented May 5, 2021

ttyio commented Jul 2, 2021

VeeranjaneyuluToka commented Apr 20, 2021 •

edited

Loading

VeeranjaneyuluToka commented Apr 23, 2021 •

edited

Loading

VeeranjaneyuluToka commented Apr 30, 2021 •

edited

Loading

VeeranjaneyuluToka commented May 4, 2021 •

edited

Loading

VeeranjaneyuluToka commented May 5, 2021 •

edited

Loading