Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onnx to trt conversion of ssd mobilenet v2 fpnlite tensorflow object detection API 2 #1205

Closed
VeeranjaneyuluToka opened this issue Apr 20, 2021 · 30 comments
Labels
Export: tf2onnx https://github.com/onnx/tensorflow-onnx ONNX triaged Issue has been triaged by maintainers

Comments

@VeeranjaneyuluToka
Copy link

VeeranjaneyuluToka commented Apr 20, 2021

Description

Trying to convert from onnx to tensorrt and getting [1] 23371 segmentation fault

adVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_4/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_1/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_1/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_1/FusedBatchNormV3/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_1/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/smoothing_1_batchnorm/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/smoothing_1_batchnorm/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/smoothing_1_batchnorm/FusedBatchNormV3/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/smoothing_1_batchnorm/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1312
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_1/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_1/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_1/FusedBatchNormV3/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_1/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_0/BatchNorm/feature_0/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_0/BatchNorm/feature_0/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_0/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_0/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1307
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_1/BatchNorm/feature_0/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_1/BatchNorm/feature_0/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_1/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_1/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1402
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_0/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_0/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_0/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_0/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/BatchNorm/feature_0/FusedBatchNormV3/ReadVariableOp_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1354
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1324
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_starts__757
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_ends__758
[04/20/2021-15:38:38] [V] [TRT] onnx2trt_utils.cpp:236: Weight at index 0: 9223372036854775807 is out of range. Clamping to: 2147483647
[04/20/2021-15:38:38] [V] [TRT] onnx2trt_utils.cpp:236: Weight at index 1: 9223372036854775807 is out of range. Clamping to: 2147483647
[04/20/2021-15:38:38] [V] [TRT] onnx2trt_utils.cpp:236: Weight at index 2: 9223372036854775807 is out of range. Clamping to: 2147483647
[04/20/2021-15:38:38] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/Decode/truediv_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/Decode/truediv:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: ConstantFolding/StatefulPartitionedCall/Postprocessor/Decode/truediv_2_recip:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/Decode/get_center_coordinates_and_sizes/add_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/Decode/get_center_coordinates_and_sizes/add:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_starts__760
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_ends__761
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_starts__763
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_ends__764
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_starts__766
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_ends__767
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_starts__769
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_ends__770
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_starts__772
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_ends__773
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/PadOrClipBoxList/Pad__1118
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/Decode/get_center_coordinates_and_sizes/sub:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/Decode/get_center_coordinates_and_sizes/sub_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: ConstantFolding/StatefulPartitionedCall/Postprocessor/Decode/truediv_7_recip:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1323
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/Minimum_5/x:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1406
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/mul:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/Reshape_3:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/Concatenate/concat_5:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/mul_5/x:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/range_6/delta:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/PadOrClipBoxList/Select/e:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const__1218
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: largest_int_val__1219
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/PadOrClipBoxList/zeros_10:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/PadOrClipBoxList/zeros:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/PadOrClipBoxList/sub_17/x:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/PadOrClipBoxList/sub_3/x:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1301
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:90: Importing initializer: const_fold_opt__1350
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:103: Parsing node: __inference_map_while_cond_8144_19055_map/while/Less_1 [Less]
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:119: Searching for input: const_fold_opt__1415
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:119: Searching for input: StatefulPartitionedCall/add/y:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:125: __inference_map_while_cond_8144_19055_map/while/Less_1 [Less] inputs: [const_fold_opt__1415 -> ()], [StatefulPartitionedCall/add/y:0 -> ()],
[04/20/2021-15:38:38] [V] [TRT] ImporterContext.hpp:141: Registering layer: __inference_map_while_cond_8144_19055_map/while/Less_1 for ONNX node: __inference_map_while_cond_8144_19055_map/while/Less_1
[04/20/2021-15:38:38] [V] [TRT] ImporterContext.hpp:116: Registering tensor: __inference_map_while_cond_8144_19055_map/while/Less_1:0 for ONNX tensor: __inference_map_while_cond_8144_19055_map/while/Less_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:179: __inference_map_while_cond_8144_19055_map/while/Less_1 [Less] outputs: [__inference_map_while_cond_8144_19055_map/while/Less_1:0 -> ()],
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:103: Parsing node: __inference_map_while_cond_8144_19055_map/while/LogicalAnd [And]
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:119: Searching for input: __inference_map_while_cond_8144_19055_map/while/Less_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:119: Searching for input: __inference_map_while_cond_8144_19055_map/while/Less_1:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:125: __inference_map_while_cond_8144_19055_map/while/LogicalAnd [And] inputs: [__inference_map_while_cond_8144_19055_map/while/Less_1:0 -> ()], [__inference_map_while_cond_8144_19055_map/while/Less_1:0 -> ()],
[04/20/2021-15:38:38] [V] [TRT] ImporterContext.hpp:141: Registering layer: __inference_map_while_cond_8144_19055_map/while/LogicalAnd for ONNX node: __inference_map_while_cond_8144_19055_map/while/LogicalAnd
[04/20/2021-15:38:38] [V] [TRT] ImporterContext.hpp:116: Registering tensor: __inference_map_while_cond_8144_19055_map/while/LogicalAnd:0 for ONNX tensor: __inference_map_while_cond_8144_19055_map/while/LogicalAnd:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:179: __inference_map_while_cond_8144_19055_map/while/LogicalAnd [And] outputs: [__inference_map_while_cond_8144_19055_map/while/LogicalAnd:0 -> ()],
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:103: Parsing node: StatefulPartitionedCall/map/while_loop [Loop]
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:119: Searching for input: StatefulPartitionedCall/map/while/maximum_iterations:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:119: Searching for input: __inference_map_while_cond_8144_19055_map/while/LogicalAnd:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:119: Searching for input: StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/range_5/start:0
[04/20/2021-15:38:38] [V] [TRT] ModelImporter.cpp:125: StatefulPartitionedCall/map/while_loop [Loop] inputs: [StatefulPartitionedCall/map/while/maximum_iterations:0 -> ()], [__inference_map_while_cond_8144_19055_map/while/LogicalAnd:0 -> ()], [StatefulPartitionedCall/Postprocessor/BatchMultiClassNonMaxSuppression/MultiClassNonMaxSuppression/range_5/start:0 -> ()],
[04/20/2021-15:38:38] [V] [TRT] ImporterContext.hpp:116: Registering tensor: map_while_map_while_loop_counter:0 for ONNX tensor: map_while_map_while_loop_counter:0
[04/20/2021-15:38:38] [V] [TRT] ImporterContext.hpp:116: Registering tensor: map_while_placeholder:0 for ONNX tensor: map_while_placeholder:0
[1] 23371 segmentation fault (core dumped) ./trtexec --verbose

Environment

TensorRT Version: 7.1.3.0
NVIDIA GPU: Jetson AGX Xavier (16 gb)
NVIDIA Driver Version: Jetpack 4.4.1
CUDA Version: 10.2
CUDNN Version: 8
Operating System: Ubuntu 18.04
Python Version (if applicable): 3.6.9
Tensorflow Version (if applicable): 2.4.0
PyTorch Version (if applicable):
Baremetal or Container (if so, version):

Relevant Files

Steps To Reproduce

./trtexec --onnx=/data/Veeru/models/regression/120421-ed1-onnx/ssd-mobnetv2_fpnlite_model_no_nms.onnx --saveEngine=/data/Veeru/models/regression/120421-ed1-onnx/ssd-mobnetv2_fpnlite_model_no_nms.trt --verbose

Please include:

#795

@pranavm-nvidia
Copy link
Collaborator

@VeeranjaneyuluToka Could you try reducing the topK value in your NMS to <= 4096?

See #795 (comment)

@VeeranjaneyuluToka
Copy link
Author

VeeranjaneyuluToka commented Apr 23, 2021

Screenshot from 2021-04-23 15-46-37
@pranavm-nvidia I have tried with <=4096 but still i am getting segmenation fault as shown in the above log. Am clueless from here. I have explored a couple of things suggested by @qraleq but they did not help me much. Would be great help if you can point at least some direction. Can you check my netron out and let me know if there is anything missed. I got to know from qraleq that input_tensor need to be connected but not able to get that.

@pranavm-nvidia
Copy link
Collaborator

@VeeranjaneyuluToka How did you generate the ONNX model? Can you share the original model and the scripts you used to modify it?

Also can you get a backtrace from gdb so we can see where it's segfaulting?

@VeeranjaneyuluToka
Copy link
Author

@pranavm-nvidia I have savedmodel from tensorflow object detectionn API 2 which is trained using my own dataset, then i have used below command to convert from savedmodel to onnx model

python -m tf2onnx.convert --saved-model /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/saved_model --outputs 'Identity_6:0','Identity_7:0' --output /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/ssd-mobnetv2_fpnlite_model.onnx --opset 12 --verbose

i have sent an email with all the onnx models and script that i have used to modify it.

and i am using the below command to convert from onnx to trt

./trtexec --onnx=/tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/ssd-mobnetv2_fpnlite_model_no_nms_latest.onnx --saveEngine=tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/ssd_mobilenet_v2_fpnlite.trt --verbose

i will check with gdb and let you know if i find anything related to seg fault.

@VeeranjaneyuluToka
Copy link
Author

@pranavm-nvidia , i have considered latest tf2onnx branch changes esp to fix this issues(onnx/tensorflow-onnx#1443) to avoid TensorListStack ops, then converted from savedmodel to onnx, then changed input-tensor datatype to float and tried to convert from onnx to trt, but getting the different error. Attached log for your reference.
log.txt, Can you please have a look into the log file and any suggestions to fix the error would be of great help

@ttyio ttyio added ONNX Export: tf2onnx https://github.com/onnx/tensorflow-onnx Topic: ONNX Plugin triaged Issue has been triaged by maintainers labels Apr 26, 2021
@VeeranjaneyuluToka
Copy link
Author

@pranavm-nvidia Thanks for all your help provided and prompt response. I could successfully convert from onnx to trt engine but results are quite different. wondering how can i debug this? would be great help if you can provide any suggestions or any references to debug further.

@pranavm-nvidia
Copy link
Collaborator

@VeeranjaneyuluToka A good first step would be to try with Polygraphy to figure out which layer is the issue. I'd start with the ONNX model w/o the NMS node:

polygraphy run </path/to/model.onnx> \
    --trt --trt-outputs mark all \
    --onnxrt --onnx-outputs mark all

If that passes, then the problem must be in the NMS.

@VeeranjaneyuluToka
Copy link
Author

VeeranjaneyuluToka commented Apr 30, 2021

polygraphy run /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/280421_convert/ssd-mobnetv2-fpnlite-model-inploop-removed.onnx --trt --trt-outputs mark all --onnxrt --onnx-outputs mark all
[W] 'colored' module is not installed, will not use colors when logging. To enable colors, please install the 'colored' module: python3 -m pip install colored
[I] Runner: trt-runner-N0-04/30/21-16:20:17 | Activating and starting inference
[TensorRT] WARNING: onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[I] Building engine with configuration: max_workspace_size=16777216 bytes (16.00 MB) | tf32=False, fp16=False, int8=False, strict_types=False | 1 profiles
[TensorRT] WARNING: TensorRT was linked against cuDNN 8.1.0 but loaded cuDNN 8.0.4
[TensorRT] WARNING: TensorRT was linked against cuDNN 8.1.0 but loaded cuDNN 8.0.4
[TensorRT] WARNING: TensorRT was linked against cuDNN 8.1.0 but loaded cuDNN 8.0.4
[I] Runner: trt-runner-N0-04/30/21-16:20:17 | Completed 1 iterations.
[I] Runner: onnxrt-runner-N0-04/30/21-16:20:17 | Activating and starting inference
[W] ONNX Checker exited with an error:
Field 'type' of value_info is required but missing.
2021-04-30 15:20:55.445659872 [W:onnxruntime:, graph.cc:84 MergeShapeInfo] Error merging shape info for output. 'StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/bn_Conv1/FusedBatchNormV3:0' source:{1,32,320,320} target:{0,32,320,320}. Falling back to lenient merge.
Traceback (most recent call last):
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/bin/polygraphy", line 47, in
main()
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/bin/polygraphy", line 43, in main
sys.exit(args.subcommand(args))
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/tools/run/run.py", line 264, in call
exec(script)
File "", line 32, in
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/comparator/comparator.py", line 170, in run
run_results[runner.name] = execute_runner(runner, loader_cache)
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/comparator/comparator.py", line 79, in execute_runner
with runner as active_runner:
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/backend/base/runner.py", line 67, in enter
self.activate()
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/backend/base/runner.py", line 93, in activate
self.activate_impl()
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/backend/onnxrt/runner.py", line 40, in activate_impl
self.sess, _ = misc.try_call(self._sess)
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/util/misc.py", line 274, in try_call
ret = func(*args, **kwargs)
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/polygraphy/backend/onnxrt/loader.py", line 41, in call
return onnxruntime.InferenceSession(model_bytes)
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 280, in init
self._create_inference_session(providers, provider_options)
File "/home/veeru/anaconda3/envs/tf_fish_2.2.0_dev/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 309, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Node (StatefulPartitionedCall/Postprocessor/Decode/mul_1) Op (Mul) [ShapeInferenceError] Incompatible dimensions

This is how the log looks like, @pranavm-nvidia :any inputs on this?

@pranavm-nvidia
Copy link
Collaborator

@VeeranjaneyuluToka Seems like the ONNX model isn't valid. Looks to me like it could be a bug in tf2onnx, in which case you may want to post here.

@VeeranjaneyuluToka
Copy link
Author

@pranavm-nvidia, i have one doubt when i use the below command to remove the loop and attached input shape, it started giving shape information and could visualize in netron the same.

polygraphy surgeon extract /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/290421_convert/ssd-mobnetv2_fpnlite_model.onnx --inputs StatefulPartitionedCall/map/while_loop:2,1x3x640x640,float32 -o /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/290421_convert/ssd-mobnetv2-fpnlite-model-inploop-removed.onnx

But if you notice the attached graph, it shows 1x3x640x640 at first node and after that it shows 0x32x320x320, why it is showing 0 at first dimension?
Screenshot from 2021-04-30 21-21-18

@pranavm-nvidia
Copy link
Collaborator

@VeeranjaneyuluToka Sounds like it might be a bug with ONNX shape inference. Is it causing any issues? TRT does its own shape inference, so it should ignore the intermediate shapes from the model

@VeeranjaneyuluToka
Copy link
Author

@pranavm-nvidia , trt conversion goes fine. But when i run inference using trt engine, results are very bad. so i am not sure where it is going wrong? anything i can do to track or debug trt engine?

@pranavm-nvidia
Copy link
Collaborator

@VeeranjaneyuluToka You can use Polygraphy to debug. I would start by checking if TRT results match ONNX-Runtime:

polygraphy run \
     /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/290421_convert/ssd-mobnetv2-fpnlite-model-inploop-removed.onnx \
    --trt --onnxrt

@VeeranjaneyuluToka
Copy link
Author

VeeranjaneyuluToka commented May 4, 2021

@pranavm-nvidia, Thanks for quick reply!, I have done some thing like this, i have run inference of tf savedmodel and converted savedmodel to onnx with the below command

python -m tf2onnx.convert --saved-model /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/saved_model --output /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/030521_convert/ssd-mobnetv2_fpnlite_model.onnx --opset 12

Then run the inference using onnx-Runtime, results are exactly same. But the onnx which got converted from the above command will have all the preprocessing, network, and post processing (includes NMS) and the input tensor data type is Uint8. So i need to modify the input tensor data type to float32, remove the loop nodes, consider only raw bounding boxes and get rid of NMS part from the saved model. I have used the below command for getting raw bounding box outputs

python -m tf2onnx.convert --saved-model /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/saved_model --output /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/030521_convert/ssd-mobnetv2_fpnlite_model_no_NMS.onnx --opset 12 --inputs "input_tensor:0" --outputs "Identity_6:0","Identity_7:0"

I can not run the inference using onnx runtime with the model created using the above command

And then polygraphy to remove the loop node as you suggested

polygraphy surgeon extract /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/290421_convert/ssd-mobnetv2_fpnlite_model.onnx --inputs StatefulPartitionedCall/map/while_loop:2,1x3x640x640,float32 -o /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/290421_convert/ssd-mobnetv2-fpnlite-model-inploop-removed.onnx

And then i have used onnx graph surgeon to modify the input tensor data type and to add the NMS plugin.

BTW, the above polygraphy command that you suggested gives the error as below

Runner: trt-runner-N0-05/04/21-15:50:36 | Activating and starting inference
[TensorRT] WARNING: onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[I] Building engine with configuration: max_workspace_size=16777216 bytes (16.00 MB) | tf32=False, fp16=False, int8=False, strict_types=False | 1 profiles
[TensorRT] WARNING: TensorRT was linked against cuDNN 8.1.0 but loaded cuDNN 8.0.4
[TensorRT] WARNING: TensorRT was linked against cuDNN 8.1.0 but loaded cuDNN 8.0.4
[TensorRT] WARNING: TensorRT was linked against cuDNN 8.1.0 but loaded cuDNN 8.0.4
[I] Runner: trt-runner-N0-05/04/21-15:50:36 | Completed 1 iterations.
[I] Runner: onnxrt-runner-N0-05/04/21-15:50:36 | Activating and starting inference
2021-05-04 14:50:55.194512363 [W:onnxruntime:, graph.cc:84 MergeShapeInfo] Error merging shape info for output. 'StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/bn_Conv1/FusedBatchNormV3:0' source:{1,32,320,320} target:{0,32,320,320}. Falling back to lenient merge.
Traceback (most recent call last):
File "/home/veeru/anaconda3/envs/tf_fish_2.4/bin/polygraphy", line 47, in
main()
File "/home/veeru/anaconda3/envs/tf_fish_2.4/bin/polygraphy", line 43, in main
sys.exit(args.subcommand(args))
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/tools/run/run.py", line 264, in call
exec(script)
File "", line 26, in
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/comparator/comparator.py", line 170, in run
run_results[runner.name] = execute_runner(runner, loader_cache)
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/comparator/comparator.py", line 79, in execute_runner
with runner as active_runner:
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/backend/base/runner.py", line 67, in enter
self.activate()
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/backend/base/runner.py", line 93, in activate
self.activate_impl()
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/backend/onnxrt/runner.py", line 40, in activate_impl
self.sess, _ = misc.try_call(self._sess)
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/util/misc.py", line 274, in try_call
ret = func(*args, **kwargs)
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/polygraphy/backend/onnxrt/loader.py", line 41, in call
return onnxruntime.InferenceSession(model_bytes)
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 280, in init
self._create_inference_session(providers, provider_options)
File "/home/veeru/anaconda3/envs/tf_fish_2.4/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 309, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Node (StatefulPartitionedCall/Postprocessor/Decode/mul_1) Op (Mul) [ShapeInferenceError] Incompatible dimensions

@pranavm-nvidia
Copy link
Collaborator

@VeeranjaneyuluToka Could you try saving layerwise outputs from the first model (the one that works with ONNX-RT) and then comparing?

  1. Save layerwise outputs:
polygraphy run \
    /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/030521_convert/ssd-mobnetv2_fpnlite_model.onnx \
    --onnxrt --onnx-outputs mark all \
    --save-outputs golden.pkl
  1. Then you can compare the TRT outputs against those:
polygraphy run \ 
    /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/290421_convert/ssd-mobnetv2-fpnlite-model-inploop-removed.onnx \
    --trt --trt-outputs mark all \
    --load-outputs golden.pkl

@VeeranjaneyuluToka
Copy link
Author

looks like no --save-outputs option in first command?

@pranavm-nvidia
Copy link
Collaborator

@VeeranjaneyuluToka Which version of Polygraphy are you using? Could you try installing from source?

@VeeranjaneyuluToka
Copy link
Author

@pranavm-nvidia after i build manually with the below commands
python3 setup.py bdist_wheel
python3 -m pip install polygraphy/dist/polygraphy-0.21.1-py2.py3-none-any.whl --user, here version is 0.21.1 i believe
i am getting the below error with the below command

polygraphy run /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/280421_convert/ssd-mobnetv2_fpnlite_model.onnx --onnxrt --onnx-outputs mark all --save-outputs /tf_git_hubs/tensorflow/workspace/training_demo/exported-models/my_ssd_mobnetv2_fpnlite_model/280421_convert/ssd-mobnetv2_fpnlite_model.pkl
Traceback (most recent call last):
File "/home/veeru/anaconda3/envs/tf_fish_2.4/bin/polygraphy", line 22, in
from polygraphy.tools.util import args as args_util
ImportError: cannot import name 'args' from 'polygraphy.tools.util' (/home/veeru/.local/lib/python3.8/site-packages/polygraphy/tools/util/init.py)

@pranavm-nvidia
Copy link
Collaborator

@VeeranjaneyuluToka Did you install Polygraphy in your conda environment? It looks like there's a mismatch between the polygraphy binary and the Python module it's trying to load. Can you try using ~/.local/bin/polygraphy instead?

@VeeranjaneyuluToka
Copy link
Author

@pranavm-nvidia thanks for quick reply. If you remember earlier i was getting 1x51150x4 shape after squeeze operation(you can see the same in the attached graph
Screenshot from 2021-04-27 20-39-56

After i remove squeeze node also, i was getting 1x51150x4 shape only as shown in the below attached graph
Screenshot from 2021-04-27 21-23-12

And then i was able to convert from onnx to trt though trt results are quite different. Am i getting different results because of this?

Today i manage to get 1x51150x1x4 shape as shown in the below graph
Screenshot from 2021-05-05 14-39-00

But still i think i need to get 1x51150x6x4 as i have 6 classes. Do you have any suggestions on this?

@pranavm-nvidia
Copy link
Collaborator

@VeeranjaneyuluToka You could insert a Tile node with repeats=[1, 1, 6, 1] (see https://github.com/onnx/onnx/blob/master/docs/Operators.md#Tile).

@VeeranjaneyuluToka
Copy link
Author

VeeranjaneyuluToka commented May 5, 2021

Screenshot from 2021-05-05 15-27-32
@pranavm-nvidia you mentioned that trt does not consider onnx shape info, does this cause accuracy differs and would you mind sharing any sample of using this tile operator? they have given an example there but i am just thinking that my output Identity-6 should be repeated as you mentioned with repeats?

I modified using the onnxconverter_common.onnx2py and got as attached graph, is it correct?

@pranavm-nvidia
Copy link
Collaborator

trt does not consider onnx shape info, does this cause accuracy differs

No, it shouldn't cause any difference in the output.

To insert the Tile, you could do something like:

import onnx
import onnx_graphsurgeon as gs
import numpy as np

graph = gs.import_onnx(onnx.load("/path/to/model.onnx"))
tmap = graph.tensors()

# Find boxes input and NMS node 
boxes = tmap["Identity_6:0"]
nms = boxes.outputs[0]

# Make Tile layer and connect to NMS
tile_out = graph.layer(op="Tile", 
                       inputs=[boxes, np.array([1, 1, 6, 1])], 
                       outputs=["tile_out"])[0]
nms.inputs[0] = tile_out

# Re-export graph
onnx.save(gs.export_onnx(graph), "/path/to/new_model.onnx"))

@VeeranjaneyuluToka
Copy link
Author

@pranavm-nvidia Thanks for sharing sample code snippet, here is the modified graph, would you mind have a look once and let me know this is how it should be?
Screenshot from 2021-05-05 15-41-46

@pranavm-nvidia
Copy link
Collaborator

@VeeranjaneyuluToka Yup, looks right. Could you double check that the repeats input of the Tile is [1, 1, 6, 1]? Other than that, should be good to go.

@VeeranjaneyuluToka
Copy link
Author

@pranavm-nvidia

Make Tile layer and connect to NMS

tile_out = graph.layer(op="Tile",
inputs=[boxes, np.array([1, 1, 6, 1])],
outputs=["tile_out"])[0]
this is what i have used as you suggested above and i think it is of shape that you mentioned. But onnx to trt conversion fails with this.
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_4_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_4_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_medium_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_5_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_5_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_medium_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_6_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_6_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_medium_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_7_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_7_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_8_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_8_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_9_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_9_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_10_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_10_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_11_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_11_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_12_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_12_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_13_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_13_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_14_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_14_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_15_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_15_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_16_expand/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/block_16_expand_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/Conv_1/Conv2D + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/model/out_relu/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/projection_1/BiasAdd + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/add_1 (scudnn) Set Tactic Name: ampere_scudnn_128x64_relu_interior_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/smoothing_1_depthwise_conv/separable_conv2d + StatefulPartitionedCall/ssd_mobile_net_v2fpn_keras_feature_extractor/FeatureMaps/top_down/smoothing_1/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_small_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_0/separable_conv2d + StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_0/activation_0/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_small_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_1/separable_conv2d + StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_1/activation_0/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_small_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/separable_conv2d + StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_2/activation_0/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_small_nn_v1
[05/05/2021-15:58:57] [V] [TRT] StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/separable_conv2d + StatefulPartitionedCall/WeightSharedConvolutionalBoxPredictor/PredictionTower/conv2d_3/activation_0/Relu6 (scudnn) Set Tactic Name: ampere_scudnn_128x128_relu_small_nn_v1
[05/05/2021-15:58:57] [F] [TRT] Assertion failed: inputDims[0].d[1] == numLocClasses
batchedNMSPlugin.cpp:297
Aborting...

let me know if you want have a look into complete log?

@pranavm-nvidia
Copy link
Collaborator

@VeeranjaneyuluToka Actually, you probably don't want to tile in this case (though if you do, you can set the shareLocation attribute to False). If shareLocation is True, then your boxes input can be 1x51150x1x4 and the boxes will be shared for all classes. The accuracy issue is probably not due to this.

@VeeranjaneyuluToka
Copy link
Author

@pranavm-nvidia ok, i will take off tile. Do you have any suggestions to investigate further on my accuracy issue?

@pranavm-nvidia
Copy link
Collaborator

@VeeranjaneyuluToka Comparing layer-wise outputs would be a good next step, see #1205 (comment)

This could be pretty tricky to debug though, given the number of modifications made to the ONNX model

@ttyio
Copy link
Collaborator

ttyio commented Jul 2, 2021

Closing since no activity for more than 3 weeks. please reopen if you still have question, thanks!

@ttyio ttyio closed this as completed Jul 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Export: tf2onnx https://github.com/onnx/tensorflow-onnx ONNX triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

3 participants