Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[nanodet-plus] Conv layer shape wrong #15

Closed
vuthithao opened this issue Nov 15, 2022 · 6 comments
Closed

[nanodet-plus] Conv layer shape wrong #15

vuthithao opened this issue Nov 15, 2022 · 6 comments
Labels
5D channel shuffle 5D channel shuffle, shufflenet pattern no-issue-activity OP:Reshape OP:Reshape OP:Transpose OP:Transpose opset < 11 Unsupported opset model Parameter replacement Use Parameter replacement

Comments

@vuthithao
Copy link

vuthithao commented Nov 15, 2022

Issue Type

Others

onnx2tf version number

1.1.22

Download URL for ONNX

https://drive.google.com/file/d/1HyUDTYSTgS7_Gs29SxUSoxs3irqvIsXM/view?usp=sharing

Parameter Replacement JSON

{}

Description

  1. Purpose: Personal development
  2. What:
    Running script: onnx2tf -i nanodet-plus-m_416.onnx -oh5 -k "data"
    Issue log:
INFO: onnx_op_type: Conv onnx_op_name: /backbone/stage2/stage2.1/branch2/branch2.0/Conv
INFO: input_name.1: /backbone/stage2/stage2.1/Split_output_1 shape: [1, 58, 52, 52] dtype: float32
INFO: input_name.2: onnx::Conv_1392 shape: [58, 58, 1, 1] dtype: <class 'numpy.float32'>
INFO: input_name.3: onnx::Conv_1393 shape: [58] dtype: <class 'numpy.float32'>
INFO: output_name.1: /backbone/stage2/stage2.1/branch2/branch2.0/Conv_output_0 shape: [1, 58, 52, 52] dtype: float32
ERROR: The trace log is below.
Traceback (most recent call last):
  File "/mnt/hdd10tb/Users/thaovu/.conda/envs/onnx2tf/lib/python3.8/site-packages/onnx2tf/utils/common_functions.py", line 261, in print_wrapper_func
    result = func(*args, **kwargs)
  File "/mnt/hdd10tb/Users/thaovu/.conda/envs/onnx2tf/lib/python3.8/site-packages/onnx2tf/utils/common_functions.py", line 323, in inverted_operation_enable_disable_wrapper_func
    result = func(*args, **kwargs)
  File "/mnt/hdd10tb/Users/thaovu/.conda/envs/onnx2tf/lib/python3.8/site-packages/onnx2tf/ops/Conv.py", line 153, in make_node
    tf.nn.convolution(
  File "/mnt/hdd10tb/Users/thaovu/.conda/envs/onnx2tf/lib/python3.8/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/mnt/hdd10tb/Users/thaovu/.conda/envs/onnx2tf/lib/python3.8/site-packages/keras/layers/core/tf_op_layer.py", line 119, in handle
    return TFOpLambda(op)(*args, **kwargs)
  File "/mnt/hdd10tb/Users/thaovu/.conda/envs/onnx2tf/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
ValueError: Exception encountered when calling layer "tf.nn.convolution_4" (type TFOpLambda).

Depth of input (52) is not a multiple of input depth of filter (58) for '{{node tf.nn.convolution_4/convolution}} = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], explicit_paddings=[], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](Placeholder, tf.nn.convolution_4/convolution_internal/filters)' with input shapes: [1,58,52,52], [1,1,58,58].

Question 1: Shape of onnx::Conv_1393 layer must be [58,58,1,1], but in here it is [1,1,58,58]. Any cause from data_format?
Question 2: I set -k "data" but data_format of this layer is NHWC. Correct me if I wrong
3. How:
Firstly, I ran without -k option and get the same issue, I tried to use -k but still the the same
I also check parameter replacement but cant find suitable solution.
4. Why: I wanna convert NanoDet-plus-m model to Keras (h5 format)
5. Resources: Pytorch model from NanoDet model zoo https://github.com/RangiLyu/nanodet
Thank you for your hard working. Your projects are very useful <3

@PINTO0309
Copy link
Owner

PINTO0309 commented Nov 15, 2022

The most likely problem with this tool is Reshape or Transpose, as described in the README. While the tool tries to automatically adjust model transformations as much as possible to avoid breakdowns, there are limits to automatic geometry changes and transpositions. The combination of Reshape with a 5-dimensional transposition is particularly error-prone. This applies to most of the shuffle-net models.

Therefore, when a conversion error occurs, it is necessary to check around Reshape and Transpose to see if the conversion is in a different state than expected. Below is a flowchart of how to identify the problem and how to resolve it.

  1. onnx2tf -i nanodet-plus-m_416.onnx -oh5

  2. Identify the name of the OP where the error finally occurred.
    /backbone/stage2/stage2.1/branch2/branch2.0/Conv
    image
    image
    image

  3. Trace back a short distance from the OP where the error finally occurred to the previous OP to find where the Reshape or Transpose shape change is broken.
    image

  4. Check the conversion logs for the three OPs in the figure above.

    INFO: onnx_op_type: Reshape onnx_op_name: /backbone/stage2/stage2.0/Reshape
    INFO: input_name.1: /backbone/stage2/stage2.0/Concat_output_0 shape: [1, 116, 52, 52] dtype: float32
    INFO: input_name.2: /backbone/stage2/stage2.0/Constant_output_0 shape: [5] dtype: <class 'numpy.int64'>
    INFO: output_name.1: /backbone/stage2/stage2.0/Reshape_output_0 shape: [1, 2, 58, 52, 52] dtype: float32
    INFO: tf_op_type: reshape
    INFO: input.1.tensor: name: tf.compat.v1.transpose/transpose:0 shape: (1, 116, 52, 52) dtype: <dtype: 'float32'> 
    INFO: input.2.shape: val: [1, 2, 58, 52, 52] 
    INFO: output.1.output: name: tf.reshape/Reshape:0 shape: (1, 2, 58, 52, 52) dtype: <dtype: 'float32'> 
    
    INFO: onnx_op_type: Transpose onnx_op_name: /backbone/stage2/stage2.0/Transpose
    INFO: input_name.1: /backbone/stage2/stage2.0/Reshape_output_0 shape: [1, 2, 58, 52, 52] dtype: float32
    INFO: output_name.1: /backbone/stage2/stage2.0/Transpose_output_0 shape: [1, 58, 2, 52, 52] dtype: float32
    INFO: tf_op_type: transpose_v2
    INFO: input.1.a: name: tf.reshape/Reshape:0 shape: (1, 2, 58, 52, 52) dtype: <dtype: 'float32'> 
    INFO: input.2.perm: val: [0, 2, 1, 3, 4] 
    INFO: output.1.output: name: tf.compat.v1.transpose_1/transpose:0 shape: (1, 58, 2, 52, 52) dtype: <dtype: 'float32'> 
    
    INFO: onnx_op_type: Reshape onnx_op_name: /backbone/stage2/stage2.0/Reshape_1
    INFO: input_name.1: /backbone/stage2/stage2.0/Transpose_output_0 shape: [1, 58, 2, 52, 52] dtype: float32
    INFO: input_name.2: /backbone/stage2/stage2.0/Constant_1_output_0 shape: [4] dtype: <class 'numpy.int64'>
    INFO: output_name.1: /backbone/stage2/stage2.0/Reshape_1_output_0 shape: [1, 116, 52, 52] dtype: float32
    INFO: tf_op_type: reshape
    INFO: input.1.tensor: name: tf.compat.v1.transpose_2/transpose:0 shape: (1, 58, 2, 52, 52) dtype: <dtype: 'float32'> 
    INFO: input.2.shape: val: [1, -1, 52, 52] 
    INFO: output.1.output: name: tf.reshape_1/Reshape:0 shape: (1, 116, 52, 52) dtype: <dtype: 'float32'> 
    
    OP matters for investigation
    /backbone/stage2/stage2.0/Reshape Correct because the output shape is (1, 2, 58, 52, 52)
    /backbone/stage2/stage2.0/Transpose Correct because the output shape is (1, 58, 2, 52, 52)
    /backbone/stage2/stage2.0/Reshape_1 It needs to be transposed to (1,52,52,116) before being input to the next Split, which is (1,116,52,52). (1,116,52,52) is in NCHW format.

    image

  5. Reshape's output shape was found to be incorrect, so Reshape's behavior will be corrected. Transpose to NHWC format.

    • replace.json
      {
        "format_version": 1,
        "operations": [
          {
            "op_name": "/backbone/stage2/stage2.0/Reshape_1",
            "param_target": "outputs",
            "param_name": "/backbone/stage2/stage2.0/Reshape_1_output_0",
            "post_process_transpose_perm": [0,2,3,1]
          }
        ]
      }
  6. To execute the conversion command again while loading the created JSON file into the tool, enter the command as follows

    onnx2tf -i nanodet-plus-m_416.onnx -oh5 -prf replace.json
    

    The log shows that the output shape of /backbone/stage2/stage2.0/Reshape_1 is corrected to NHWC. At the same time, the input shape of /backbone/stage2/stage2.1/branch2/branch2.0/Conv is corrected to (1, 52, 52, 58).

    INFO: onnx_op_type: Reshape onnx_op_name: /backbone/stage2/stage2.0/Reshape
    INFO: input_name.1: /backbone/stage2/stage2.0/Concat_output_0 shape: [1, 116, 52, 52] dtype: float32
    INFO: input_name.2: /backbone/stage2/stage2.0/Constant_output_0 shape: [5] dtype: <class 'numpy.int64'>
    INFO: output_name.1: /backbone/stage2/stage2.0/Reshape_output_0 shape: [1, 2, 58, 52, 52] dtype: float32
    INFO: tf_op_type: reshape
    INFO: input.1.tensor: name: tf.compat.v1.transpose/transpose:0 shape: (1, 116, 52, 52) dtype: <dtype: 'float32'> 
    INFO: input.2.shape: val: [1, 2, 58, 52, 52] 
    INFO: output.1.output: name: tf.reshape/Reshape:0 shape: (1, 2, 58, 52, 52) dtype: <dtype: 'float32'> 
    
    INFO: onnx_op_type: Transpose onnx_op_name: /backbone/stage2/stage2.0/Transpose
    INFO: input_name.1: /backbone/stage2/stage2.0/Reshape_output_0 shape: [1, 2, 58, 52, 52] dtype: float32
    INFO: output_name.1: /backbone/stage2/stage2.0/Transpose_output_0 shape: [1, 58, 2, 52, 52] dtype: float32
    INFO: tf_op_type: transpose_v2
    INFO: input.1.a: name: tf.reshape/Reshape:0 shape: (1, 2, 58, 52, 52) dtype: <dtype: 'float32'> 
    INFO: input.2.perm: val: [0, 2, 1, 3, 4] 
    INFO: output.1.output: name: tf.compat.v1.transpose_1/transpose:0 shape: (1, 58, 2, 52, 52) dtype: <dtype: 'float32'> 
    
    INFO: onnx_op_type: Reshape onnx_op_name: /backbone/stage2/stage2.0/Reshape_1
    INFO: input_name.1: /backbone/stage2/stage2.0/Transpose_output_0 shape: [1, 58, 2, 52, 52] dtype: float32
    INFO: input_name.2: /backbone/stage2/stage2.0/Constant_1_output_0 shape: [4] dtype: <class 'numpy.int64'>
    INFO: output_name.1: /backbone/stage2/stage2.0/Reshape_1_output_0 shape: [1, 116, 52, 52] dtype: float32
    INFO: tf_op_type: reshape
    INFO: input.1.tensor: name: tf.compat.v1.transpose_2/transpose:0 shape: (1, 58, 2, 52, 52) dtype: <dtype: 'float32'> 
    INFO: input.2.shape: val: [1, -1, 52, 52] 
    INFO: output.1.output: name: tf.compat.v1.transpose_3/transpose:0 shape: (1, 52, 52, 116) dtype: <dtype: 'float32'> 
    
    INFO: onnx_op_type: Split onnx_op_name: /backbone/stage2/stage2.1/Split
    INFO: input_name.1: /backbone/stage2/stage2.0/Reshape_1_output_0 shape: [1, 116, 52, 52] dtype: float32
    INFO: output_name.1: /backbone/stage2/stage2.1/Split_output_0 shape: [1, 58, 52, 52] dtype: float32
    INFO: output_name.2: /backbone/stage2/stage2.1/Split_output_1 shape: [1, 58, 52, 52] dtype: float32
    INFO: tf_op_type: split
    INFO: input.1.value: name: tf.compat.v1.transpose_3/transpose:0 shape: (1, 52, 52, 116) dtype: <dtype: 'float32'> 
    INFO: input.2.num_or_size_splits: val: [58, 58] 
    INFO: input.3.axis: val: 3 
    INFO: input.4.num: 
    INFO: output.1.output0: name: tf.split/split:0 shape: (1, 52, 52, 58) dtype: <dtype: 'float32'> 
    INFO: output.2.output1: name: tf.split/split:1 shape: (1, 52, 52, 58) dtype: <dtype: 'float32'> 
    
    INFO: onnx_op_type: Conv onnx_op_name: /backbone/stage2/stage2.1/branch2/branch2.0/Conv
    INFO: input_name.1: /backbone/stage2/stage2.1/Split_output_1 shape: [1, 58, 52, 52] dtype: float32
    INFO: input_name.2: onnx::Conv_1392 shape: [58, 58, 1, 1] dtype: <class 'numpy.float32'>
    INFO: input_name.3: onnx::Conv_1393 shape: [58] dtype: <class 'numpy.float32'>
    INFO: output_name.1: /backbone/stage2/stage2.1/branch2/branch2.0/Conv_output_0 shape: [1, 58, 52, 52] dtype: float32
    INFO: tf_op_type: convolution_v2
    INFO: input.1.input: name: tf.split/split:1 shape: (1, 52, 52, 58) dtype: <dtype: 'float32'> 
    INFO: input.2.weights: shape: (1, 1, 58, 58) dtype: float32 
    INFO: input.3.bias: shape: (58,) dtype: float32 
    INFO: input.4.strides: val: [1, 1] 
    INFO: input.5.dilations: val: [1, 1] 
    INFO: input.6.padding: val: SAME 
    INFO: input.7.group: val: 1 
    INFO: output.1.output: name: tf.math.add_6/Add:0 shape: (1, 52, 52, 58) dtype: <dtype: 'float32'>
    
  7. The error occurs again, but this is because there are several similar Reshape and Transpose combinations. Although an error occurred, we can see that the error occurred in a different OP than the previous one. This means that the first Conv with an error was successfully converted, and so on up to the next Conv.

    /backbone/stage2/stage2.2/branch2/branch2.0/Conv
    image

  8. Thereafter, repeat steps 1. through 5. until there are no more errors. The model you shared seems to have 16 identical
    5D Reshape -> 5D Traspose -> 4D Reshape combinations. Thus, it would be necessary to write similar statements 16 times in the JSON file.

You have provided a sample for testing and I intend to resolve this issue in a future update.

@PINTO0309 PINTO0309 added Parameter replacement Use Parameter replacement OP:Reshape OP:Reshape OP:Transpose OP:Transpose 5D channel shuffle 5D channel shuffle, shufflenet pattern labels Nov 15, 2022
@vuthithao
Copy link
Author

I tried to replace parameter as you comment with replace.json here https://drive.google.com/drive/folders/1bDJw2mJVGirdpO33POvDxrEQx0cTD2HY?usp=sharing
And get issue at Resize layer
image
With len(graph_node.inputs)=2, this case fall in "else" of

if sizes is not None:

I simply add "else" case:

# Tensorflow require the shape of "size" in the "tf.image.resize" must be known at
 # graph creation time. However in the dynamic shape situation, the shape of "new_size"
 # will be "None", the actual shape can only be determine at runtime. But we know
 # "new_size" should always contain [h, w], therefore the shape must be 2.
 else:
     if hasattr(graph_node.outputs[0], 'shape') \
         and graph_node.outputs[0].shape is not None \
         and isinstance(graph_node.outputs[0].shape[-2], int) \
         and isinstance(graph_node.outputs[0].shape[-1], int):
         new_size = graph_node.outputs[0].shape[-2:len(graph_node.outputs[0].shape)] # Estimated from ONNX output shape
 if hasattr(new_size, 'set_shape'):
     new_size.set_shape([2])

It works.
Thank for your supports.

@PINTO0309 PINTO0309 added the opset < 11 Unsupported opset model label Nov 16, 2022
@PINTO0309
Copy link
Owner

PINTO0309 commented Nov 16, 2022

Thanks for pointing that out. I found the cause of the error in Resize OP.

Only the models with opset<=10 seemed to have errors. The number and type of inputs seem to be completely different. Originally, the tool focused on testing models with opset >= 11 and was inadequate for models with opset <= 10.

Your model has opset=10.
image

Therefore, the modifications you suggested would have been fine, but I have made a more generic modification.
commit: b275740
release: https://github.com/PINTO0309/onnx2tf/releases/tag/1.1.23

I have released 1.1.23 and upgrading to the latest package should solve the problem.

@PINTO0309 PINTO0309 changed the title Conv layer shape wrong [nanodet-plus] Conv layer shape wrong Nov 17, 2022
@github-actions
Copy link

If there is no activity within the next two days, this issue will be closed automatically.

@PINTO0309
Copy link
Owner

PINTO0309 commented Nov 23, 2022

Fixes: https://github.com/PINTO0309/onnx2tf/releases/tag/1.1.29

onnx2tf -i nanodet-plus-m_416.onnx

ONNX TFLite
image image

@github-actions
Copy link

If there is no activity within the next two days, this issue will be closed automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5D channel shuffle 5D channel shuffle, shufflenet pattern no-issue-activity OP:Reshape OP:Reshape OP:Transpose OP:Transpose opset < 11 Unsupported opset model Parameter replacement Use Parameter replacement
Projects
None yet
Development

No branches or pull requests

2 participants