Question about channel_transpose in common_functions.py #18

Hyunseok-Kim0 · 2022-11-17T02:29:46Z

Issue Type

Others

onnx2tf version number

1.1.25

Download URL for ONNX

gist for reproduction : https://colab.research.google.com/gist/Hyunseok-Kim0/d0aaf6e9ac6fbe461c5f2364db4bc0b2/onnx2tf_20221117.ipynb

Parameter Replacement JSON

N/A

Description

Purpose: Personal development
What: channel_transpose in common_functions.py is used in arithmetic operations like Add, Sub, Mul, etc. What is the main purpose of this function? When second input has more dimension, channel_transpose adds additional squeeze layer and changes the output shape, not vice versa.
Please see the gist (https://colab.research.google.com/gist/Hyunseok-Kim0/d0aaf6e9ac6fbe461c5f2364db4bc0b2/onnx2tf_20221117.ipynb). When the output of network is x = x + y, onnx and tflite has same output shape. Converted tflite has wrong shape for x = y + x.

ONNX	Correct tflite (x = x + y)	Wrong tflite (x = y + x)

The text was updated successfully, but these errors were encountered:

PINTO0309 · 2022-11-17T02:47:44Z

Thanks. Please engage in a little discussion with me.

I am aware of that problem. However, I am struggling to implement a realistic measure.

This problem is especially common with Mul, Add, Sub, and Div. For example, the following Add operation pattern is available for the 4D NHWC / 5D NDHWC input. My implementation is still rough, so I have kept it simple with the idea of broadcasting the Y side or compressing the dimensions.

All of the following Y expression patterns on ONNX need to be converted to NHWC for processing. Also, the example below is a very simple pattern that only needs to be broadcast with all 1's except for all but one dimension.

e.g.

pattern.1: X = [1,128,128,3] (TF input format), Y = [3] (3 channel constant of onnx)
pattern.2: X = [1,128,128,3] (TF input format), Y = [1,3,1,1] (NCHW constant of onnx)
pattern.3: X = [1,128,128,3] (TF input format), Y = [3,1,1] (CHW constant of onnx)
~~- pattern.4: X = [1,128,128,3] (TF input format), Y = [1,3] (NC constant of onnx)~~
~~- pattern.5: X = [1,128,128,3] (TF input format), Y = [3,128] (CH constant of onnx)~~
pattern.6: X = [1,128,128,3] (TF input format), Y = [1,3,128] (NCH constant of onnx)
pattern.7: X = [1,128,128,3] (TF input format), Y = [128,128] (HW constant of onnx)
pattern.8: X = [1,64,128,128,3] (TF input format), Y = [128,128] (HW constant of onnx)
pattern.9: X = [1,64,128,128,3] (TF input format), Y = [3] (3 channel constant of onnx)
pattern.10: X = [1,64,128,128,3] (TF input format), Y = [1,128] (CH constant of onnx)
pattern.11: X = [1,64,128,128,3] (TF input format), Y = [1,1,128,1] (DCHW constant of onnx)
~~- pattern.12: X = [1,64,128,128,3] (TF input format), Y = [64] (D constant of onnx)~~
pattern.13: X = [1,128,128,128,64] (TF input format), Y = [128] (? constant of onnx)
etc...

I have no idea how to successfully implement constant broadcasts in all dimensions up to xD, not just 2D to 5D. Right now I am forced to deal with only a limited pattern. Realistically, I believe we need to not only implement Numpy-style broadcasts, but also an implementation that mechanically reads the constants that assume NCHW format into NHWC format and then broadcasts them.

It is very easy to implement a simple broadcast.

Hyunseok-Kim0 · 2022-11-17T06:53:07Z

Now I understood how difficult to solve this problem since there is no information to guess the order of tensor during conversion. However, Is comparing input and output shape between onnx and tensorflow not enough? Anyway the onnx model follows numpy broadcasting rule. In that case, the tensor shapes are compared from backward. It looks patterns like 4, 5, 12 cannot exist in onnx.

Considering that only the channel dimension should changed in onnx to tensorflow, I think procedure stated below can work for arbitrary dimension broadcasting.

if y has more dimension, swap x and y.
unsqueeze(0) y until the x and y have same length of dimension.
compare shape of x between onnx and tensorflow to figure out where does channel dimension moved.
transpose channel dimension of y to match order with x if needed.

As a result, one reshape layer one transpose layer will be added.

PINTO0309 · 2022-11-17T07:12:31Z

compare shape of x between onnx and tensorflow to figure out where does channel dimension moved.

Unfortunately, there is a problem in shufflenet-based models that prevents locating the channel dimension when all dimensions except batch size are the same. There are quite a few models where the very operation of comparing ONNX shapes to TensorFlow shapes breaks down.

e.g. onnx x: [1,80,80,80], [40,40,40], etc...

It might be possible to respond in a very limited way... 🤔
I may still be misunderstanding.

Hyunseok-Kim0 · 2022-11-17T07:30:29Z

What about comparing intermediate output using dummy input? For the cases you mentioned, brute-force looks like the only solution.

PINTO0309 · 2022-11-17T07:33:27Z

brute-force

Thanks.
I see. This idea had never occurred to me before. I will give it some thought.

PINTO0309 · 2022-11-18T03:45:49Z

Notes on implementation ideas.

Get the result of inference on the corresponding OP alone with a dummy tensor.
Flatten the output tensor to 1D and sort in ascending order.
Either use numpy.ndarray.all to determine an exact match, or loop through the numbers one by one to determine if they are approximate (simple comparison of numbers is likely to cause problems, as small arithmetic errors can occur.)

Need to separate the logic for ambiguous match comparison for each pattern of integers and decimals.

https://numpy.org/doc/stable/reference/generated/numpy.isclose.html

PINTO0309 · 2022-11-19T08:50:20Z

channel_transpose branch test.1, Hyunseok-Kim0 pattern dummy1.onnx.zip
channel_transpose branch, PRelu test.2 face_recognition_sface_2021dec.onnx
channel_transpose branch, test.3 dummy2.onnx.zip
channel_transpose branch, test.4 dummy2.onnx.zip
```
onnx2tf -i dummy2.onnx -k y
```
channel_transpose branch, test.5 dummy4.onnx.zip
```
onnx2tf -i dummy4.onnx -kat y
```

PINTO0309 · 2022-11-20T04:33:07Z

https://github.com/PINTO0309/onnx2tf/releases/tag/1.1.26
https://github.com/PINTO0309/onnx2tf/releases/tag/1.1.27

Hyunseok-Kim0 · 2022-11-21T05:09:21Z

Current explicit_broadcast has a bug. It returns swapped operand if const_or_var_2.shape is all 1's. This occurs wrong calculation when constant is operand. For example, (1-x) is calculated as (x-1) in current version.

onnx2tf/onnx2tf/utils/common_functions.py

Lines 641 to 651 in 86cc1a0

    
           # Swap: len(const_or_var_1.shape) > len(const_or_var_2.shape) 
        
           if len(const_or_var_1.shape) < len(const_or_var_2.shape): 
        
               const_or_var_1, const_or_var_2 = const_or_var_2, const_or_var_1 
        
               graph_node_input_name1, graph_node_input_name2 = graph_node_input_name2, graph_node_input_name1 
        
           # If const_or_var_2.shape is all 1's, do not broadcast and return as is 
        
           shape_for_judging_skip_processing = [ 
        
               i if i is not None else INF_INDEX_VALUE for i in const_or_var_2.shape 
        
           ] 
        
           if np.prod(shape_for_judging_skip_processing) == 1: 
        
               return const_or_var_1, const_or_var_2

Also, arithmetic operations between same shape of tensors cannot be done due to wrong transpose_perm.
If x is (1, 384, 384, 3) and y has same shape (1, 384, 384, 3), current version transpose_perm has value of (0, 2, 3, 1). So y is transposed to (1, 384, 3, 384).
I misunderstood, sorry.

For now, I almost fixed these bugs. Do you mind if I open PR after checking some patterns to make sure bug is fixed?

PINTO0309 · 2022-11-21T05:21:00Z

I see. Thanks.

Current explicit_broadcast has a bug. It returns swapped operand if const_or_var_2.shape is all 1's. This occurs wrong calculation when constant is operand. For example, (1-x) is calculated as (x-1) in current version.

Consider switching the order of decisions as follows.

 # If const_or_var_2.shape is all 1's, do not broadcast and return as is 
 shape_for_judging_skip_processing = [ 
     i if i is not None else INF_INDEX_VALUE for i in const_or_var_2.shape 
 ] 
 if np.prod(shape_for_judging_skip_processing) == 1: 
     return const_or_var_1, const_or_var_2 

 # Swap: len(const_or_var_1.shape) > len(const_or_var_2.shape) 
 if len(const_or_var_1.shape) < len(const_or_var_2.shape): 
     const_or_var_1, const_or_var_2 = const_or_var_2, const_or_var_1 
     graph_node_input_name1, graph_node_input_name2 = graph_node_input_name2, graph_node_input_name1

Also, arithmetic operations between same shape of tensors cannot be done due to wrong transpose_perm.
If x is (1, 384, 384, 3) and y has same shape (1, 384, 384, 3), current version transpose_perm has value of (0, 2, 3, 1). So y is transposed to (1, 384, 3, 384).

Consider adding logic to determine if all dimensions match before calculating transpose_perm. However, I believe that there are very few situations where tensors of the same shape are entered as constants in the NCHW phase of ONNX in the first place.

PINTO0309 · 2022-11-21T05:23:54Z

For now, I almost fixed these bugs. Do you mind if I open PR after checking some patterns to make sure bug is fixed?

Sorry. I was so focused on the text pointing out the bug that I missed this last sentence.

Of course. You are welcome. :)

PINTO0309 · 2022-11-22T03:34:38Z

https://github.com/PINTO0309/onnx2tf/releases/tag/1.1.28

PINTO0309 added the discussion Specification Discussion label Nov 17, 2022

PINTO0309 added a commit that referenced this issue Nov 19, 2022

#18 Non-brute-force checking part of explicit_broadcast

9b10a6e

PINTO0309 mentioned this issue Nov 19, 2022

Non-brute-force checking part of explicit_broadcast #19

Merged

Hyunseok-Kim0 mentioned this issue Nov 21, 2022

bug fix in explicit_broadcast #20

Merged

Hyunseok-Kim0 closed this as completed Nov 25, 2022

PINTO0309 mentioned this issue Feb 13, 2023

Fixed a bug that caused an error by referencing Numpy data that should not exist as output OP in TensorFlow. #185 #188

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about channel_transpose in common_functions.py #18

Question about channel_transpose in common_functions.py #18

Hyunseok-Kim0 commented Nov 17, 2022

PINTO0309 commented Nov 17, 2022 •

edited

Hyunseok-Kim0 commented Nov 17, 2022 •

edited

PINTO0309 commented Nov 17, 2022 •

edited

Hyunseok-Kim0 commented Nov 17, 2022

PINTO0309 commented Nov 17, 2022

PINTO0309 commented Nov 18, 2022 •

edited

PINTO0309 commented Nov 19, 2022 •

edited

PINTO0309 commented Nov 20, 2022 •

edited

Hyunseok-Kim0 commented Nov 21, 2022 •

edited

PINTO0309 commented Nov 21, 2022

PINTO0309 commented Nov 21, 2022 •

edited

PINTO0309 commented Nov 22, 2022

Question about channel_transpose in common_functions.py #18

Question about channel_transpose in common_functions.py #18

Comments

Hyunseok-Kim0 commented Nov 17, 2022

Issue Type

onnx2tf version number

Download URL for ONNX

Parameter Replacement JSON

Description

PINTO0309 commented Nov 17, 2022 • edited

Hyunseok-Kim0 commented Nov 17, 2022 • edited

PINTO0309 commented Nov 17, 2022 • edited

Hyunseok-Kim0 commented Nov 17, 2022

PINTO0309 commented Nov 17, 2022

PINTO0309 commented Nov 18, 2022 • edited

PINTO0309 commented Nov 19, 2022 • edited

PINTO0309 commented Nov 20, 2022 • edited

Hyunseok-Kim0 commented Nov 21, 2022 • edited

PINTO0309 commented Nov 21, 2022

PINTO0309 commented Nov 21, 2022 • edited

PINTO0309 commented Nov 22, 2022

PINTO0309 commented Nov 17, 2022 •

edited

Hyunseok-Kim0 commented Nov 17, 2022 •

edited

PINTO0309 commented Nov 17, 2022 •

edited

PINTO0309 commented Nov 18, 2022 •

edited

PINTO0309 commented Nov 19, 2022 •

edited

PINTO0309 commented Nov 20, 2022 •

edited

Hyunseok-Kim0 commented Nov 21, 2022 •

edited

PINTO0309 commented Nov 21, 2022 •

edited