Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spec] DepthToSpace mode attribute is counter-intuitive #6069

Open
zhenhuaw-me opened this issue Apr 8, 2024 · 1 comment
Open

[Spec] DepthToSpace mode attribute is counter-intuitive #6069

zhenhuaw-me opened this issue Apr 8, 2024 · 1 comment
Assignees
Labels
contributions welcome documentation Issues related to ONNX documentation operator Issues related to ONNX operators question Questions about ONNX spec clarification Clarification of the ONNX spec needed spec
Milestone

Comments

@zhenhuaw-me
Copy link
Member

I want to discuss our spec of the DepthToSpace operator. Please help to current me if any misunderstanding. Thanks! (I don't want to mark this as a bug as it is not.)

For this operator, the mode attribute defaults to DCR is counter-intuitive. IMO, there are 2 problems.

The two problems

1. We should use CRD as the default

DepthToSpace default mode should be compatible with the framework that defines it.

In practice, if a framework default tensor layout is NCHW, the DepthToSpace operator should assume to rearrange the C dimension to HW dimensions in an NCHW approach (i.e. CRD).

TensorFlow is an example of default to NHWC tensor layout. So, for ONNX, we should use CRD as the default.

tf.nn.depth_to_space(
    input, block_size, data_format='NHWC', name=None
)

2. Our mode spec is confusing

In our spec:

By default, mode = DCR. In the DCR mode, elements along the depth dimension from the input tensor are rearranged in the following order: depth, column, and then row.

It is not wrong, but DepthToSpace and SpaceToDepth are manipulating the C dimension with HW dimensions. Using "depth, column, and then row" is confusing as it doesn't make sense if 3D image. We can argue that 3D is unsupported, but that doesn't solve the problem.

Using something like NCHW and NHWC is more easy to understand.

What to do?

I suggest:

  1. Rename the attribute to NCHW and NHWC. We can provide compatibility implicitly, and deprecate/remove legacy ones in the long term.
  2. Make NCHW as the default. I am not sure if we have done this before, it's kind of tricky as it would impact the default API behavior...
@zhenhuaw-me zhenhuaw-me added question Questions about ONNX operator Issues related to ONNX operators ir documentation Issues related to ONNX documentation spec spec clarification Clarification of the ONNX spec needed labels Apr 8, 2024
@zhenhuaw-me zhenhuaw-me self-assigned this Apr 8, 2024
@zhenhuaw-me
Copy link
Member Author

@onnx/sig-operators for viz.

@justinchuby justinchuby removed the ir label Apr 9, 2024
@justinchuby justinchuby added this to the 1.17 milestone May 7, 2024
@justinchuby justinchuby modified the milestones: 1.17, 1.16.1, 1.18 May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributions welcome documentation Issues related to ONNX documentation operator Issues related to ONNX operators question Questions about ONNX spec clarification Clarification of the ONNX spec needed spec
Projects
None yet
Development

No branches or pull requests

2 participants