[Spec] DepthToSpace `mode` attribute is counter-intuitive #6069

zhenhuaw-me · 2024-04-08T03:10:35Z

I want to discuss our spec of the DepthToSpace operator. Please help to current me if any misunderstanding. Thanks! (I don't want to mark this as a bug as it is not.)

For this operator, the mode attribute defaults to DCR is counter-intuitive. IMO, there are 2 problems.

The two problems

1. We should use `CRD` as the default

DepthToSpace default mode should be compatible with the framework that defines it.

In practice, if a framework default tensor layout is NCHW, the DepthToSpace operator should assume to rearrange the C dimension to HW dimensions in an NCHW approach (i.e. CRD).

TensorFlow is an example of default to NHWC tensor layout. So, for ONNX, we should use CRD as the default.

tf.nn.depth_to_space(
    input, block_size, data_format='NHWC', name=None
)

2. Our `mode` spec is confusing

In our spec:

By default, mode = DCR. In the DCR mode, elements along the depth dimension from the input tensor are rearranged in the following order: depth, column, and then row.

It is not wrong, but DepthToSpace and SpaceToDepth are manipulating the C dimension with HW dimensions. Using "depth, column, and then row" is confusing as it doesn't make sense if 3D image. We can argue that 3D is unsupported, but that doesn't solve the problem.

Using something like NCHW and NHWC is more easy to understand.

What to do?

I suggest:

Rename the attribute to NCHW and NHWC. We can provide compatibility implicitly, and deprecate/remove legacy ones in the long term.
Make NCHW as the default. I am not sure if we have done this before, it's kind of tricky as it would impact the default API behavior...

The text was updated successfully, but these errors were encountered:

zhenhuaw-me · 2024-04-08T03:11:46Z

@onnx/sig-operators for viz.

zhenhuaw-me added question Questions about ONNX operator Issues related to ONNX operators ir documentation Issues related to ONNX documentation spec spec clarification Clarification of the ONNX spec needed labels Apr 8, 2024

zhenhuaw-me self-assigned this Apr 8, 2024

justinchuby removed the ir label Apr 9, 2024

justinchuby added this to the 1.17 milestone May 7, 2024

justinchuby added the contributions welcome label May 7, 2024

justinchuby modified the milestones: 1.17, 1.16.1, 1.18 May 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Spec] DepthToSpace `mode` attribute is counter-intuitive #6069

[Spec] DepthToSpace `mode` attribute is counter-intuitive #6069

zhenhuaw-me commented Apr 8, 2024

zhenhuaw-me commented Apr 8, 2024

[Spec] DepthToSpace mode attribute is counter-intuitive #6069

[Spec] DepthToSpace mode attribute is counter-intuitive #6069

Comments

zhenhuaw-me commented Apr 8, 2024

The two problems

1. We should use CRD as the default

2. Our mode spec is confusing

What to do?

zhenhuaw-me commented Apr 8, 2024

[Spec] DepthToSpace `mode` attribute is counter-intuitive #6069

[Spec] DepthToSpace `mode` attribute is counter-intuitive #6069

1. We should use `CRD` as the default

2. Our `mode` spec is confusing