[Tracking] Spec clarifications #3651

jcwchen · 2021-08-13T00:49:39Z

Creating this issue to track work items for spec clarifications. There are already multiple GitHub issues regarding op spec not being clear or error in expected outputs. I will collect them in this issue. Feel free to add more work item here. Thanks!

jcwchen · 2021-08-13T22:58:04Z

Old issues which look still valid:

AlexandreEichenberger · 2021-08-16T12:06:15Z

Conv operation does not specify if M (channel out) must be a multiple of number of groups. TF Keras and Pytorch have this restriction. #3641

kraiskil · 2021-08-31T14:14:09Z

Difference between Add and Sum could be more explicit, and that difference would be nice to have mentioned in both operator's documentation.
#339 says there was some differences with broadcasting support, but that doesn't seem to be the case anymore.

tungld · 2021-10-05T06:26:18Z

~~Expected outputs for Round looks incorrect #3755.~~

This one has been resolved. (Edited by @jcwchen)

gramalingam · 2021-10-13T06:11:08Z

The description of Cast operator could be improved (especially with conversions involving bool or string).

jcwchen · 2021-12-03T00:09:58Z

[Spec issue] Rounding or truncation behavior for Cast operator is not specified in the spec #3876

gramalingam · 2022-01-05T00:05:52Z

DequantizeLinear/QuantizeLinear: Spec should be more explicit if the inputs should be 2D or can be N-D, and explain the behavior in N-D case better.

gramalingam · 2022-01-05T00:08:31Z

QuantizeLinear: Spec should be more clear about the precision used for the different operators when using mixed-precision inputs (in computing x / y_scale + y_zero_point).

volcacius · 2022-02-02T15:20:58Z

LSTM: whenever input_forget=True, the spec doesn't specify which set of weights and biases ends up being used. It should be clarified that it's implementation dependent (e.g. ONNXRuntime uses the input ones and discards the forget ones https://github.com/microsoft/onnxruntime/blob/ef7b4dc05cae1084546ff7caba048a2f908ac1d8/onnxruntime/test/providers/cpu/rnn/LSTM.py#L208 ).

garymm · 2022-02-10T01:09:55Z

Prelu: The constraint rank(slope) == rank(X) OR slope is unidirectional broadcastable to X makes sense, but the spec doesn't explicitly say that. That is, it seems to allow rank(slope) > rank(X), which doesn't seem like it makes sense (if it does, could someone please explain?).

https://github.com/onnx/onnx/blob/main/docs/Operators.md#Prelu

jcwchen · 2022-03-25T16:22:44Z

I have a same question as #2303: "Do subgraph initializer names and input names shadow outer-scope names?" It's better to clarify it in the IR.

gramalingam · 2022-06-17T23:26:37Z

The specification of the "Pad" operator does not describe the intended behavior when there is an interaction between the use of negative pad-values (to truncate/slice) and the use of reflection (with a positive pad-value for the opposite side). Please see microsoft/onnxruntime#11828 for a detailed description.

gramalingam · 2022-08-29T20:20:09Z

The storage_order attribute of the MaxPool op seems out of place in the ONNX spec. Does the op need to be updated to eliminate/deprecate this attribute?

AlexandreEichenberger · 2022-12-01T18:07:00Z

Minor point, but naming of inputs matter in our onnx-mlir code base and we had to write extra templates because A & B for the MatMul, MatMulInteger, and QLinearMatMul are capitalized differently.

Use A & B:

Use a and b:

https://github.com/onnx/onnx/blob/main/docs/Operators.md#QLinearMatMul

I know this may not matter to most, and it is not a big deal, but nevertheless name discrepancy between related ops for which, for example, shape inference can be "commoned" may face minor impediments due to naming conventions.

If this can be fixed without triggering incompatibilities between before/after renaming, then I am for it. If it would trigger incompatibilities, then let us just remember to use names as consistency as possible for future ops.

Thanks

gramalingam · 2022-12-08T20:37:04Z

If this can be fixed without triggering incompatibilities between before/after renaming, then I am for it. If it would trigger incompatibilities, then let us just remember to use names as consistency as possible for future ops.

My own guess would be that the names should not matter. But, paradoxically, it looks like it would matter for onnx-mlir.

I am in favor of using uniform naming conventions. If anyone has any concerns about changing parameter names of ops (without changing their version-number), please let us know.

AlexandreEichenberger · 2022-12-16T18:43:18Z

@gramalingam thanks for your answers

Found another case: Conv/ConvTranspose uses X() and W() but ConvInteger/QLinearConv uses x() and w()

gramalingam · 2023-03-04T02:20:40Z

Another couple of issues relating to Reduce* ops:

(a) I assume that the Reduce* ops should support the special-case of zero-rank tensors (with a single element). It would be good if the spec explicitly clarifies this.

(b) The spec does not indicate what happens if the input is a tensor with zero elements: e.g., a tensor of shape (100, 0) with reduction along the axis with size 0 (axis 1 in above example).
(i) The ideal answer for ops like Sum is 0, and Prod is 1. (The identity element for the op.)
(ii) However, for other ops like Max or Min, it can be complicated. An ideal answer for Max() is minus-infinity, but that is valid for only float-like types. For integral-types, it could be min-int.

In any case, some clarification whether this is allowed would be useful.

justinchuby · 2023-03-10T18:08:30Z

Max and Min can both take rank 0 tensors as inputs but it was not clear from the spec.

edgchen1 · 2023-03-13T17:56:50Z

Regarding Pad op spec:

onnx/docs/Operators.md

Lines 17129 to 17142 in 0e9deba

    
           ### <a name="Pad"></a><a name="pad">**Pad**</a> 
        
             Given a tensor containing the data to be padded (`data`), a tensor containing the number of start and end pad values for axis (`pads`), (optionally) a `mode`, and (optionally) `constant_value`, 
        
             a padded tensor (`output`) is generated. 
        
             The three supported `modes` are (similar to corresponding modes supported by `numpy.pad`): 
        
             1) `constant`(default) - pads with a given constant value as specified by `constant_value` (which defaults to 0, empty string, or False) 
        
             2) `reflect` - pads with the reflection of the vector mirrored on the first and last values of the vector along each axis 
        
             3) `edge` - pads with the edge values of array 
        
             4) `wrap` - wrap-around padding as if the data tensor forms a torus

should probably also mention optional axes input in the first sentence
"The three supported modes are..." and then four are given

edgchen1 · 2023-06-13T21:55:26Z

Regarding MeanVarianceNormalization op spec:

onnx/docs/Operators.md

Lines 15221 to 15226 in 0f53636

    
           #### Attributes 
        
           <dl> 
        
           <dt><tt>axes</tt> : list of ints (default is ['0', '2', '3'])</dt> 
        
           <dd>A list of integers, along which to reduce. The default is to caculate along axes [0,2,3] for calculating mean and variance along each channel. Two variables with the same C-coordinate are associated with the same mean and variance.</dd> 
        
           </dl>

How should the default axes value be interpreted for inputs with fewer than four dimensions? The "C" reference seems specific to NCHW. This op does not otherwise seem limited to 4D input though.
Are negative axis values supported?

justinchuby · 2023-06-22T15:21:56Z

Pad operator behavior is not fully defined, examples lacking #4294

justinchuby · 2023-08-10T17:35:13Z

Clarify the behavior of the reduction-ops when reducing an empty set of values, by updating the test-cases and documentation. It is useful in various edge-cases. For example, ReduceProd should return 1 for an empty tensor, and ReduceSum should return 0 for an empty tensor. (See #3651 (comment)) ### Summary ReduceSum ({}) = 0 ReduceProd ({}) = 1 ReduceMin ({}) = Max. value of datatype ReduceMax ({}) = Min. value of datatype ReduceLogSum ({}) = minus infinity or undefined for datatypes without minus infinity ReduceLogSumExp ({}) = minus infinity or undefined for datatypes without minus infinity ReduceMean ({}) = Undefined --------- Signed-off-by: Ganesan Ramalingam <grama@microsoft.com> Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: gramalingam <gramalingam@users.noreply.github.com>

jcwchen added documentation Issues related to ONNX documentation enhancement Request for new feature or operator labels Aug 13, 2021

jcwchen pinned this issue Aug 13, 2021

RunnerZhong mentioned this issue Dec 20, 2021

MaxPool shape inference is NOT matched with ONNX OP SPEC microsoft/onnxruntime#10083

Open

jcwchen added the tracking Tracking issues label Jan 20, 2022

garymm mentioned this issue Feb 10, 2022

Incorrect ONNX Export for Unidirectional Broadcasting in PReLU pytorch/pytorch#70570

Closed

jcwchen added the spec clarification Clarification of the ONNX spec needed label Mar 25, 2022

faxu mentioned this issue Apr 21, 2022

operator MaxPool output tensor shape is ambiguously documented. #2927

Open

gramalingam mentioned this issue Aug 10, 2022

Work items: Contributions Welcome #4423

Open

gramalingam mentioned this issue May 5, 2023

Extend function type inference to handle missing optional parameters #5169

Merged

justinchuby added the contributions welcome label Jul 31, 2023

gramalingam mentioned this issue Sep 5, 2023

Clarify reduction behavior for an empty set of values #5568

Merged

justinchuby unpinned this issue Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tracking] Spec clarifications #3651

[Tracking] Spec clarifications #3651

jcwchen commented Aug 13, 2021

jcwchen commented Aug 13, 2021 •

edited

AlexandreEichenberger commented Aug 16, 2021

kraiskil commented Aug 31, 2021

tungld commented Oct 5, 2021 •

edited by jcwchen

gramalingam commented Oct 13, 2021

jcwchen commented Dec 3, 2021

gramalingam commented Jan 5, 2022

gramalingam commented Jan 5, 2022

volcacius commented Feb 2, 2022

garymm commented Feb 10, 2022

jcwchen commented Mar 25, 2022

gramalingam commented Jun 17, 2022

gramalingam commented Aug 29, 2022

AlexandreEichenberger commented Dec 1, 2022

gramalingam commented Dec 8, 2022

AlexandreEichenberger commented Dec 16, 2022

gramalingam commented Mar 4, 2023

justinchuby commented Mar 10, 2023

edgchen1 commented Mar 13, 2023

edgchen1 commented Jun 13, 2023 •

edited

justinchuby commented Jun 22, 2023

justinchuby commented Aug 10, 2023 •

edited

[Tracking] Spec clarifications #3651

[Tracking] Spec clarifications #3651

Comments

jcwchen commented Aug 13, 2021

jcwchen commented Aug 13, 2021 • edited

Pooling related

Conv/ConvTranspose related

RNN/LSTM related

Other operators

IR related

AlexandreEichenberger commented Aug 16, 2021

kraiskil commented Aug 31, 2021

tungld commented Oct 5, 2021 • edited by jcwchen

gramalingam commented Oct 13, 2021

jcwchen commented Dec 3, 2021

gramalingam commented Jan 5, 2022

gramalingam commented Jan 5, 2022

volcacius commented Feb 2, 2022

garymm commented Feb 10, 2022

jcwchen commented Mar 25, 2022

gramalingam commented Jun 17, 2022

gramalingam commented Aug 29, 2022

AlexandreEichenberger commented Dec 1, 2022

gramalingam commented Dec 8, 2022

AlexandreEichenberger commented Dec 16, 2022

gramalingam commented Mar 4, 2023

justinchuby commented Mar 10, 2023

edgchen1 commented Mar 13, 2023

edgchen1 commented Jun 13, 2023 • edited

justinchuby commented Jun 22, 2023

justinchuby commented Aug 10, 2023 • edited

jcwchen commented Aug 13, 2021 •

edited

tungld commented Oct 5, 2021 •

edited by jcwchen

edgchen1 commented Jun 13, 2023 •

edited

justinchuby commented Aug 10, 2023 •

edited