Type of some parameters should match the input data type #442

Honry · 2023-07-14T06:16:05Z

pad(), MLPadOptions::value is float in current spec, which is used as padded value if the MLPadOptions::mode is "constant", its type should exactly match the input data type, otherwise this may cause precision loss.
(ONNX Pad-18 requires the input data and constant value (a single scalar input tensor) to be the same type T.)
clamp(), MLClampOptions::minValue and MLClampOptions::maxValue, current spec states them as a float scalar.

And for the v2 ops, we will also have to consider these: (thanks @fdwr for providing the list)

struct DML_FILL_VALUE_CONSTANT_OPERATOR_DESC
{
    const DML_TENSOR_DESC* OutputTensor;
    DML_TENSOR_DATA_TYPE ValueDataType;
    DML_SCALAR_UNION Value;
};

struct DML_FILL_VALUE_SEQUENCE_OPERATOR_DESC
{
    const DML_TENSOR_DESC* OutputTensor;
    DML_TENSOR_DATA_TYPE ValueDataType;
    DML_SCALAR_UNION ValueStart;
    DML_SCALAR_UNION ValueDelta;
};

struct DML_DIAGONAL_MATRIX1_OPERATOR_DESC
{
    _Maybenull_ const DML_TENSOR_DESC* InputTensor;
    const DML_TENSOR_DESC* OutputTensor;
    DML_TENSOR_DATA_TYPE ValueDataType;
    DML_SCALAR_UNION Value;
    INT DiagonalFillBegin;
    INT DiagonalFillEnd;
};

The text was updated successfully, but these errors were encountered:

fdwr · 2023-07-14T21:26:45Z

Agreed. This may seem like an inconsequential matter because float32 has such a wide range, but float32 cannot accurately represent int32 values over 16 million (all odd numbers get dropped to nearest even), and it will be exacerbated if we extend to larger types like int64. There have already been cases in ONNX models where special sentinel values were used like INT32_MAX which when converted to float32 lose their representation.

🤔 I wonder what the IDL would look like though? (I'm not that familiar with the IDL syntax in its murkier corners)

dictionary MLPadOptions {
  MLPaddingMode mode = "constant";
  ???? value = 0; // What goes here? Is there any sort of multi-valued scalar in JS IDL, like C++ union?
};

dictionary MLFillSequenceOptions {
  ??? start = 0;
  ??? delta = 1;
};

...

Should it use an or with multiple types? I see there is precedent for an or description in split: split(..., (unsigned long or sequence<unsigned long>) splits, ...);

inexorabletash · 2023-10-13T23:17:36Z

Re: "Is there any sort of multi-valued scalar in JS IDL, like C++ union?"

WebIDL has union types (a or b or c) which can be given a name via typedef, but the types must be distinguishable. Fundamentally, this part of WebIDL is about mapping incoming JavaScript types into distinct IDL types and applying the appropriate conversion logic. So you can have a union of (DOMString or float) because IDL rules allow distinguishing a JS Number and a JS String into one of those two types. But a union of (float or double) is not permitted, since there's no way to tell which type at the IDL layer to convert a JS Number to.

For now, my best suggestion is to accept unrestricted double which is equivalent to any JS Number (64-bit IEEE 754 FP), and write an algorithm in prose for narrowing, which should reference the same logic as WebIDL e.g. https://webidl.spec.whatwg.org/#es-integer-types and https://webidl.spec.whatwg.org/#es-float etc.

In the future, you can do a union of (bigint and unrestricted double) following the guidance at https://webidl.spec.whatwg.org/#limit-bigint-numeric-unions

Other options include using any and write prose for what to accept/reject following the conversions in https://webidl.spec.whatwg.org/#es-any but I think that ends up being a superset of the above.

inexorabletash · 2024-02-16T23:20:56Z

Re: unrestricted double above - that may have been bad guidance. Are infinities valid? Is NaN valid? If not, just use double.

fdwr · 2024-02-16T23:40:52Z

Are infinities valid? Is NaN valid?

Josh: Definite yes for infinity. NaN is more arguable, but they have their use (and many libraries have dedicated NaN testing operators - tf.math.is_nan, torch.isnan, ONNX IsNaN).

fdwr · 2024-02-16T23:42:47Z

And quite related to this is the "fill sequence" constant overload: #492
We want to be consistent between the two.

inexorabletash · 2024-04-10T19:09:27Z

FYI, I have a local change for this, but will wait for the PR queue to drain. I went with this definition:

typedef (bigint or unrestricted double) MLNumber;

And then use MLNumber for constant(value, type), constant(start, end, step, type) (see #571 and #492), MLClampOptions and MLPadOptions.

unrestricted - because Infinity should be allowed, per above
double - per Clarify the usage of 32 bit floating point type and consider using double #325 it's unclear that limiting to float in the IDL is useful unless we really really want the semantics of https://webidl.spec.whatwg.org/#js-float

Bikeshedding on the name is welcome. :)

fdwr · 2024-04-10T22:46:51Z

FYI, I have a local change for this, but will wait for the PR queue to drain. I went with this definition:
typedef (bigint or unrestricted double) MLNumber;
...
Bikeshedding on the name is welcome. :)

@inexorabletash Yeah, I suppose it makes sense to just have a common numeric scalar type (rather than repeated definitions in each of the function prototypes) if we're going to be sharing this across multiple operators - clamp, pad, fill constant, fill sequence, diagonal matrix... I originally thought including "scalar" in the name would make sense, but then JS has a "Number" type, and so MLNumber makes sense. So, 👍.

huningxin · 2024-04-11T01:57:20Z

typedef (bigint or unrestricted double) MLNumber;

looks good!

For some MLGraphBuilder methods the type of a numeric input can vary - e.g. for constant() an explicit MLOperandDataType is provided; for clamp() and pad() the data type is implied by input operands. In these cases, specifying the numeric value as either a float/double or int64 type runs into accuracy or range issues - you can't accurately represent all int64 values as a double, and you can't represent the full range of floats as int64. (You also can't represent all int64 values as an long long either - over 2^53 things get wierd. But that's a digression.) Per discussion in whatwg/webidl#1388 this change introduces a union between a bigint type and unrestricted double called MLNumber. Callers can pass a JS number (1234, 1.1234e38) or a JS bigint (9007199254740993n), and the implementation will treat it properly based on the explicit or implicit MLOperandDataType. Usage of this type should be limited to only those cases. Fixes webmachinelearning#442 Note that webmachinelearning#492 proposes changes to the constant sequential filling operation; this just adds IDL to match the current spec prose. Some of the concerns raised in webmachinelearning#325 are addressed (e.g. clamp()'s options). However, several other options are still specified as "float", and should maybe be "double" - but MLNumber is likely not appropriate for those, so they are not updated here.

inexorabletash added the operator specific label Feb 15, 2024

This was referenced Feb 17, 2024

Intent to use bigint/numeric union whatwg/webidl#959

Closed

Intent to use BigInt/numeric union in WebNN whatwg/webidl#1388

Open

inexorabletash self-assigned this Apr 10, 2024

inexorabletash mentioned this issue Apr 18, 2024

Introduce MLNumber for specifying numeric inputs of any type #647

Merged

anssiko added the feature request label Apr 25, 2024

inexorabletash mentioned this issue Jun 27, 2024

WebML WG Teleconference – 27 June 2024 - Open issues and PRs webmachinelearning/meetings#24

Open

huningxin closed this as completed in #647 Jul 5, 2024

huningxin closed this as completed in 9f88ebf Jul 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Type of some parameters should match the input data type #442

Type of some parameters should match the input data type #442

Honry commented Jul 14, 2023

fdwr commented Jul 14, 2023 •

edited

Loading

inexorabletash commented Oct 13, 2023

inexorabletash commented Feb 16, 2024

fdwr commented Feb 16, 2024

fdwr commented Feb 16, 2024

inexorabletash commented Apr 10, 2024

fdwr commented Apr 10, 2024 •

edited

Loading

huningxin commented Apr 11, 2024

Type of some parameters should match the input data type #442

Type of some parameters should match the input data type #442

Comments

Honry commented Jul 14, 2023

fdwr commented Jul 14, 2023 • edited Loading

inexorabletash commented Oct 13, 2023

inexorabletash commented Feb 16, 2024

fdwr commented Feb 16, 2024

fdwr commented Feb 16, 2024

inexorabletash commented Apr 10, 2024

fdwr commented Apr 10, 2024 • edited Loading

huningxin commented Apr 11, 2024

fdwr commented Jul 14, 2023 •

edited

Loading

fdwr commented Apr 10, 2024 •

edited

Loading