New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow 0 size dimensions #391
Comments
ReferencesAll the major Python ML APIs handle them robustly, and they are not considered degenerate: NumPyimport numpy
x = numpy.ones(shape=(2,0,2), dtype=numpy.float32)
y = numpy.add(x, x)
print("NumPy:")
print("value:", y)
print("shape:", y.shape)
# Prints:
# value: []
# shape: (2, 0, 2) TensorFlowimport tensorflow as tf
x = tf.ones(shape=(2,0,2), dtype=tf.float32)
y = tf.add(x, x);
print("TensorFlow:")
print("value:", y)
print("shape:", y.shape)
# Prints:
# value: tf.Tensor([], shape=(2, 0, 2), dtype=float32)
# shape: () PyTorchimport torch
x = torch.ones(size=(2,0,2), dtype=torch.float)
y = torch.add(x, x)
print("PyTorch:")
print("value:", y)
print("shape:", y.shape)
# Prints:
# value: tensor([], size=(2, 0, 2))
# shape: torch.Size([2, 0, 2]) ONNX / ONNX Runtimeimport onnx
# Scalar via [].
x = onnx.helper.make_tensor(
name="value", data_type=onnx.TensorProto.FLOAT, dims=[2,0,2], vals=[]
)
print(x)
# Prints:
# dims: 2
# dims: 0
# dims: 2
# data_type: 1
# name: "value" In ONNX Runtime, these cases are handled as nops, either directly by the EP backend (if it handles them gracefully) or by the lower-level code just before it reaches the backend API call (such as with DirectML which currently rejects 0's in the dimensions, where the EP skips operator creation while still leaving the overall graph connectivity intact). XNNPackAllows them. // see Bin Miao's code below SafeTensorsThe SafeTensors file format (commonly used with Stable Diffusion models for custom weights) explicitly allows 0D scalars and 0-size tensors - "Empty tensors (tensors with 1 dimension being 0) are allowed" and "0-rank Tensors (tensors with shape []) are allowed, they are merely a scalar". CoreML / MPS / BNNS? Not evident from documentation:
DirectMLDisallows 0 for DML_BUFFER_TENSOR_DESC::Sizes. The backend must skip the operation. |
@miaobin would volunteer to help investigate XNNPACK's support. Thanks! |
After I deleted and modified the errant validation statement in the ml_graph_builder.cc and graph_validation_utils.cc. I verified that XNNPack supports both 0D scalars and 0-size tensors through the following two test cases: Test for 0D scalars:
Test for 0-size tensors:
Both the tests have passed. |
miaobin: Great - thanks for investigating and adding the test cases. It will be more interesting for the DirectML backend because the current API rejects zero size tensors, and even if we were to update the API to accept them (and add test cases for all 100+ operators...), the older version would still be on the operating system. So we'll have to do the same thing like was done in ONNX Runtime where the operator creation is bypassed for such operators (the node is left null as a placeholder, and it's not added to the graph later). |
Evidently LLaMA is another model that can encounter legal 0 size tensors during |
WebNN's slice requires "the size must not be 0". Would this prevent the ONNX model you mentioned from slicing a tensor down to emptiness? |
@huningxin: 🤔 It could, as TF and ONNX support 0 size slices (see below). Granted, it's unlikely a TF or ONNX model would typically contain a 0-slice window (ends - starts = 0), but it could occur indirectly as a result of a model generation process and manipulating some other variable: TFimport tensorflow as tf
values = tf.constant([0, 1, 2, 3, 4, 5], dtype=tf.uint8)
result = tf.slice(values, [1], [1])
print("value:", result)
print("shape:", result.shape) ONNX
|
According to my test, XNNPACK |
Bin Miao showed that XNNPack's |
Opened: google/XNNPACK#5807
If frameworks can handle that, it would help simplify WebNN implementation. |
Another case that we should consider is having webnn input operands that have 0 dimensions. This is not allowed today. |
And it sounds like from @guschmue today that this affects yolov9 too. |
Would this mean the shape of key value tensors keeps changing for each round of inference? WebNN only supports static shape. This may cause re-compiling WebNN graph for each round? We met similar issue for Whisper model inference. The static key value cache seems to be useful: huggingface/transformers#27931 |
In Chromium CL review, @fdwr mentioned (Thanks Dwayne!)
From native ML API perspective, Dwayne also mentioned
We may want to investigate more native ML APIs to understand what's the status of the support.
I am opening this issue to start tracking, e.g. adding TODO in the implementation. @fdwr, feel free to share more details. Thanks!
The text was updated successfully, but these errors were encountered: