-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ONNX] Add dtype check in onnx verification #79263
Conversation
🔗 Helpful links
✅ No Failures (0 Pending)As of commit ab98bce (more details on the Dr. CI page): Expand to see more💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
cc @BowenBao |
@qqaatw This is awesome! Thank you so much for contribution! A suggestion would be to add a new skip decorator to disable dtype checks for these tests. Then remove the skip in later follow up PRs that fix individual tests. This allows CI to tell us if the issue is properly fixed. If you are interested you can utilize ghstack to submit follow-up individual fixes on top of this PR. |
Part of #79263 This PR fixes the following matters: 1. Before this fix, the reduced output has `[1]` shape when `dim = None` and `keepdim = False`. Now the output is reduced to `[]` shape, which matches Pytorch's behavior. 2. Before this fix, the output is always casted to `Long`. Now the output is casted to the input's dtype. Pull Request resolved: #79506 Approved by: https://github.com/justinchuby, https://github.com/BowenBao
Part of #79263 Before: When `dim` == `None` and `keepdim` == `0`(`False`), the reduced output has `[1]` shape. After: Squeeze the output so that the shape will be `[]` as PyTorch's behavior. Pull Request resolved: #79371 Approved by: https://github.com/AllenTiTaiWang, https://github.com/BowenBao
Part of #79263 Before: The output has `[1]` shape when the input is a scalar. After: The output has `[]` shape, matching PyTorch's behavior. The original comment along the code states `torch allows scalar self, and ONNX is ambiguous about whether this is allowed`. The fact seems to be that ONNX never clearly indicates whether scalar inputs are allowed for all the ONNX operators. At least in this case, a scalar input seems to be allowed. Pull Request resolved: #79846 Approved by: https://github.com/BowenBao
Thank you, good suggestion indeed! ghstack requires write access to the repository, which I don't have now. |
Part of #79263 Before: When the shape of the two functions is `[]`, the reduced output has `[1]` shape. After: The shape of the two functions is now `[]` as PyTorch's behavior. Pull Request resolved: #79695 Approved by: https://github.com/justinchuby, https://github.com/BowenBao
Summary: Part of #79263 Before: output bool is casted back to input's dtype. After: no longer casted back. Pull Request resolved: #79339 Approved by: https://github.com/BowenBao Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/7a8d6c9b1dffdd01c9a95392c7ba1102a1566192 Reviewed By: b0noI Differential Revision: D37509641 fbshipit-source-id: 89419ce977d48c01b4109d4e3075e603ccc8fc14
) Summary: Part of #79263 Before: When the shape of the two functions is `[]`, the reduced output has `[1]` shape. After: The shape of the two functions is now `[]` as PyTorch's behavior. Pull Request resolved: #79695 Approved by: https://github.com/justinchuby, https://github.com/BowenBao Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/3dec9fd09fbe5229e5737f9b54037454dc40802e Reviewed By: b0noI Differential Revision: D37509863 fbshipit-source-id: d3148da4b8bdb7ff8e28d623192d6dba97075dcb
Could you rebase? |
@pytorchbot rebase |
@pytorchbot successfully started a rebase job. Check the current status here |
Rebase failed due to Command
Raised by https://github.com/pytorch/pytorch/actions/runs/2621356399 |
After current opened PRs are merged maybe? There are two remaining. |
Part of pytorch#79263 This PR fixes the following matters: 1. Before this fix, the reduced output has `[1]` shape when `dim = None` and `keepdim = False`. Now the output is reduced to `[]` shape, which matches Pytorch's behavior. 2. Before this fix, the output is always casted to `Long`. Now the output is casted to the input's dtype. Pull Request resolved: pytorch#79506 Approved by: https://github.com/justinchuby, https://github.com/BowenBao
Part of pytorch#79263 Before: When `dim` == `None` and `keepdim` == `0`(`False`), the reduced output has `[1]` shape. After: Squeeze the output so that the shape will be `[]` as PyTorch's behavior. Pull Request resolved: pytorch#79371 Approved by: https://github.com/AllenTiTaiWang, https://github.com/BowenBao
Part of pytorch#79263 Before: The output has `[1]` shape when the input is a scalar. After: The output has `[]` shape, matching PyTorch's behavior. The original comment along the code states `torch allows scalar self, and ONNX is ambiguous about whether this is allowed`. The fact seems to be that ONNX never clearly indicates whether scalar inputs are allowed for all the ONNX operators. At least in this case, a scalar input seems to be allowed. Pull Request resolved: pytorch#79846 Approved by: https://github.com/BowenBao
Part of pytorch#79263 Before: output bool is casted back to input's dtype. After: no longer casted back. Pull Request resolved: pytorch#79339 Approved by: https://github.com/BowenBao
Part of pytorch#79263 Before: When the shape of the two functions is `[]`, the reduced output has `[1]` shape. After: The shape of the two functions is now `[]` as PyTorch's behavior. Pull Request resolved: pytorch#79695 Approved by: https://github.com/justinchuby, https://github.com/BowenBao
Part of #79263 The `keepdim` argument is theoretically ignored when `dim` is not specified (See [docs](https://pytorch.org/docs/stable/generated/torch.argmin.html)). Unfortunately the PyTorch implementation seems to still take it into account, resulting in a non-fully-reduced tensor, which is an undefined behavior. Thus, I add `dim` argument to the tests to make the outputs between PyTorch and ONNX runtime consistent. Pull Request resolved: #79503 Approved by: https://github.com/justinchuby, https://github.com/AllenTiTaiWang, https://github.com/BowenBao
Summary: Part of #79263 The `keepdim` argument is theoretically ignored when `dim` is not specified (See [docs](https://pytorch.org/docs/stable/generated/torch.argmin.html)). Unfortunately the PyTorch implementation seems to still take it into account, resulting in a non-fully-reduced tensor, which is an undefined behavior. Thus, I add `dim` argument to the tests to make the outputs between PyTorch and ONNX runtime consistent. Pull Request resolved: #79503 Approved by: https://github.com/justinchuby, https://github.com/AllenTiTaiWang, https://github.com/BowenBao Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/6bdf89b0c71a827be2a0969afa33f151e87a32f3 Reviewed By: kit1980 Differential Revision: D38478785 fbshipit-source-id: cd9493d6d088c4e32eb11bd19924efc0b6e6335c
Part of #79263 Previously, all quantized PyTorch tensors are all casted to the dtypes which comply with ONNX's definition, i.e. `scale` is casted to `double`, and `zero_point` is casted to `int64`. These casts lead to inconsistent dtypes when comparing PyTorch's outputs and ONNX runtime's outputs. Now, `cast_onnx_accepted` argument is added to `unpack_quantized_tensor` function. When making example inputs for ONNX, we cast them to the ONNX compliant dtypes; otherwise, they are casted to PyTorch default types for quantization. Pull Request resolved: #79690 Approved by: https://github.com/justinchuby, https://github.com/BowenBao
@pytorchbot merge -g |
@pytorchbot successfully started a merge job. Check the current status here |
Summary: Part of #79263 Previously, all quantized PyTorch tensors are all casted to the dtypes which comply with ONNX's definition, i.e. `scale` is casted to `double`, and `zero_point` is casted to `int64`. These casts lead to inconsistent dtypes when comparing PyTorch's outputs and ONNX runtime's outputs. Now, `cast_onnx_accepted` argument is added to `unpack_quantized_tensor` function. When making example inputs for ONNX, we cast them to the ONNX compliant dtypes; otherwise, they are casted to PyTorch default types for quantization. Pull Request resolved: #79690 Approved by: https://github.com/justinchuby, https://github.com/BowenBao Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/9b4dc56c83ca28df3d34b650f4c0249de45d7601 Reviewed By: seemethere Differential Revision: D38585476 fbshipit-source-id: a2015092f3f034e30c6134aee6230c57e959636f
Summary: Currently we don't have a dtype check in verifying the consistency between PyTorch and ONNX outputs. As a result, some of dtype inconsistencies were found and reported: #77842 #77845 This is a POC. Failed workflows: - [linux-xenial-py3.7-clang7-onnx / test (default, 2, 2, linux.2xlarge)] - inconsistent shape - TestONNXRuntime_opset10.test_all (#79371) - TestONNXRuntime_opset10.test_any (#79371) - TestONNXRuntime_opset10.test_argmin_argmax (#79503) - TestONNXRuntime_opset10.test_hardshrink (#79695) - TestONNXRuntime_opset10.test_linalg_norm (#79506) - TestONNXRuntime_opset10.test_linalg_vector_norm (#79506) - TestONNXRuntime_opset10.test_prelu_scalar (#79846) - TestONNXRuntime_opset10.test_softshrink (#79695) - TestONNXRuntime_opset10.test_sum_empty_tensor (skipped) - TestONNXRuntime_opset10.test_tolist (skipped) - inconsistent dtype - test_arithmetic_prim_bool (skipped) - test_arithmeticOps_with_low_precision (skipped) - test_arithmetic_prim_float (skipped) - test_logical_and (#79339) - test_logical_or (#79339) - test_logical_xor (#79339) - test_pow (skipped) - test_primitive_input_floating (skipped) - test_quantize_per_tensor (#79690) - test_quantized_adaptive_avg_pool2d (#79690) - test_quantized_arithmetic (#79690) - test_quantized_arithmetic_qfunctional (#79690) - test_quantized_conv2d (#79690) - test_quantized_conv2d_relu (#79690) - test_quantized_flatten (#79690) - test_quantized_hardsigmoid (#79690) - test_quantized_hardswish (#79690) - test_quantized_linear (#79690) - test_quantized_sigmoid (#79690) - test_item (skipped) - test_full_like_value (skipped) - TestONNXRuntime_opset7.test_div_rounding_mode (skipped) - TestONNXRuntime_opset8.test_div_rounding_mode (skipped) - TestONNXRuntime_opset9.test_div_rounding_mode (skipped) - TestONNXRuntime_opset9_IRv4.test_div_rounding_mode (skipped) - test_outer (skipped) - test_symbolic_shape_inference_arange_2 (skipped) Pull Request resolved: #79263 Approved by: https://github.com/justinchuby, https://github.com/BowenBao Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/d9a7e93aaf3166e639ea413123bd6c38b9144adc Reviewed By: seemethere Differential Revision: D38585848 fbshipit-source-id: 9da98581ceec51142ae31d3f8a06f9f296a16b23
Currently we don't have a dtype check in verifying the consistency between PyTorch and ONNX outputs. As a result, some of dtype inconsistencies were found and reported: #77842 #77845
This is a POC.
Failed workflows:
any
andall
outputs' shape #79371)any
andall
outputs' shape #79371)argmin
andargmax
test cases #79503)hardshrink
andsoftshrink
output's shape #79695)prelu
output's shape #79846)hardshrink
andsoftshrink
output's shape #79695)