[ONNX] Add dtype check in onnx verification #79263

qqaatw · 2022-06-10T04:43:48Z

Currently we don't have a dtype check in verifying the consistency between PyTorch and ONNX outputs. As a result, some of dtype inconsistencies were found and reported: #77842 #77845

This is a POC.

Failed workflows:

[linux-xenial-py3.7-clang7-onnx / test (default, 2, 2, linux.2xlarge)]
- inconsistent shape
  - TestONNXRuntime_opset10.test_all ([ONNX] Fix any and all outputs' shape #79371)
  - TestONNXRuntime_opset10.test_any ([ONNX] Fix any and all outputs' shape #79371)
  - TestONNXRuntime_opset10.test_argmin_argmax ([ONNX] Fix argmin and argmax test cases #79503)
  - TestONNXRuntime_opset10.test_hardshrink ([ONNX] Fix hardshrink and softshrink output's shape #79695)
  - TestONNXRuntime_opset10.test_linalg_norm ([ONNX] Fix linalg norm output's shapes and dtypes #79506)
  - TestONNXRuntime_opset10.test_linalg_vector_norm ([ONNX] Fix linalg norm output's shapes and dtypes #79506)
  - TestONNXRuntime_opset10.test_prelu_scalar ([ONNX] Fix prelu output's shape #79846)
  - TestONNXRuntime_opset10.test_softshrink ([ONNX] Fix hardshrink and softshrink output's shape #79695)
  - TestONNXRuntime_opset10.test_sum_empty_tensor (skipped)
  - TestONNXRuntime_opset10.test_tolist (skipped)
- inconsistent dtype
  - test_arithmetic_prim_bool (skipped)
  - test_arithmeticOps_with_low_precision (skipped)
  - test_arithmetic_prim_float (skipped)
  - test_logical_and ([ONNX] Fix onnx logical functions' dtype #79339)
  - test_logical_or ([ONNX] Fix onnx logical functions' dtype #79339)
  - test_logical_xor ([ONNX] Fix onnx logical functions' dtype #79339)
  - test_pow (skipped)
  - test_primitive_input_floating (skipped)
  - test_quantize_per_tensor ([ONNX] Fix quantization outputs' dtype #79690)
  - test_quantized_adaptive_avg_pool2d ([ONNX] Fix quantization outputs' dtype #79690)
  - test_quantized_arithmetic ([ONNX] Fix quantization outputs' dtype #79690)
  - test_quantized_arithmetic_qfunctional ([ONNX] Fix quantization outputs' dtype #79690)
  - test_quantized_conv2d ([ONNX] Fix quantization outputs' dtype #79690)
  - test_quantized_conv2d_relu ([ONNX] Fix quantization outputs' dtype #79690)
  - test_quantized_flatten ([ONNX] Fix quantization outputs' dtype #79690)
  - test_quantized_hardsigmoid ([ONNX] Fix quantization outputs' dtype #79690)
  - test_quantized_hardswish ([ONNX] Fix quantization outputs' dtype #79690)
  - test_quantized_linear ([ONNX] Fix quantization outputs' dtype #79690)
  - test_quantized_sigmoid ([ONNX] Fix quantization outputs' dtype #79690)
  - test_item (skipped)
  - test_full_like_value (skipped)
  - TestONNXRuntime_opset7.test_div_rounding_mode (skipped)
  - TestONNXRuntime_opset8.test_div_rounding_mode (skipped)
  - TestONNXRuntime_opset9.test_div_rounding_mode (skipped)
  - TestONNXRuntime_opset9_IRv4.test_div_rounding_mode (skipped)
  - test_outer (skipped)
  - test_symbolic_shape_inference_arange_2 (skipped)

facebook-github-bot · 2022-06-10T04:43:53Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/79263
📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓Need help or want to give feedback on the CI? Visit our office hours

✅ No Failures (0 Pending)

As of commit ab98bce (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

justinchuby · 2022-06-10T15:43:30Z

cc @BowenBao

BowenBao · 2022-06-21T19:19:35Z

@qqaatw This is awesome! Thank you so much for contribution!

A suggestion would be to add a new skip decorator to disable dtype checks for these tests. Then remove the skip in later follow up PRs that fix individual tests. This allows CI to tell us if the issue is properly fixed.

If you are interested you can utilize ghstack to submit follow-up individual fixes on top of this PR.

Part of #79263 This PR fixes the following matters: 1. Before this fix, the reduced output has `[1]` shape when `dim = None` and `keepdim = False`. Now the output is reduced to `[]` shape, which matches Pytorch's behavior. 2. Before this fix, the output is always casted to `Long`. Now the output is casted to the input's dtype. Pull Request resolved: #79506 Approved by: https://github.com/justinchuby, https://github.com/BowenBao

Part of #79263 Before: When `dim` == `None` and `keepdim` == `0`(`False`), the reduced output has `[1]` shape. After: Squeeze the output so that the shape will be `[]` as PyTorch's behavior. Pull Request resolved: #79371 Approved by: https://github.com/AllenTiTaiWang, https://github.com/BowenBao

Part of #79263 Before: The output has `[1]` shape when the input is a scalar. After: The output has `[]` shape, matching PyTorch's behavior. The original comment along the code states `torch allows scalar self, and ONNX is ambiguous about whether this is allowed`. The fact seems to be that ONNX never clearly indicates whether scalar inputs are allowed for all the ONNX operators. At least in this case, a scalar input seems to be allowed. Pull Request resolved: #79846 Approved by: https://github.com/BowenBao

qqaatw · 2022-06-22T07:36:52Z

@qqaatw This is awesome! Thank you so much for contribution!

A suggestion would be to add a new skip decorator to disable dtype checks for these tests. Then remove the skip in later follow up PRs that fix individual tests. This allows CI to tell us if the issue is properly fixed.

If you are interested you can utilize ghstack to submit follow-up individual fixes on top of this PR.

Thank you, good suggestion indeed!

ghstack requires write access to the repository, which I don't have now.

Part of #79263 Before: When the shape of the two functions is `[]`, the reduced output has `[1]` shape. After: The shape of the two functions is now `[]` as PyTorch's behavior. Pull Request resolved: #79695 Approved by: https://github.com/justinchuby, https://github.com/BowenBao

Summary: Part of #79263 Before: output bool is casted back to input's dtype. After: no longer casted back. Pull Request resolved: #79339 Approved by: https://github.com/BowenBao Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/7a8d6c9b1dffdd01c9a95392c7ba1102a1566192 Reviewed By: b0noI Differential Revision: D37509641 fbshipit-source-id: 89419ce977d48c01b4109d4e3075e603ccc8fc14

) Summary: Part of #79263 Before: When the shape of the two functions is `[]`, the reduced output has `[1]` shape. After: The shape of the two functions is now `[]` as PyTorch's behavior. Pull Request resolved: #79695 Approved by: https://github.com/justinchuby, https://github.com/BowenBao Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/3dec9fd09fbe5229e5737f9b54037454dc40802e Reviewed By: b0noI Differential Revision: D37509863 fbshipit-source-id: d3148da4b8bdb7ff8e28d623192d6dba97075dcb

justinchuby · 2022-07-05T17:33:55Z

Could you rebase?

qqaatw · 2022-07-06T08:10:00Z

@pytorchbot rebase

pytorchmergebot · 2022-07-06T08:11:33Z

@pytorchbot successfully started a rebase job. Check the current status here

pytorchmergebot · 2022-07-06T08:11:38Z

Rebase failed due to Command git -C /home/runner/work/pytorch/pytorch rebase refs/remotes/origin/master pull/79263/head returned non-zero exit code 1

Rebasing (1/8)
Rebasing (2/8)
Rebasing (3/8)
Auto-merging test/onnx/test_pytorch_common.py
Auto-merging test/onnx/test_pytorch_onnx_onnxruntime.py
CONFLICT (content): Merge conflict in test/onnx/test_pytorch_onnx_onnxruntime.py
Auto-merging torch/onnx/verification.py
error: could not apply 8ce6eac7a1... Add decorators to skip checks
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
Could not apply 8ce6eac7a1... Add decorators to skip checks

Raised by https://github.com/pytorch/pytorch/actions/runs/2621356399

qqaatw · 2022-07-06T11:14:08Z

Could you rebase?

After current opened PRs are merged maybe? There are two remaining.

Part of pytorch#79263 This PR fixes the following matters: 1. Before this fix, the reduced output has `[1]` shape when `dim = None` and `keepdim = False`. Now the output is reduced to `[]` shape, which matches Pytorch's behavior. 2. Before this fix, the output is always casted to `Long`. Now the output is casted to the input's dtype. Pull Request resolved: pytorch#79506 Approved by: https://github.com/justinchuby, https://github.com/BowenBao

Part of pytorch#79263 Before: When `dim` == `None` and `keepdim` == `0`(`False`), the reduced output has `[1]` shape. After: Squeeze the output so that the shape will be `[]` as PyTorch's behavior. Pull Request resolved: pytorch#79371 Approved by: https://github.com/AllenTiTaiWang, https://github.com/BowenBao

Part of pytorch#79263 Before: The output has `[1]` shape when the input is a scalar. After: The output has `[]` shape, matching PyTorch's behavior. The original comment along the code states `torch allows scalar self, and ONNX is ambiguous about whether this is allowed`. The fact seems to be that ONNX never clearly indicates whether scalar inputs are allowed for all the ONNX operators. At least in this case, a scalar input seems to be allowed. Pull Request resolved: pytorch#79846 Approved by: https://github.com/BowenBao

Part of pytorch#79263 Before: output bool is casted back to input's dtype. After: no longer casted back. Pull Request resolved: pytorch#79339 Approved by: https://github.com/BowenBao

Part of pytorch#79263 Before: When the shape of the two functions is `[]`, the reduced output has `[1]` shape. After: The shape of the two functions is now `[]` as PyTorch's behavior. Pull Request resolved: pytorch#79695 Approved by: https://github.com/justinchuby, https://github.com/BowenBao

Part of #79263 The `keepdim` argument is theoretically ignored when `dim` is not specified (See [docs](https://pytorch.org/docs/stable/generated/torch.argmin.html)). Unfortunately the PyTorch implementation seems to still take it into account, resulting in a non-fully-reduced tensor, which is an undefined behavior. Thus, I add `dim` argument to the tests to make the outputs between PyTorch and ONNX runtime consistent. Pull Request resolved: #79503 Approved by: https://github.com/justinchuby, https://github.com/AllenTiTaiWang, https://github.com/BowenBao

Summary: Part of #79263 The `keepdim` argument is theoretically ignored when `dim` is not specified (See [docs](https://pytorch.org/docs/stable/generated/torch.argmin.html)). Unfortunately the PyTorch implementation seems to still take it into account, resulting in a non-fully-reduced tensor, which is an undefined behavior. Thus, I add `dim` argument to the tests to make the outputs between PyTorch and ONNX runtime consistent. Pull Request resolved: #79503 Approved by: https://github.com/justinchuby, https://github.com/AllenTiTaiWang, https://github.com/BowenBao Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/6bdf89b0c71a827be2a0969afa33f151e87a32f3 Reviewed By: kit1980 Differential Revision: D38478785 fbshipit-source-id: cd9493d6d088c4e32eb11bd19924efc0b6e6335c

Part of #79263 Previously, all quantized PyTorch tensors are all casted to the dtypes which comply with ONNX's definition, i.e. `scale` is casted to `double`, and `zero_point` is casted to `int64`. These casts lead to inconsistent dtypes when comparing PyTorch's outputs and ONNX runtime's outputs. Now, `cast_onnx_accepted` argument is added to `unpack_quantized_tensor` function. When making example inputs for ONNX, we cast them to the ONNX compliant dtypes; otherwise, they are casted to PyTorch default types for quantization. Pull Request resolved: #79690 Approved by: https://github.com/justinchuby, https://github.com/BowenBao

qqaatw · 2022-08-10T05:52:36Z

@pytorchbot merge -g

pytorchmergebot · 2022-08-10T05:53:49Z

@pytorchbot successfully started a merge job. Check the current status here

Summary: Part of #79263 Previously, all quantized PyTorch tensors are all casted to the dtypes which comply with ONNX's definition, i.e. `scale` is casted to `double`, and `zero_point` is casted to `int64`. These casts lead to inconsistent dtypes when comparing PyTorch's outputs and ONNX runtime's outputs. Now, `cast_onnx_accepted` argument is added to `unpack_quantized_tensor` function. When making example inputs for ONNX, we cast them to the ONNX compliant dtypes; otherwise, they are casted to PyTorch default types for quantization. Pull Request resolved: #79690 Approved by: https://github.com/justinchuby, https://github.com/BowenBao Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/9b4dc56c83ca28df3d34b650f4c0249de45d7601 Reviewed By: seemethere Differential Revision: D38585476 fbshipit-source-id: a2015092f3f034e30c6134aee6230c57e959636f

Summary: Currently we don't have a dtype check in verifying the consistency between PyTorch and ONNX outputs. As a result, some of dtype inconsistencies were found and reported: #77842 #77845 This is a POC. Failed workflows: - [linux-xenial-py3.7-clang7-onnx / test (default, 2, 2, linux.2xlarge)] - inconsistent shape - TestONNXRuntime_opset10.test_all (#79371) - TestONNXRuntime_opset10.test_any (#79371) - TestONNXRuntime_opset10.test_argmin_argmax (#79503) - TestONNXRuntime_opset10.test_hardshrink (#79695) - TestONNXRuntime_opset10.test_linalg_norm (#79506) - TestONNXRuntime_opset10.test_linalg_vector_norm (#79506) - TestONNXRuntime_opset10.test_prelu_scalar (#79846) - TestONNXRuntime_opset10.test_softshrink (#79695) - TestONNXRuntime_opset10.test_sum_empty_tensor (skipped) - TestONNXRuntime_opset10.test_tolist (skipped) - inconsistent dtype - test_arithmetic_prim_bool (skipped) - test_arithmeticOps_with_low_precision (skipped) - test_arithmetic_prim_float (skipped) - test_logical_and (#79339) - test_logical_or (#79339) - test_logical_xor (#79339) - test_pow (skipped) - test_primitive_input_floating (skipped) - test_quantize_per_tensor (#79690) - test_quantized_adaptive_avg_pool2d (#79690) - test_quantized_arithmetic (#79690) - test_quantized_arithmetic_qfunctional (#79690) - test_quantized_conv2d (#79690) - test_quantized_conv2d_relu (#79690) - test_quantized_flatten (#79690) - test_quantized_hardsigmoid (#79690) - test_quantized_hardswish (#79690) - test_quantized_linear (#79690) - test_quantized_sigmoid (#79690) - test_item (skipped) - test_full_like_value (skipped) - TestONNXRuntime_opset7.test_div_rounding_mode (skipped) - TestONNXRuntime_opset8.test_div_rounding_mode (skipped) - TestONNXRuntime_opset9.test_div_rounding_mode (skipped) - TestONNXRuntime_opset9_IRv4.test_div_rounding_mode (skipped) - test_outer (skipped) - test_symbolic_shape_inference_arange_2 (skipped) Pull Request resolved: #79263 Approved by: https://github.com/justinchuby, https://github.com/BowenBao Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/d9a7e93aaf3166e639ea413123bd6c38b9144adc Reviewed By: seemethere Differential Revision: D38585848 fbshipit-source-id: 9da98581ceec51142ae31d3f8a06f9f296a16b23

Add a dtype check in onnx verification

5cf3bd5

qqaatw requested a review from BowenBao as a code owner June 10, 2022 04:43

facebook-github-bot added the cla signed label Jun 10, 2022

pytorchbot added the open source label Jun 10, 2022

qqaatw mentioned this pull request Jun 10, 2022

[ONNX] Fix inconsistent rand dtype #79193

Closed

allow nan equal

d38ade7

gchanan added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 10, 2022

justinchuby added the module: onnx Related to torch.onnx label Jun 10, 2022

This was referenced Jun 11, 2022

[ONNX] Fix onnx logical functions' dtype #79339

Closed

[ONNX] Fix any and all outputs' shape #79371

Closed

[ONNX] Fix argmin and argmax test cases #79503

Closed

[ONNX] Fix linalg norm output's shapes and dtypes #79506

Closed

Merge branch 'master' into add_dtype_check_onnx

38d8579

This was referenced Jun 16, 2022

[ONNX] Fix quantization outputs' dtype #79690

Closed

[ONNX] Fix hardshrink and softshrink output's shape #79695

Closed

justinchuby added module: tests Issues related to tests (not the torch.testing module) topic: bug fixes topic category release notes: onnx torch.onnx related changes that should show up in the release notes labels Jun 16, 2022

qqaatw mentioned this pull request Jun 19, 2022

[ONNX] Fix prelu output's shape #79846

Closed

titaiwangms assigned titaiwangms and justinchuby Jun 21, 2022

qqaatw added 2 commits June 22, 2022 16:35

Add decorators to skip checks

8ce6eac

Merge branch 'master' into add_dtype_check_onnx

a92d722

titaiwangms mentioned this pull request Jun 27, 2022

[ONNX] Some functions do not preserve shape for scalars #78852

Closed

justinchuby linked an issue Jun 28, 2022 that may be closed by this pull request

[ONNX] Some functions do not preserve shape for scalars #78852

Closed

Merge remote-tracking branch 'upstream/master' into add_dtype_check_onnx

a591788

qqaatw requested a review from shubhambhokare1 as a code owner August 5, 2022 04:38

exclude opset version < 9

d537ce5

qqaatw added 2 commits August 10, 2022 12:00

lint

2c3180e

Merge remote-tracking branch 'upstream/master' into add_dtype_check_onnx

ab98bce

pytorchmergebot added the Merged label Aug 10, 2022

pytorchmergebot closed this in d9a7e93 Aug 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ONNX] Add dtype check in onnx verification #79263

[ONNX] Add dtype check in onnx verification #79263

qqaatw commented Jun 10, 2022 •

edited

facebook-github-bot commented Jun 10, 2022 •

edited

justinchuby commented Jun 10, 2022

BowenBao commented Jun 21, 2022

qqaatw commented Jun 22, 2022

justinchuby commented Jul 5, 2022

qqaatw commented Jul 6, 2022

pytorchmergebot commented Jul 6, 2022

pytorchmergebot commented Jul 6, 2022

qqaatw commented Jul 6, 2022 •

edited

qqaatw commented Aug 10, 2022

pytorchmergebot commented Aug 10, 2022

[ONNX] Add dtype check in onnx verification #79263

[ONNX] Add dtype check in onnx verification #79263

Conversation

qqaatw commented Jun 10, 2022 • edited

facebook-github-bot commented Jun 10, 2022 • edited

🔗 Helpful links

✅ No Failures (0 Pending)

justinchuby commented Jun 10, 2022

BowenBao commented Jun 21, 2022

qqaatw commented Jun 22, 2022

justinchuby commented Jul 5, 2022

qqaatw commented Jul 6, 2022

pytorchmergebot commented Jul 6, 2022

pytorchmergebot commented Jul 6, 2022

qqaatw commented Jul 6, 2022 • edited

qqaatw commented Aug 10, 2022

pytorchmergebot commented Aug 10, 2022

qqaatw commented Jun 10, 2022 •

edited

facebook-github-bot commented Jun 10, 2022 •

edited

qqaatw commented Jul 6, 2022 •

edited