Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ONNX] update default opset_version to 13 #73898

Closed
wants to merge 16 commits into from

Conversation

garymm
Copy link
Collaborator

@garymm garymm commented Mar 8, 2022

And add a new tool to update it in the future, which follows the policy
of using "latest as of 18 months ago". This policy is meant to balance:

  • recent enough to increase the odds of being able to successfully
    export
  • old enough to increase the odds of exported model being runnable by
    different ONNX implementations

Related changes:

  • test_models.py: explicitly fix opset_version to 9 rather than relying on default. Caffe2 doesn't support newer versions.
  • symbolic_helper.py:
    • Remove a misleading comment
    • Remove unnecessary check in _set_opset_version
    • Use a range to define _onnx_stable_opsets
  • test_pytorch_common.py:
    • Rename a variable from min -> max. I think it was a copy-paste error.
    • Make skip test messages more informative.
    • Remove unused skipIfONNXShapeInference. More on that below.
  • test_pytorch_onnx_onnxruntime.py:
    • Make all the TestCase classes explicitly specify opset version.
    • Make test_unsupported_pad respect opset_version by using run_test
    • Unrelated simplification: make it obvious that all tests run with onnx_shape_inference=True. AFAICT this was already the case.
    • There was one test that was entirely disabled (test_tolist) because it was asking to be skipped whenever onnx_shape_inference=True, but it was always True. I changed the model being tested so as to preserve the intended test coverage but still have the test actually pass.

@pytorch-bot
Copy link

pytorch-bot bot commented Mar 8, 2022

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/garymm/pytorch/blob/605fc6684cafc4ef8492853ea7037e114278e23a/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default
Add ciflow labels to this PR to trigger more builds:

Workflows Labels (bold enabled) Status
Triggered Workflows
linux-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
linux-binary-libtorch-cxx11-abi ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
linux-binary-libtorch-pre-cxx11 ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
linux-binary-manywheel ciflow/all, ciflow/binaries, ciflow/binaries_wheel, ciflow/default, ciflow/trunk ✅ triggered
linux-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/noarch, ciflow/trunk ✅ triggered
linux-bionic-rocm4.5-py3.7 ciflow/all, ciflow/default, ciflow/linux, ciflow/rocm, ciflow/trunk ✅ triggered
linux-docs ciflow/all, ciflow/cpu, ciflow/default, ciflow/docs, ciflow/linux, ciflow/trunk ✅ triggered
linux-vulkan-bionic-py3.7-clang9 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk, ciflow/vulkan ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test ciflow/all, ciflow/bazel, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-build ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static ciflow/all, ciflow/default, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-asan ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/sanitizers, ciflow/trunk ✅ triggered
linux-xenial-py3.7-clang7-onnx ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/onnx, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc5.4-mobile-lightweight-dispatch-build ciflow/all, ciflow/cpu, ciflow/default, ciflow/libtorch, ciflow/linux, ciflow/mobile, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7 ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
linux-xenial-py3.7-gcc7-no-ops ciflow/all, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
macos-arm64-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-arm64-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
macos-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
macos-binary-libtorch-cxx11-abi ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-libtorch-pre-cxx11 ciflow/binaries, ciflow/binaries_libtorch, ciflow/default ✅ triggered
macos-binary-wheel ciflow/binaries, ciflow/binaries_wheel, ciflow/default ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit ciflow/all, ciflow/android, ciflow/cpu, ciflow/default, ciflow/linux, ciflow/trunk ✅ triggered
win-vs2019-cpu-py3 ciflow/all, ciflow/cpu, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
win-vs2019-cuda11.3-py3 ciflow/all, ciflow/cuda, ciflow/default, ciflow/trunk, ciflow/win ✅ triggered
windows-binary-conda ciflow/binaries, ciflow/binaries_conda, ciflow/default ✅ triggered
windows-binary-libtorch-debug ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
windows-binary-libtorch-release ciflow/all, ciflow/binaries, ciflow/binaries_libtorch, ciflow/default, ciflow/trunk ✅ triggered
windows-binary-wheel ciflow/all, ciflow/binaries, ciflow/binaries_wheel, ciflow/default, ciflow/trunk ✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
docker-builds ciflow/all, ciflow/trunk 🚫 skipped
ios-12-5-1-arm64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-custom-ops ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-arm64-metal ciflow/all, ciflow/ios, ciflow/macos, ciflow/scheduled 🚫 skipped
ios-12-5-1-x86-64 ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
ios-12-5-1-x86-64-coreml ciflow/all, ciflow/ios, ciflow/macos, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/trunk 🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/slow, ciflow/trunk 🚫 skipped
linux-docs-push ciflow/all, ciflow/cpu, ciflow/linux, ciflow/scheduled 🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops ciflow/all, ciflow/cuda, ciflow/linux, ciflow/trunk 🚫 skipped
macos-10-15-py3-arm64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
macos-11-py3-x86-64 ciflow/all, ciflow/macos, ciflow/trunk 🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/libtorch, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7 ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled, ciflow/slow, ciflow/slow-gradcheck 🚫 skipped
periodic-linux-xenial-cuda11.3-py3.7-gcc7-debug ciflow/all, ciflow/cuda, ciflow/linux, ciflow/scheduled 🚫 skipped
periodic-win-vs2019-cuda11.5-py3 ciflow/all, ciflow/cuda, ciflow/scheduled, ciflow/win 🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build ciflow/all, ciflow/android, ciflow/cpu, ciflow/linux, ciflow/trunk 🚫 skipped
pytorch-xla-linux-bionic-py3.7-clang8 ciflow/all, ciflow/cpu, ciflow/linux, ciflow/trunk, ciflow/xla 🚫 skipped

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Mar 8, 2022

🔗 Helpful links

💊 CI failures summary and remediations

As of commit 74ec262 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@garymm garymm marked this pull request as draft March 8, 2022 20:21
@garymm garymm marked this pull request as ready for review March 9, 2022 19:33
@garymm garymm marked this pull request as draft March 9, 2022 19:33
@garymm garymm marked this pull request as ready for review March 10, 2022 03:13
@BowenBao
Copy link
Collaborator

What's your opinion on test_operators.py? I feel it creates a lot of noise for PRs like this, and it is impossible to validate if the updated version is correct or not since there are more than 100 files. That being said, it is useful to capture certain issues like shape/type inference or constant folding, that are not detectable by runtime tests.

@garymm
Copy link
Collaborator Author

garymm commented Mar 11, 2022

What's your opinion on test_operators.py?

I think we should restrict usage of diff tests to only those cases where we can't figure out meaningful explicit assertions about the expected output.
Having this many diff tests is not very useful, especially without a tool that compresses similar diffs for easier code review. Google has tools to help analyze diffs of structured data (including protocol buffers) internally but I don't know of anything available for us to use.
So long term I think we should get rid of most if not all of the diff tests.

@ezyang ezyang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Mar 12, 2022
Copy link
Collaborator

@thiagocrepaldi thiagocrepaldi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have always overwritten tests with newer opsets, which is great for newer version, but we also loose coverage on older versions

Have you considered a way of having tests for different opset versions as opposed to replace them? Does it make sense? Maybe opset 7 for a certain op ha one behavior and for another a slightly different. Keeping both, we could make sure both are guarded

@garymm
Copy link
Collaborator Author

garymm commented Mar 14, 2022

We have always overwritten tests with newer opsets

I don't know what you mean by this.

The operator diff tests (that diff the output against the .expect files) always use the default op set version, which I think has never changed before.

Most of the other tests (e.g. test_pytorch_onnx_onnxruntime.py) are automatically run against all versions.

The operator diff tests are pretty annoying for many reasons and I think we should reduce their scope rather than (or before) expanding them to all op set versions. See discussion with Bowen above.

I'm not sure I understood the concern, so does this comment address your concern?

@thiagocrepaldi
Copy link
Collaborator

thiagocrepaldi commented Mar 14, 2022

We have always overwritten tests with newer opsets

I don't know what you mean by this.

The operator diff tests (that diff the output against the .expect files) always use the default op set version, which I think has never changed before.

Most of the other tests (e.g. test_pytorch_onnx_onnxruntime.py) are automatically run against all versions.

The operator diff tests are pretty annoying for many reasons and I think we should reduce their scope rather than (or before) expanding them to all op set versions. See discussion with Bowen above.

I'm not sure I understood the concern, so does this comment address your concern?

Yes, it does. My point was that for expect operator diff tests whether we 1) should only test the default opset or 2) create a mechanism to keep testing old opsets while adding new expect files for newer opsets

tools/onnx/update_default_opset_version.py Outdated Show resolved Hide resolved
tools/onnx/update_default_opset_version.py Outdated Show resolved Hide resolved
torch/onnx/symbolic_helper.py Show resolved Hide resolved
And add a new tool to update it in the future, which follows the policy
of using "latest as of 18 months ago". This policy is meant to balance:
* recent enough to increase the odds of being able to successfully
  export
* old enough to increase the odds of exported model being runnable by
  different ONNX implementations

Also minor clean-up of related code in symbolic_helper:

* Remove a misleading comment
* Remove unnecessary check in _set_opset_version
* Use a range to define _onnx_stable_opsets
@garymm garymm added release notes: onnx torch.onnx related changes that should show up in the release notes onnx-needs-import This PR is related to ONNX, but touches files outside of merge rule patterns, and hence needs import labels Mar 21, 2022
@garymm
Copy link
Collaborator Author

garymm commented Mar 22, 2022

@malfet @msaroufim this needs import for the one-line change in test/quantization/eager/test_quantize_eager_ptq.py.
I'm somewhat worried it will cause internal test failures though. They should be fixable with a similar 1-line change, but if there are many such failures please let me know.

@thiagocrepaldi
Copy link
Collaborator

thiagocrepaldi commented Mar 28, 2022

Closed and reopen to trigger Github pipelines after conflict resolution for test/onnx/test_pytorch_onnx_onnxruntime.py
A new test has been merged on this file in between changes

@garymm garymm requested a review from malfet March 29, 2022 20:10
@facebook-github-bot
Copy link
Contributor

@msaroufim has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@malfet has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit to facebookresearch/pytext that referenced this pull request Apr 6, 2022
Summary:
And add a new tool to update it in the future, which follows the policy
of using "latest as of 18 months ago". This policy is meant to balance:
* recent enough to increase the odds of being able to successfully
  export
* old enough to increase the odds of exported model being runnable by
  different ONNX implementations

Related changes:

* test_models.py: explicitly fix opset_version to 9 rather than relying on default. Caffe2 doesn't support newer versions.
* symbolic_helper.py:
  * Remove a misleading comment
  * Remove unnecessary check in `_set_opset_version`
  * Use a range to define `_onnx_stable_opsets`
* test_pytorch_common.py:
  * Rename a variable from min -> max. I think it was a copy-paste error.
  * Make skip test messages more informative.
  * Remove unused `skipIfONNXShapeInference`. More on that below.
* test_pytorch_onnx_onnxruntime.py:
  * Make all the `TestCase` classes explicitly specify opset version.
  * Make `test_unsupported_pad` respect `opset_version` by using `run_test`
  * Unrelated simplification: make it obvious that all tests run with `onnx_shape_inference=True`. AFAICT this was already the case.
  * There was one test that was entirely disabled (test_tolist) because it was asking to be skipped whenever `onnx_shape_inference=True`, but it was always True. I changed the model being tested so as to preserve the intended test coverage but still have the test actually pass.

X-link: pytorch/pytorch#73898

Reviewed By: msaroufim

Differential Revision: D35264615

Pulled By: malfet

fbshipit-source-id: cda8fbdffe4cc8210d8d96e659e3a9adf1b5f1d2
facebook-github-bot pushed a commit that referenced this pull request Apr 7, 2022
Summary:
And add a new tool to update it in the future, which follows the policy
of using "latest as of 18 months ago". This policy is meant to balance:
* recent enough to increase the odds of being able to successfully
  export
* old enough to increase the odds of exported model being runnable by
  different ONNX implementations

Related changes:

* test_models.py: explicitly fix opset_version to 9 rather than relying on default. Caffe2 doesn't support newer versions.
* symbolic_helper.py:
  * Remove a misleading comment
  * Remove unnecessary check in `_set_opset_version`
  * Use a range to define `_onnx_stable_opsets`
* test_pytorch_common.py:
  * Rename a variable from min -> max. I think it was a copy-paste error.
  * Make skip test messages more informative.
  * Remove unused `skipIfONNXShapeInference`. More on that below.
* test_pytorch_onnx_onnxruntime.py:
  * Make all the `TestCase` classes explicitly specify opset version.
  * Make `test_unsupported_pad` respect `opset_version` by using `run_test`
  * Unrelated simplification: make it obvious that all tests run with `onnx_shape_inference=True`. AFAICT this was already the case.
  * There was one test that was entirely disabled (test_tolist) because it was asking to be skipped whenever `onnx_shape_inference=True`, but it was always True. I changed the model being tested so as to preserve the intended test coverage but still have the test actually pass.

Pull Request resolved: #73898

Reviewed By: msaroufim

Differential Revision: D35264615

Pulled By: malfet

fbshipit-source-id: cda8fbdffe4cc8210d8d96e659e3a9adf1b5f1d2
@garymm garymm deleted the op-set-version branch April 7, 2022 20:27
facebook-github-bot pushed a commit to facebookresearch/pytext that referenced this pull request Apr 11, 2022
Summary: As followup of D35264615 (b2ef1f3) (pytorch/pytorch#73898)

Differential Revision: D35472745

fbshipit-source-id: 3ed9501088f22301a91c8d8e585557368ec225fa
@stonelazy
Copy link

stonelazy commented Jun 29, 2022

Am sorry if I am asking in a wrong thread, recently Signal processing operators were supported by ONNX, it's available in opset-17. Any estimate on when it will become available in Torch so that i can invoke export API. So, according to the policy is it like i have to wait 18months for this ?

@garymm
Copy link
Collaborator Author

garymm commented Jun 29, 2022

@stonelazy please open an issue requesting support for the specific ATen / torch op that you need with an example model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed module: onnx Related to torch.onnx onnx-needs-import This PR is related to ONNX, but touches files outside of merge rule patterns, and hence needs import open source release notes: onnx torch.onnx related changes that should show up in the release notes triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants