-
Notifications
You must be signed in to change notification settings - Fork 683
Arm backend: Add 16A8W linear ops support and test #13754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Pull Request resolved: #13641 This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. ghstack-source-id: 305891620 @exported-using-ghexport Differential Revision: [D79763381](https://our.internmc.facebook.com/intern/diff/D79763381/)
Pull Request resolved: #13658 - Adds linear ops test using the 16A8W config in INT16 profile. - Adds support in view ops validation for INT16 Dtype. - Validated with TOSA pipeline test. - Checked earlier marked flaky tests no longer flaky and remove markers. Note: Not verified with tosa reference model run. ghstack-source-id: 305897251 Differential Revision: [D80308822](https://our.internmc.facebook.com/intern/diff/D80308822/)
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13754
Note: Links to docs will display an error until the docs builds have been completed. ❌ 9 New Failures, 3 Unrelated FailuresAs of commit ee58c9b with merge base 9053089 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@Ninja91 Nice change! The changes in op_transpose.py and op_view.py results in some test failures as we partition a few ops incorrectly. @per and @agrima1304 have patches that fixes these failures but their patches are blocked by the vela pin update in #13282. If you move the changes in op_transpose.py and op_view.py to a separate PR I believe we should be able to merge this PR. |
@Ninja91 arm tests started failing after this PR, see this dashboard |
reverting here: #13895 |
…13895) This reverts commit f8156fb. ### Summary [PLEASE REMOVE] See [CONTRIBUTING.md's Pull Requests](https://github.com/pytorch/executorch/blob/main/CONTRIBUTING.md#pull-requests) for ExecuTorch PR guidelines. [PLEASE REMOVE] If this PR closes an issue, please add a `Fixes #<issue-id>` line. [PLEASE REMOVE] If this PR introduces a fix or feature that should be the upcoming release notes, please add a "Release notes: <area>" label. For a list of available release notes labels, check out [CONTRIBUTING.md's Pull Requests](https://github.com/pytorch/executorch/blob/main/CONTRIBUTING.md#pull-requests). ### Test plan [PLEASE REMOVE] How did you test this PR? Please write down any manual commands you used and note down tests that you have written if applicable.
@per @mergennachin @oscarandersson8218 the PR was reverted and I am pushing this now here: #13899. Validated that no arm tests are failing. |
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #13899 - Adds linear ops test using the 16A8W config in INT16 profile. - Adds support in view ops validation for INT16 Dtype. - Validated with TOSA pipeline test. - Checked earlier marked flaky tests no longer flaky and remove markers. Note: Not verified with tosa reference model run. Differential Revision: [D81550511](https://our.internmc.facebook.com/intern/diff/D81550511/) Differential Revision: [D81550511](https://our.internmc.facebook.com/intern/diff/D81550511) Reattempt to land #13754
This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #13658 by @Ninja91
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/Ninja91/3/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/Ninja91/3/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/Ninja91/1/orig
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/Ninja91/3/orig
@diff-train-skip-merge
cc @digantdesai @freddan80 @per @zingo @oscarandersson8218