Skip to content

Conversation

pytorchbot
Copy link
Collaborator

@pytorchbot pytorchbot commented Aug 28, 2025

This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #13658 by @Ninja91
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/Ninja91/3/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/Ninja91/3/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/Ninja91/1/orig
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/Ninja91/3/orig
@diff-train-skip-merge

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218

Pull Request resolved: #13641

This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479.

## Key Changes

**1. New Quantization Configuration Function**
- Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py`
- Provides 16-bit activations with HistogramObserver (better precision than 8A8W)
- Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient)
- **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)**

## Benefits
- **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets.
ghstack-source-id: 305891620
@exported-using-ghexport

Differential Revision: [D79763381](https://our.internmc.facebook.com/intern/diff/D79763381/)
Pull Request resolved: #13658

- Adds linear ops test using the 16A8W config in INT16 profile.
- Adds support in view ops validation for INT16 Dtype.
- Validated with TOSA pipeline test.
- Checked earlier marked flaky tests no longer flaky and remove markers.

Note: Not verified with tosa reference model run.
ghstack-source-id: 305897251

Differential Revision: [D80308822](https://our.internmc.facebook.com/intern/diff/D80308822/)
Copy link

pytorch-bot bot commented Aug 28, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13754

Note: Links to docs will display an error until the docs builds have been completed.

❌ 9 New Failures, 3 Unrelated Failures

As of commit ee58c9b with merge base 9053089 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 28, 2025
@zingo zingo changed the title Add 16A8W linear ops support and test Arm backend: Add 16A8W linear ops support and test Aug 28, 2025
@zingo zingo added ciflow/trunk module: arm Issues related to arm backend partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm labels Aug 28, 2025
@oscarandersson8218
Copy link
Collaborator

@Ninja91 Nice change! The changes in op_transpose.py and op_view.py results in some test failures as we partition a few ops incorrectly. @per and @agrima1304 have patches that fixes these failures but their patches are blocked by the vela pin update in #13282.

If you move the changes in op_transpose.py and op_view.py to a separate PR I believe we should be able to merge this PR.

Base automatically changed from gh/Ninja91/1/orig to main September 2, 2025 21:56
@lucylq lucylq merged commit f8156fb into main Sep 2, 2025
247 of 266 checks passed
@lucylq lucylq deleted the gh/Ninja91/3/orig branch September 2, 2025 22:01
@mergennachin
Copy link
Contributor

@Ninja91 arm tests started failing after this PR, see this dashboard

https://hud.pytorch.org/hud/pytorch/executorch/main/1?per_page=50&name_filter=arm-&mergeEphemeralLF=true

@mergennachin
Copy link
Contributor

mergennachin commented Sep 3, 2025

cc @lucylq @per @digantdesai @shoumikhin

lucylq added a commit that referenced this pull request Sep 3, 2025
@lucylq
Copy link
Contributor

lucylq commented Sep 3, 2025

reverting here: #13895

lucylq added a commit that referenced this pull request Sep 3, 2025
…13895)

This reverts commit f8156fb.

### Summary
[PLEASE REMOVE] See [CONTRIBUTING.md's Pull
Requests](https://github.com/pytorch/executorch/blob/main/CONTRIBUTING.md#pull-requests)
for ExecuTorch PR guidelines.

[PLEASE REMOVE] If this PR closes an issue, please add a `Fixes
#<issue-id>` line.

[PLEASE REMOVE] If this PR introduces a fix or feature that should be
the upcoming release notes, please add a "Release notes: <area>" label.
For a list of available release notes labels, check out
[CONTRIBUTING.md's Pull
Requests](https://github.com/pytorch/executorch/blob/main/CONTRIBUTING.md#pull-requests).

### Test plan
[PLEASE REMOVE] How did you test this PR? Please write down any manual
commands you used and note down tests that you have written if
applicable.
@Ninja91
Copy link
Contributor

Ninja91 commented Sep 3, 2025

@per @mergennachin @oscarandersson8218 the PR was reverted and I am pushing this now here: #13899.

Validated that no arm tests are failing.

Ninja91 added a commit that referenced this pull request Sep 3, 2025
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at
bottom):
* __->__ #13899



- Adds linear ops test using the 16A8W config in INT16 profile.
- Adds support in view ops validation for INT16 Dtype.
- Validated with TOSA pipeline test.
- Checked earlier marked flaky tests no longer flaky and remove markers.

Note: Not verified with tosa reference model run.

Differential Revision:
[D81550511](https://our.internmc.facebook.com/intern/diff/D81550511/)

Differential Revision:
[D81550511](https://our.internmc.facebook.com/intern/diff/D81550511)

Reattempt to land #13754
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: arm Issues related to arm backend partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants