Arm backend: Add INT16 support to rescale operation #13802

Ninja91 · 2025-08-29T06:43:07Z

Stack from ghstack (oldest at bottom):

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:

Add INT16 dtype validation support in op_rescale.py
Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.

Differential Revision: D80513725

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/) [ghstack-poisoned]

pytorch-bot · 2025-08-29T06:43:11Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13802

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures, 107 Cancelled Jobs

As of commit 4f3502b with merge base 1d37845 ():

NEW FAILURES - The following jobs have failed:

Apple / build-benchmark-app / macos-job (gh)
Build Windows Wheels / pytorch/executorch / upload / upload-wheel-py3_10-cpu (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_executorch__3.10_cpu_x64
Propose to merge ghstack orig PRs to main / Try to create a PR with ghstack /orig branch (gh)
Process completed with exit code 1.
trunk / test-llama-runner-mac (fp32, coreml) / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1
trunk / test-qnn-model (fp32, dl3) / linux-job (gh)
RuntimeError: Command docker exec -t 63fe1aeb24fce19a6bce3695ab8e3fb6c320b89d9960ec4cb867889dc7910bd6 /exec failed with exit code 92

CANCELLED JOBS - The following jobs were cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/) ghstack-source-id: 304555411 Pull Request resolved: #13802

facebook-github-bot · 2025-08-29T06:44:14Z

This pull request was exported from Phabricator. Differential Revision: D80513725

backends/arm/operators/op_rescale.py

digantdesai

I didn't see torch ops like tests for TOSA dialect ops, perhaps we should add it here. @per what do you think?

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/) cc digantdesai freddan80 per zingo oscarandersson8218 [ghstack-poisoned]

Pull Request resolved: #13802 Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. ghstack-source-id: 304555411 @exported-using-ghexport Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

facebook-github-bot · 2025-09-04T14:50:43Z

This pull request was exported from Phabricator. Differential Revision: D80513725

digantdesai

Review automatically exported from Phabricator review in Meta.

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/) cc digantdesai freddan80 per zingo oscarandersson8218 [ghstack-poisoned]

Pull Request resolved: #13802 Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. ghstack-source-id: 308021436 @exported-using-ghexport Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

facebook-github-bot · 2025-09-06T17:31:49Z

This pull request was exported from Phabricator. Differential Revision: D80513725

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/) cc digantdesai freddan80 per zingo oscarandersson8218 [ghstack-poisoned]

Pull Request resolved: #13802 Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. ghstack-source-id: 308024305 @exported-using-ghexport Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

facebook-github-bot · 2025-09-06T17:39:33Z

This pull request was exported from Phabricator. Differential Revision: D80513725

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/) cc digantdesai freddan80 per zingo oscarandersson8218 [ghstack-poisoned]

Pull Request resolved: #13802 Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. ghstack-source-id: 308024305 @exported-using-ghexport Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

facebook-github-bot · 2025-09-10T15:32:16Z

This pull request was exported from Phabricator. Differential Revision: D80513725

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/) cc digantdesai freddan80 per zingo oscarandersson8218 [ghstack-poisoned]

Pull Request resolved: #13802 Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. ghstack-source-id: 308860606 @exported-using-ghexport Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

facebook-github-bot · 2025-09-10T20:30:31Z

This pull request was exported from Phabricator. Differential Revision: D80513725

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/) cc digantdesai freddan80 per zingo oscarandersson8218 [ghstack-poisoned]

Pull Request resolved: #13802 Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. ghstack-source-id: 308928949 @exported-using-ghexport Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

facebook-github-bot · 2025-09-10T23:29:07Z

This pull request was exported from Phabricator. Differential Revision: D80513725

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/) cc digantdesai freddan80 per zingo oscarandersson8218 [ghstack-poisoned]

Pull Request resolved: #13802 Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend. This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations. Changes: - Add INT16 dtype validation support in op_rescale.py - Enable rescale operations for 16A8W quantization configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline. ghstack-source-id: 308986675 @exported-using-ghexport Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

facebook-github-bot · 2025-09-11T05:08:31Z

This pull request was exported from Phabricator. Differential Revision: D80513725

mergennachin · 2025-09-15T15:29:15Z

@Ninja91 You need to land this in main

cc @digantdesai (reviewer) @lucylq (oncall)

Differential Revision: D80513725 Pull Request resolved: #13802

Differential Revision: D80513725 Pull Request resolved: #13802 #13802 (comment) failed to cp to main Co-authored-by: Nitin Jain <jainnitin@meta.com>

Differential Revision: D80513725 Pull Request resolved: pytorch#13802 pytorch#13802 (comment) failed to cp to main Co-authored-by: Nitin Jain <jainnitin@meta.com>

Ninja91 requested a review from digantdesai as a code owner August 29, 2025 06:43

This was referenced Aug 29, 2025

Arm backend: Add 16A8W support and test for add operation #13789

Merged

Arm backend: Add 16A8W support and test for mul operation #13795

Merged

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 29, 2025

facebook-github-bot added the fb-exported label Aug 29, 2025

zingo added ciflow/trunk module: arm Issues related to arm backend partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm labels Aug 29, 2025

zingo changed the title ~~Add INT16 support to rescale operation~~ Arm backend: Add INT16 support to rescale operation Aug 29, 2025

digantdesai reviewed Aug 29, 2025

View reviewed changes

backends/arm/operators/op_rescale.py Show resolved Hide resolved

digantdesai requested changes Aug 29, 2025

View reviewed changes

Ninja91 mentioned this pull request Sep 4, 2025

Add 16A8W quantization configuration utility for ARM backend #13898

Open

Ninja91 mentioned this pull request Sep 4, 2025

[Arm] Support INT16 rescale ops with TOSA reference model run #13980

Open

digantdesai approved these changes Sep 5, 2025

View reviewed changes

Ninja91 added the release notes: arm Changes to the ARM backend delegate label Sep 8, 2025

facebook-github-bot merged commit 8c3c565 into gh/Ninja91/15/base Sep 13, 2025
291 of 303 checks passed

facebook-github-bot deleted the gh/Ninja91/15/head branch September 13, 2025 08:41

facebook-github-bot had a problem deploying to cherry-pick-bot September 13, 2025 08:41 — with GitHub Actions Failure

facebook-github-bot had a problem deploying to cherry-pick-bot September 13, 2025 21:22 — with GitHub Actions Failure

lucylq mentioned this pull request Sep 15, 2025

Retake: Merged Arm backend: Add INT16 support to rescale operation #13802 #14300

Closed

lucylq pushed a commit that referenced this pull request Sep 15, 2025

Arm backend: Add INT16 support to rescale operation

544f4f6

Differential Revision: D80513725 Pull Request resolved: #13802

lucylq mentioned this pull request Sep 15, 2025

Arm backend: Add INT16 support to rescale operation #14301

Merged

lucylq added a commit that referenced this pull request Sep 15, 2025

Arm backend: Add INT16 support to rescale operation (#14301)

eaad1c2

Differential Revision: D80513725 Pull Request resolved: #13802 #13802 (comment) failed to cp to main Co-authored-by: Nitin Jain <jainnitin@meta.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Arm backend: Add INT16 support to rescale operation #13802

Arm backend: Add INT16 support to rescale operation #13802

Uh oh!

Ninja91 commented Aug 29, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Aug 29, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Aug 29, 2025

Uh oh!

Uh oh!

digantdesai left a comment

Uh oh!

facebook-github-bot commented Sep 4, 2025

Uh oh!

digantdesai left a comment

Uh oh!

facebook-github-bot commented Sep 6, 2025

Uh oh!

facebook-github-bot commented Sep 6, 2025

Uh oh!

facebook-github-bot commented Sep 10, 2025

Uh oh!

facebook-github-bot commented Sep 10, 2025

Uh oh!

facebook-github-bot commented Sep 10, 2025

Uh oh!

facebook-github-bot commented Sep 11, 2025

Uh oh!

Uh oh!

mergennachin commented Sep 15, 2025

Uh oh!

Uh oh!

Arm backend: Add INT16 support to rescale operation #13802

Arm backend: Add INT16 support to rescale operation #13802

Uh oh!

Conversation

Ninja91 commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13802

❌ 5 New Failures, 107 Cancelled Jobs

Uh oh!

facebook-github-bot commented Aug 29, 2025

Uh oh!

Uh oh!

digantdesai left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Sep 4, 2025

Uh oh!

digantdesai left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Sep 6, 2025

Uh oh!

facebook-github-bot commented Sep 6, 2025

Uh oh!

facebook-github-bot commented Sep 10, 2025

Uh oh!

facebook-github-bot commented Sep 10, 2025

Uh oh!

facebook-github-bot commented Sep 10, 2025

Uh oh!

facebook-github-bot commented Sep 11, 2025

Uh oh!

Uh oh!

mergennachin commented Sep 15, 2025

Uh oh!

Uh oh!

Ninja91 commented Aug 29, 2025 •

edited

Loading

pytorch-bot bot commented Aug 29, 2025 •

edited

Loading