Skip to content

Conversation

Ninja91
Copy link
Contributor

@Ninja91 Ninja91 commented Aug 29, 2025

Stack from ghstack (oldest at bottom):

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:

  • Add INT16 dtype validation support in op_rescale.py
  • Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.

Differential Revision: D80513725

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Aug 29, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13802

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures, 107 Cancelled Jobs

As of commit 4f3502b with merge base 1d37845 (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOBS - The following jobs were cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Ninja91 added a commit that referenced this pull request Aug 29, 2025
Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

ghstack-source-id: 304555411
Pull Request resolved: #13802
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 29, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80513725

@zingo zingo added ciflow/trunk module: arm Issues related to arm backend partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm labels Aug 29, 2025
@zingo zingo changed the title Add INT16 support to rescale operation Arm backend: Add INT16 support to rescale operation Aug 29, 2025
Copy link
Contributor

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see torch ops like tests for TOSA dialect ops, perhaps we should add it here. @per what do you think?

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

cc digantdesai freddan80 per zingo oscarandersson8218

[ghstack-poisoned]
Ninja91 added a commit that referenced this pull request Sep 4, 2025
Pull Request resolved: #13802

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.
ghstack-source-id: 304555411
@exported-using-ghexport

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80513725

Copy link
Contributor

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review automatically exported from Phabricator review in Meta.

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

cc digantdesai freddan80 per zingo oscarandersson8218

[ghstack-poisoned]
Ninja91 added a commit that referenced this pull request Sep 6, 2025
Pull Request resolved: #13802

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.
ghstack-source-id: 308021436
@exported-using-ghexport

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80513725

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

cc digantdesai freddan80 per zingo oscarandersson8218

[ghstack-poisoned]
Ninja91 added a commit that referenced this pull request Sep 6, 2025
Pull Request resolved: #13802

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.
ghstack-source-id: 308024305
@exported-using-ghexport

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80513725

@Ninja91 Ninja91 added the release notes: arm Changes to the ARM backend delegate label Sep 8, 2025
Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

cc digantdesai freddan80 per zingo oscarandersson8218

[ghstack-poisoned]
Ninja91 added a commit that referenced this pull request Sep 10, 2025
Pull Request resolved: #13802

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.
ghstack-source-id: 308024305
@exported-using-ghexport

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80513725

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

cc digantdesai freddan80 per zingo oscarandersson8218

[ghstack-poisoned]
Ninja91 added a commit that referenced this pull request Sep 10, 2025
Pull Request resolved: #13802

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.
ghstack-source-id: 308860606
@exported-using-ghexport

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80513725

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

cc digantdesai freddan80 per zingo oscarandersson8218

[ghstack-poisoned]
Ninja91 added a commit that referenced this pull request Sep 10, 2025
Pull Request resolved: #13802

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.
ghstack-source-id: 308928949
@exported-using-ghexport

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80513725

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)

cc digantdesai freddan80 per zingo oscarandersson8218

[ghstack-poisoned]
Ninja91 added a commit that referenced this pull request Sep 11, 2025
Pull Request resolved: #13802

Add INT16 support for RequantizeNode rescale operations in ExecutorTorch ARM backend.

This follows the pattern established for linear, mul, sigmoid, tanh, slice, view/transpose, cat, and FCNode operations, extending int16 support to RequantizeNode rescale operations.

Changes:
- Add INT16 dtype validation support in op_rescale.py
- Enable rescale operations for 16A8W quantization configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. RequantizeNode rescale operations are essential for proper quantization scaling in the 16A8W pipeline.
ghstack-source-id: 308986675
@exported-using-ghexport

Differential Revision: [D80513725](https://our.internmc.facebook.com/intern/diff/D80513725/)
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80513725

@facebook-github-bot facebook-github-bot merged commit 8c3c565 into gh/Ninja91/15/base Sep 13, 2025
291 of 303 checks passed
@facebook-github-bot facebook-github-bot deleted the gh/Ninja91/15/head branch September 13, 2025 08:41
@mergennachin
Copy link
Contributor

@Ninja91 You need to land this in main

cc @digantdesai (reviewer) @lucylq (oncall)

lucylq pushed a commit that referenced this pull request Sep 15, 2025
Differential Revision: D80513725

Pull Request resolved: #13802
lucylq added a commit that referenced this pull request Sep 15, 2025
Differential Revision: D80513725

Pull Request resolved: #13802


#13802 (comment)
failed to cp to main

Co-authored-by: Nitin Jain <jainnitin@meta.com>
StrycekSimon pushed a commit to nxp-upstream/executorch that referenced this pull request Sep 23, 2025
Differential Revision: D80513725

Pull Request resolved: pytorch#13802


pytorch#13802 (comment)
failed to cp to main

Co-authored-by: Nitin Jain <jainnitin@meta.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported module: arm Issues related to arm backend partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: arm Changes to the ARM backend delegate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants