[ET-VK][DCE] Remove redundant quantized linear implementations #14151

SS-JIA · 2025-09-10T15:56:32Z

Stack from ghstack (oldest at bottom):

As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under QuantizedLinear.cpp, the operators implemented in QuantizedLinearQGANW.cpp and QuantizedLinear_QTA8A_QGA4W.cpp are no longer required.

This diff removes all code related to the operators implemented in those files, namely:

et_vk.linear_weight_int4.default
et_vk.linear_qta8a_qga4w.default

AOT export logic needed to support those ops are also removed.

Differential Revision: D82120824

As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under `QuantizedLinear.cpp`, the operators implemented in `QuantizedLinearQGANW.cpp` and `QuantizedLinear_QTA8A_QGA4W.cpp` are no longer required. This diff removes all code related to the operators implemented in those files, namely: * `et_vk.linear_weight_int4.default` * `et_vk.linear_qta8a_qga4w.default` AOT export logic needed to support those ops are also removed. Differential Revision: [D82120824](https://our.internmc.facebook.com/intern/diff/D82120824/) [ghstack-poisoned]

pytorch-bot · 2025-09-10T15:56:35Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14151

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 42 Unrelated Failures

As of commit a63a9b4 with merge base 598ba46 ():

NEW FAILURE - The following job has failed:

trunk / test-arm-backend (test_pytest_ops_ethosu_fvp) / linux-job (gh)
RuntimeError: Command docker exec -t 5d20dd0cfbd016c3a258b32a23b410085b8473fd65c8b34a92a71e6a5cb87fa8 /exec failed with exit code 1

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / test-binary-size-linux-gcc / linux-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / test-moshi-linux / linux-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / test-openvino-linux / linux-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / test-setup-linux-gcc / linux-job (gh) (trunk failure)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under `QuantizedLinear.cpp`, the operators implemented in `QuantizedLinearQGANW.cpp` and `QuantizedLinear_QTA8A_QGA4W.cpp` are no longer required. This diff removes all code related to the operators implemented in those files, namely: * `et_vk.linear_weight_int4.default` * `et_vk.linear_qta8a_qga4w.default` AOT export logic needed to support those ops are also removed. Differential Revision: [D82120824](https://our.internmc.facebook.com/intern/diff/D82120824/) ghstack-source-id: 308785943 Pull Request resolved: #14151

facebook-github-bot · 2025-09-10T15:56:52Z

This pull request was exported from Phabricator. Differential Revision: D82120824

github-actions · 2025-09-10T15:57:30Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

…ions" As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under `QuantizedLinear.cpp`, the operators implemented in `QuantizedLinearQGANW.cpp` and `QuantizedLinear_QTA8A_QGA4W.cpp` are no longer required. This diff removes all code related to the operators implemented in those files, namely: * `et_vk.linear_weight_int4.default` * `et_vk.linear_qta8a_qga4w.default` AOT export logic needed to support those ops are also removed. Differential Revision: [D82120824](https://our.internmc.facebook.com/intern/diff/D82120824/) [ghstack-poisoned]

Pull Request resolved: #14151 As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under `QuantizedLinear.cpp`, the operators implemented in `QuantizedLinearQGANW.cpp` and `QuantizedLinear_QTA8A_QGA4W.cpp` are no longer required. This diff removes all code related to the operators implemented in those files, namely: * `et_vk.linear_weight_int4.default` * `et_vk.linear_qta8a_qga4w.default` AOT export logic needed to support those ops are also removed. Differential Revision: [D82120824](https://our.internmc.facebook.com/intern/diff/D82120824/) ghstack-source-id: 308791208

facebook-github-bot · 2025-09-10T16:14:04Z

This pull request was exported from Phabricator. Differential Revision: D82120824

…ions" As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under `QuantizedLinear.cpp`, the operators implemented in `QuantizedLinearQGANW.cpp` and `QuantizedLinear_QTA8A_QGA4W.cpp` are no longer required. This diff removes all code related to the operators implemented in those files, namely: * `et_vk.linear_weight_int4.default` * `et_vk.linear_qta8a_qga4w.default` AOT export logic needed to support those ops are also removed. Differential Revision: [D82120824](https://our.internmc.facebook.com/intern/diff/D82120824/) [ghstack-poisoned]

Pull Request resolved: #14151 As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under `QuantizedLinear.cpp`, the operators implemented in `QuantizedLinearQGANW.cpp` and `QuantizedLinear_QTA8A_QGA4W.cpp` are no longer required. This diff removes all code related to the operators implemented in those files, namely: * `et_vk.linear_weight_int4.default` * `et_vk.linear_qta8a_qga4w.default` AOT export logic needed to support those ops are also removed. ghstack-source-id: 308939585 Differential Revision: [D82120824](https://our.internmc.facebook.com/intern/diff/D82120824/)

facebook-github-bot · 2025-09-11T00:09:41Z

This pull request was exported from Phabricator. Differential Revision: D82120824

Pull Request resolved: #14151 As title. With the recent support added for dynamically quantized + weight only quantized int4 and int8 linear, which are consolidated under `QuantizedLinear.cpp`, the operators implemented in `QuantizedLinearQGANW.cpp` and `QuantizedLinear_QTA8A_QGA4W.cpp` are no longer required. This diff removes all code related to the operators implemented in those files, namely: * `et_vk.linear_weight_int4.default` * `et_vk.linear_qta8a_qga4w.default` AOT export logic needed to support those ops are also removed. ghstack-source-id: 308939585 Differential Revision: [D82120824](https://our.internmc.facebook.com/intern/diff/D82120824/)

SS-JIA requested review from jackzhxng, larryliu0820, lucylq, mergennachin and swolchok as code owners September 10, 2025 15:56

SS-JIA mentioned this pull request Sep 10, 2025

[ET-VK] Misc code cleanup to recent Quantized Linear + SDPA implementations #14150

Merged

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 10, 2025

facebook-github-bot added the fb-exported label Sep 10, 2025

manuelcandales approved these changes Sep 10, 2025

View reviewed changes

facebook-github-bot merged commit b304076 into gh/SS-JIA/326/base Sep 11, 2025
244 of 290 checks passed

facebook-github-bot deleted the gh/SS-JIA/326/head branch September 11, 2025 04:51

facebook-github-bot temporarily deployed to cherry-pick-bot September 11, 2025 04:51 — with GitHub Actions Inactive

pytorchbot mentioned this pull request Sep 11, 2025

[ET-VK][DCE] Remove redundant quantized linear implementations #14198

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ET-VK][DCE] Remove redundant quantized linear implementations #14151

[ET-VK][DCE] Remove redundant quantized linear implementations #14151

Uh oh!

SS-JIA commented Sep 10, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Sep 10, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Sep 10, 2025

Uh oh!

github-actions bot commented Sep 10, 2025

Uh oh!

facebook-github-bot commented Sep 10, 2025

Uh oh!

facebook-github-bot commented Sep 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[ET-VK][DCE] Remove redundant quantized linear implementations #14151

[ET-VK][DCE] Remove redundant quantized linear implementations #14151

Uh oh!

Conversation

SS-JIA commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14151

❌ 1 New Failure, 42 Unrelated Failures

Uh oh!

facebook-github-bot commented Sep 10, 2025

Uh oh!

github-actions bot commented Sep 10, 2025

This PR needs a release notes: label

Uh oh!

facebook-github-bot commented Sep 10, 2025

Uh oh!

facebook-github-bot commented Sep 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SS-JIA commented Sep 10, 2025 •

edited

Loading

pytorch-bot bot commented Sep 10, 2025 •

edited

Loading

This PR needs a `release notes:` label