[ET-VK] Conv2d quantize/dequantize ops for conv2d activations #14330

SS-JIA · 2025-09-16T14:20:08Z

Stack from ghstack (oldest at bottom):

Context

As title. Add shaders to quantize a floating point conv2d input tensor to packed int8 memory layout and dequantize a int8 conv2d output tensor back to floating point representation.

Hooking it up to the export logic will be handled in a follow up diff.

Differential Revision: D82542335

## Context As title. Add shaders to quantize a floating point conv2d input tensor to packed int8 memory layout and dequantize a int8 conv2d output tensor back to floating point representation. Hooking it up to the export logic will be handled in a follow up diff. Differential Revision: [D82542335](https://our.internmc.facebook.com/intern/diff/D82542335/) [ghstack-poisoned]

pytorch-bot · 2025-09-16T14:20:13Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14330

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 1 Cancelled Job, 3 Unrelated Failures

As of commit ffec699 with merge base c18abc8 ():

NEW FAILURES - The following jobs have failed:

pull / test-qnn-wheel-packages-linux (3.10) / linux-job (gh)
RuntimeError: Command docker exec -t c7ac157a850b9d14317d9c7d34a7122ee88b500c1fd35016df39837e8404892b /exec failed with exit code 1
pull / test-qnn-wheel-packages-linux (3.11) / linux-job (gh)
RuntimeError: Command docker exec -t 6b061ed11d5d6ef5b1ca4decb3b806d354591579de1c3181107f08ccef50f830 /exec failed with exit code 1
pull / test-qnn-wheel-packages-linux (3.12) / linux-job (gh)
RuntimeError: Command docker exec -t c9f3bb6e335f6aaa9377f91b5ee2647cf8c3641ad9c0a61b9a34e4a809a9849a /exec failed with exit code 1

CANCELLED JOB - The following job was cancelled. Please retry:

pull / test-setup-linux-gcc / linux-job (gh)

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / unittest-arm-backend-with-no-fvp (test_pytest_models) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / test-binary-size-linux-gcc / linux-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest / macos / macos-job (gh) (trunk failure)
exir/tests/test_quant_fusion_pass.py::TestQuantFusionPass::test_embedding_torchao

This comment was automatically generated by Dr. CI and updates every 15 minutes.

## Context As title. Add shaders to quantize a floating point conv2d input tensor to packed int8 memory layout and dequantize a int8 conv2d output tensor back to floating point representation. Hooking it up to the export logic will be handled in a follow up diff. Differential Revision: [D82542335](https://our.internmc.facebook.com/intern/diff/D82542335/) ghstack-source-id: 309992179 Pull Request resolved: #14330

facebook-github-bot · 2025-09-16T14:20:25Z

@SS-JIA has exported this pull request. If you are a Meta employee, you can view the originating diff in D82542335.

github-actions · 2025-09-16T14:21:00Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

…ons" ## Context As title. Add shaders to quantize a floating point conv2d input tensor to packed int8 memory layout and dequantize a int8 conv2d output tensor back to floating point representation. Hooking it up to the export logic will be handled in a follow up diff. Differential Revision: [D82542335](https://our.internmc.facebook.com/intern/diff/D82542335/) [ghstack-poisoned]

Pull Request resolved: #14330 ## Context As title. Add shaders to quantize a floating point conv2d input tensor to packed int8 memory layout and dequantize a int8 conv2d output tensor back to floating point representation. Hooking it up to the export logic will be handled in a follow up diff. Differential Revision: [D82542335](https://our.internmc.facebook.com/intern/diff/D82542335/) ghstack-source-id: 310013406

facebook-github-bot · 2025-09-16T16:05:52Z

@SS-JIA has exported this pull request. If you are a Meta employee, you can view the originating diff in D82542335.

…ons" ## Context As title. Add shaders to quantize a floating point conv2d input tensor to packed int8 memory layout and dequantize a int8 conv2d output tensor back to floating point representation. Hooking it up to the export logic will be handled in a follow up diff. Differential Revision: [D82542335](https://our.internmc.facebook.com/intern/diff/D82542335/) [ghstack-poisoned]

Pull Request resolved: #14330 ## Context As title. Add shaders to quantize a floating point conv2d input tensor to packed int8 memory layout and dequantize a int8 conv2d output tensor back to floating point representation. Hooking it up to the export logic will be handled in a follow up diff. ghstack-source-id: 310277511 Differential Revision: [D82542335](https://our.internmc.facebook.com/intern/diff/D82542335/)

facebook-github-bot · 2025-09-17T14:15:06Z

@SS-JIA has exported this pull request. If you are a Meta employee, you can view the originating diff in D82542335.

…ons" ## Context As title. Add shaders to quantize a floating point conv2d input tensor to packed int8 memory layout and dequantize a int8 conv2d output tensor back to floating point representation. Hooking it up to the export logic will be handled in a follow up diff. Differential Revision: [D82542335](https://our.internmc.facebook.com/intern/diff/D82542335/) [ghstack-poisoned]

Pull Request resolved: #14330 ## Context As title. Add shaders to quantize a floating point conv2d input tensor to packed int8 memory layout and dequantize a int8 conv2d output tensor back to floating point representation. Hooking it up to the export logic will be handled in a follow up diff. ghstack-source-id: 310286204 Differential Revision: [D82542335](https://our.internmc.facebook.com/intern/diff/D82542335/)

facebook-github-bot · 2025-09-17T14:58:23Z

@SS-JIA has exported this pull request. If you are a Meta employee, you can view the originating diff in D82542335.

…ons" ## Context As title. Add shaders to quantize a floating point conv2d input tensor to packed int8 memory layout and dequantize a int8 conv2d output tensor back to floating point representation. Hooking it up to the export logic will be handled in a follow up diff. Differential Revision: [D82542335](https://our.internmc.facebook.com/intern/diff/D82542335/) [ghstack-poisoned]

Pull Request resolved: #14330 ## Context As title. Add shaders to quantize a floating point conv2d input tensor to packed int8 memory layout and dequantize a int8 conv2d output tensor back to floating point representation. Hooking it up to the export logic will be handled in a follow up diff. ghstack-source-id: 312106550 Differential Revision: [D82542335](https://our.internmc.facebook.com/intern/diff/D82542335/)

facebook-github-bot · 2025-09-25T15:53:19Z

@SS-JIA has exported this pull request. If you are a Meta employee, you can view the originating diff in D82542335.

Pull Request resolved: #14330 ## Context As title. Add shaders to quantize a floating point conv2d input tensor to packed int8 memory layout and dequantize a int8 conv2d output tensor back to floating point representation. Hooking it up to the export logic will be handled in a follow up diff. ghstack-source-id: 312106550 Differential Revision: [D82542335](https://our.internmc.facebook.com/intern/diff/D82542335/)

@SS-JIA

This PR was created by the merge bot to help merge the original PR into the main branch. ghstack PR number: #14330 by @SS-JIA ^ Please use this as the source of truth for the PR details, comments, and reviews ghstack PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/330/base ghstack PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/330/head Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/331/orig Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/330/orig Differential Revision: [D82542335](https://our.internmc.facebook.com/intern/diff/D82542335/) @diff-train-skip-merge Co-authored-by: ssjia <ssjia@devvm1479.ncg0.facebook.com>

SS-JIA mentioned this pull request Sep 16, 2025

[ET-VK] Add kInt8x4 dtype and GPUMemoryLayouts for packed quantized tensors #14329

Merged

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 16, 2025

facebook-github-bot added fb-exported meta-exported labels Sep 16, 2025

SS-JIA requested review from larryliu0820 and kirklandsign as code owners September 17, 2025 14:57

SS-JIA mentioned this pull request Sep 25, 2025

[ET-VK] Improve q8 matmul by increasing TILE_N4 #14597

Merged

manuelcandales approved these changes Sep 25, 2025

View reviewed changes

facebook-github-bot merged commit 55b45d8 into gh/SS-JIA/330/base Sep 25, 2025
123 of 132 checks passed

facebook-github-bot deleted the gh/SS-JIA/330/head branch September 25, 2025 20:04

facebook-github-bot temporarily deployed to cherry-pick-bot September 25, 2025 20:05 — with GitHub Actions Inactive

pytorchbot mentioned this pull request Sep 25, 2025

[ET-VK] Conv2d quantize/dequantize ops for conv2d activations #14611

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ET-VK] Conv2d quantize/dequantize ops for conv2d activations #14330

[ET-VK] Conv2d quantize/dequantize ops for conv2d activations #14330

Uh oh!

SS-JIA commented Sep 16, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Sep 16, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Sep 16, 2025

Uh oh!

github-actions bot commented Sep 16, 2025

Uh oh!

facebook-github-bot commented Sep 16, 2025

Uh oh!

facebook-github-bot commented Sep 17, 2025

Uh oh!

facebook-github-bot commented Sep 17, 2025

Uh oh!

facebook-github-bot commented Sep 25, 2025

Uh oh!

Uh oh!

Uh oh!

[ET-VK] Conv2d quantize/dequantize ops for conv2d activations #14330

[ET-VK] Conv2d quantize/dequantize ops for conv2d activations #14330

Uh oh!

Conversation

SS-JIA commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Uh oh!

pytorch-bot bot commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14330

❌ 3 New Failures, 1 Cancelled Job, 3 Unrelated Failures

Uh oh!

facebook-github-bot commented Sep 16, 2025

Uh oh!

github-actions bot commented Sep 16, 2025

This PR needs a release notes: label

Uh oh!

facebook-github-bot commented Sep 16, 2025

Uh oh!

facebook-github-bot commented Sep 17, 2025

Uh oh!

facebook-github-bot commented Sep 17, 2025

Uh oh!

facebook-github-bot commented Sep 25, 2025

Uh oh!

Uh oh!

Uh oh!

SS-JIA commented Sep 16, 2025 •

edited

Loading

pytorch-bot bot commented Sep 16, 2025 •

edited

Loading

This PR needs a `release notes:` label