Skip to content

[ET-VK][qconv] Add q8ta_conv2d_transposed operator#18034

Merged
SS-JIA merged 2 commits intogh/SS-JIA/462/origfrom
gh/SS-JIA/463/orig
Mar 10, 2026
Merged

[ET-VK][qconv] Add q8ta_conv2d_transposed operator#18034
SS-JIA merged 2 commits intogh/SS-JIA/462/origfrom
gh/SS-JIA/463/orig

Conversation

@pytorchbot
Copy link
Collaborator

This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #18016 by @SS-JIA
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/463/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/463/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/SS-JIA/462/orig
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/SS-JIA/463/orig
Differential Revision: D95807070
@diff-train-skip-merge

Pull Request resolved: #18016

Implement quantized transposed 2D convolution for the Vulkan backend,
enabling int8 transposed convolutions used in decoder/upsampling networks.

The GLSL shader iterates over all kernel positions and derives valid input
positions via (output + padding - kernel) / stride. Invalid positions use
input_zp_packed so the precomputed weight_sums zero-point correction
remains consistent. Reuses the existing q8ta_conv2d weight packing and
workgroup size selection since, after the pattern matcher reshapes
transposed weights from (IC, OC, KH, KW) to (OC, KH*KW*IC), the layout
is identical to regular conv2d.

Supports hardware int8 dot product with software fallback, grouped
convolutions, optional bias and ReLU activation. Only dilation=1 is
supported (matching the ATen conv_transpose2d constraint).

This diff was authored with Claude.
ghstack-source-id: 349646651
@exported-using-ghexport

Differential Revision: [D95807070](https://our.internmc.facebook.com/intern/diff/D95807070/)
@pytorchbot pytorchbot requested a review from SS-JIA as a code owner March 10, 2026 08:53
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 10, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18034

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 10, 2026
Pull Request resolved: #18017

Some quantized linear projections (e.g. in EdgeTAM's SpatialPerceiver / mask
decoder) decompose as aten.bmm instead of aten.mm. Add aten.bmm.default as an
anchor node in the quantized linear pattern detector so these nodes can be fused
into custom quantized linear ops. Reject bmm nodes with batch dim > 1 since the
custom ops assume a single batch.
ghstack-source-id: 349646654
@exported-using-ghexport

Differential Revision: [D95807072](https://our.internmc.facebook.com/intern/diff/D95807072/)
@SS-JIA SS-JIA merged commit a63a916 into gh/SS-JIA/462/orig Mar 10, 2026
174 of 179 checks passed
@SS-JIA SS-JIA deleted the gh/SS-JIA/463/orig branch March 10, 2026 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants