Fused quant bmm kernel (#19489) by DrJessop · Pull Request #19489 · pytorch/executorch

DrJessop · 2026-05-11T23:04:29Z

Summary:

Fused quant batch matrix multiply kernel with optional dequantize/quantize. Binary op on 3D tensors [B,M,K] x [B,K,N] -> [B,M,N]. Supports per-tensor and per-channel quantization.

Reviewed By: mvartani-meta

Differential Revision: D103754815

pytorch-bot · 2026-05-11T23:04:32Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19489

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Run pull jobs on OSDC in pull requests shadow mode

❌ 4 New Failures, 3 Unrelated Failures

As of commit 8392933 with merge base 8020fe0 ():

NEW FAILURES - The following jobs have failed:

pull / unittest / windows / windows-job (gh)
exir/tests/test_joint_graph.py::TestJointGraph::test_joint_graph
pull / unittest-editable / linux / linux-job (gh)
exir/tests/test_joint_graph.py::TestJointGraph::test_joint_graph
pull / unittest-editable / macos / macos-job (gh)
exir/tests/test_joint_graph.py::TestJointGraph::test_joint_graph
pull / unittest-editable / windows / windows-job (gh)
exir/tests/test_joint_graph.py::TestJointGraph::test_joint_graph

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

pull / test-qnn-testsuite-linux / test-backend-linux (qnn, models) / linux-job (gh) (detected as infra flaky with no log or failing log classifier)
pull / unittest / linux / linux-job (gh) (similar failure)
exir/tests/test_memory_planning.py::TestMisc::test_multiple_pools_1

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / macos / macos-job (gh) (trunk failure)
exir/tests/test_memory_planning.py::TestMisc::test_multiple_pools_1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2026-05-11T23:04:52Z

@DrJessop has exported this pull request. If you are a Meta employee, you can view the originating Diff in D103754815.

github-actions · 2026-05-11T23:05:47Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Summary: Fused quant batch matrix multiply kernel with optional dequantize/quantize. Binary op on 3D tensors [B,M,K] x [B,K,N] -> [B,M,N]. Supports per-tensor and per-channel quantization. Reviewed By: mvartani-meta Differential Revision: D103754815

Summary: Fused quant hardswish kernel with optional dequantize/quantize. Unary op that applies x * min(max(x+3, 0), 6) / 6. Supports per-tensor and per-channel quantization. Reviewed By: mvartani-meta Differential Revision: D103754780

Summary: Fused quant batch matrix multiply kernel with optional dequantize/quantize. Binary op on 3D tensors [B,M,K] x [B,K,N] -> [B,M,N]. Supports per-tensor and per-channel quantization. Reviewed By: mvartani-meta Differential Revision: D103754815

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 11, 2026

meta-codesync Bot added fb-exported meta-exported labels May 11, 2026

aliafzal self-requested a review May 11, 2026 23:14

aliafzal approved these changes May 11, 2026

View reviewed changes

meta-codesync Bot changed the title ~~Fused quant bmm kernel~~ Fused quant bmm kernel (#19489) May 12, 2026

DrJessop force-pushed the export-D103754815 branch from 762a9cb to e5f8f21 Compare May 12, 2026 04:23

Andrew Grebenisan added 2 commits May 12, 2026 10:09

Fused quant hardswish kernel (pytorch#19488)

d32b87e

Summary: Fused quant hardswish kernel with optional dequantize/quantize. Unary op that applies x * min(max(x+3, 0), 6) / 6. Supports per-tensor and per-channel quantization. Reviewed By: mvartani-meta Differential Revision: D103754780

DrJessop force-pushed the export-D103754815 branch from e5f8f21 to 8392933 Compare May 12, 2026 17:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fused quant bmm kernel (#19489)#19489

Fused quant bmm kernel (#19489)#19489
DrJessop wants to merge 2 commits into
pytorch:mainfrom
DrJessop:export-D103754815

DrJessop commented May 11, 2026 •

edited by meta-codesync Bot

Loading

Uh oh!

pytorch-bot Bot commented May 11, 2026 •

edited

Loading

Uh oh!

meta-codesync Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DrJessop commented May 11, 2026 • edited by meta-codesync Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19489

❗ 1 Active SEVs

❌ 4 New Failures, 3 Unrelated Failures

Uh oh!

meta-codesync Bot commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DrJessop commented May 11, 2026 •

edited by meta-codesync Bot

Loading

pytorch-bot Bot commented May 11, 2026 •

edited

Loading

This PR needs a `release notes:` label