Add fuse() to remaining QuantizationPatterns (#19727) by ethansfng · Pull Request #19727 · pytorch/executorch

ethansfng · 2026-05-21T20:29:51Z

Summary:

Add fuse() implementations to the remaining Cadence QuantizationPattern subclasses:

MaxPool2dPattern, MaxPool2dWithoutIndicesPattern — order-preserving pool on quantized values
ReluBasePattern (inherited by ReluPattern0/1) — relu with requantization
ConvReluBasePattern (inherited by Conv1d/2dReluPattern0/1) — conv+relu fusion with anchor_ops() override to match only the conv op
SoftmaxPattern — softmax with dummy mask/pos tensors and fake_mode metadata
MixedW8A32LinearPattern — weight-only quantized linear (no input/output quant)
MixedW8A32ConvPattern — weight-only quantized conv1d with NCL→NLC permutation
MixedW8A32GruPattern — weight-only quantized GRU with 4 dequantized params

Differential Revision: D105728177

pytorch-bot · 2026-05-21T20:30:10Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19727

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 2 Unrelated Failures

As of commit 82a1dd9 with merge base ec76470 ():

NEW FAILURE - The following job has failed:

pull / test-parakeet-xnnpack-linux / linux-job (gh)
RuntimeError: Command docker exec -t 44fb81d3d25dfd5a4e4ff87f52031f2d6e62869f96af130153f724341f9cfdad /exec failed with exit code 1

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / macos / macos-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest-editable / macos / macos-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2026-05-21T20:30:11Z

@ethansfng has exported this pull request. If you are a Meta employee, you can view the originating Diff in D105728177.

github-actions · 2026-05-21T20:35:54Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Summary: Add `fuse()` implementations to the remaining Cadence `QuantizationPattern` subclasses: - `MaxPool2dPattern`, `MaxPool2dWithoutIndicesPattern` — order-preserving pool on quantized values - `ReluBasePattern` (inherited by `ReluPattern0`/`1`) — relu with requantization - `ConvReluBasePattern` (inherited by `Conv1d`/`2dReluPattern0`/`1`) — conv+relu fusion with `anchor_ops()` override to match only the conv op - `SoftmaxPattern` — softmax with dummy mask/pos tensors and fake_mode metadata - `MixedW8A32LinearPattern` — weight-only quantized linear (no input/output quant) - `MixedW8A32ConvPattern` — weight-only quantized conv1d with NCL→NLC permutation - `MixedW8A32GruPattern` — weight-only quantized GRU with 4 dequantized params Differential Revision: D105728177

…19743) Summary: torchao's `convert_pt2e` adds `out_dtype` kwargs to dequant nodes for bf16 models. `cadence::dequantize_per_tensor` doesn't support this kwarg (it hardcodes float32 output), so `ReplacePT2DequantWithCadenceDequantPass` crashes when it forwards kwargs blindly to the cadence op. Strip `out_dtype` from kwargs before creating the cadence dequant node, and insert an `aten.to.dtype` cast after it to preserve the original output dtype semantics. Differential Revision: D105630451

Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137

Summary: Add `fuse()` implementations to the first batch of Cadence `QuantizationPattern` subclasses — the standard fully-quantized patterns that use the shared `_fuse_conv`, `_fuse_linear`, and `_fuse_matmul` helpers: - `AddmmPattern` — transpose weight + linear fusion - `AddPattern` — two-input quantized add - `AddReluBasePattern` — add+relu fusion with `anchor_ops()` override - `BmmPattern`, `MatmulPattern` — matmul fusion via `_fuse_matmul` - `CatPattern` — cat passthrough on quantized inputs - `Conv1dPattern`, `Conv2dPattern` — conv fusion via `_fuse_conv` with depthwise routing - `LayerNormPattern` — layer norm with default weight/bias creation - `LinearPattern` — linear fusion via `_fuse_linear` Differential Revision: D105728156

Summary: Add `fuse()` implementations to the remaining Cadence `QuantizationPattern` subclasses: - `MaxPool2dPattern`, `MaxPool2dWithoutIndicesPattern` — order-preserving pool on quantized values - `ReluBasePattern` (inherited by `ReluPattern0`/`1`) — relu with requantization - `ConvReluBasePattern` (inherited by `Conv1d`/`2dReluPattern0`/`1`) — conv+relu fusion with `anchor_ops()` override to match only the conv op - `SoftmaxPattern` — softmax with dummy mask/pos tensors and fake_mode metadata - `MixedW8A32LinearPattern` — weight-only quantized linear (no input/output quant) - `MixedW8A32ConvPattern` — weight-only quantized conv1d with NCL→NLC permutation - `MixedW8A32GruPattern` — weight-only quantized GRU with 4 dequantized params Differential Revision: D105728177

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 21, 2026

meta-codesync Bot added fb-exported meta-exported labels May 21, 2026

meta-codesync Bot changed the title ~~Add fuse() to remaining QuantizationPatterns~~ Add fuse() to remaining QuantizationPatterns (#19727) May 21, 2026

ethansfng force-pushed the export-D105728177 branch from 7eb7834 to 25e52f7 Compare May 21, 2026 21:00

ethansfng force-pushed the export-D105728177 branch from 25e52f7 to 7c183db Compare May 21, 2026 21:10

ethansfng force-pushed the export-D105728177 branch from 7c183db to 008c464 Compare May 21, 2026 21:18

ethansfng force-pushed the export-D105728177 branch from 008c464 to e65a3a3 Compare May 22, 2026 18:49

ethansfng added 4 commits May 22, 2026 17:36

ethansfng force-pushed the export-D105728177 branch from e65a3a3 to 82a1dd9 Compare May 23, 2026 00:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fuse() to remaining QuantizationPatterns (#19727)#19727

Add fuse() to remaining QuantizationPatterns (#19727)#19727
ethansfng wants to merge 4 commits into
pytorch:mainfrom
ethansfng:export-D105728177

ethansfng commented May 21, 2026 •

edited by meta-codesync Bot

Loading

Uh oh!

pytorch-bot Bot commented May 21, 2026 •

edited

Loading

Uh oh!

meta-codesync Bot commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ethansfng commented May 21, 2026 • edited by meta-codesync Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19727

❌ 1 New Failure, 2 Unrelated Failures

Uh oh!

meta-codesync Bot commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ethansfng commented May 21, 2026 •

edited by meta-codesync Bot

Loading

pytorch-bot Bot commented May 21, 2026 •

edited

Loading

This PR needs a `release notes:` label