Skip to content

Cortex-M: Fuse relu activation into quantized_add#18462

Merged
rascani merged 5 commits intopytorch:mainfrom
rascani:cortex-m-fuse-add-relu
Mar 27, 2026
Merged

Cortex-M: Fuse relu activation into quantized_add#18462
rascani merged 5 commits intopytorch:mainfrom
rascani:cortex-m-fuse-add-relu

Conversation

@rascani
Copy link
Copy Markdown
Contributor

@rascani rascani commented Mar 24, 2026

Summary

ResNet8 has skip connections with relu(add(conv(x), skip(x))). The ActivationFusionPass only fused relu into conv/linear, leaving 3 unfused relu ops that fell through to portable aten::relu.out which incorrectly clamps int8 tensors to literal 0 instead of the quantized zero_point, causing numerical mismatches on the FVP.

Add fused activation patterns (relu, hardtanh, clamp) for add/add_ to quantizer_support.py BINARY_OP_PATTERNS so the quantizer produces activation-aware quantization bounds. Add aten.add.Tensor to ActivationFusionPass FUSE_OPS. Update QuantizedOpFusionPass to read activation bounds from output_qparams and pass them to quantized_add. Update the quantized_add operator (schema, meta, impl, C++) to accept activation_min/activation_max parameters.

ResNet8 has skip connections with relu(add(conv(x), skip(x))). The
ActivationFusionPass only fused relu into conv/linear, leaving 3
unfused relu ops that fell through to portable aten::relu.out which
incorrectly clamps int8 tensors to literal 0 instead of the quantized
zero_point, causing numerical mismatches on the FVP.

Add fused activation patterns (relu, hardtanh, clamp) for add/add_ to
quantizer_support.py BINARY_OP_PATTERNS so the quantizer produces
activation-aware quantization bounds. Add aten.add.Tensor to
ActivationFusionPass FUSE_OPS. Update QuantizedOpFusionPass to read
activation bounds from output_qparams and pass them to quantized_add.
Update the quantized_add operator (schema, meta, impl, C++) to accept
activation_min/activation_max parameters.

Co-authored-by: Claude <noreply@anthropic.com>
@rascani rascani requested review from AdrianLundell and psiddh March 24, 2026 20:44
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Mar 24, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18462

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 1 Cancelled Job, 3 Unrelated Failures

As of commit e527a04 with merge base 7c79395 (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 24, 2026
@rascani
Copy link
Copy Markdown
Contributor Author

rascani commented Mar 24, 2026

@AdrianLundell - I still need to add tests for this, but I wanted to make sure this is the right approach for the Quantizer.

@github-actions
Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Add add_relu, add_relu_channels_last, add_hardtanh, and
add_hardtanh_channels_last test cases to test_add.py verifying that
relu/hardtanh activations are fused into quantized_add. Remove the
conv_add_relu xfail from test_nn_modules.py since the fusion now works.

Co-authored-by: Claude <noreply@anthropic.com>
Comment thread backends/cortex_m/quantizer/quantizer_support.py Outdated
Comment thread backends/cortex_m/quantizer/quantizer_support.py
Copy link
Copy Markdown
Collaborator

@AdrianLundell AdrianLundell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks correct to me, nice! Just two comments

Remove add_.Tensor + activation fused patterns from BINARY_OP_PATTERNS.
Functionalization converts inplace ops to out-of-place before the
quantizer runs, so these patterns are never matched.

Co-authored-by: Claude <noreply@anthropic.com>
@rascani
Copy link
Copy Markdown
Contributor Author

rascani commented Mar 27, 2026

Failured unrelated.

@rascani rascani merged commit 6fccd5a into pytorch:main Mar 27, 2026
411 of 420 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants