Cortex-M: Fuse relu activation into quantized_add#18462
Cortex-M: Fuse relu activation into quantized_add#18462rascani merged 5 commits intopytorch:mainfrom
Conversation
ResNet8 has skip connections with relu(add(conv(x), skip(x))). The ActivationFusionPass only fused relu into conv/linear, leaving 3 unfused relu ops that fell through to portable aten::relu.out which incorrectly clamps int8 tensors to literal 0 instead of the quantized zero_point, causing numerical mismatches on the FVP. Add fused activation patterns (relu, hardtanh, clamp) for add/add_ to quantizer_support.py BINARY_OP_PATTERNS so the quantizer produces activation-aware quantization bounds. Add aten.add.Tensor to ActivationFusionPass FUSE_OPS. Update QuantizedOpFusionPass to read activation bounds from output_qparams and pass them to quantized_add. Update the quantized_add operator (schema, meta, impl, C++) to accept activation_min/activation_max parameters. Co-authored-by: Claude <noreply@anthropic.com>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18462
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New Failures, 1 Cancelled Job, 3 Unrelated FailuresAs of commit e527a04 with merge base 7c79395 ( NEW FAILURES - The following jobs have failed:
CANCELLED JOB - The following job was cancelled. Please retry:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@AdrianLundell - I still need to add tests for this, but I wanted to make sure this is the right approach for the Quantizer. |
This PR needs a
|
Add add_relu, add_relu_channels_last, add_hardtanh, and add_hardtanh_channels_last test cases to test_add.py verifying that relu/hardtanh activations are fused into quantized_add. Remove the conv_add_relu xfail from test_nn_modules.py since the fusion now works. Co-authored-by: Claude <noreply@anthropic.com>
Remove add_.Tensor + activation fused patterns from BINARY_OP_PATTERNS. Functionalization converts inplace ops to out-of-place before the quantizer runs, so these patterns are never matched. Co-authored-by: Claude <noreply@anthropic.com>
|
Failured unrelated. |
Summary
ResNet8 has skip connections with relu(add(conv(x), skip(x))). The ActivationFusionPass only fused relu into conv/linear, leaving 3 unfused relu ops that fell through to portable aten::relu.out which incorrectly clamps int8 tensors to literal 0 instead of the quantized zero_point, causing numerical mismatches on the FVP.
Add fused activation patterns (relu, hardtanh, clamp) for add/add_ to quantizer_support.py BINARY_OP_PATTERNS so the quantizer produces activation-aware quantization bounds. Add aten.add.Tensor to ActivationFusionPass FUSE_OPS. Update QuantizedOpFusionPass to read activation bounds from output_qparams and pass them to quantized_add. Update the quantized_add operator (schema, meta, impl, C++) to accept activation_min/activation_max parameters.