Quantize compatible node + activation patterns as one block #7555

Tessil · 2025-01-08T16:19:07Z

Annotate conv1d/conv2d/linear followed by relu/relu6 patterns as one block and fuse the activation into its parent.

The activation will then be implicitly done in the tosa.rescale node that will have a -128 zero-point.

pytorch-bot · 2025-01-08T16:19:11Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7555

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit b60972b with merge base 08770b7 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Annotate conv1d/conv2d/linear followed by relu/relu6 patterns as one block and fuse the activation into its parent. The activation will then be implicitly done in the tosa.rescale node that will have a -128 zero-point. Change-Id: I5bf1e2c91be21ab842012fbc20d159af7fe2222d

swolchok · 2025-01-13T16:49:42Z

This PR caused a type-checking failure:

executorch/backends/arm/_passes/fuse_quantized_activation_pass.py:27:48 Uninitialized local [61]: Local variable `zp` is undefined, or not always defined.
executorch/backends/arm/_passes/fuse_quantized_activation_pass.py:27:54 Uninitialized local [61]: Local variable `qmin` is undefined, or not always defined.

freddan80 · 2025-01-14T07:32:48Z

@Tessil will you have a look?

Tessil · 2025-01-14T11:57:50Z

@swolchok Thanks for the info. What would be the best way to fix it as the PR has already gone through? Raise a new one with a fix?

Note that the zp and qmin will theoretically never be evaluated if they are uninitialized with the short-circuit and as they are initialized in any case when is_quantized is true and evaluated only if is_quantized is true. Though still best to fix the type-checking issue.

swolchok · 2025-01-14T17:02:34Z

Raise a new one with a fix?

yep!

Tessil · 2025-01-15T16:55:08Z

@swolchok Fix: #7671
Thanks

Annotate conv1d/conv2d/linear followed by relu/relu6 patterns as one block and fuse the activation into its parent. The activation will then be implicitly done in the tosa.rescale node that will have a -128 zero-point. Change-Id: I5bf1e2c91be21ab842012fbc20d159af7fe2222d

3l1 · 2025-07-16T23:13:36Z

hey @Tessil, a bit late to this PR, but ... can you please clarify why we are specializing on the check for zp == qmin only here? there should be allowance for several variations of quantization observers that may not satisfy this equality check?

cc @freddan80 @digantdesai @limintang @gggekov

digantdesai · 2025-07-18T16:29:58Z

Also FYI @Ninja91

Ninja91 · 2025-07-22T21:08:24Z

@Tessil Following up on @3l1 question about this check's need?

Tessil · 2025-07-23T10:12:47Z

@3l1 @Ninja91 Sorry for the late reply. The FuseQuantizedActivationPass requires zp == qmin (which should always be the case for asymmetric quantization of activations with a [0, inf) range) as the activation lower bound cut is managed by the tosa.rescale.

3l1 · 2025-07-24T00:06:26Z

@3l1 @Ninja91 Sorry for the late reply. The FuseQuantizedActivationPass requires zp == qmin (which should always be the case for asymmetric quantization of activations with a [0, inf) range) as the activation lower bound cut is managed by the tosa.rescale.

thanks. yes this makes sense, i just saw that you were checking for a relu, so this makes sense for that activation -- although i think expecting qmin may be a bit strict, it could easily be, depending on the quantization config, qmin + epsilon

digantdesai · 2025-07-24T11:55:50Z

as the activation lower bound cut is managed by the tosa.rescale.

Can you say more, the rescale OP should be able handle qmin != zp. Just trying to better understand, if the quantization flow is ensuring the zp is "valid" for [0, inf) range, why do we have this constraint here.

Tessil · 2025-07-28T16:16:54Z

The tosa.rescale op is able to handle qmin != zp but with a tosa.rescale with qmin == zp we are sure that any rescaled output will be => 0 (when dequantized) and thus the rescale does an implicit ReLU (hence why it is optimised out).

It might be possible to relax the constraint but that would be best to check with the other members of the team as I'm working on a different project.

digantdesai · 2025-07-29T11:34:25Z

Thanks @Tessil

the rescale does an implicit ReLU (hence why it is optimised out)

Makes sense to remove relu through rescale qmin. That said, @Ninja91 might also found a bug where quantizer does generate a Relu q/dq with qmin which is != zp when used in some pattern.

It might be possible to relax the constraint but that would be best to check with the other members of the team as I'm working on a different project.

Who would be the right PoC?

Ninja91 · 2025-07-29T16:16:03Z

@Tessil Please loop in the right POCs on the linked bug.

Tessil · 2025-07-29T17:03:57Z

That said, @Ninja91 might also found a bug where quantizer does generate a Relu q/dq with qmin which is != zp when used in some pattern.

Yeah, that's quite strange as the output of a ReLU should only observe >= 0 values hence a zp equals to qmin for asymmetric quantization.

Who would be the right PoC?

@digantdesai @Ninja91 Best check with @freddan80

digantdesai · 2025-07-31T10:09:33Z

FYI - #12959

Tessil added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Jan 8, 2025

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 8, 2025

Tessil added the ciflow/trunk label Jan 8, 2025

Tessil force-pushed the toupstream/quantize_annotate_patterns branch from 8df7f54 to b60972b Compare January 8, 2025 16:54

Tessil added ciflow/trunk topic: not user facing and removed ciflow/trunk labels Jan 8, 2025

freddan80 approved these changes Jan 10, 2025

View reviewed changes

freddan80 merged commit a059981 into pytorch:main Jan 10, 2025
165 of 173 checks passed

Ninja91 mentioned this pull request Jul 29, 2025

Incorrect Observer Sharing/Derivation at Conv-ReLU+ Residual with Arm Ethos Quantizer #12959

Closed

Quantize compatible node + activation patterns as one block #7555

Quantize compatible node + activation patterns as one block #7555

Uh oh!

Conversation

Tessil commented Jan 8, 2025

Uh oh!

pytorch-bot bot commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7555

✅ No Failures

Uh oh!

Uh oh!

swolchok commented Jan 13, 2025

Uh oh!

freddan80 commented Jan 14, 2025

Uh oh!

Tessil commented Jan 14, 2025

Uh oh!

swolchok commented Jan 14, 2025

Uh oh!

Tessil commented Jan 15, 2025

Uh oh!

3l1 commented Jul 16, 2025

Uh oh!

digantdesai commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ninja91 commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Tessil commented Jul 23, 2025

Uh oh!

3l1 commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

digantdesai commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Tessil commented Jul 28, 2025

Uh oh!

digantdesai commented Jul 29, 2025

Uh oh!

Ninja91 commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Tessil commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

digantdesai commented Jul 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

pytorch-bot bot commented Jan 8, 2025 •

edited

Loading

digantdesai commented Jul 18, 2025 •

edited

Loading

Ninja91 commented Jul 22, 2025 •

edited

Loading

3l1 commented Jul 24, 2025 •

edited

Loading

digantdesai commented Jul 24, 2025 •

edited

Loading

Ninja91 commented Jul 29, 2025 •

edited

Loading

Tessil commented Jul 29, 2025 •

edited

Loading