Skip to content

Conversation

Tessil
Copy link
Collaborator

@Tessil Tessil commented Jan 8, 2025

Annotate conv1d/conv2d/linear followed by relu/relu6 patterns as one block and fuse the activation into its parent.

The activation will then be implicitly done in the tosa.rescale node that will have a -128 zero-point.

@Tessil Tessil added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Jan 8, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Jan 8, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7555

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit b60972b with merge base 08770b7 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 8, 2025
Annotate conv1d/conv2d/linear followed by relu/relu6 patterns as one block and fuse the activation into its parent. The activation will then be implicitly done in the tosa.rescale node that will have a -128 zero-point.

Change-Id: I5bf1e2c91be21ab842012fbc20d159af7fe2222d
@freddan80 freddan80 merged commit a059981 into pytorch:main Jan 10, 2025
165 of 173 checks passed
@swolchok
Copy link
Contributor

This PR caused a type-checking failure:

executorch/backends/arm/_passes/fuse_quantized_activation_pass.py:27:48 Uninitialized local [61]: Local variable `zp` is undefined, or not always defined.
executorch/backends/arm/_passes/fuse_quantized_activation_pass.py:27:54 Uninitialized local [61]: Local variable `qmin` is undefined, or not always defined.

@freddan80
Copy link
Collaborator

@Tessil will you have a look?

@Tessil
Copy link
Collaborator Author

Tessil commented Jan 14, 2025

@swolchok Thanks for the info. What would be the best way to fix it as the PR has already gone through? Raise a new one with a fix?

Note that the zp and qmin will theoretically never be evaluated if they are uninitialized with the short-circuit and as they are initialized in any case when is_quantized is true and evaluated only if is_quantized is true. Though still best to fix the type-checking issue.

@swolchok
Copy link
Contributor

Raise a new one with a fix?

yep!

@Tessil
Copy link
Collaborator Author

Tessil commented Jan 15, 2025

@swolchok Fix: #7671
Thanks

YIWENX14 pushed a commit that referenced this pull request Jan 28, 2025
Annotate conv1d/conv2d/linear followed by relu/relu6 patterns as one block and fuse the activation into its parent. The activation will then be implicitly done in the tosa.rescale node that will have a -128 zero-point.

Change-Id: I5bf1e2c91be21ab842012fbc20d159af7fe2222d
@3l1
Copy link
Contributor

3l1 commented Jul 16, 2025

hey @Tessil, a bit late to this PR, but ... can you please clarify why we are specializing on the check for zp == qmin only here? there should be allowance for several variations of quantization observers that may not satisfy this equality check?

cc @freddan80 @digantdesai @limintang @gggekov

@digantdesai
Copy link
Contributor

digantdesai commented Jul 18, 2025

Also FYI @Ninja91

@Ninja91
Copy link
Contributor

Ninja91 commented Jul 22, 2025

@Tessil Following up on @3l1 question about this check's need?

@Tessil
Copy link
Collaborator Author

Tessil commented Jul 23, 2025

@3l1 @Ninja91 Sorry for the late reply. The FuseQuantizedActivationPass requires zp == qmin (which should always be the case for asymmetric quantization of activations with a [0, inf) range) as the activation lower bound cut is managed by the tosa.rescale.

@3l1
Copy link
Contributor

3l1 commented Jul 24, 2025

@3l1 @Ninja91 Sorry for the late reply. The FuseQuantizedActivationPass requires zp == qmin (which should always be the case for asymmetric quantization of activations with a [0, inf) range) as the activation lower bound cut is managed by the tosa.rescale.

thanks. yes this makes sense, i just saw that you were checking for a relu, so this makes sense for that activation -- although i think expecting qmin may be a bit strict, it could easily be, depending on the quantization config, qmin + epsilon

@digantdesai
Copy link
Contributor

digantdesai commented Jul 24, 2025

as the activation lower bound cut is managed by the tosa.rescale.

Can you say more, the rescale OP should be able handle qmin != zp. Just trying to better understand, if the quantization flow is ensuring the zp is "valid" for [0, inf) range, why do we have this constraint here.

@Tessil
Copy link
Collaborator Author

Tessil commented Jul 28, 2025

The tosa.rescale op is able to handle qmin != zp but with a tosa.rescale with qmin == zp we are sure that any rescaled output will be => 0 (when dequantized) and thus the rescale does an implicit ReLU (hence why it is optimised out).

It might be possible to relax the constraint but that would be best to check with the other members of the team as I'm working on a different project.

@digantdesai
Copy link
Contributor

Thanks @Tessil

the rescale does an implicit ReLU (hence why it is optimised out)

Makes sense to remove relu through rescale qmin. That said, @Ninja91 might also found a bug where quantizer does generate a Relu q/dq with qmin which is != zp when used in some pattern.

It might be possible to relax the constraint but that would be best to check with the other members of the team as I'm working on a different project.

Who would be the right PoC?

@Ninja91
Copy link
Contributor

Ninja91 commented Jul 29, 2025

@Tessil Please loop in the right POCs on the linked bug.

@Tessil
Copy link
Collaborator Author

Tessil commented Jul 29, 2025

That said, @Ninja91 might also found a bug where quantizer does generate a Relu q/dq with qmin which is != zp when used in some pattern.

Yeah, that's quite strange as the output of a ReLU should only observe >= 0 values hence a zp equals to qmin for asymmetric quantization.

Who would be the right PoC?

@digantdesai @Ninja91 Best check with @freddan80

@digantdesai
Copy link
Contributor

FYI - #12959

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm topic: not user facing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants