-
Notifications
You must be signed in to change notification settings - Fork 693
Quantize compatible node + activation patterns as one block #7555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quantize compatible node + activation patterns as one block #7555
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7555
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit b60972b with merge base 08770b7 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Annotate conv1d/conv2d/linear followed by relu/relu6 patterns as one block and fuse the activation into its parent. The activation will then be implicitly done in the tosa.rescale node that will have a -128 zero-point. Change-Id: I5bf1e2c91be21ab842012fbc20d159af7fe2222d
8df7f54
to
b60972b
Compare
This PR caused a type-checking failure:
|
@Tessil will you have a look? |
@swolchok Thanks for the info. What would be the best way to fix it as the PR has already gone through? Raise a new one with a fix? Note that the |
yep! |
Annotate conv1d/conv2d/linear followed by relu/relu6 patterns as one block and fuse the activation into its parent. The activation will then be implicitly done in the tosa.rescale node that will have a -128 zero-point. Change-Id: I5bf1e2c91be21ab842012fbc20d159af7fe2222d
hey @Tessil, a bit late to this PR, but ... can you please clarify why we are specializing on the check for |
Also FYI @Ninja91 |
thanks. yes this makes sense, i just saw that you were checking for a relu, so this makes sense for that activation -- although i think expecting qmin may be a bit strict, it could easily be, depending on the quantization config, qmin + epsilon |
Can you say more, the rescale OP should be able handle qmin != zp. Just trying to better understand, if the quantization flow is ensuring the zp is "valid" for [0, inf) range, why do we have this constraint here. |
The It might be possible to relax the constraint but that would be best to check with the other members of the team as I'm working on a different project. |
Thanks @Tessil
Makes sense to remove relu through rescale qmin. That said, @Ninja91 might also found a bug where quantizer does generate a Relu q/dq with qmin which is != zp when used in some pattern.
Who would be the right PoC? |
@Tessil Please loop in the right POCs on the linked bug. |
Yeah, that's quite strange as the output of a ReLU should only observe >= 0 values hence a zp equals to qmin for asymmetric quantization.
@digantdesai @Ninja91 Best check with @freddan80 |
FYI - #12959 |
Annotate conv1d/conv2d/linear followed by relu/relu6 patterns as one block and fuse the activation into its parent.
The activation will then be implicitly done in the
tosa.rescale
node that will have a -128 zero-point.