Add add-relu fusion in the quantizer#19077
Conversation
Summary: Fold ReLU into add in the quantizer, the same way we already fold it into convolutions. The `quantized_add` op already performs requantization, so when an add is followed by a relu, we can use the output quant params directly (with zero point at min range) instead of emitting a separate `quantized_relu`. This saves one op in the quantized graph. Changes: - Add `AddReluBasePattern`, `AddReluPattern0`, and `AddReluPattern1` in `patterns.py` - Handle AddRelu patterns in `QuantFusion.call()` in `fusion_pass.py`, reusing `get_args_and_kwargs_add` - Register the patterns in `CadenceFusedConvReluQuantizer` before standalone `AddPattern` so the fused pattern matches first - Add annotation test for the fused add+relu pattern Reviewed By: khazaei Differential Revision: D102189156
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19077
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 1 New Failure, 2 Cancelled JobsAs of commit e598914 with merge base e7b38a3 ( NEW FAILURE - The following job has failed:
CANCELLED JOBS - The following jobs were cancelled. Please retry:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@mcremon-meta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D102189156. |
This PR needs a
|
Summary:
Fold ReLU into add in the quantizer, the same way we already fold it into convolutions.
The
quantized_addop already performs requantization, so when an add is followed by a relu, we can use the output quant params directly (with zero point at min range) instead of emitting a separatequantized_relu. This saves one op in the quantized graph.Changes:
AddReluBasePattern,AddReluPattern0, andAddReluPattern1inpatterns.pyQuantFusion.call()infusion_pass.py, reusingget_args_and_kwargs_addCadenceFusedConvReluQuantizerbefore standaloneAddPatternso the fused pattern matches firstReviewed By: khazaei
Differential Revision: D102189156