-
Notifications
You must be signed in to change notification settings - Fork 771
Adding Tests for CadenceFusedConvReluQuantizer #16358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16358
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 1 Unrelated FailureAs of commit fd92112 with merge base 050e2ee ( NEW FAILURE - The following job has failed:
UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds comprehensive test coverage for the CadenceFusedConvReluQuantizer and several other previously untested Cadence quantizers. The key innovation is extending the test framework to support fused quantization patterns, where multiple operations (e.g., conv2d + relu) are quantized as a single unit, requiring annotations to be split across different nodes in the computation graph.
- Updated the graph builder function signature to optionally return a third element (input source node) for fused patterns
- Modified the test assertion logic to check output annotations on the output node and input annotations on the input source node
- Added 13 new test cases covering 6 previously untested quantizer classes
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary: Add annotation tests for CadenceWith16BitConvActivationsQuantizer covering both conv1d and conv2d operations. Differential Revision: D88895865
Differential Revision: D88896712
Differential Revision: D88898823
Differential Revision: D88898933
Differential Revision: D88899457
Differential Revision: D88955761
Summary: Pull Request resolved: pytorch#16358 A fused pattern is when the quantizer recognizes a sequence of operations and treats as a single unit for quantization purposes. So for example, for a Conv2D + ReLU fusion, rather than having something like this: ``` input → [quantize] → conv2d → [dequantize] → [quantize] → relu → [dequantize] → output ``` a fused pattern quantizes them together like so: ``` input → [quantize] → conv2d → relu → [dequantize] → output ``` We need to make a few changes in our framework to test this. # Change 1: We allow graph builders to return a 3rd element for fused patterns For fused patterns like conv+relu, the quantization annotations are split across two nodes: - Output annotation is on the relu node (the final output of the fused pattern) - Input annotations are on the conv node (where the quantized inputs enter) The existing graph builders return (gm, target_node), which works for single-op patterns where both annotations are on the same node. For fused patterns, we need to know both nodes, so graph builders can now optionally return (gm, output_node, input_source_node). # Change 2: We check annotations on the correct nodes for fused patterns The test previously assumed output_qspec and input_qspec_map were both on the same node. For fused patterns, they're on different nodes: - output_qspec is checked on the output node (relu) - input_qspec_map is checked on the input source node (conv) This change is backwards-compatible: for non-fused patterns, both nodes are the same. Reviewed By: hsharma35 Differential Revision: D89630759
af22ffe to
ccdb8e8
Compare
Summary: Pull Request resolved: pytorch#16358 A fused pattern is when the quantizer recognizes a sequence of operations and treats as a single unit for quantization purposes. So for example, for a Conv2D + ReLU fusion, rather than having something like this: ``` input → [quantize] → conv2d → [dequantize] → [quantize] → relu → [dequantize] → output ``` a fused pattern quantizes them together like so: ``` input → [quantize] → conv2d → relu → [dequantize] → output ``` We need to make a few changes in our framework to test this. # Change 1: We allow graph builders to return a 3rd element for fused patterns For fused patterns like conv+relu, the quantization annotations are split across two nodes: - Output annotation is on the relu node (the final output of the fused pattern) - Input annotations are on the conv node (where the quantized inputs enter) The existing graph builders return (gm, target_node), which works for single-op patterns where both annotations are on the same node. For fused patterns, we need to know both nodes, so graph builders can now optionally return (gm, output_node, input_source_node). # Change 2: We check annotations on the correct nodes for fused patterns The test previously assumed output_qspec and input_qspec_map were both on the same node. For fused patterns, they're on different nodes: - output_qspec is checked on the output node (relu) - input_qspec_map is checked on the input source node (conv) This change is backwards-compatible: for non-fused patterns, both nodes are the same. Reviewed By: hsharma35 Differential Revision: D89630759
ccdb8e8 to
fd92112
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def _build_layer_norm_graph(self) -> tuple[torch.fx.GraphModule, torch.fx.Node]: | ||
| """Build a simple graph with a layer_norm operation.""" | ||
| # Input shape: (batch, features) | ||
| x = torch.randn(1, 10) | ||
| # normalized_shape must match the last dimension(s) of input | ||
| normalized_shape = [10] | ||
| gm = single_op_builder( | ||
| placeholders=(x,), | ||
| op=torch.ops.aten.layer_norm.default, | ||
| args=(x, normalized_shape), | ||
| ) | ||
|
|
||
| layer_norm_nodes = gm.graph.find_nodes( | ||
| op="call_function", | ||
| target=torch.ops.aten.layer_norm.default, | ||
| ) | ||
| self.assertEqual( | ||
| len(layer_norm_nodes), 1, "Should find exactly one layer_norm node" | ||
| ) | ||
| # Add source_fn_stack metadata required by quantizer pattern matching | ||
| layer_norm_nodes[0].meta["source_fn_stack"] = [ | ||
| ("layer_norm", torch.ops.aten.layer_norm.default) | ||
| ] | ||
| return gm, layer_norm_nodes[0] |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This builder is inconsistent with the others. Most builders use GraphBuilder and include NodeMetadata with source_fn_stack at creation time. This builder uses single_op_builder and then manually adds source_fn_stack metadata after the fact. Consider either using GraphBuilder for consistency or documenting why single_op_builder is necessary for layer_norm.
| # Find the index of this input node in the input source node's args | ||
| arg_index = None | ||
| args = input_source_node.args | ||
| assert isinstance(args, tuple) |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Python's assert statement in test code is not ideal because it can be disabled with optimization flags. Consider using self.assertIsInstance(args, tuple) instead to ensure the check always runs.
| assert isinstance(args, tuple) | |
| self.assertIsInstance(args, tuple) |
Summary:
A fused pattern is when the quantizer recognizes a sequence of operations and treats as a single unit for quantization purposes. So for example, for a Conv2D + ReLU fusion, rather than having something like this:
a fused pattern quantizes them together like so:
We need to make a few changes in our framework to test this.
Change 1: We allow graph builders to return a 3rd element for fused patterns
For fused patterns like conv+relu, the quantization annotations are split across two nodes:
The existing graph builders return (gm, target_node), which works for single-op patterns where both annotations are on the same node. For fused patterns, we need to know both nodes, so graph builders can now optionally return (gm, output_node, input_source_node).
Change 2: We check annotations on the correct nodes for fused patterns
The test previously assumed output_qspec and input_qspec_map were both on the same node. For fused patterns, they're on different nodes:
This change is backwards-compatible: for non-fused patterns, both nodes are the same.
Differential Revision: D89630759