Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16607
Note: Links to docs will display an error until the docs builds have been completed.
|
This PR needs a
|
727b730 to
5d8c391
Compare
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
There was a problem hiding this comment.
Pull request overview
This PR fixes the Conv1d w8a32 operator by adding metadata propagation for transposed tensors and adding input validation to prevent unsupported configurations.
Changes:
- Added metadata propagation for the
valattribute when creating transposed inputs and weights in the Conv1d w8a32 operator - Added validation in patterns.py to bail early when input length doesn't equal kernel size (marked as "not yet supported")
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| backends/cadence/aot/quantizer/fusion_pass.py | Adds proper fake_mode-aware metadata propagation for transposed_inputs and transposed_weights when transforming Conv1d tensors from NCL to NLC format |
| backends/cadence/aot/quantizer/patterns.py | Adds validation to reject Conv1d operations where input length doesn't equal kernel size (3), marking this configuration as not yet supported |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Bail if length != kernel size - Not yet supported | ||
| if inputs_shape[-1] != cnn_weights_shape[2]: | ||
| return ( | ||
| PartitionAnchors( | ||
| empty=True, | ||
| ), | ||
| conv_layer, | ||
| ) |
There was a problem hiding this comment.
This check restricts the w8a32_conv pattern to only match when the input length equals the kernel size (3). While the comment indicates this is intentionally not yet supported, this is quite restrictive. Standard convolution operations typically support input lengths greater than or equal to the kernel size. The reference implementation in ref_implementations.py (lines 926-970) and the test in test_ref_implementations.py (lines 1156-1166 show length=5 with kernel=3) both support arbitrary input lengths. Consider whether this restriction is necessary, or if it should be relaxed to allow input_length >= kernel_size to enable the optimization in more cases.
| # Propagate val metadata for transposed_inputs | ||
| if "val" in other_inputs[0].meta: | ||
| original_val = other_inputs[0].meta["val"] | ||
| fake_mode = original_val.fake_mode | ||
| if fake_mode is not None: | ||
| with fake_mode: | ||
| transposed_val = torch.ops.aten.permute.default( | ||
| original_val, [0, 2, 1] | ||
| ) | ||
| transposed_inputs.meta["val"] = transposed_val | ||
| else: | ||
| transposed_inputs.meta["val"] = torch.ops.aten.permute.default( | ||
| original_val, [0, 2, 1] | ||
| ) | ||
| copy_node_metadata(transposed_inputs, other_inputs[0]) | ||
|
|
||
| transposed_weights = graph_module.graph.call_function( | ||
| torch.ops.aten.permute.default, | ||
| (weights_inputs[0], [2, 0, 1]), # NCL -> LNC | ||
| ) | ||
| # Propagate val metadata for transposed_weights | ||
| if "val" in weights_inputs[0].meta: | ||
| original_val = weights_inputs[0].meta["val"] | ||
| fake_mode = original_val.fake_mode | ||
| if fake_mode is not None: | ||
| with fake_mode: | ||
| transposed_val = torch.ops.aten.permute.default( | ||
| original_val, [2, 0, 1] | ||
| ) | ||
| transposed_weights.meta["val"] = transposed_val | ||
| else: | ||
| transposed_weights.meta["val"] = torch.ops.aten.permute.default( | ||
| original_val, [2, 0, 1] | ||
| ) |
There was a problem hiding this comment.
The metadata propagation logic for transposed_inputs (lines 435-448) and transposed_weights (lines 455-468) is duplicated with only minor variations. This pattern also appears elsewhere in the codebase (e.g., lines 164-176, 189-200, 376-385, 653-671). Consider extracting this into a helper function to reduce code duplication and improve maintainability. The helper function could take parameters like the node, transformation operation, and transformation arguments.
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
5d8c391 to
7850292
Compare
Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750
Summary:
Summary
This diff fixes the Conv1d w8a32 operator by adding a transformation to the
valattribute of theother_inputs[0].metadictionary. Specifically, thepermuteoperation is applied to theoriginal_valtensor with thefake_modecontext, and the resultingtransposed_valis assigned totransposed_inputs.meta["val"].Reviewed By: mcremon-meta
Differential Revision: D89863750