[export] Temporarily bypass torch_fn in partitioner #134292

yushangdi · 2024-08-22T23:16:45Z

Summary:
"torch_fn" is not correct for the decomposed add node from batch norm. This is a temporary workaround to bypass torch fn.

For example, for the graph below (test_qat_conv2d_unary graph):

graph():
    %conv_weight : [num_users=1] = get_attr[target=conv.weight]
    %bn_weight : [num_users=1] = get_attr[target=bn.weight]
    %bn_bias : [num_users=1] = get_attr[target=bn.bias]
    %bn_running_mean : [num_users=1] = get_attr[target=bn.running_mean]
    %bn_running_var : [num_users=1] = get_attr[target=bn.running_var]
    %bn_num_batches_tracked : [num_users=1] = get_attr[target=bn.num_batches_tracked]
    %x : [num_users=1] = placeholder[target=x]
    %conv2d : [num_users=1] = call_function[target=torch.ops.aten.conv2d.default](args = (%x, %conv_weight, None, [1, 1], [1, 1]), kwargs = {})
    %add_ : [num_users=0] = call_function[target=torch.ops.aten.add_.Tensor](args = (%bn_num_batches_tracked, 1), kwargs = {})
    %batch_norm : [num_users=1] = call_function[target=torch.ops.aten.batch_norm.default](args = (%conv2d, %bn_weight, %bn_bias, %bn_running_mean, %bn_running_var, True, 0.1, 1e-05, True), kwargs = {})
    %relu : [num_users=1] = call_function[target=torch.ops.aten.relu.default](args = (%batch_norm,), kwargs = {})
    %max_pool2d : [num_users=1] = call_function[target=torch.ops.aten.max_pool2d.default](args = (%relu, [3, 3], [3, 3]), kwargs = {})
    return (max_pool2d,)

the add_ node has 'torch_fn': ('add__1', 'method_descriptor.add_'), in its meta.

If we run the line below in _annotate_qat_conv2d_bn_binary_unary, we'll have a partition without output nodes.

 find_sequential_partitions(
            gm, [torch.nn.Conv2d, torch.nn.BatchNorm2d, operator.add, torch.nn.ReLU]
        )

partition_list
[
SourcePartition(nodes=[conv_weight, conv2d], source=<class 'torch.nn.modules.conv.Conv2d'>, input_nodes=[x], output_nodes=[conv2d], params=[conv_weight]),

SourcePartition(nodes=[bn_weight, bn_bias, bn_running_mean, bn_running_var, bn_num_batches_tracked, add_, batch_norm], source=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>, input_nodes=[conv2d], output_nodes=[batch_norm], params=[bn_num_batches_tracked, bn_running_var, bn_bias, bn_weight, bn_running_mean]),

SourcePartition(nodes=[add_], source='add_', input_nodes=[bn_num_batches_tracked], output_nodes=[], params=[])
]

We should not have the last partition.

Test Plan:

buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test/quantization:test_quantization -- -r test_qat_conv2d

Differential Revision: D61569049

pytorch-bot · 2024-08-22T23:16:49Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/134292

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit c1b4102 with merge base 78d69bf ():

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

trunk / linux-focal-rocm6.1-py3.8 / test (default, 1, 2, linux.rocm.gpu) (gh) (trunk failure)
inductor/test_torchinductor.py::CpuTests::test_mutable_custom_op_fixed_layout2_cpu

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-08-22T23:17:07Z

This pull request was exported from Phabricator. Differential Revision: D61569049

facebook-github-bot · 2024-08-23T18:08:40Z

This pull request was exported from Phabricator. Differential Revision: D61569049

facebook-github-bot · 2024-08-23T20:15:15Z

This pull request was exported from Phabricator. Differential Revision: D61569049

Summary: Pull Request resolved: pytorch#134292 "torch_fn" is not correct for the decomposed add node from batch norm. This is a temporary workaround to bypass torch fn. For example, for the graph below (test_qat_conv2d_unary graph): ``` graph(): %conv_weight : [num_users=1] = get_attr[target=conv.weight] %bn_weight : [num_users=1] = get_attr[target=bn.weight] %bn_bias : [num_users=1] = get_attr[target=bn.bias] %bn_running_mean : [num_users=1] = get_attr[target=bn.running_mean] %bn_running_var : [num_users=1] = get_attr[target=bn.running_var] %bn_num_batches_tracked : [num_users=1] = get_attr[target=bn.num_batches_tracked] %x : [num_users=1] = placeholder[target=x] %conv2d : [num_users=1] = call_function[target=torch.ops.aten.conv2d.default](args = (%x, %conv_weight, None, [1, 1], [1, 1]), kwargs = {}) %add_ : [num_users=0] = call_function[target=torch.ops.aten.add_.Tensor](args = (%bn_num_batches_tracked, 1), kwargs = {}) %batch_norm : [num_users=1] = call_function[target=torch.ops.aten.batch_norm.default](args = (%conv2d, %bn_weight, %bn_bias, %bn_running_mean, %bn_running_var, True, 0.1, 1e-05, True), kwargs = {}) %relu : [num_users=1] = call_function[target=torch.ops.aten.relu.default](args = (%batch_norm,), kwargs = {}) %max_pool2d : [num_users=1] = call_function[target=torch.ops.aten.max_pool2d.default](args = (%relu, [3, 3], [3, 3]), kwargs = {}) return (max_pool2d,) ``` the add_ node has `'torch_fn': ('add__1', 'method_descriptor.add_'),` in its meta. If we run the line below in `_annotate_qat_conv2d_bn_binary_unary`, we'll have a partition without output nodes. ``` find_sequential_partitions( gm, [torch.nn.Conv2d, torch.nn.BatchNorm2d, operator.add, torch.nn.ReLU] ) ```` ``` partition_list [ SourcePartition(nodes=[conv_weight, conv2d], source=<class 'torch.nn.modules.conv.Conv2d'>, input_nodes=[x], output_nodes=[conv2d], params=[conv_weight]), SourcePartition(nodes=[bn_weight, bn_bias, bn_running_mean, bn_running_var, bn_num_batches_tracked, add_, batch_norm], source=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>, input_nodes=[conv2d], output_nodes=[batch_norm], params=[bn_num_batches_tracked, bn_running_var, bn_bias, bn_weight, bn_running_mean]), SourcePartition(nodes=[add_], source='add_', input_nodes=[bn_num_batches_tracked], output_nodes=[], params=[]) ] ``` We should not have the last partition. Test Plan: ``` buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test/quantization:test_quantization -- -r test_qat_conv2d buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:fx -- -r TestSourceMatcher ``` Reviewed By: angelayi Differential Revision: D61569049

facebook-github-bot · 2024-08-23T20:23:44Z

This pull request was exported from Phabricator. Differential Revision: D61569049

facebook-github-bot · 2024-08-24T05:48:29Z

@pytorchbot merge -f 'Landed internally'

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

pytorchmergebot · 2024-08-24T05:50:05Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorch-bot bot added the release notes: fx release notes category label Aug 22, 2024

facebook-github-bot added the fb-exported label Aug 22, 2024

yushangdi requested a review from angelayi August 22, 2024 23:19

angelayi approved these changes Aug 23, 2024

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 23, 2024

yushangdi force-pushed the export-D61569049 branch from 5b2b612 to e28380d Compare August 23, 2024 18:08

yushangdi force-pushed the export-D61569049 branch from e28380d to b703779 Compare August 23, 2024 20:15

yushangdi force-pushed the export-D61569049 branch from b703779 to c1b4102 Compare August 23, 2024 20:23

pytorchmergebot added the merging label Aug 24, 2024

pytorchmergebot added the Merged label Aug 24, 2024

pytorchmergebot closed this in 0694918 Aug 24, 2024

pytorchmergebot removed the merging label Aug 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[export] Temporarily bypass torch_fn in partitioner #134292

[export] Temporarily bypass torch_fn in partitioner #134292

Uh oh!

yushangdi commented Aug 22, 2024

Uh oh!

pytorch-bot bot commented Aug 22, 2024 •

edited

Loading

Uh oh!

facebook-github-bot commented Aug 22, 2024

Uh oh!

facebook-github-bot commented Aug 23, 2024

Uh oh!

facebook-github-bot commented Aug 23, 2024

Uh oh!

facebook-github-bot commented Aug 23, 2024

Uh oh!

facebook-github-bot commented Aug 24, 2024

Uh oh!

pytorchmergebot commented Aug 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[export] Temporarily bypass torch_fn in partitioner #134292

[export] Temporarily bypass torch_fn in partitioner #134292

Uh oh!

Conversation

yushangdi commented Aug 22, 2024

Uh oh!

pytorch-bot bot commented Aug 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/134292

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

facebook-github-bot commented Aug 22, 2024

Uh oh!

facebook-github-bot commented Aug 23, 2024

Uh oh!

facebook-github-bot commented Aug 23, 2024

Uh oh!

facebook-github-bot commented Aug 23, 2024

Uh oh!

facebook-github-bot commented Aug 24, 2024

Uh oh!

pytorchmergebot commented Aug 24, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pytorch-bot bot commented Aug 22, 2024 •

edited

Loading