[dynamo] Relax guard introduced when tracing `call` on user defined object #152395

StrongerXi · 2025-04-29T00:52:48Z

Stack from ghstack (oldest at bottom):

-> [dynamo] Relax guard introduced when tracing __call__ on user defined object #152395
[AOTAutogradCache] Allow torch.Tensor and a non-torch op from einops #152369

This relaxes the guard introduced in #100444 (which aggressively guard
on the object id, despite Dynamo is just tracing its __call__ method.

This allows users to bypass the high compilation time issue in #150706
by compiling transformer blocks only. Without this patch, we'd get lots
of unnecessary recompilation, as the block has difference attention
processor instances.

Compiling blocks only significantly speeds up compilation process
(from ~310s to ~32s), and even speeds up e2e performance for some reason
(7.83s to 7.67s).

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames

…ed object This relaxes the guard introduced in #100444 (which aggressively guard on the object id, despite Dynamo is just tracing its `__call__` method. This allows users to bypass the high compilation time issue in #150706 by compiling transformer blocks only. Without this patch, we'd get lots of unnecessary recompilation, as the block has difference attention processor instances. Compiling blocks only _significantly_ speeds up compilation process (from ~310s to ~32s), and even speeds up e2e performance for some reason (7.83s to 7.67s). [ghstack-poisoned]

pytorch-bot · 2025-04-29T00:52:52Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152395

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 06a2da1 with merge base 6e5e9dc ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

Lint / Lint URLs / linux-job (gh) (#152489)
RuntimeError: Command docker exec -t 204ecdae19291b204c671f343eb26bded05f2fe65976278da9ceaf7f0425deb5 /exec failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…ed object This relaxes the guard introduced in #100444 (which aggressively guard on the object id, despite Dynamo is just tracing its `__call__` method. This allows users to bypass the high compilation time issue in #150706 by compiling transformer blocks only. Without this patch, we'd get lots of unnecessary recompilation, as the block has difference attention processor instances. Compiling blocks only _significantly_ speeds up compilation process (from ~310s to ~32s), and even speeds up e2e performance for some reason (7.83s to 7.67s). ghstack-source-id: d3208ca Pull Request resolved: #152395

anijain2305 · 2025-04-29T05:06:43Z

torch/_dynamo/variables/user_defined.py

                if method is torch.nn.Module.__init__:
                    method = unpatched_nn_module_init
+                if source:
+                    install_guard(source.make_guard(GuardBuilder.FUNCTION_MATCH))


I am concerned that this will add a lot of new guards (even though this might be the right thing to do). Maybe skip this one in this PR, and then follow up with another PR with some more benchmarking?

Lemme localize the fix to that one site instead, that should cover us in practice without introducing too many new guards.

… user defined object" This relaxes the guard introduced in #100444 (which aggressively guard on the object id, despite Dynamo is just tracing its `__call__` method. This allows users to bypass the high compilation time issue in #150706 by compiling transformer blocks only. Without this patch, we'd get lots of unnecessary recompilation, as the block has difference attention processor instances. Compiling blocks only _significantly_ speeds up compilation process (from ~310s to ~32s), and even speeds up e2e performance for some reason (7.83s to 7.67s). cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames [ghstack-poisoned]

…ed object This relaxes the guard introduced in #100444 (which aggressively guard on the object id, despite Dynamo is just tracing its `__call__` method. This allows users to bypass the high compilation time issue in #150706 by compiling transformer blocks only. Without this patch, we'd get lots of unnecessary recompilation, as the block has difference attention processor instances. Compiling blocks only _significantly_ speeds up compilation process (from ~310s to ~32s), and even speeds up e2e performance for some reason (7.83s to 7.67s). ghstack-source-id: 1e98603 Pull Request resolved: #152395

StrongerXi · 2025-04-29T21:08:15Z

@pytorchbot merge

pytorchmergebot · 2025-04-29T21:11:04Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-04-29T21:11:14Z

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

Lint / Lint URLs / linux-job

Dig deeper by viewing the failures on hud

Details for Dev Infra team

Raised by workflow job

Failing merge rule: Core Maintainers

StrongerXi · 2025-04-30T17:32:26Z

@pytorchbot merge -f "unrelated failure"

pytorchmergebot · 2025-04-30T17:33:59Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

StrongerXi mentioned this pull request Apr 29, 2025

[AOTAutogradCache] Allow torch.Tensor and a non-torch op from einops #152369

Closed

pytorch-bot bot added ciflow/inductor module: dynamo labels Apr 29, 2025

StrongerXi requested a review from anijain2305 April 29, 2025 00:53

StrongerXi added the topic: not user facing topic category label Apr 29, 2025

anijain2305 reviewed Apr 29, 2025

View reviewed changes

StrongerXi requested a review from anijain2305 April 29, 2025 17:16

anijain2305 approved these changes Apr 29, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 29, 2025

pytorchmergebot added the merging label Apr 29, 2025

pytorchmergebot removed the merging label Apr 29, 2025

pytorchmergebot added the merging label Apr 30, 2025

pytorchmergebot added the Merged label Apr 30, 2025

pytorchmergebot closed this in 0904a18 Apr 30, 2025

pytorchmergebot removed the merging label Apr 30, 2025

github-actions bot deleted the gh/StrongerXi/107/head branch June 14, 2025 02:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[dynamo] Relax guard introduced when tracing `call` on user defined object #152395

[dynamo] Relax guard introduced when tracing `call` on user defined object #152395

Uh oh!

StrongerXi commented Apr 29, 2025 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Apr 29, 2025 •

edited

Loading

Uh oh!

anijain2305 Apr 29, 2025

Uh oh!

StrongerXi Apr 29, 2025

Uh oh!

StrongerXi commented Apr 29, 2025

Uh oh!

pytorchmergebot commented Apr 29, 2025

Uh oh!

pytorchmergebot commented Apr 29, 2025

Uh oh!

StrongerXi commented Apr 30, 2025

Uh oh!

pytorchmergebot commented Apr 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[dynamo] Relax guard introduced when tracing __call__ on user defined object #152395

[dynamo] Relax guard introduced when tracing __call__ on user defined object #152395

Uh oh!

Conversation

StrongerXi commented Apr 29, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152395

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

anijain2305 Apr 29, 2025

Choose a reason for hiding this comment

Uh oh!

StrongerXi Apr 29, 2025

Choose a reason for hiding this comment

Uh oh!

StrongerXi commented Apr 29, 2025

Uh oh!

pytorchmergebot commented Apr 29, 2025

Merge started

Uh oh!

pytorchmergebot commented Apr 29, 2025

Merge failed

Uh oh!

StrongerXi commented Apr 30, 2025

Uh oh!

pytorchmergebot commented Apr 30, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[dynamo] Relax guard introduced when tracing `call` on user defined object #152395

[dynamo] Relax guard introduced when tracing `call` on user defined object #152395

StrongerXi commented Apr 29, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Apr 29, 2025 •

edited

Loading