Skip to content

Conversation

StrongerXi
Copy link
Contributor

@StrongerXi StrongerXi commented Apr 29, 2025

Stack from ghstack (oldest at bottom):

This relaxes the guard introduced in #100444 (which aggressively guard
on the object id, despite Dynamo is just tracing its __call__ method.

This allows users to bypass the high compilation time issue in #150706
by compiling transformer blocks only. Without this patch, we'd get lots
of unnecessary recompilation, as the block has difference attention
processor instances.

Compiling blocks only significantly speeds up compilation process
(from ~310s to ~32s), and even speeds up e2e performance for some reason
(7.83s to 7.67s).

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames

…ed object

This relaxes the guard introduced in #100444 (which aggressively guard
on the object id, despite Dynamo is just tracing its `__call__` method.

This allows users to bypass the high compilation time issue in #150706
by compiling transformer blocks only. Without this patch, we'd get lots
of unnecessary recompilation, as the block has difference attention
processor instances.

Compiling blocks only _significantly_ speeds up compilation process
(from ~310s to ~32s), and even speeds up e2e performance for some reason
(7.83s to 7.67s).

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Apr 29, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152395

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 06a2da1 with merge base 6e5e9dc (image):

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

StrongerXi added a commit that referenced this pull request Apr 29, 2025
…ed object

This relaxes the guard introduced in #100444 (which aggressively guard
on the object id, despite Dynamo is just tracing its `__call__` method.

This allows users to bypass the high compilation time issue in #150706
by compiling transformer blocks only. Without this patch, we'd get lots
of unnecessary recompilation, as the block has difference attention
processor instances.

Compiling blocks only _significantly_ speeds up compilation process
(from ~310s to ~32s), and even speeds up e2e performance for some reason
(7.83s to 7.67s).

ghstack-source-id: d3208ca
Pull Request resolved: #152395
@StrongerXi StrongerXi requested a review from anijain2305 April 29, 2025 00:53
@StrongerXi StrongerXi added the topic: not user facing topic category label Apr 29, 2025
if method is torch.nn.Module.__init__:
method = unpatched_nn_module_init
if source:
install_guard(source.make_guard(GuardBuilder.FUNCTION_MATCH))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned that this will add a lot of new guards (even though this might be the right thing to do). Maybe skip this one in this PR, and then follow up with another PR with some more benchmarking?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lemme localize the fix to that one site instead, that should cover us in practice without introducing too many new guards.

… user defined object"

This relaxes the guard introduced in #100444 (which aggressively guard
on the object id, despite Dynamo is just tracing its `__call__` method.

This allows users to bypass the high compilation time issue in #150706
by compiling transformer blocks only. Without this patch, we'd get lots
of unnecessary recompilation, as the block has difference attention
processor instances.

Compiling blocks only _significantly_ speeds up compilation process
(from ~310s to ~32s), and even speeds up e2e performance for some reason
(7.83s to 7.67s).

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames

[ghstack-poisoned]
StrongerXi added a commit that referenced this pull request Apr 29, 2025
…ed object

This relaxes the guard introduced in #100444 (which aggressively guard
on the object id, despite Dynamo is just tracing its `__call__` method.

This allows users to bypass the high compilation time issue in #150706
by compiling transformer blocks only. Without this patch, we'd get lots
of unnecessary recompilation, as the block has difference attention
processor instances.

Compiling blocks only _significantly_ speeds up compilation process
(from ~310s to ~32s), and even speeds up e2e performance for some reason
(7.83s to 7.67s).

ghstack-source-id: 1e98603
Pull Request resolved: #152395
@StrongerXi StrongerXi requested a review from anijain2305 April 29, 2025 17:16
@StrongerXi
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 29, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

@StrongerXi
Copy link
Contributor Author

@pytorchbot merge -f "unrelated failure"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@github-actions github-actions bot deleted the gh/StrongerXi/107/head branch June 14, 2025 02:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants