-
Notifications
You must be signed in to change notification settings - Fork 21.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DTensor] Computed DTensorSpec
hash lazily
#114322
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/114322
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 0f5e76b with merge base 140c54e (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
ghstack-source-id: 11dcb0e9dd8505fb88c37e37bfbab6b46bb99e6c Pull Request resolved: #114322
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, thanks for the quick fix!
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: periodic / macos-12-py3-x86-64 / build Details for Dev Infra teamRaised by workflow job |
Failure unrelated: periodic / macos-12-py3-x86-64 / build (gh)
|
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 1 checks: periodic / macos-12-py3-x86-64 / build Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
This is weird as I still see the test failing in trunk on multigpu job https://hud.pytorch.org/pytorch/pytorch/commit/e7326ec295559c16795088e79a5631e784bb4d61, any thoughts? |
Looking into it! |
Stack from ghstack (oldest at bottom):
DTensorSpec
hash lazily #114322isinstance
call inis_shard
#114140DTensorSpec
#113915distribute_tensor
#113930grad_placements
was tuple #113925from_local
#114134redistribute
#113924_Partial
,Replicate
frozen dataclasses #113919This is a forward fix for #113781.
We lazily compute the hash so that we do not try to compute the hash on
SymInt
s (for the stride) during Dynamo tracing.Tested via: