-
Notifications
You must be signed in to change notification settings - Fork 25.7k
[PT-D] Use process group of the partial tensor so sub pg comm will be enabled during reshard #79357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… enabled during reshard [ghstack-poisoned]
🔗 Helpful links
✅ No Failures (0 Pending)As of commit 0e7e424 (more details on the Dr. CI page): Expand to see more💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
…omm will be enabled during reshard" [ghstack-poisoned]
|
@fduwjj has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
|
@pytorchbot merge |
|
@pytorchbot successfully started a merge job. Check the current status here |
|
Hey @fduwjj. |
… enabled during reshard (#79357) Summary: Pull Request resolved: #79357 Approved by: https://github.com/wanchaol Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/f4edbaa62fcc1dc60066bb95926f8a578d8c351e Reviewed By: wanchaol Differential Revision: D37093468 Pulled By: fduwjj fbshipit-source-id: 1eff1b1c727b406308029c7ea22b2903d6d61b23
Stack from ghstack (oldest at bottom):
During the debugging for TP enablement for Transformer model, looks like Partial tensor does not use its own pg during resharding while using the default pg instead. Switch to use its own pg as the fix.
Differential Revision: D37093468