Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about NCCL backend in torch.distributed documentation #65525

Closed
jasperzhong opened this issue Sep 23, 2021 · 1 comment
Closed

Question about NCCL backend in torch.distributed documentation #65525

jasperzhong opened this issue Sep 23, 2021 · 1 comment
Labels
oncall: distributed Add this issue/PR to distributed oncall triage queue triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@jasperzhong
Copy link

jasperzhong commented Sep 23, 2021

❓ Questions and Help

https://pytorch.org/docs/stable/distributed.html

The document says that send and recv are not supported for nccl on GPUs. But there is already this part of the implementation code and it can work. Is this part of documentation outdated?

image

c10::intrusive_ptr<ProcessGroup::Work> send(
std::vector<at::Tensor>& tensors,
int dstRank,
int tag) override;
c10::intrusive_ptr<ProcessGroup::Work> recv(
std::vector<at::Tensor>& tensors,
int srcRank,
int tag) override;

cc @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @SciPioneer @H-Huang @gcramer23

@facebook-github-bot facebook-github-bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Sep 23, 2021
@wayi1 wayi1 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Sep 24, 2021
wayi1 pushed a commit that referenced this issue Sep 24, 2021
wayi1 pushed a commit that referenced this issue Sep 24, 2021
#Closes: #65525

Differential Revision: [D31163535](https://our.internmc.facebook.com/intern/diff/D31163535/)

ghstack-source-id: 138918961
Pull Request resolved: #65601
@wayi1
Copy link
Contributor

wayi1 commented Sep 24, 2021

Good observation! I believe the doc is outdated, and I have created a PR to update the doc. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
oncall: distributed Add this issue/PR to distributed oncall triage queue triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

3 participants