Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why setting NCCL_NET_GDR_READ to 0 perform better than seeting NCCL_NET_GDR_READ to 1 on PCI-E platform with multiple nodes? #1181

Open
shanleo2024 opened this issue Feb 19, 2024 · 0 comments

Comments

@shanleo2024
Copy link

shanleo2024 commented Feb 19, 2024

Hi @sjeaugey, I have one question as like the title.

I have tested the case of setting NCCL_NET_GDR_READ to 0 and setting NCCL_NET_GDR_READ to 1 on my PCI-E platform with multiple nodes, the rccl_test performs better when setting NCCL_NET_GDR_READ to 0.
The official documents also shows that: https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/env.html#nccl-net-gdr-read
I wonder why? As I know, the rccl_test should perform better with GDR than without GDR, this is the advantage of GDR.
On the PCI-E platform, the send side will not open GDR with setting NCCL_NET_GDR_READ to 0, while the recv side will open GDR wiht setting NCCL_NET_GDR_READ to 1, is this how it is implemented?

Wish your response to help me to solve my trouble.
Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant