-
Notifications
You must be signed in to change notification settings - Fork 761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework the P2P/IPC/SHM container fix #248
Conversation
Fixes: #155 Now we use /proc/sys/kernel/random/boot_id instead of the hostname and NS uts,mnt info A new env var; NCCL_HOSTID can be used to overload this string.
@alsrgv Here is the latest P2P/IPC/SHM fix that we are intending to add to the next NCCL release |
@AddyLaddy, @sjeaugey, I'm a bit confused by this change. Before, /uts and /mnt were used to differentiate between containers, which helped to properly disable P2P/SHM/IPC as necessary:
How will P2P/IPC/SHM be auto-disabled between unconnected containers (e.g. just two |
Fixes: #155 Now we use /proc/sys/kernel/random/boot_id instead of the hostname and NS uts,mnt info A new env var; NCCL_HOSTID can be used to overload this string. Also checks the MAJOR:MINOR of the /dev/shm device to check it can be used between containers
Sorry I missed a component of that fix which should also check the MAJOR:MINOR of the /dev/shm device before attempting to use it in the SHM transport. I'll try and updtae the PR. It will be good to know if this is sufficient for your use case. |
…cl into dev/daddison/p2p_container_fix
What about CUDA IPC? It does not work across containers, too. |
The motivation for this patch is to enable P2P/IPC/SHM between containers as I am told that it is typical for HPC apps to launch a container per rank. @3XX0 Is our container expert and perhaps he can comment? |
I think the rationale for IPCs is that if we are sure we are on the same node, then we can work on NVML devices and see if they can use P2P. |
Yes, Next check is the major:minor of Last check is This should handle things like: # No P2P, no SHM
nvidia-docker run -e NVIDIA_VISIBLE_DEVICES=none horovod
nvidia-docker run -e NVIDIA_VISIBLE_DEVICES=none horovod
# No P2P, SHM possible
nvidia-docker run --name=A -e NVIDIA_VISIBLE_DEVICES=0 horovod
nvidia-docker run --name=B -e NVIDIA_VISIBLE_DEVICES=1 --ipc=container:A horovod
# P2P
nvidia-docker run -e NVIDIA_VISIBLE_DEVICES=0,1 --ipc=host horovod
nvidia-docker run -e NVIDIA_VISIBLE_DEVICES=0,1 --ipc=host horovod |
Do we check |
We should check it here : https://github.com/NVIDIA/nccl/blob/master/src/transport/p2p.cc#L98 |
Probably a driver issue, IIRC at one point there was an issue where you also needed to share the PID namespace for IPCs to work, but that's definitely not the case anymore. |
I've tested this PR and it looks good. Thanks, this makes it much more convenient to use NCCL with various container environments! |
Great thanks for the feedback Alex! |
Add LL128 Protocol. Rewrite the topology detection and tree/ring creation (#179). Improve tree performance by sending/receiving from different GPUs. Add model-based tuning to switch between the different algorithms and protocols. Rework P2P/SHM detection in containers (#155, #248). Detect duplicated devices and return an error (#231).
Add LL128 Protocol. Rewrite the topology detection and tree/ring creation (#179). Improve tree performance by sending/receiving from different GPUs. Add model-based tuning to switch between the different algorithms and protocols. Rework P2P/SHM detection in containers (#155, #248). Detect duplicated devices and return an error (#231).
Add LL128 Protocol. Rewrite the topology detection and tree/ring creation (#179). Improve tree performance by sending/receiving from different GPUs. Add model-based tuning to switch between the different algorithms and protocols. Rework P2P/SHM detection in containers (#155, #248). Detect duplicated devices and return an error (#231).
Add LL128 Protocol. Rewrite the topology detection and tree/ring creation (#179). Improve tree performance by sending/receiving from different GPUs. Add model-based tuning to switch between the different algorithms and protocols. Rework P2P/SHM detection in containers (#155, #248). Detect duplicated devices and return an error (#231). Add tuning for GCP
Add LL128 Protocol. Rewrite the topology detection and tree/ring creation (#179). Improve tree performance by sending/receiving from different GPUs. Add model-based tuning to switch between the different algorithms and protocols. Rework P2P/SHM detection in containers (#155, #248). Detect duplicated devices and return an error (#231). Add tuning for GCP
Merged in 2.5.6. Closing. |
Fixes: #155
Now we use /proc/sys/kernel/random/boot_id instead of the
hostname and NS uts,mnt info
A new env var; NCCL_HOSTID can be used to overload this string.