Enhance new_group doc to mention using NCCL concurrently.

Using NCCL communicators concurrently is not safe and this is documented in NCCL docs. However, this is not documented in PyTorch and we should add documentation for ProcessGroupNCCL so that users are aware of this limitation. Differential Revision: [D25351778](https://our.internmc.facebook.com/intern/diff/D25351778/) ghstack-source-id: 117932333 Pull Request resolved: #48872
pytorch · Dec 5, 2020 · c0ea183 · c0ea183
1 parent 0e4f9a7
commit c0ea183
Showing 1 changed file with 6 additions and 0 deletions.
diff --git a/torch/distributed/distributed_c10d.py b/torch/distributed/distributed_c10d.py
@@ -2213,6 +2213,12 @@ def new_group(ranks=None, timeout=default_pg_timeout, backend=None):
     if they are not going to be members of the group. Additionally, groups
     should be created in the same order in all processes.
 
+    .. warning::
+        Using multiple process groups with the ``NCCL`` backend concurrently
+        is not safe and the user should perform explicit synchronization in
+        their application to ensure only one process group is used at a time.
+        See `Using multiple NCCL communicators concurrently <https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/usage/communicators.html#using-multiple-nccl-communicators-concurrently>`_ for more details.
+
     Arguments:
         ranks (list[int]): List of ranks of group members. If ``None``, will be
             set to all ranks. Default is ``None``.