fix batch_isend_irecv example incorrect usage (#110408)

mismatched dtypes silently leads to wrong outputs in nccl ``` 1:recv_tensor=tensor([0., 0.], device='cuda:1') 0:recv_tensor=tensor([2.8026e-45, 0.0000e+00], device='cuda:0') ``` Pull Request resolved: #110408 Approved by: https://github.com/awgu, https://github.com/Neilblaze
pytorch · Oct 4, 2023 · 0949d97 · 0949d97
1 parent 8672d64
commit 0949d97
Showing 1 changed file with 2 additions and 2 deletions.
diff --git a/torch/distributed/distributed_c10d.py b/torch/distributed/distributed_c10d.py
@@ -1779,8 +1779,8 @@ def batch_isend_irecv(p2p_op_list):
 
     Examples:
         >>> # xdoctest: +SKIP("no rank")
-        >>> send_tensor = torch.arange(2) + 2 * rank
-        >>> recv_tensor = torch.randn(2)
+        >>> send_tensor = torch.arange(2, dtype=torch.float32) + 2 * rank
+        >>> recv_tensor = torch.randn(2, dtype=torch.float32)
         >>> send_op = dist.P2POp(dist.isend, send_tensor, (rank + 1)%world_size)
         >>> recv_op = dist.P2POp(dist.irecv, recv_tensor, (rank - 1 + world_size)%world_size)
         >>> reqs = batch_isend_irecv([send_op, recv_op])