Skip to content

Commit

Permalink
[Gradient Compression] Explicitly restrict the scope of torch.cuda.sy…
Browse files Browse the repository at this point in the history
…nchronize to the current device (#49711)

Summary:
Pull Request resolved: #49711

`torch.cuda.synchronize` uses the current device by default. Explicitly specify this device for better readability.

Original PR issue: Investigate Applying PowerSGD to Communication Hook for Gradient Compression #47202
ghstack-source-id: 119017654

Test Plan:
buck test mode/dev-nosan caffe2/test/distributed:c10d -- test_powerSGD_ddp_comm_hook_nccl

buck test mode/dev-nosan caffe2/test/distributed:distributed_nccl_fork -- test_DistributedDataParallel_powerSGD_ddp_comm_hook

Reviewed By: rohan-varma

Differential Revision: D25672267

fbshipit-source-id: 62a2266727a2ea76175f3c438daf20951091c771
  • Loading branch information
Yi Wang authored and facebook-github-bot committed Dec 23, 2020
1 parent ee27104 commit 88c33ff
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions torch/distributed/algorithms/ddp_comm_hooks/powerSGD_hook.py
Original file line number Diff line number Diff line change
Expand Up @@ -304,7 +304,7 @@ def decompress(fut):
for p, q, tensor in zip(ps, qs, high_rank_tensors):
torch.matmul(p, q.t(), out=tensor)
if torch.cuda.is_available():
torch.cuda.synchronize()
torch.cuda.synchronize(device)

if state.use_error_feedback:
# Memorize the local errors.
Expand Down Expand Up @@ -494,7 +494,7 @@ def decompress(fut):
# Memorize the local errors.
state.error_dict[bucket_index] = input_tensor_cp - input_tensor
if torch.cuda.is_available():
torch.cuda.synchronize()
torch.cuda.synchronize(device)
if not state.warm_start:
state.p_memory_dict.clear()
state.q_memory_dict.clear()
Expand Down

0 comments on commit 88c33ff

Please sign in to comment.