-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Closed
Labels
featureA request for a proper, new feature.A request for a proper, new feature.oncall: distributedAdd this issue/PR to distributed oncall triage queueAdd this issue/PR to distributed oncall triage queuetodoNot as important as medium or high priority tasks, but we will work on these.Not as important as medium or high priority tasks, but we will work on these.triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Description
distributed_c10d.py provides APIs to reduce a single tensor per process, or multiple tensors per process where each of them needs to reside on a different device. It will be useful (e.g., for model averaging) to support reducing a list of tensors per process where all of them are on the same device. The implementation should apply bucketing under the hood to get better performance.
Metadata
Metadata
Assignees
Labels
featureA request for a proper, new feature.A request for a proper, new feature.oncall: distributedAdd this issue/PR to distributed oncall triage queueAdd this issue/PR to distributed oncall triage queuetodoNot as important as medium or high priority tasks, but we will work on these.Not as important as medium or high priority tasks, but we will work on these.triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module