support using RoundRobin ProcessGroup in Distributed training #1213

chenyangyu1988 · 2019-12-20T02:34:54Z

Summary:
support using RoundRobin ProcessGroup in Distributed training
RoundRobin ProcessGroup will use multi-stream for gradient sync.

Usually it could give us 15%+ speed up with no gradient accumulation.

Reviewed By: hudeven

Differential Revision: D19138726

Summary: support using RoundRobin ProcessGroup in Distributed training RoundRobin ProcessGroup will use multi-stream for gradient sync. Usually it could give us 15%+ speed up with no gradient accumulation. Reviewed By: hudeven Differential Revision: D19138726 fbshipit-source-id: bc52df1ebeb9bd69a5239507dab866533a56f6b5

facebook-github-bot · 2019-12-20T02:35:21Z

This pull request was exported from Phabricator. Differential Revision: D19138726

facebook-github-bot · 2019-12-20T20:19:47Z

This pull request has been merged in 151be72.

facebook-github-bot added CLA Signed Do not delete this pull request or issue due to inactivity. fb-exported labels Dec 20, 2019

facebook-github-bot closed this in 151be72 Dec 20, 2019

facebook-github-bot added the Merged label Dec 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support using RoundRobin ProcessGroup in Distributed training #1213

support using RoundRobin ProcessGroup in Distributed training #1213

chenyangyu1988 commented Dec 20, 2019

facebook-github-bot commented Dec 20, 2019

facebook-github-bot commented Dec 20, 2019

support using RoundRobin ProcessGroup in Distributed training #1213

support using RoundRobin ProcessGroup in Distributed training #1213

Conversation

chenyangyu1988 commented Dec 20, 2019

facebook-github-bot commented Dec 20, 2019

facebook-github-bot commented Dec 20, 2019