Skip to content

Gradient collection in stage 1 #7197

Answered by tjruwase
ghadialhajj asked this question in Q&A
Discussion options

You must be logged in to vote

@ghadialhajj, yes reduce-scatter is the better option for gradient reduction.

We manually implemented reduce-scatter because the collective was not supported by torch at that time. We have lacked bandwidth to upgrade the code:

if not self.reduce_scatter:

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@ghadialhajj
Comment options

Answer selected by ghadialhajj
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants