DC-ASGD(Delay Compensated Asynchronous Stochastic Gradient Descent)? #8744

liufei1656 · 2017-03-27T07:41:09Z

DC-ASGD is Microsoft's very useful algorithm for distributed asynchronous training. Compared with the ordinary ASGD algorithm, DC-ASGD has no significant loss in speed, but can get almost the same effect as Sequential SGD. As far as I know, other mainstream deep learning open source tools have implemented this algorithm, such as: CNTK, Mxnet, Paddle and so on. But in Tensorflow I have not found similar modules.

What related GitHub issues or StackOverflow threads have you found by searching the web for your problem?

microsoft/CNTK#1295
PaddlePaddle/Paddle#185
apache/mxnet#3614

What other attempted solutions have you tried?

I tried to implement this algorithm in Tensorflow by myself. I do not have enough ability to do this now.

The link address of the paper

Asynchronous Stochastic Gradient Descent with Delay Compensation for Distributed Deep Learning

renyi533 · 2017-03-27T09:45:54Z

I am also very interested with this. Any update?

just-a-jazz · 2017-04-01T14:49:21Z

I can do this

just-a-jazz · 2017-04-04T14:58:24Z

@aselle Tensorflow's documentation on distributed computing leads me to believe that the end user is responsible for assigning workers and parameter servers to different devices. Should I then only implement Gradient Descent with Delay Compensation, which the user will have to implement asynchronously?

aselle · 2017-04-04T17:45:35Z

@mrry, might be able to give you more direction in this vein (also @tfboyd )

mrry · 2017-04-08T04:27:32Z

I haven't read the paper in depth, but from a quick skim of the paper you could implement the update rule as a tf.train.Optimizer subclass, and the asynchronous execution would be provided by whatever training loop the user used. (It would presumably "work" in a synchronous case as well, because w_t – w_bak would be zero, and hence it would devolve into classic SGD.)

YirenBj · 2017-04-19T01:23:15Z

I already implemented this. Currently, I am verifying its impact on metrics, such as AUC, in our ads data. I will send the pull request if this is really useful.

MorganGellert · 2017-10-25T23:54:12Z

Is support for the adaptive variance parameter version of DC-ASGD planned on being added? The paper found that it performed better in all cases than the constant version which is currently implemented.

dynamicwebpaige · 2019-10-26T23:20:06Z

Closing this issue, as the PR for DC-ASGD was merged (#9551). Thank you! 👍

aselle added type:feature Feature requests stat:contribution welcome Status - Contributions welcome labels Mar 28, 2017

just-a-jazz mentioned this issue Apr 30, 2017

Delay Compensated Asynchronous Stochastic Gradient Descent #9551

Merged

cstur4 mentioned this issue Dec 14, 2018

delay compsensated gradient descent #24363

Closed

dynamicwebpaige closed this as completed Oct 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DC-ASGD(Delay Compensated Asynchronous Stochastic Gradient Descent)? #8744

DC-ASGD(Delay Compensated Asynchronous Stochastic Gradient Descent)? #8744

liufei1656 commented Mar 27, 2017 •

edited

renyi533 commented Mar 27, 2017

just-a-jazz commented Apr 1, 2017

just-a-jazz commented Apr 4, 2017

aselle commented Apr 4, 2017

mrry commented Apr 8, 2017

YirenBj commented Apr 19, 2017

MorganGellert commented Oct 25, 2017

dynamicwebpaige commented Oct 26, 2019

DC-ASGD(Delay Compensated Asynchronous Stochastic Gradient Descent)? #8744

DC-ASGD(Delay Compensated Asynchronous Stochastic Gradient Descent)? #8744

Comments

liufei1656 commented Mar 27, 2017 • edited

What related GitHub issues or StackOverflow threads have you found by searching the web for your problem?

What other attempted solutions have you tried?

The link address of the paper

renyi533 commented Mar 27, 2017

just-a-jazz commented Apr 1, 2017

just-a-jazz commented Apr 4, 2017

aselle commented Apr 4, 2017

mrry commented Apr 8, 2017

YirenBj commented Apr 19, 2017

MorganGellert commented Oct 25, 2017

dynamicwebpaige commented Oct 26, 2019

liufei1656 commented Mar 27, 2017 •

edited