Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fluid support asynchronous training #9941

Closed
12 of 13 tasks
jacquesqiao opened this issue Apr 16, 2018 · 3 comments
Closed
12 of 13 tasks

fluid support asynchronous training #9941

jacquesqiao opened this issue Apr 16, 2018 · 3 comments

Comments

@jacquesqiao
Copy link
Member

jacquesqiao commented Apr 16, 2018

Project

https://github.com/PaddlePaddle/Paddle/projects/61

Design

Operators

Transpiler #9997

  • dist transpile async trainer program. Do not need to add .trainer_n suffix to gradient block in async mode.
  • dist transpile async pserver program. Do not need to aggregate gradient block.

Consider

  • need to consider how to add learning rate decay in asynchronous training. Do we need lr_decay?

Benchmark

@typhoonzero
Copy link
Contributor

Add an async_listen_and_serv_op(no barrier and no lock)

I think if we can decouple barrier operations and the grpc_server/client, it's easy to pass an Attr to the op to enable async or not.

@jacquesqiao
Copy link
Member Author

@typhoonzero Ok, I will write some demo code to see if we need an independent async_listen_and_serv_op.

@shanyi15
Copy link
Collaborator

您好,此issue在近一个月内暂无更新,我们将于今天内关闭。若在关闭后您仍需跟进提问,可重新开启此问题,我们将在24小时内回复您。因关闭带来的不便我们深表歉意,请您谅解~感谢您对PaddlePaddle的支持!
Hello, this issue has not been updated in the past month. We will close it today for the sake of other user‘s experience. If you still need to follow up on this question after closing, please feel free to reopen it. In that case, we will get back to you within 24 hours. We apologize for the inconvenience caused by the closure and thank you so much for your support of PaddlePaddle Group!

fluid support async training automation moved this from In progress to Done Aug 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

7 participants