Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI-enabled PaddlePaddle #9405

Closed
seiriosPlus opened this issue Mar 27, 2018 · 3 comments
Closed

MPI-enabled PaddlePaddle #9405

seiriosPlus opened this issue Mar 27, 2018 · 3 comments

Comments

@seiriosPlus
Copy link
Collaborator

By using MPI API, we enables PaddlePaddle to take advantage of high performance low latency networks such as Infiniband.
There are two benefits:

  1. Enable RDMA with PaddlePaddle, which bring high performance low latency networks.
  2. Enable GPUDriect with PaddlePaddle, which bring highest throughput and lowest latency GPU read and write.
@wangkuiyi
Copy link
Collaborator

wangkuiyi commented Mar 28, 2018

It looks to me that in order to utilize InfiniBank and GPDirect via MPI, we need to call MPI_AllReduce?

MPI_AllReduce is mutually exclusive with parameter server, fault tolerance, and elastic scheduling. It would be important to draft a design doc to make sure that Fluid supports both modes -- AllReduce and ParameterServer.

@wangkuiyi
Copy link
Collaborator

Also, I noticed that the current distributed computing solution includes a transpiler that generates the ProgramDesc messages for trainers and parameters. If we are going to use MPI_AllReduce to replace parameter servers, do we need a new transpiler that generates the ProgramDesc for trainers in the AllReduce mode?

@seiriosPlus
Copy link
Collaborator Author

We current target speed up PaddlePaddle with Distribution. And for that, We need to call API MPI_Isend and MPI_Irecv to send and receive data between nodes in MPI cluster.
Introduce Open MPI API to PaddlePaddle, which can bring two benefits to PaddlePaddle:

  1. Enable RDMA with PaddlePaddle, which bring high-performance low latency networks.
  2. Enable GPUDriect with PaddlePaddle, which bring the highest throughput and lowest latency GPU read and write.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants