Could dask-mpi run the client script too?

I've been dealing with an issue that...well, I was convinced shouldn't be an issue, so I never said anything about it until dask/dask-blog#5.  And after a discussion with @guillaumeeb, I was convinced that maybe I'm not as crazy (or as ill-informed) as I thought I was.  So, here's the issue...

I've been trying to figure out a way of launching the Dask Scheduler, Workers, _and_ the Client script in the same MPI environment.  Currently, the way `dask-mpi` works is that the Scheduler and the Workers are started, and you separately connect your client (in your separate script) to the Scheduler via, for example, the `scheduler.json` file.

I discussed with @guillaumeeb one approach that should work, something like the following:

```bash
# [PBS header info requesting N MPI processes]

mpirun -np N dask-mpi [dask-mpi options] &
python my_dask_script.py
```

However, this launches Scheduler/Worker processes on all N allocated MPI processes, and then the `python my_dask_script.py` process could, potentially, run on the same process as the Scheduler, for example.  If you have a compute-intensive client script, this could be problematic.

What I was originally hoping for was a solution that allowed something more like this:

```bash
# [PBS header info requesting N MPI processes]

mpirun -np N dask-mpi [dask-mpi options] --script my_dask_script
```

But after thinking about it for a while, I found that what I _really_ wanted was something that worked like this:

```bash
# [PBS header info requesting N MPI processes]

mpirun -np N python my_dask_mpi_script.py
```

where the `my_dask_mpi_script.py` has something like an `import dask-mpi` line that does the following:

1. let's MPI rank 0 pass through,
2. launches the Scheduler on MPI rank 1 and runs an `IOLoop` until the rank 0 process is complete,
3. launches the Workers on MPI ranks >1, which also run until rank 0 process is complete.

At this point, I feel like I could write this myself...except that I don't know how to implement the "run the `IOLoop` until rank 0 process is complete" part.

Any thoughts?  Are there different solutions?  Would you recommend something different?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Could dask-mpi run the client script too? #2402

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Could dask-mpi run the client script too? #2402

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions