Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MPI-capable tasks #9

Merged
merged 61 commits into from May 29, 2019
Merged

Add MPI-capable tasks #9

merged 61 commits into from May 29, 2019

Conversation

ocaisa
Copy link
Contributor

@ocaisa ocaisa commented May 20, 2019

(in the same namespace)
Fixes #8

I'm not 100% sure about how I approach the serialization is the right way. Do you know what the cost of serialization is? Here I effectively do it three times, I serialise the function, dask serialises the result, MPI serialises the result of that (to do the broadcast).

The restriction here is that only the root MPI process can return something back to dask (so any management related to that has to be handled by the task itself).

@ocaisa ocaisa changed the title Add proof of concept for tasks with MPI capabilities WIP: Add proof of concept for tasks with MPI capabilities May 20, 2019
@ocaisa ocaisa mentioned this pull request May 20, 2019
16 tasks
@AdamWlodarczyk
Copy link
Collaborator

Overall, I really like this approach. if this solution shall work only for mpi tasks which are essentially primitive this should work flawlessly (I am pointing at problems which @dwhswenson and me had some time age with serialization of OPS storage objects).
Since some validation in (de)serialization I would go for creation some class which would handle most of that stuff. May I suggest some class-based solution?

@ocaisa
Copy link
Contributor Author

ocaisa commented May 21, 2019

@AdamWlodarczyk The serialization stuff is my weak point, if you can suggest something that would be great.

mpi_pickling.py Outdated Show resolved Hide resolved
@ocaisa
Copy link
Contributor Author

ocaisa commented May 21, 2019

What's left is to swap out the _command_template of https://github.com/dask/dask-jobqueue/blob/master/dask_jobqueue/core.py#L284 replacing distributed.cli.dask_worker with jobqueue_features.mpi_dask_worker and then wrap the whole command using mpi_wrap

@ocaisa ocaisa changed the title WIP: Add proof of concept for tasks with MPI capabilities Add MPI-capable tasks May 24, 2019
@ocaisa
Copy link
Contributor Author

ocaisa commented May 24, 2019

@AdamWlodarczyk This is actually working now. Can you review? I'd like to cut a release of this before our workshop. I'll add tests and maybe refactor a bit next week.

Copy link
Collaborator

@AdamWlodarczyk AdamWlodarczyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks fine for me.

jobqueue_features/clusters.py Outdated Show resolved Hide resolved
@ocaisa
Copy link
Contributor Author

ocaisa commented May 28, 2019

It looks fine for me.

Thanks for talking a look @AdamWlodarczyk I'll address that comment and add some additional testing and then merge this so I can cut a release to install on JURECA.

I plan on reworking the whole test process to emulate what dask-jobqueue does so that we can actually test all these features in the CI (but that is for a separate PR).

@ocaisa ocaisa merged commit 696f815 into master May 29, 2019
@ocaisa ocaisa deleted the mpi_enabled_tasks branch June 25, 2019 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implementing tasks with MPI capabilities
2 participants