Should dask-mpi section be marked as not necessary in the docs? #3889

guillaumeeb · 2018-08-20T12:19:46Z

See http://dask.pydata.org/en/latest/setup/hpc.html#using-mpi.

While working on and with dask-jobqueue, in particular on doc issues (See #118), I stumble on the question of dask-jobqueue for batch processing. I believe this is OK and will work in many cases (I've already used it this way), but it won't in some others where dask-mpi would be appropriate. See the part of the doc I'm working on:

While dask-jobqueue can perfectly be used to submit batch processing, it is
better suited to interactive processing, using tools like ipython or jupyter
notebooks. Batch processing with dask-jobqueue can be tricky in some cases
depending on how your cluster is configured and which resources and queues you
have access to: scheduler might hang on for a long time before having some
connected workers, and you could end up with less computing power than you
expected. Another good solution for batch processing on HPC system using dask
is the dask-mpi <http://dask.pydata.org/en/latest/setup/hpc.html#using-mpi>_
command.

Maybe we should clarify this between us and also in Dask docs?

cc @jhamman

The text was updated successfully, but these errors were encountered:

jhamman · 2018-08-20T15:52:55Z

I generally agree with the comments above. Though, dask/distributed#2138, if implemented, would alleviate many of the relevant concerns here.

guillaumeeb · 2018-08-20T21:23:47Z

At first I thought so to, but I'm not so sure anymore. Depending on cluster use, resolving this issue could lead to scale method never returning, or returning at a ressource optimum that could decay right after depending on jobs start time and walltime. Different outcome, but mainly same concerns as described above.

The only way to solve this with dask-jobqueue would be to make it able to launch multi-node jobs with worker on it. But currently we've agreed on it being out of the scope of dask-jobqueue.

GenevieveBuckley added the documentation Improve or add to documentation label Oct 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should dask-mpi section be marked as not necessary in the docs? #3889

Should dask-mpi section be marked as not necessary in the docs? #3889

guillaumeeb commented Aug 20, 2018

jhamman commented Aug 20, 2018

guillaumeeb commented Aug 20, 2018

Should dask-mpi section be marked as not necessary in the docs? #3889

Should dask-mpi section be marked as not necessary in the docs? #3889

Comments

guillaumeeb commented Aug 20, 2018

jhamman commented Aug 20, 2018

guillaumeeb commented Aug 20, 2018