Support Dask #59

cindytsai · 2022-07-06T05:10:37Z

Support Dask

Dask is a flexible library for parallel computing in Python. It is growing its popularity among Python ecosystems. Because libyt does the in-situ analysis by running Python script, it is important to support this feature as well.

Current `libyt` structure

Each MPI rank initializes a Python interpreter, and they work together through mpi4py.

MPI 0 ~ (N-1)
Python
libyt Python Module
libyt C/C++ library
Simulation

How should `dask` be set up inside embedded Python?

We can make two additional ranks specifically for scheduler and client (not necessarily to be MPI 0 and 1), and the rest of MPI nodes for workers. Each simulation also runs inside workers. By following how dask-mpi initialize() initializes scheduler, client, and workers, it is possible to wrap this inside libyt.

MPI 0	MPI 1	MPI 2	...	MPI (N-1)
Scheduler	Client	Worker	Worker	Worker
libyt Python Module	libyt Python Module	libyt Python Module	libyt Python Module	libyt Python Module
libyt C/C++ library	libyt C/C++ library	libyt C/C++ library	libyt C/C++ library	libyt C/C++ library
Empty	Empty	Simulation	Simulation	Simulation

Solve data exchange problem

Because we use Remote Memory Access (one-sided MPI) with some settings that required every rank to participate in the procedure (#26). libyt suffers from data exchange process between MPI nodes. Every time yt reads data, all ranks should wait for each other and synchronize.
However, if we move this data exchange process from C/C++ to Python side, then it is possible to exchange data with more flexibility using dask and exchange data in a asynchronous way. By encoding what MPI ranks should get into a Dask graph, asking worker to prepare local grid data, and exchanging data between workers, it will be much easier.
(At least much easier than using C/C++. 😅 )

The text was updated successfully, but these errors were encountered:

cindytsai added the new-feature New feature. label Jul 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Dask #59

Support Dask #59

cindytsai commented Jul 6, 2022 •

edited

Loading

Support Dask #59

Support Dask #59

Comments

cindytsai commented Jul 6, 2022 • edited Loading

Support Dask

Current libyt structure

How should dask be set up inside embedded Python?

Solve data exchange problem

cindytsai commented Jul 6, 2022 •

edited

Loading

Current `libyt` structure

How should `dask` be set up inside embedded Python?