# Introduction to dask-mpi

Before proceeding to this notebook, we suggest the reading of ["Introduction to dask"](Dask.ipynb)

## Initialization
### Interactive jobs

When dask is used interactively (e.g. like here in a notebook), dask-mpi needs to be run in the background as a server with a command of the kind
```bash
mpirun -n $((N+1)) dask-mpi --no-nanny --scheduler-file scheduler.json --nthreads 1
```
where `N+1` is the total number of processes having one scheduler and N workers.

Then in the notebook we connect to the server by doing
```python
from dask.distributed import Client
client = Client(scheduler_file="scheduler.json")
```

### Batch jobs

When dask is used in a script, the script needs to be executed in parallel with a command of the kind
```bash
mpirun -n $((N+1)) python script.py
```
and the first line of script.py should be
```python
from dask_mpi import initialize
initialize(nthreads=1, nanny=False)

from dask.distributed import Client
client = Client()
```

For more details about dask-mpi refer to its [documentation](https://mpi.dask.org/en/latest/index.html).

## Example
In the following we start start the server and connect to it.

In [None]:
import sh
import tempfile

# Since dask-mpi produces several file we create a temporary directory
tmppath = tempfile.mkdtemp()
sh.cd(tmppath)

# Here we set the number of workers
workers = 8
threads_per_worker = 1

# The command runs in the background (_bg=True) and the stdout(err) is stored in tmppath+"/log.out(err)"
server = sh.mpirun("-n", workers+1, "dask-mpi", "--no-nanny", "--nthreads", threads_per_worker,
          "--scheduler-file", "scheduler.json", _bg = True, _out="log.out", _err="log.err")


In [None]:
from dask.distributed import Client
client = Client(scheduler_file=tmppath+"/scheduler.json")
client

## Workers

Information about the workers can be get using
```python
client.scheduler_info()["workers"]
```
that returns a dictionary with keys the workers name and content the last update about the worker.


In [None]:
workers = list(client.scheduler_info()["workers"].keys())
workers

In [None]:
# The known information are for example
client.scheduler_info()["workers"][workers[0]]

## Distributed operations
We can initialize a group of workers for performing a task using the function 
```python
client.scatter(list, workers = None or workers, broadcast=False, hash=False)
```
where one of each element of the list will be given to one of the workers in a round-robin based. The list of workers can be selected between the workers available.

The content of the list should contain information that the worker needs to proceed.

Here a dummy example.

In [None]:
dummy = range(len(workers))
group = client.scatter(dummy, workers=workers, broadcast=False, hash=False)
group

In [None]:
client.who_has(group)

In [None]:
[g.result() for g in group]

To check that they are actually distributed we get the rank of each process.

In [None]:
def get_rank(*args,comm=None):
    if comm is None:
        from mpi4py.MPI import COMM_WORLD as comm
    return comm.rank

ranks = client.map(get_rank, group)
ranks = [rank.result() for rank in ranks]
ranks

We note that `rank = 0` is not in the list because indeed the scheduler is running on it and not a worker.

Thus any MPI operation need to be run on a communcator involing only the workers and not the scheduler.

In [None]:
def create_comm(*args, ranks=None, comm=None):
    assert ranks
    if comm is None:
        from mpi4py.MPI import COMM_WORLD as comm
    return comm.Create_group(comm.group.Incl(ranks))

comms = client.map(create_comm, group, workers=workers, ranks=ranks, actor=True)
comms

In [None]:
comms = [comm.result() for comm in comms]
comms

In [None]:
client.scheduler.workers()

In [None]:
[comm.rank for comm in comms]

In [None]:
reductions = [comm.allreduce(1) for comm in comms]
[r.result() for r in reductions]