# (truly) Parallel Python
Unfortunately, not all problems are of the "embarrassingly-parallel" type. For example, simulations of dynamical systems (nervous/weather/quantum systems) can easily become too computationally heavy for single machines, both in terms of memory and compute. One solution is to *distribute* such simulations across multiple machines. In particular this implies that we are using multiple processes which all work on a part of the simulation and hence need to communicate. The de-facto standard protocol for inter-process communcation (in academia) is the Message Passing Interface (MPI). This protocol defines a standard way of processes to send data to/receive data from each other. Compared to the application we've considered so far, using MPI *effectively* requires significant cognitive and development overhead, so you should very carefully evaluate whether you need to get your hands this dirty before reimplementing your simulation (or do it for fun as a challenge while your supervisor is on holidays ;) ). In the following we will focus on the `mpi4py` package.

## Starting multiple processes: who am I?
Using MPI requires you to change the way you start your Python program. First, we can not (easily) run it from a jupyter notebook. Second, instead of calling it like `python <script>` from the commandline, we need to run it via the `mpirun` executable. At this point, we also specify how many processes we'd like to start via `-np <number of processes>`. In MPI each process is assigned a rank. This helps you to organize work ("rank X does Y") and communication ("rank X sends Z to rank Y"). We first implement (one of) the simplest possible MPI program(s): report your rank and exit.

In [1]:
from mpi4py import MPI

comm = MPI.COMM_WORLD
rank = comm.Get_rank()

print(f'hello world from rank {rank}')

hello world from rank 0


As mentioned, using the rank, we can let different processes do different work, for example, generating random numbers in different ranges:

In [2]:
from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
rank = comm.Get_rank()

x = np.random.normal(loc=rank)

print(f'hello world from rank {rank}. my mean is {np.mean(x)}.')

hello world from rank 0. my mean is 0.7738655887422622.


- simple example of mpi program (report rank and exit)
- explain different way of calling it (with mpirun executable)
- discuss issue of unordered execution

Now, let's consider a (super simplified) dynamical systems simulation. We have particles moving in a one-dimensional "box" between 0 and 1. Assuming lots and lots of particles, we may want to split the work of propagating the particles, i.e., computing their new position, across different processes. Here we decide that each process should propagate the particles within a certain "volume" of the box. For example, using two processes, one of them propagates all particles between 0 and 0.5, the other all particles between 0.5 and 1. Of course particles can cross the boundary from below 0.5 to above 0.5 and we hence need to communicate positions between the processes. Here, we go for a simple implementation: after each propagation step, information about the new positions is shared across all ranks. Each rank afterwards determines which particles it should propagate in the next step.

- send/recv example
- discuss issue of telling each rank how to behave
- discuss issue of blocking
- discuss issue of what can be send between ranks (pickleable?)

# TODO
- caveat `send` vs `Send`