## MPI: Message Passing Interface

MPI is the programming interface to high-performance computing (HPC), i.e. supercomputers.

* Message passing parallelism
* Cluster computing (no shared memory)
* Process (not thread oriented)
* Parallelism model
  * SPMD: by definition
* MPI environment
  * Application programming interface
  * Implemented in libraries
  * Support for C/C++ and Fortran


There is a reasonable tutorial [https://hpc-tutorials.llnl.gov/mpi/abstract/](https://hpc-tutorials.llnl.gov/mpi/abstract/) as part of a High-Performance computing tutorial series. 

MPI routines that are most useful for new MPI programmers include:

* MPI Environment Management 
* Point-to-Point Communications
* Collective Communications

Tutorial does not cover advanced topics such as 
* Derived Data Types
* Group and Communicator Management Routines
* Virtual Topologies

### Why teach MPI?

The communication paradigms, particularly collective communication patterns, are widely used in distributed AI training.

### SPMD: Single program multiple data

From wikipedia “Tasks are split up and run simultaneously on multiple processors with different input in order to obtain results faster. SPMD is the most common style of parallel programming.”
  * Asynchronous execution of the same program
  
<img src="https://www.sharcnet.ca/help/images/8/8a/SPMD_model.png" width=512 title="SPMD" />

_SPMD_ is not part of Flynn's taxonomy. It is a software programming model. In contrast, SIMD is an architectural classification.

### A first MPI program

* Configure the MPI environment
* Discover yourself
* Take some differentiated activity

all demos in `./mpi` Start with `mpimsg.c`

* Idioms
  * SPMD: all processes run the same program
    * MPI_Rank: tell yourself apart from other and customize the local processes behaviours
    * Find neighbors, select data region, etc.
 
The next demo `passitforward.c` builds a simple ring topology and passes the message around the ring.

<img src="https://upload.wikimedia.org/wikipedia/commons/3/36/MPI_Ring_topology.png" width=256 title="MPI Ring" />


### The MPI Toolchain

Build and launch scripts that wrap a compiler.
    
<img src="./images/mpiscip.png" width=400 /> 

* To compile an MPI program, you call the associated wrapper.
* To run an MPI program:
  * **debug** `mpirun` to launch MPI job on the local machine/cluster
  * **deploy** launch through scheduler on HPC clusters (do not run on the login node)

    
```
mpicc mpimsg.c -o mpimsg
mpirun mpimsg
mpirun -np 16 --oversubscribe mpimsg
```

### MPI Runtime

MPI programs are just C/Fortran that include message passing directives.
One designs an SPMD program that will collaborate to solve a problem that includes:
  * Calls to the MPI library
  * Interactions with the MPI runtime
  
Some calls query or manipulate the runtime:  
* Initialize the environment
  * `MPI_Init ( &argc, &argv )`
* Acquire information for process to differentiate process behavior in SMPD
  * `MPI_Comm_size ( MPI_COMM_WORLD, &num_procs )`
  * `MPI_Comm_rank ( MPI_COMM_WORLD, &ID )`
* And cleanup
  * `MPI_Finalize()`

### MPI Communicators and Groups

The MPI runtime has knowledge of the configuration of the cluster. The nodes of the cluster are connected by the global communicator `MPI_COMM_WORLD`. This specifies the number of nodes `MPI_Comm_size`.

It is possible to make application/task specific scopes with narrower communicators and groups. For example, you may break the global cluster into nodes with and without GPUs. 

<img src="https://cvw.cac.cornell.edu/mpiadvtopics/communicators-groups/communicators.gif?v=pAsH8Kcxy00ZJiis72tMMunpeqgvBx0cw65cxeq95hw" width=368 />

Most MPI programs and all our examples will use only the global scope.
  
## MPI Design Ethos
* MPI is just messaging.
    * And synchronization constructs, which are built on messaging
    * And library calls for discovery and configuration
* Computation is done in C/C++/Fortran SPMD program
* MPI is sometimes called the “assembly language” of supercomputing
    * Simple primitives
    * Build your own communication protocols, application topologies, parallel execution