## MPI: Message Passing Interface

* Message passing parallelism
* Cluster computing (no shared memory)
* Process (not thread oriented)
* Parallelism model
  * SPMD: by definition
* MPI environment
  * Application programming interface
  * Implemented in libraries
  * Multi-language support (C/C++ and Fortran)


### SPMD: Single program multiple data

From wikipedia “Tasks are split up and run simultaneously on multiple processors with different input in order to obtain results faster. SPMD is the most common style of parallel programming.”
  * Asynchronous execution of the same program (unlike SIMD)
  
<img src="https://www.sharcnet.ca/help/images/8/8a/SPMD_model.png" width=512 title="SPMD" />



### A first MPI program

* Configure the MPI environment
* Discover yourself
* Take some differentiated activity

all demos in `./mpi` Start with `mpimsg.c`

* Idioms
  * SPMD: all processes run the same program
    * MPI_Rank: tell yourself apart from other and customize the local processes behaviours
    * Find neighbors, select data region, etc.


### MPI Vision circa 1996 (Poster at Supercomputing)

<img src="https://www.netlib.org/mpi/mpi.gif" width=512 />

The goals of the MPI process was to normalize message passing, which was previously spread across many different incompatible libraries that were often machine dependent:
  * portable (code reuse across different hardware, software)
  * multiple vendors
  * extensible (value added libraries/tools/applications)

### The MPI Toolchain

Build and launch scripts that wrap a compiler.
    
<img src="./images/mpiscip.png" width=400 />

* To compile an MPI program, you call the associated wrapper.
* To run an MPI program:
  * **debug** `mpirun` to launch MPI job on the local machine/cluster
  * **deploy** launch through scheduler on HPC clusters (do not run on the login node)

    
```
mpicc mpimsg.c -o mpimsg
mpirun mpimsg
mpirun -np 16 --oversubscribe mpimsg
```

### HPC Scheduler

Schedule many parallel jobs onto a supercomputer based on size, resources needed, priority.
* Maui/Torque
* SLURM
* OGE
Each with their own submission scripts. Not mpirun.
    
<img src="http://docs.alces-flight.com/en/stable/_images/tetris.jpg" width=512 />
    
HPC systems have login nodes that you `ssh` into.  **Do not call `mpirun` on login nodes**
  * this tries to run a parallel job on the login node.
 
<img src="https://portal.tacc.utexas.edu/documents/10157/1181317/Login+and+compute+nodes/dd6fa98c-1695-4e62-8b7b-66f0c83ceba3?t=1436213020000" width=512 />  


### MPI Runtime

MPI programs are just C/Fortran that include message passing directives.
One designs an SPMD program that will collaborate to solve a problem that includes:
  * Calls to the MPI library
  * Interactions with the MPI runtime
  
Some calls query or manipulate the runtime:  
* Initialize the environment
  * `MPI_Init ( &argc, &argv )`
* Acquire information for process to differentiate process behavior in SMPD
  * `MPI_Comm_size ( MPI_COMM_WORLD, &num_procs )`
  * `MPI_Comm_rank ( MPI_COMM_WORLD, &ID )`
* And cleanup
  * `MPI_Finalize()`
  
## MPI Design Ethos

* MPI is just messaging.
  * And synchronization constructs, which are built on messaging
  * And library calls for discovery and configuration
* Computation is done in C/C++/Fortran SPMD program

* I’ve heard MPI called the “assembly language” of supercomputing
  * Simple primitives
  * Build your own communication protocols, application topologies, parallel execution
  * The opposite end of the design space from MR, Spark

