# How does it work?

## Useful Linux commands

**pwd** &emsp; prints the current working directory path.

**ls** &emsp; &emsp; lists directory contents of files and directories.

**cd** &emsp; &emsp;stands for change directory. This command allows the user to move from one directory to another.


## How do I run MPI?

To install MPI on a local machine, go to *Anaconda3 / Anaconda Prompt* and install the mpi4py module. On JupyterHub - open Terminal.

## Simple Python script

Usually, when you run a Python program, you will use something like this:

Similarly, when you run an MPI-enabled program, you will use a launcher to start multiple processes of your program:

The default configuration on most machines, when using OpenMPI, is to use the same number of processes that there are cores available.

It is possible to specify exactly the number of processes you want, for example if you are memory constrained. This would launch your program using 2 processes:

If you try to run a non-MPI-enabled program using the mpirun launcher, you will have 2 instances doing exactly the same thing, not communicating with each other. When doing parallel computation, you want to split your input data and distribute it amongst all participant processes.

## MPI-enabled Python script

The most basic MPI-enabled program would look like this:

In [21]:
%%writefile exercises/myMPI.py

from mpi4py import MPI

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

print("I am rank", rank, "of", size)

Writing exercises/hello.py


We first import the MPI library from the **mpi4py** package. The **comm** variable stands for *communicator*, which is a fundamental part of the MPI programming paradigm: all communication between processes or groups of processes are sent using a communicator. You can see it as a pool of processes participating in a meeting. One communicator, called world, always exist and includes all processes.

The other interesting concept is the *rank*. Each process is assigned a unique number, ranging from 0 to *n*-1, where *n* is the total number of processes. This unique number is called a rank and is used to communicate with one process in particular.

Let’s try our first run of our MPI program:

In [18]:
! mpiexec -np 4 python exercises/myMPI.py

I am rank 2 of 4
I am rank 0 of 4
I am rank 1 of 4
I am rank 3 of 4


As expected, we have 4 ranks, numbered 0 to 3. The interesting thing to notice is that the output is not sequential. This is the first thing to remember: those processes are really independent processes. They do their own thing, whenever they are ready, unless we synchronize them in some way, either explicitly or by adding communication between them.