# <center>Introduction to MPI</center>

## MPI (Message Passing Interface)

### Overview
The Message Passing Interface (MPI) is a standardized and portable message-passing system designed to function on parallel computing architectures. MPI is widely used for parallel programming in high-performance computing (HPC) environments.

MPI addresses the message-passing parallel programming model: data is **moved from the address space** of one process to that of another process through cooperative operations on each process.

### MPI Standard
The MPI standard defines the syntax and semantics of library routines that can be used to write portable message-passing programs in C, C++, and Fortran. The most current version of MPI is MPI-3.1., but The MPI standard has gone through a number of revisions, with the most recent version being MPI-4.x

### MPI Implementations
There are several implementations of the MPI standard. Two of the most widely used implementations are:
- **MPICH**: A high-performance and widely portable implementation of MPI.
- ** INTELMPI**: Intel specific implementation 
- **OpenMPI**: An open-source MPI implementation that is developed and maintained by a consortium of academic, research, and industry partners.

OpenMPI offer MPI Build Script for Linux Clusters, 


|Implementation |language   |ScriptName | Underlying Compiler|
|  --- |    --- |   --- |   --- |
|Open MPI       |	C	    | mpicc	    |C compiler for loaded compiler package|
|               |   C++	| - mpiCC <br/> - mpic++ <br/>- mpicxx	    |C++ compiler for loaded compiler package|
|               |   Fortran	|   -mpif77 <br/> - mpif90	| Fortran77 compiler for loaded compiler package <br/>Fortran90 compiler for loaded compiler package. Points to mpifort.|




## Setting Up the Environment
To start programming with MPI in C or C++, you need to have an MPI library installed. For this tutorial, we'll use OpenMPI. Below are the steps to install and compile MPI programs using GCC and OpenMPI.

### Installation of OpenMPI
You can install OpenMPI on a Unix-based system using a package manager. For example, on Ubuntu, you can use:
```bash
sudo apt-get update
sudo apt-get install openmpi-bin openmpi-common libopenmpi-dev
```
***Note:***: All libs and wrappers are installing, please don't try it. 

## Compiling MPI Programs
MPI programs are compiled using the `mpicc` or `mpiCC` compiler wrappers, which are part of the OpenMPI package. These wrappers call the underlying compiler (e.g., GCC) with the correct flags and libraries.


`hello_world.c`
```C
#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {
    MPI_Init(&argc, &argv);

    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

    int world_size;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);

    printf("Hello world from rank %d out of %d processors\n", world_rank, world_size);

    MPI_Finalize();
    return 0;
}

```


Let's compile the program using mpi wrapper compiler

In [None]:
!srun mpicc Code/hello_world.c -o Code/hello_world.o

which should create an executable file called `hello_world`. And now execute the program, and see in details the runtime execution using `mpirun` command or summit it to current cluster using a Slurm Job Manager with command `srun` and the parameter used

In [None]:
!srun -N 3 ./Code/hello_world.o  # This example uses 4 processes on 2 nodes

Note that the execution block is enclosed in a function called `main()`, which returns the value 0 if it is completed successfully. The declaration of `main()` is mandatory in C/C++.


## MPI Basics

### MPI Initialization and Finalization
- **MPI_Init**: Initializes the MPI execution environment.
- **MPI_Finalize**: Terminates the MPI execution environment.

#### Example: Initialization and Finalization
```c
#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {
    MPI_Init(&argc, &argv);

    printf("MPI environment initialized.\n");

    MPI_Finalize();
    printf("MPI environment finalized.\n");
    return 0;
}
```

### Point-to-Point Communication
#### Explanation
- **MPI_Send**: Sends a message to another process.
- **MPI_Recv**: Receives a message from another process.

#### Example: Send and Receive
```c
#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {
    MPI_Init(&argc, &argv);

    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

    if (world_rank == 0) {
        int data = 100;
        MPI_Send(&data, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
        printf("Process 0 sent data %d to process 1\n", data);
    } else if (world_rank == 1) {
        int data;
        MPI_Recv(&data, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
        printf("Process 1 received data %d from process 0\n", data);
    }

    MPI_Finalize();
    return 0;
}
```


### MPI Communicators
#### Explanation
- **MPI_COMM_WORLD**: Default communicator including all processes.
- **MPI_Comm_size**: Determines the size of the group associated with a communicator.
- **MPI_Comm_rank**: Determines the rank of the calling process in the communicator.

In [None]:
!srun mpic++ Code/mpi_send_recv.c -o code/mpi_send_recv.o

In [None]:
!srun -N 2 ./Code/mpi_send_recv.o

### MPI  Data Types

To share data between nodes, is required use same data types, to cast values and manipulate side to side. MPI predefines its primitive data types:


|C Data Types 1| C Data Types 2|
|   ---  |  --- |	
|MPI_CHAR<br/>MPI_WCHAR<br/>MPI_SHORT<br/>MPI_INT<br/>MPI_LONG<br/>MPI_LONG_LONG_INT<br/>MPI_LONG_LONG<br/>MPI_SIGNED_CHAR<br/>MPI_UNSIGNED_CHAR<br/>MPI_UNSIGNED_SHORT<br/>MPI_UNSIGNED_LONG<br/>MPI_UNSIGNED<br/>MPI_FLOAT<br/>MPI_DOUBLE<br/>MPI_LONG_DOUBLE|MPI_C_COMPLEX<br/>MPI_C_FLOAT_COMPLEX<br/>MPI_C_DOUBLE_COMPLEX<br/>MPI_C_LONG_DOUBLE_COMPLEX<br/>MPI_C_BOOL<br/>MPI_LOGICAL<br/>MPI_C_LONG_DOUBLE_COMPLEX<br/>MPI_INT8_T<br/>MPI_INT16_T<br/>MPI_INT32_T<br/>MPI_INT64_T<br/>MPI_UINT8_T<br/>MPI_UINT16_T<br/>MPI_UINT32_T<br/>MPI_UINT64_T<br/>MPI_BYTE<br/>MPI_PACKED|


## MPI Collective Communication

The Type of Collective communication on MPI are:

![Collective](./img/collective_comm.gif)

### Broadcast
#### Explanation
- **MPI_Bcast**: Broadcasts a message from the process with rank "root" to all other processes in the communicator.

#### Example: Broadcast
```c
#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {
    MPI_Init(&argc, &argv);

    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

    int data = 0;
    if (world_rank == 0) {
        data = 100;
    }
    MPI_Bcast(&data, 1, MPI_INT, 0, MPI_COMM_WORLD);
    printf("Process %d received data %d\n", world_rank, data);

    MPI_Finalize();
    return 0;
}
```



|MPI Reduction Operation|	C Data Types	|
|   --- |   --- |
|MPI_MAX|	maximum	|integer, float	|
|MPI_MIN|	minimum	integer, float	|
|MPI_SUM|	sum	|integer, float	|
|MPI_PROD|	product	|integer, float	|
|MPI_LAND|	logical AND|	integer	|
|MPI_BAND|	bit-wise AND|integer MPI_BYTE|	
|MPI_LOR|	logical OR|	integer	|
|MPI_BOR|	bit-wise OR	integer, MPI_BYTE|	
|MPI_LXOR|	logical XOR	|integer	
|MPI_BXOR|	bit-wise XOR	|integer, MPI_BYTE|
|MPI_MAXLOC|	max value and location	|float, double and long double|	
|MPI_MINLOC|	min value and location	|float, double and long double|	


In [None]:
!srun mpic++ Code/mpi_broadcast.c -o code/mpi_broadcast.o

In [None]:
!srun -N 3 ./Code/mpi_broadcast.o

You can play with the code on mpi_broadcast.cpp

In MPI, gather and scatter are collective communication operations used for data distribution and collection among processes in a parallel computing environment.

### MPI Scatter:
The MPI_Scatter operation is used to distribute distinct chunks of data from the root process to all other processes in the communicator. Each process receives a unique portion of the data. It is commonly used to divide large datasets so each process can work on its part independently.

### MPI Gather:
The MPI_Gather operation is the reverse of scatter. It collects data from all processes and gathers them into one array in the root process. Each process contributes a portion of data, and the root process stores these in order.

In [None]:
!srun mpic++ Code/mpi_scatter_gather.c -o code/mpi_scatter_gather.o

In [None]:
!srun -N 2 /Code/mpi_scatter_gather.o

### Exercise: MPI Reductions with a Fixed Dataset
Objective: In this exercise, you will learn how to use MPI reduction operations (MPI_Reduce) to combine data across multiple processes. The exercise will involve summing values across processes, where each process contributes a fixed value, and the final sum is collected and printed by the root process.

#### Task:
- Initialize MPI and obtain the rank of the process.
- Each process will have a fixed integer value (use the process rank as the value).
- Use MPI_Reduce to compute the sum of all the values from each process.
- The root process (rank 0) will print the result of the sum.
- Add code comments to explain the purpose of the various parts of the program.

#### Steps:
Set up MPI.
- Use MPI_Reduce to sum the values of all processes.
- Only the root process should print the final sum.

In [None]:
!srun mpicc Exercises/mpi_reductions.c -o Exercises/mpi_reductions.o

In [None]:
!srun -N 2 -n 4 --ntasks-per-node=2 ./Exercises/mpi_reductions.o

In [None]:
See a possible solution to Exercise 1.

In [None]:
!srun mpicc Solutions/mpi_reductions.c -o Solutions/mpi_reductions.o

In [None]:
!srun -N 2 -n 4 --ntasks-per-node=2 ./Solutions/mpi_reductions.o