############# Markdown note ##################

<div class="alert alert-block alert-info"> <b>NOTE</b> Use blue boxes for Tips and notes. </div>

<div class="alert alert-block alert-success"> Use green boxes sparingly, and only for some specific purpose that the other boxes can't cover. For example, if you have a lot of related content to link to, maybe you decide to use green boxes for related links from each section of a notebook. </div>

<div class="alert alert-block alert-warning"> Use yellow boxes for examples that are not inside code cells, or use for mathematical formulas if needed. </div>

<div class="alert alert-block alert-danger"> In general, just avoid the red boxes. </div>

<img src="<path>" width=20% style="margin-left:auto; margin-right:auto">

In [None]:
%%sh

# reset all programs
rm -rf debug*

In [None]:
%%sh

# reset all programs
rm -rf debug*# MPI Introduction

An introduction to basic concept of **Message Passing Interface** (MPI)

# MPI Communications

Communications with **Message Passing Interface** (MPI)

## Point-to-point communications

Basic communication method provided by MPI library - communication between 2 processes.

* Source process `A` sends a **message** to destination process `B`, `B` then receives the message from `A`;
* Communication take places within a **communicator**;
* The processes are identified by their **rank** in the communicator.

<img src="./Images/send.png" width=40% style="margin-left:auto; margin-right:auto">

### Send and Recv

Calls used to send and receive a simple message.

* `MPI_Send`: see https://www-lb.open-mpi.org/doc/v4.1/man3/MPI_Send.3.php
* `MPI_Recv`: see https://www-lb.open-mpi.org/doc/v4.1/man3/MPI_Recv.3.php

In [None]:
%%writefile main_send_recv.cpp

#include <iostream>
#include <mpi.h>

int main(int argc, char **argv) 
{
    MPI_Init(&argc, &argv);
    
    int process_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &process_rank);
    
    int tag = 10;
    float a[2] = { 1.2, 3.4 };
    float b[2] = { 0.0, 0.0 } ;
    MPI_Status status;
    
    if (process_rank == 0)
        MPI_Send(&a, 2, MPI_FLOAT, 1, tag, MPI_COMM_WORLD);
    else if (process_rank == 1)
        MPI_Recv(&b, 2, MPI_FLOAT, 0, tag, MPI_COMM_WORLD, &status);

    std::cout<< "Process "<< process_rank<< " ";
    std::cout<< "Status SOURCE: "<< status.MPI_SOURCE<< " ";
    std::cout<< "TAG: "<< status.MPI_TAG<< " ";
    std::cout<< "ERROR: "<< status.MPI_ERROR<< std::endl;
    
    std::cout<< "Process "<< process_rank<< " ";
    std::cout<< "a "<< a[0]<< ", "<< a[1]<< " ";
    std::cout<< "b "<< b[0]<< ", "<< b[1]<< std::endl; 
    
    MPI_Finalize();
    return 0;
}

In [None]:
%%sh

# compile program
mkdir -p ./debug_send_recv
cd debug_send_recv
cmake -DSOURCES="main_send_recv.cpp" ..
make

In [None]:
%%sh

# run program
cd debug_send_recv
mpirun -np 2 2_MPI_Communications

<div class="alert alert-block alert-warning"> <b>NOTE</b>: the use of the IF statements - remember each task runs exactly the same program. </div>

## Message

Composed by a **buffer** and an **envelope**.
 
* Data is exchanged in the buffer, an array of count elements of some particular **MPI data type**;
* The envelope identifies the message. A message could be exchanged **only if** the sender and receiver specify the correct envelope.

<img src="./Images/message.png" width=60% style="margin-left:auto; margin-right:auto">

## DataTypes

MPI Data types can be:
* Basic types
* Derived types (`MPI_Type_xxx` functions)

<div class="alert alert-block alert-info"> <b>NOTE</b>: a derived type can be built up from basic types. </div>

MPI defines **handles** to allow programmers to refer to data types and structures

<div class="alert alert-block alert-warning"> C/C++ handles are macros to structs (<code>#define MPI_INT</code> …) </div>

### C/C++ MPI Data Types

<img src="./Images/types.png" width=90% style="margin-left:auto; margin-right:auto">

DataTypes can be created with different MPI routines, for example:

* `MPI_Pack`: https://www.open-mpi.org/doc/v4.1/man3/MPI_Pack.3.php
* `MPI_Type_create_struct`: https://www.open-mpi.org/doc/v4.1/man3/MPI_Type_create_struct.3.php
* ...

<div class="alert alert-block alert-success"> Before using a new DataType, we shall <b>commit</b> it. </div>

* `MPI_Type_commit`: https://www.open-mpi.org/doc/v4.1/man3/MPI_Type_commit.3.php

<div class="alert alert-block alert-danger"> <b>REMARK</b>: Once a new data type is created we shall <b>destroy</b> it before closing the application:</div>

* `MPI_Type_free`: https://www.open-mpi.org/doc/v4.1/man3/MPI_Type_free.3.php

<div class="alert alert-block alert-info"> <b>NOTE</b>: We are going to see more about data types when we are going to see <b>advanced</b> data types. </div>

In [None]:
%%writefile main_struct.cpp

#include <iostream>
#include <mpi.h>

struct Car 
{
    int Model;
    int Color;
};

int main(int argc, char **argv) 
{
    MPI_Init(&argc, &argv);

    const int tag = 13;
    int size, rank;
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    // create a MPI type for struct Car
    const int nitems = 2; // number of struct fields
    int blocklengths[nitems] = { 1, 1 }; // lengths of types for each struct field
    MPI_Datatype types[nitems] = { MPI_INT, MPI_INT }; // MPI types of each struct field
    MPI_Datatype mpi_car_type; // the new MPI dataType
    MPI_Aint offsets[nitems]; // offset computed directly from fields

    offsets[0] = offsetof(Car, Model);
    offsets[1] = offsetof(Car, Color);

    // create the new dataType
    MPI_Type_create_struct(nitems, blocklengths, offsets, types, &mpi_car_type);
    MPI_Type_commit(&mpi_car_type); // commit operation

    Car car = { 0, 0 };
    
    if (rank == 0) 
    {
        car.Model = 4;
        car.Color = 100;

        MPI_Send(&car, 1, mpi_car_type, 1, tag, MPI_COMM_WORLD);

        std::cout<< "Process "<< rank<< ": sent structure car"<< std::endl;
    }
    else if (rank == 1) 
    {
        MPI_Status status;
        MPI_Recv(&car, 1, mpi_car_type, 0, tag, MPI_COMM_WORLD, &status);
        std::cout<< "Process "<< rank<< ": recv structure car"<< std::endl;
    }
    
    std::cout<< "Process "<< rank<< ": car.Model "<< car.Model<< " car.Color "<< car.Color<< std::endl;

    MPI_Type_free(&mpi_car_type); // destroy operation

    MPI_Finalize();
    return 0;
}

In [None]:
%%sh

# compile program
mkdir -p ./debug_struct
cd debug_struct
cmake -DSOURCES="main_struct.cpp" ..
make

In [None]:
%%sh

# run program
cd debug_struct
mpirun -np 3 2_MPI_Communications

## More about communications...

For a communication to succeed:

1. Always specify a **valid** sorce/destination rank in the communicator;
3. The communicator must be **the same**;
4. Tags must **match**;
5. Buffers must be **large enough**!

<div class="alert alert-block alert-danger">Check very carefully all the arguments - the command may succeed but with wrong data. </div>

In a perfect world, every send operation would be perfectly synchronized with its matching receive. 
This is **rarely** the case.

<div class="alert alert-block alert-success">The MPI implementation is able to deal with storing data when the two tasks are out of sync. </div>

## Blocking and Non-blocking communications

MPI point-to-point routines can be used in either **blocking** or **non-blocking** mode.

<div class="alert alert-block alert-info"> Non-blocking communications are identified by prefix <code>I</code></div>

<img src="./Images/non_block.png" width=90% style="margin-left:auto; margin-right:auto">

<div class="alert alert-block alert-warning"><b>NOTE</b>: Not always possible but worth trying - depends how much
calculation can be done which does not require the transferred data.</div>

### Isend and Irecv

Calls used to send and receive a non-blocking message.

* `MPI_Isend`: see https://www.open-mpi.org/doc/v4.1/man3/MPI_Isend.3.php
* `MPI_Irecv`: see https://www.open-mpi.org/doc/v4.1/man3/MPI_Irecv.3.php

<div class="alert alert-block alert-danger"> <b>REMARK</b>: we should <b>wait</b> for completing each non-blocking operation:</div>

* `MPI_Wait`: https://www.open-mpi.org/doc/v4.1/man3/MPI_Wait.3.php
* `MPI_Waitall`: https://www.open-mpi.org/doc/v4.1/man3/MPI_Waitall.3.php
* ...

In [None]:
%%writefile main_isend_irecv.cpp

#include <iostream>
#include <mpi.h>

int main(int argc, char **argv) 
{
    MPI_Init(&argc, &argv);
    
    int process_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &process_rank);
    
    int tag = 10;
    float a[2] = { 0.0, 0.0 };
    float b[2] = { 0.0, 0.0 };
    MPI_Status status[2];
    MPI_Request request[2];
    
    if (process_rank == 0)
    {
        a[0] = 1.0; a[1] = 2.0;
        MPI_Irecv(&b, 2, MPI_FLOAT, 1, tag + 1, MPI_COMM_WORLD, &request[1]);
        MPI_Isend(&a, 2, MPI_FLOAT, 1, tag, MPI_COMM_WORLD, &request[0]);
    }
    else if (process_rank == 1)
    {
        b[0] = 3.0; b[1] = 4.0;
        MPI_Irecv(&a, 2, MPI_FLOAT, 0, tag, MPI_COMM_WORLD, &request[1]);
        MPI_Isend(&b, 2, MPI_FLOAT, 0, tag + 1, MPI_COMM_WORLD, &request[0]);
    }

    std::cout<< "Process "<< process_rank<< " waiting..."<< std::endl;
    
    MPI_Wait(&request[0], &status[0]);
    MPI_Wait(&request[1], &status[1]);
    //MPI_Waitall(2, request, status); // alternative
    
    std::cout<< "Process "<< process_rank<< " ";
    std::cout<< "Status[0] SOURCE: "<< status[0].MPI_SOURCE<< " ";
    std::cout<< "TAG: "<< status[0].MPI_TAG<< " ";
    std::cout<< "ERROR: "<< status[0].MPI_ERROR<< std::endl;
    
    std::cout<< "Process "<< process_rank<< " ";
    std::cout<< "Status[1] SOURCE: "<< status[1].MPI_SOURCE<< " ";
    std::cout<< "TAG: "<< status[1].MPI_TAG<< " ";
    std::cout<< "ERROR: "<< status[1].MPI_ERROR<< std::endl;
    
    std::cout<< "Process "<< process_rank<< " ";
    std::cout<< "a "<< a[0]<< ", "<< a[1]<< " ";
    std::cout<< "b "<< b[0]<< ", "<< b[1]<< std::endl; 
    
    MPI_Finalize();
    return 0;
}

In [None]:
%%sh

# compile program
mkdir -p ./debug_isend_irecv
cd debug_isend_irecv
cmake -DSOURCES="main_isend_irecv.cpp" ..
make

In [None]:
%%sh

# run program
cd debug_isend_irecv
mpirun -np 2 2_MPI_Communications

## Deadlock

A **Deadlock** or a *Race condition* occurs when $2$ (or more) processes are **blocked**, and each is waiting
for the other to make progress.

<img src="./Images/deadlock.png" width=60% style="margin-left:auto; margin-right:auto">

<div class="alert alert-block alert-warning"><b>NOTE</b>: the allocated time (and budget) may expire but <b>no work</b> is done.</div>

In [None]:
%%writefile main_deadlock.cpp

#include <iostream>
#include <mpi.h>

int main(int argc, char **argv) 
{
    MPI_Init(&argc, &argv);
    
    int process_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &process_rank);
    
    int tag = 10;
    float a[2] = { 0.0, 0.0 };
    float b[2] = { 0.0, 0.0 };
    MPI_Status status;
    
    if (process_rank == 0)
    {
        a[0] = 1.0; a[1] = 2.0;
        MPI_Recv(&b, 2, MPI_FLOAT, 1, tag + 1, MPI_COMM_WORLD, &status);
        MPI_Send(&a, 2, MPI_FLOAT, 1, tag, MPI_COMM_WORLD);
    }
    else if (process_rank == 1)
    {
        b[0] = 3.0; b[1] = 4.0;
        MPI_Recv(&a, 2, MPI_FLOAT, 0, tag, MPI_COMM_WORLD, &status);
        MPI_Send(&b, 2, MPI_FLOAT, 0, tag + 1, MPI_COMM_WORLD);
    }
        
    std::cout<< "Process "<< process_rank<< " ";
    std::cout<< "Status SOURCE: "<< status.MPI_SOURCE<< " ";
    std::cout<< "TAG: "<< status.MPI_TAG<< " ";
    std::cout<< "ERROR: "<< status.MPI_ERROR<< std::endl;
        
    std::cout<< "Process "<< process_rank<< " ";
    std::cout<< "a "<< a[0]<< ", "<< a[1]<< " ";
    std::cout<< "b "<< b[0]<< ", "<< b[1]<< std::endl; 
    
    MPI_Finalize();
    return 0;
}

In [None]:
%%sh

# compile program
mkdir -p ./debug_deadlock
cd debug_deadlock
cmake -DSOURCES="main_deadlock.cpp" ..
make

In [None]:
%%sh

# run program
cd debug_deadlock
mpirun -np 2 2_MPI_Communications