# MPI Tutorial

MPI stands for Message Passing Interface. It is a standard specification for the developers and users of message passing libraries, often referred to as MPI libraries. MPI libraries allow processes to send messages among each other, even if they are executed on different compute nodes. Thus, it allows us to distribute a program across different computers. 

Let's begin by reviewing some of the concepts of MPI in [this slide deck](slides/Tapia2022IntroMPI.pdf). You will be able to use these slides as reference of the following examples.
 

## Example 1 - Hello World

Let's run our first MPI program. Go to [mpi_hello.c](c/mpi_hello.c) and read the code. 

How many times will each `printf` execute?

In [None]:
!cd c && make -f Makefile mpi_hello && srun mpirun --oversubscribe -N 10 ./mpi_hello

## Example 2 - Dot Product

 MPI Example - Dot (scalar) product of two vectors  - C Version
 This program demonstrates a simple data decomposition. The master task
 first initializes two  arrays and then distributes an equal portion of each
 array to the other tasks. After the other tasks receive their portion
 of the arrays, they perform an dot product operation to each array element.
 They also maintain a sum for their portion of the array. The master task
 does likewise with its portion of the array. As each of the non-master
 tasks finish, they send their updated portion of the array to the master.
 An MPI collective communication call is used to collect the sums
 maintained by each task.

 Notice the use of MASTER to select the region of code to execute. Likewise, notice the use
of `MPI_send` and `MPI_recv` in pairs to send and receive information between process.

Go to [mpi_DotProd.c](c/mpi_DotProd.c) to take a look at the code. 
 
Play with the number of nodes and notice if there are changes in the execution time

In [None]:
!cd c && make -f Makefile mpi_DotProd && srun mpirun --oversubscribe -N 12 ./mpi_DotProd

## Example 3 - Array

This program demonstrates a simple data decomposition. The master task
first initializes an array and then distributes an equal portion that
array to the other tasks. After the other tasks receive their portion
of the array, they perform an addition operation to each array element.
They also maintain a sum for their portion of the array. The master task
does likewise with its portion of the array. As each of the non-master
tasks finish, they send their updated portion of the array to the master.
An MPI collective communication call is used to collect the sums
maintained by each task.  Finally, the master task displays selected
parts of the final array and the global sum of all array elements.

`NOTE: the number of MPI tasks must be evenly divided by 4.`

Go to [mpi_array.c](c/mpi_array.c) to checkout the code.

Notice the `MPI_reduce` with the `MPI_SUM` operation. This is a collective operation.
Can you describe this operation? 

In [None]:
!cd c && make -f Makefile mpi_array && srun mpirun --oversubscribe -N 12 ./mpi_array

## Example 4 - Matrix Multiplication

 MPI Matrix Multiply - C Version
 In this code, the master task distributes a matrix multiply
 operation to numtasks-1 worker tasks.

This example distributes across rows. Can you re-write this
example to distribute columns? What are the complications? 

Go to [mpi_mm.c](c/mpi_mm.c) and checkout the code.

Hint: Remember the reduction operation in the previous example.


In [None]:
!cd c && make -f Makefile mpi_mm && srun mpirun --oversubscribe -N 12 ./mpi_mm

## Example 5 - Calculating PI

MPI pi Calculation Example - C Version
Point-to-Point communications example
This program calculates pi using a "dartboard" algorithm.  See
Fox et al.(1988) Solving Problems on Concurrent Processors, vol.1
page 207.  All processes contribute to the calculation, with the
master averaging the values for pi. This version uses low level
sends and receives to collect results.

Go to [mpi_pi_send.c](c/mpi_pi_send.c) to take a look at the code

This example uses a real application to demonstrate the use of MPI.

In [None]:
!cd c && make -f Makefile mpi_pi_send && srun mpirun --oversubscribe -N 12 ./mpi_pi_send

## Final exercise

Can you write a program that calculates the average of an array of elements in parallel?

Hint: Remember the reduction operation