# Hands-On 3: Parallelization with MPI

Team: Joao Vitor Mendes, Roberto Santana Santos

Welcome to Hands-on _Parallelization with MPI_. This Hands-on comprises 3 sessions. Next table shows the documents and
files needed to develop each one of the exercises.

|  Sessions     | Codes               | files              | 
| --------------| --------------------| ------------------ |
| Session 1     |  Basic Operations   |   operations.c   | 
| Session 2     | Algebraic Function  |  function.c      | 
| Session 3     |  Tridiagonal Matrix |   tridiagonal.c  | 


## `Basic Operations`

The Algorithm below solves the multiplication, addition and subtraction of the elements of a vector of integers. The variable array is the vector on which the operations will be performed. Then, modify the program to run in parallel using MPI. Present the primitives used. The idea is made the following MPI version with only $4$ processes running.In the version, each process does a function: $1$ add, $1$ subtract and $1$ multiplies. The other process is responsible for telling each of the other $3$ its function, and when finished printing the results.

### Sequential Execution

In [1]:
%%writefile operations.c
#include <stdio.h> 
#define SIZE 12

int main (int argc, char **argv)
{
 int i, sum = 0, subtraction = 0, mult = 1; 
 int array[SIZE];

 for(i = 0; i < SIZE; i++) 
  array[i] = i + 1;

 for(i = 0; i < SIZE; i++)
   printf("array[%d] = %d\n", i, array[i]);

 for(i = 0; i < SIZE; i++) 
 {
   sum = sum + array[i];
   subtraction = subtraction - array[i]; 
   mult = mult * array[i];
 }

 printf("Sum = %d\n", sum); 
 printf("Subtraction = %d\n", subtraction); 
 printf("Multiply = %d\n", mult);

 return 0;

}

Writing operations.c


### Run the Code

In [2]:
!gcc operations.c -o operations 

In [3]:
!./operations

array[0] = 1
array[1] = 2
array[2] = 3
array[3] = 4
array[4] = 5
array[5] = 6
array[6] = 7
array[7] = 8
array[8] = 9
array[9] = 10
array[10] = 11
array[11] = 12
Sum = 78
Subtraction = -78
Multiply = 479001600


### Master-Workers MPI Execution

In [16]:
%%writefile operations.c
#include <stdio.h> 
#include <mpi.h>
#define SIZE 12

int main (int argc, char **argv) {
    int i, sum = 0, subtraction = 0, mult = 1; 
    int array[SIZE];
    char operations[] = {'+','-','*'};
    char operationsRec;
    int numberOfProcessors, id, to, from, tag = 1000;
    int result, values;

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, &numberOfProcessors);
    MPI_Comm_rank(MPI_COMM_WORLD, &id);
    MPI_Status status;

    switch(id){
        case 0:
            for(i=0;i<SIZE;i++){
                array[i] = i + 1;
                printf("array[%d] = %d\n", i, array[i]);
            }
            printf("\n");

            for(to = 1; to < numberOfProcessors; to++){
                MPI_Send(&array, SIZE, MPI_INT, to, tag, MPI_COMM_WORLD);
                MPI_Send(&operations[to-1], 1, MPI_CHAR, to, tag, MPI_COMM_WORLD);
            }

            for(to = 1; to < numberOfProcessors; to++){
                MPI_Recv(&result, 1, MPI_INT, to, tag, MPI_COMM_WORLD, &status);
                MPI_Recv(&operationsRec, 1, MPI_CHAR, to, tag, MPI_COMM_WORLD, &status);
                printf ("(%c) = %d\n", operationsRec, result);
            }

        break;

        default:
            MPI_Recv(&array, SIZE, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
            MPI_Recv(&operationsRec, 1, MPI_CHAR, 0, tag, MPI_COMM_WORLD, &status);

            switch(operationsRec) {
                case '+':
                    values = 0;
                    for(i = 0; i < SIZE; i++)
                        values += array[i];
                    break;
                case '-':
                    values = 0;
                    for(i = 0; i < SIZE; i++)
                        values -= array[i];
                    break;
                case '*':
                    values = 0;
                    for(i = 0; i < SIZE; i++)
                        values = values * array[i];
                    break;
            }

            MPI_Send(&values, 1, MPI_INT, 0, tag, MPI_COMM_WORLD);
            MPI_Send(&operationsRec, 1, MPI_CHAR, 0, tag, MPI_COMM_WORLD);

    }
    MPI_Finalize();
    return 0;

}

Overwriting operations.c


### Run the Code

In [17]:
!mpicc operations.c -o operations-mpi

In [18]:
!mpirun -np 4 ./operations-mpi

array[0] = 1
array[1] = 2
array[2] = 3
array[3] = 4
array[4] = 5
array[5] = 6
array[6] = 7
array[7] = 8
array[8] = 9
array[9] = 10
array[10] = 11
array[11] = 12

(+) = 78
(-) = -78
(*) = 0


### Results Analysis

Include the execution time tables for both executions (10 times each), show the Speed-up value, create any needed graphic.

## `Algebraic Function`

The idea of this Hands-on is to make an algorithm that uses the
`MPI_Recv` and `MPI_Send` routines in the Master-Worker Paradigm in such
a way that in the sequential code:

### Sequential Execution

In [1]:
%%writefile function.c
#include <stdio.h>

int main (int argc, char **argv)
{
  double coef[4], total, x;   
  char c;

  printf ("\nf(x) = a*x^3 + b*x^2 + c*x + d\n");          
            
  coef[0] = 1;
  coef[1] = 2;  
  coef[2] = 3;
  coef[3] = 4;
  
  printf("\nf(x)=%lf*x^3+%lf*x^2+%lf*x+%lf\n", coef[0], coef[1], coef[2], coef[3]);

  x = 10;

  total = (coef[0]* x * x * x) + (coef[1]* x * x) + (coef[2]* x + coef[3]); 
    
  printf("\nf(%lf) = %lf*x^3 + %lf*x^2 + %lf*x + %lf = %lf\n", x, coef[0], coef[1], coef[2], coef[3], total);
    
  return 0;
    
}

Writing function.c


### Run the Code

In [2]:
!gcc function.c -o function

In [3]:
!./function


f(x) = a*x^3 + b*x^2 + c*x + d

f(x)=1.000000*x^3+2.000000*x^2+3.000000*x+4.000000

f(10.000000) = 1.000000*x^3 + 2.000000*x^2 + 3.000000*x + 4.000000 = 1234.000000


### Master-Workers Scheme

1. Master
   * Create the Process
   * Shows the format of the function
   * Asks the value of x
   * Sends the values of a, b, c and x to the workers.
   * Shows the result of the function
2. Workers
   * Calculate the function and return value to the master

### Master-Workers MPI Execution

In [12]:
%%writefile function.c
#include <stdio.h>
#include <mpi.h>

int main(int argc, char **argv) {
    double coef[4], total, x;
    double coef_rec, partial;
    int operations_rec, operations_order[] = {1,2};
    char c;

    int numberOfProcessors, id, to, from, tag = 1000;

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, & numberOfProcessors);
    MPI_Comm_rank(MPI_COMM_WORLD, &id);
    MPI_Status status;

    switch(id){
        case 0:
            // Master
            printf ("\nf(x) = a*x^3 + b*x^2 + c*x + d\n");

            coef[0] = 1;
            coef[1] = 2;  
            coef[2] = 3;
            coef[3] = 4;
  
            printf("\nf(x)=%lf*x^3+%lf*x^2+%lf*x+%lf\n", coef[0], coef[1], coef[2], coef[3]);

            x = 10;

            for(to = 1; to < numberOfProcessors; to++){
                MPI_Send(&coef[to-1], 1, MPI_DOUBLE, to, tag, MPI_COMM_WORLD);
                MPI_Send(&x, 1, MPI_DOUBLE, to, tag, MPI_COMM_WORLD);
                MPI_Send(&operations_order[to-1], 1, MPI_INT, to, tag, MPI_COMM_WORLD);
            }

            total = (coef[2]* x + coef[3]);

            for(to = 1; to < numberOfProcessors; to++){
                MPI_Recv(&partial, 1, MPI_DOUBLE, to, tag, MPI_COMM_WORLD, &status);
                total += partial;
            }

            printf("\nf(%lf) = %lf*x^3 + %lf*x^2 + %lf*x + %lf = %lf\n", x, coef[0], coef[1], coef[2], coef[3], total);
            
            break;

        default:
            //Workers

            MPI_Recv(&coef_rec, 1, MPI_DOUBLE, 0, tag, MPI_COMM_WORLD, &status);
            MPI_Recv(&x, 1, MPI_DOUBLE, 0, tag, MPI_COMM_WORLD, &status);
            MPI_Recv(&operations_rec, 1, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);

            switch(operations_rec){
                case 1:
                    partial = (coef_rec * x * x * x);
                    break;
                case 2:
                    partial = (coef_rec * x * x);
                    break;
            }

            MPI_Send(&partial, 1, MPI_DOUBLE, 0, tag, MPI_COMM_WORLD);
        
    }
    
    MPI_Finalize();
    return 0;
}

Overwriting function.c


### Run the Code

In [13]:
!mpicc function.c -o functions-mpi

In [15]:
!mpirun -np 3 ./functions-mpi


f(x) = a*x^3 + b*x^2 + c*x + d

f(x)=1.000000*x^3+2.000000*x^2+3.000000*x+4.000000

f(10.000000) = 1.000000*x^3 + 2.000000*x^2 + 3.000000*x + 4.000000 = 1234.000000


### Results Analysis

## `Tridiagonal Matrix`

### Sequential Execution

In [5]:
%%writefile tridiagonal.c
#include <stdio.h>
#define ORDER 4

void printMatrix (int m[][ORDER]) 
{
  int i, j;
  for(i = 0; i < ORDER; i++) {
    printf ("| ");
    for (j = 0; j < ORDER; j++) {
      printf ("%3d ", m[i][j]);
    }
    printf ("|\n");
  }
  printf ("\n");
}

int main (int argc, char **argv)
{
  int k[3] = {100, 200, 300};
  int matrix[ORDER][ORDER], i, j;

  for(i = 0; i < ORDER; i++) 
  {
    for(j = 0; j < ORDER; j++) 
    {
      if( i == j )
        matrix[i][j] = i + j +1;
      else if(i == (j + 1)) 
      {
        matrix[i][j] = i +  j + 1;
        matrix[j][i] = matrix[i][j];
      } else
           matrix[i][j] = 0;
     }
  }
  printMatrix(matrix);

  for(i = 0; i < ORDER; i++)
  {
    matrix[i][i]     += k[0];    //main diagonal
    matrix[i + 1][i] += k[1];    //subdiagonal
    matrix[i][i + 1] += k[2];    //superdiagonal
  }
  
   printMatrix(matrix);

  return 0;
}

Writing tridiagonal.c


### Run the Code

In [6]:
!gcc tridiagonal.c -o tridiagonal 

In [7]:
!./tridiagonal

|   1   2   0   0 |
|   2   3   4   0 |
|   0   4   5   6 |
|   0   0   6   7 |

| 101 302   0   0 |
| 202 103 304   0 |
|   0 204 105 306 |
|   0   0 206 107 |

*** stack smashing detected ***: terminated


### Master-Workers Scheme

1. Master
   * text
   * text
2. Workers
   * text

### Master-Workers MPI Execution

In [9]:
%%writefile tridiagonal.c
#include <stdio.h>
#include <mpi.h>
#define ORDER 4

void printMatrix (int m[][ORDER]) 
{
  int i, j;
  for(i = 0; i < ORDER; i++) {
    printf ("| ");
    for (j = 0; j < ORDER; j++) {
      printf ("%3d ", m[i][j]);
    }
    printf ("|\n");
  }
  printf ("\n");
}

int main(int argc, char **argv) {
    int k[3] = {100,200,300};
    int matrix[ORDER][ORDER], h, i, j;

    int numberOfProcessors, id, to, from, tag = 1000;
    int matrix_rec[ORDER][ORDER], k_rec,operations_rec, operations[] = {1,2};

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, & numberOfProcessors);
    MPI_Comm_rank(MPI_COMM_WORLD, &id);
    MPI_Status status;

    switch(id) {
      case 0: // MASTER
        // CREATE MATRIX
        for(i = 0; i < ORDER; i++) {
          for(j = 0; j < ORDER; j++) {
            if(i == j)
              matrix[i][j] = i + j + 1;
            else if (i == (j + 1)) {
              matrix[i][j] = i + j + 1;
              matrix[j][i] = matrix[i][j];
            } else
              matrix[i][j] = 0;
          }
        }

        printMatrix(matrix);

        for(to = 1; to < numberOfProcessors; to++){
          MPI_Send(&matrix, (ORDER * ORDER), MPI_INT, to, tag, MPI_COMM_WORLD);
          MPI_Send(&k[to-1], 1, MPI_INT, to, tag, MPI_COMM_WORLD);
          MPI_Send(&operations[to], 1, MPI_INT, to, tag, MPI_COMM_WORLD);
        }

        for(i = 0; i < ORDER; i++)
          matrix[i][i + 1] += k[2];

        for(h = 1; i < numberOfProcessors; i++){
          MPI_Recv(&matrix_rec, (ORDER * ORDER), MPI_INT, h, tag, MPI_COMM_WORLD, &status);
          for(i = 0; i < ORDER; i++){
            for(j = 0; j < ORDER; j++){
              matrix[i][j] = matrix[i][j] + matrix_rec[i][j];
            }
          }
        }

        printMatrix(matrix);
        break;
      default:
        MPI_Recv(&matrix_rec, (ORDER * ORDER), MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
        MPI_Recv(&k_rec, 1, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
        MPI_Recv(&operations_rec, 1, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);

        switch(operations_rec){
          case 0:
            for(i = 0; i < ORDER; i++) {
              matrix_rec[i][i] += k_rec;
            }
            break;
          case 1:
            for(i = 0; i < ORDER; i++) {
              matrix_rec[i + 1][i] += k_rec;
            }
        }

        MPI_Send(&matrix_rec, (ORDER*ORDER), MPI_INT, 0, tag, MPI_COMM_WORLD);
    }

    MPI_Finalize();
    return 0;
}

Overwriting tridiagonal.c


### Run the Code

In [10]:
!mpicc tridiagonal.c -o tridiagonal-mpi

In [13]:
!mpirun -np 3 ./tridiagonal-mpi

|   1   2   0   0 |
|   2   3   4   0 |
|   0   4   5   6 |
|   0   0   6   7 |

|   1 302   0   0 |
|   2   3 304   0 |
|   0   4   5 306 |
|   0   0   6   7 |



### Results Analysis

## References

M. Boratto. Hands-On Supercomputing with Parallel Computing. Available: https://github.com/muriloboratto/Hands-On-Supercomputing-with-Parallel-Computing. 2022.

B. Chapman, G. Jost and R. Pas. Using OpenMP: Portable Shared Memory Parallel Programming. The MIT Press, 2007, USA.