##### Exercise 1 Hello World

1. Write an MPI program displaying the number of processes used for the execution and the rank of each process.
2. Test the programs obtained with different numbers of threads for the parallel program.

**Output Example**
```shell
Hello from the rank 2 process
Hello from the rank 0 process
Hello from the rank 3 process
Hello from the rank 1 process
Parallel execution of hello_world with 4 process
```
*Note that the output order maybe different*

In [1]:
%%file hello.py
from mpi4py import MPI
COMM = MPI.COMM_WORLD
SIZE = COMM.Get_size()
RANK = COMM.Get_rank()
print("Hello from the rank ", RANK," process", "\n")
if RANK==1:
    print("Parallel execution of hello_world with ",SIZE," process")

Overwriting hello.py


In [2]:
# enter command for compile and run the program
!mpirun -n 2 python3 hello.py 

Hello from the rank  0  process 

Hello from the rank  1  process 

Parallel execution of hello_world with  2  process


## Exercise 2 Sharing Data 

A common need is for one process to get data from the user, either by reading from the terminal or command line arguments, and then to distribute this information to all other processors.

Write a program that reads an integer value from the terminal and distributes the value to all of the MPI processes. Each process should print out its rank and the value it received. Values should be read until a negative integer is given as input.

You may want to use these MPI routines in your solution:
`Get_rank` `Bcast` 

**Output Example**
```shell
10
Process 0 got 10
Process 1 got 10
```

In [3]:
 %%file sharing.py
from mpi4py import MPI
COMM = MPI.COMM_WORLD
SIZE = COMM.Get_size()
RANK = COMM.Get_rank()

recv = 5
send =1
# COMM.Barrier()
while recv > 0:
    if RANK == 0 :
      send = int(input('Enter a number : ')) 
        # send = 10
      print(send)
    else: 
      send = None

    recv = COMM.bcast(send, root=0)
    print("Process ",RANK, " got ", recv)


Overwriting sharing.py


In [4]:
# enter command for compile and run the program
!mpirun -n 2 python3 sharing.py 

Enter a number : ^C


In [5]:
send = int(input("Enter the number to share :"))
send

Enter the number to share :3


3

## Exercise 3 Sending in a ring (broadcast by ring)

Write a program that takes data from process zero and sends it to all of the other processes by sending it in a ring. That is, process i should receive the data and send it to process i+1, until the last process is reached.
Assume that the data consists of a single integer. Process zero reads the data from the user.
![](../data/ring.gif)

You may want to use these MPI routines in your solution:
`Send` `Recv` 

In [18]:
 %%file sending.py
from mpi4py import MPI
COMM = MPI.COMM_WORLD
SIZE = COMM.Get_size()
RANK = COMM.Get_rank()

tag = 8
# COMM.Barrier()
n = 2 #int(input("Enter the number of processor used :"))
if RANK == 0:
    send = 10
    recv = 10
    COMM.send(send, dest=RANK+1, tag=tag)
else:
    if n==2:
        recv = COMM.recv(source=0, tag=tag)
    else:
        for i in range(1,n):
            RANK = i
            recv = COMM.recv(source=RANK-1, tag=tag)
            COMM.send(recv, dest=RANK+1, tag=tag)
            
print("Process",RANK, " got ", recv)

Overwriting sending.py


In [17]:
# enter command for compile and run the program
!mpirun -n 2 python3 sending.py 

^C


## Exercise 4 Matrix vector product

1. Use the `MatrixVectorMult.py` file to implement the MPI version of matrix vector multiplication.
2. Process 0 compares the result with the `dot` product.
3. Plot the scalability of your implementation. 

**Output Example**
```shell
CPU time of parallel multiplication using 2 processes is  174.923446
The error comparing to the dot product is : 1.4210854715202004e-14
```

In [64]:
%%file scatter.py
from mpi4py import MPI
import numpy as np
from scipy.sparse import lil_matrix
from numpy.random import rand, seed
from numba import njit

COMM = MPI.COMM_WORLD
nb_proc = COMM.Get_size()
RANK = COMM.Get_rank()

SIZE = 1000
if RANK == 0 :
    A = lil_matrix((SIZE, SIZE))
    A[0, :100] = rand(100)
    A[1, 100:200] = A[0, :100]
    A.setdiag(rand(SIZE))
    A = A.toarray()
else :
    A = None

Local_size = SIZE//2
LocalMatrix = np.zeros((Local_size ,SIZE))
# Scatter the matrix A

COMM.Scatter(A,LocalMatrix,root=0)

#assert recvbuf == (RANK+1)**2
print("I, process ",RANK, " I received data ",LocalMatrix, " from the process", 0)

X=np.zeros((SIZE,SIZE))
COMM.Gather(LocalMatrix,X ,root=0)
print(X)
print('shape : ', X.shape)

Overwriting scatter.py


In [132]:
A

array([[0.8160129 , 0.63788461, 0.52494404, ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.98211563, 0.        , ..., 0.        , 0.        ,
        0.        ],
       [0.        , 0.        , 0.22995861, ..., 0.        , 0.        ,
        0.        ],
       ...,
       [0.        , 0.        , 0.        , ..., 0.09043132, 0.        ,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 0.08127379,
        0.        ],
       [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
        0.78361308]])

In [41]:
rand(10)

array([0.77356369, 0.51917503, 0.07834313, 0.26339076, 0.79805991,
       0.97991976, 0.54519487, 0.69086101, 0.00357394, 0.5188742 ])

In [68]:
 %%file MatrixVectorMult_V0.py
import numpy as np
from scipy.sparse import lil_matrix
from numpy.random import rand, seed
from numba import njit
from mpi4py import MPI
 # write your program here
''' This program compute parallel csc matrix vector multiplication using mpi '''

COMM = MPI.COMM_WORLD
nbOfproc = COMM.Get_size()
RANK = COMM.Get_rank()

seed(42)

def matrixVectorMult(A, b, x):
    
    row, col = A.shape
    for i in range(row):
        a = A[i]
        for j in range(col):
            x[i] += a[j] * b[j]

    return 0

########################initialize matrix A and vector b ######################
#matrix sizes
SIZE = 1000
Local_size = SIZE//2

# counts = block of each proc
#counts = 

if RANK == 0:
    A = lil_matrix((SIZE, SIZE))
    A[0, :100] = rand(100)
    A[1, 100:200] = A[0, :100]
    A.setdiag(rand(SIZE))
    A = A.toarray()
    b = rand(SIZE)
else :
    A = None
    b = None
    
    
#########Send b to all procs and scatter A (each proc has its own local matrix#####
LocalMatrix = np.zeros((Local_size ,SIZE))
# Scatter the matrix A

b = COMM.bcast(b, root=0)

COMM.Scatter(A,LocalMatrix,root=0)

##################### Compute A*b locally #######################################
LocalX = np.zeros(Local_size)
#LocalMatrix = A
start = MPI.Wtime()
matrixVectorMult(LocalMatrix, b, LocalX)
stop = MPI.Wtime()
if RANK == 0:
    print("CPU time of parallel multiplication is ", (stop - start)*1000)

################## Gather te results ###########################################
# sendcouns = local size of result
#sendcounts = 
if RANK == 0: 
    X = np.zeros(SIZE)
else :
    X = None

#Gather the result into X
sendbuf = LocalX
COMM.Gather(sendbuf,X,root=0)

################## Print the results ###########################################

if RANK == 0 :
    X_ = A.dot(b)
    print("The result of A*b using dot is :", X_)
    print("\n")
    print("\n")
    print("The result of A*b using parallel version is :", X)

Overwriting MatrixVectorMult_V0.py


In [87]:
# enter command for compile and run the program
!mpirun -n 2 python3 MatrixVectorMult_V0.py 

## Exercise 5 Calculation of π (Monte Carlo)

1. Use the `PiMonteCarlo.py` file to implement the calculation of PI using Monte Carlo.
2. Process 0 prints the result.
3. Plot the scalability of your implementation. 

In [48]:
 %%file PiMonteCarlo_V0.py
 # write your program here
import random 
import timeit
import numpy as np
from scipy.sparse import lil_matrix
from numpy.random import rand, seed
from numba import njit
from mpi4py import MPI
 # write your program here
''' This program compute parallel csc matrix vector multiplication using mpi '''

COMM = MPI.COMM_WORLD
nbOfproc = COMM.Get_size()
RANK = COMM.Get_rank()

INTERVAL= 1000

random.seed(42)  

def compute_points():
    
    random.seed(42)  
    
    circle_points= 0

    # Total Random numbers generated= possible x 
    # values* possible y values 
    for i in range(INTERVAL**2): 
      
        # Randomly generated x and y values from a 
        # uniform distribution 
        # Rannge of x and y values is -1 to 1 
                
        rand_x= random.uniform(-1, 1) 
        rand_y= random.uniform(-1, 1) 
      
        # Distance between (x, y) from the origin 
        origin_dist= rand_x**2 + rand_y**2
      
        # Checking if (x, y) lies inside the circle 
        if origin_dist<= 1: 
            circle_points+= 1
      
        # Estimating value of pi, 
        # pi= 4*(no. of points generated inside the  
        # circle)/ (no. of points generated inside the square) 
    
     
    
    return circle_points

start = timeit.default_timer()
circle_points = compute_points()
end = timeit.default_timer()

if RANK==0:
    pi = 4* circle_points/ INTERVAL**2 
    print('Process 0 \n')
    print("Circle points number :",circle_points)
    print("Final Estimation of Pi=", pi, "cpu time :",end-start) 

Writing PiMonteCarlo_V0.py


In [None]:
# enter command for compile and run the program
!mpirun -n 2 python3 PiMonteCarlo_V0.py