Your Name:  Gregory Roberts

Submission Instruction:
* Instruction: make a copy of this CoLab file and share it with me (Yifeng.Zhu@maine.edu).
* Deadline: Midnight, Sunday, April 2


In [10]:
!pip install mpi4py



#Question 1: Point-to-point communication

Is there any problem in the following code? If so, how to fix it?

In [None]:
%%file q1.py
from mpi4py import MPI

comm = MPI.COMM_WORLD
rank = comm.Get_rank()

if rank==1:
    data_send= "a"
    destination_process = 2
    source_process = 2

    comm.send(data_send, dest=destination_process)
    data_received = comm.recv(source=source_process)
    
    print ("sending data %s " %data_send + \
           "to process %d" %destination_process)
    print ("data received = %s" %data_received)

if rank==2:
    data_send= "b"
    destination_process = 1
    source_process = 1

    comm.send(data_send, dest=destination_process)
    data_received = comm.recv(source=source_process)
    
    print ("sending data %s :" %data_send + \
           "to process %d" %destination_process)
    print ("data received = %s" %data_received)


Writing q1.py


In [None]:
!mpirun --allow-run-as-root -n 4 python q1.py

sending data a to process 2
data received = b
sending data b :to process 1
data received = a


**Your answer:**  Changed the code so that the comm.send is before the comm.recv for both rank processes.


# Question 2: Fix the bug of the following code

The following code is to implement a parallel matrix vector product. However, there are a large different between the parallel implementation and the result produced by numpy.dot. What is wrong with the code and how to fix it?

In [8]:
%%matvec.py%%
import numpy as np
from mpi4py import MPI
import os

# Parallel matrix-vector product
def matvec(comm, A, x):
    size = comm.Get_size()
    print('size ', size)
    rank = comm.Get_rank()
    m = A.shape[0] // size # local rows
    # every process gets a part of the data
    y_part = np.dot(A[rank * m:(rank+1)*m, :], x) 
    # container for the result    
    y = np.zeros_like(x, dtype='double')     
    # collect results from the pool, write them to container y      
    comm.Allgather([y_part,  MPI.DOUBLE], [y, MPI.DOUBLE])    
    return y

n = 400
comm = MPI.COMM_WORLD


rank = comm.Get_rank()


# GR - initialize vector and matrix to empty values
x = np.empty(n, dtype='d')
A = np.empty([n,n], dtype='d')

# GR - generate random vector and matrix
if rank == 0:
  x = np.random.rand(n)     # Generate a vector     
  A = np.random.rand(n, n)  # Generate a nxn matrix

# GR - broadcast the vector and matrix defined in rank 0
comm.Bcast([x, MPI.DOUBLE], root=0)
comm.Bcast([A, MPI.DOUBLE], root=0)

y_mpi = matvec(comm, A, x) # y_mpi = A * x

if rank == 0: # check 
  y = np.dot(A, x)      

  # compare the local and MPI results   
  # The output should be a very small value 
  print("sum(y - y_mpi) = ", (y - y_mpi).sum()) 


1
size  1
sum(y - y_mpi) =  0.0


In [None]:
!mpirun --allow-run-as-root -n 4 python3 matvec.py

sum(y - y_mpi) =  8.526512829121202e-14


**Your answer:**  I tried a number of different processes. I tried to use numpy.matmul, @, *, etc. I tried to split the numpy.dot into separate processes, similar to what was being done in matvec.

I found that for each rank that the A and x arrays were loaded with new numbers for the matvec function. The y dot production would only run once when the rank is 0 (zero).

By using static values stored in files, and then read in. The values stay the same each time, and y_mpi is not loaded with new random values each time the process iterates through the rank. Which produces a smaller value between y and y_mpi.

