<a href="https://colab.research.google.com/github/jonclindaniel/LargeScaleComputing_A21/blob/main/in-class-activities/02_Midway_MPI/mpi4py_on_colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

First, install mpi4py in the Colab notebook environment. Note that you will have to run this cell to reinstall it every time you start a new Colab session.

In [1]:
! pip install mpi4py

Collecting mpi4py
  Downloading mpi4py-3.1.1.tar.gz (2.4 MB)
[K     |████████████████████████████████| 2.4 MB 5.5 MB/s 
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
    Preparing wheel metadata ... [?25l[?25hdone
Building wheels for collected packages: mpi4py
  Building wheel for mpi4py (PEP 517) ... [?25l[?25hdone
  Created wheel for mpi4py: filename=mpi4py-3.1.1-cp37-cp37m-linux_x86_64.whl size=2180608 sha256=8d7dd7fabdf107a8fee20e0194d0358631fbe9dc18302d0d903ad5bbd9d18dc0
  Stored in directory: /root/.cache/pip/wheels/91/be/c0/2b0347be1de5cd8ca9fe67da7ec8c3fe8930fcb6b0df6f2255
Successfully built mpi4py
Installing collected packages: mpi4py
Successfully installed mpi4py-3.1.1


Then, we can use Jupyter magic to write the contents of a cell into a Python mpi4py program that we can run below using `mpirun`. Note that you need to allow it to "run as root" here in the Colab notebook in order for your code to run.

In [2]:
%%writefile hello_world.py 
from mpi4py import MPI

comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()
name = MPI.Get_processor_name()

print("Hello, World! I am process %d of %d on %s." % (rank, size, name))

Writing hello_world.py


In [3]:
! mpirun --allow-run-as-root -n 4 python hello_world.py

Hello, World! I am process 3 of 4 on 861fbdc3fdd7.
Hello, World! I am process 1 of 4 on 861fbdc3fdd7.
Hello, World! I am process 2 of 4 on 861fbdc3fdd7.
Hello, World! I am process 0 of 4 on 861fbdc3fdd7.


In [None]:
! mpirun --allow-run-as-root --oversubscribe -n 4 python hello_world.py

Note that while the program is run on different threads (4 MPI processes), Colab is only giving us one processor, so the same processor name is listed for each thread. We're unlikely to get a speed-up if we parallelize in this way, but it can be a nice interactive spot to debug our code before we run it on the Midway Cluster.

Below is the parallel random walk simulation from the `in-class-activities/02_Midway_MPI` directory on GitHub. You can view the plot that it produces (after running the program for yourself) by clicking on the file folder icon tab on the right-hand side of this screen and clicking the r_walk*.png image file.

In [4]:
%%writefile mpi_rand_walk.py 
from mpi4py import MPI
import matplotlib.pyplot as plt
import numpy as np
import time

def sim_rand_walks_parallel(n_runs):
    # Get rank of process and overall size of communicator:
    comm = MPI.COMM_WORLD
    rank = comm.Get_rank()
    size = comm.Get_size()

    # Start time:
    t0 = time.time()

    # Evenly distribute number of simulation runs across processes
    N = int(n_runs / size)

    # Simulate N random walks and specify as a NumPy Array
    r_walks = []
    for i in range(N):
        steps = np.random.normal(loc=0, scale=1, size=100)
        steps[0] = 0
        r_walks.append(100 + np.cumsum(steps))
    r_walks_array = np.array(r_walks)

    # Gather all simulation arrays to buffer of expected size/dtype on rank 0
    r_walks_all = None
    if rank == 0:
        r_walks_all = np.empty([N * size, 100], dtype='float')
    comm.Gather(sendbuf=r_walks_array, recvbuf=r_walks_all, root=0)

    # Print/plot simulation results on rank 0
    if rank == 0:
        # Calculate time elapsed after computing mean and std
        average_finish = np.mean(r_walks_all[:,-1])
        std_finish = np.std(r_walks_all[:,-1])
        time_elapsed = time.time() - t0

        # Print time elapsed + simulation results
        print("Simulated %d Random Walks in: %f seconds on %d MPI processes"
                % (n_runs, time_elapsed, size))
        print("Average final position: %f, Standard Deviation: %f"
                % (average_finish, std_finish))

        # Plot Simulations and save to file
        plt.plot(r_walks_all.transpose())
        plt.savefig("r_walk_nprocs%d_nruns%d.png" % (size, n_runs))

    return

def main():
    sim_rand_walks_parallel(n_runs=10000)

if __name__ == '__main__':
    main()

Writing mpi_rand_walk.py


In [5]:
! mpirun --allow-run-as-root -n 4 python mpi_rand_walk.py

Simulated 10000 Random Walks in: 0.198135 seconds on 4 MPI processes
Average final position: 100.111029, Standard Deviation: 9.894592
