<a href="https://colab.research.google.com/github/vadhri/hpc-notebook/blob/main/mpi.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [126]:
!pip install mpi4py
!apt-get install -y openmpi-bin


Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
openmpi-bin is already the newest version (4.1.2-2ubuntu1).
0 upgraded, 0 newly installed, 0 to remove and 29 not upgraded.


In [127]:
!mpiexec --version

mpiexec (OpenRTE) 4.1.2

Report bugs to http://www.open-mpi.org/community/help/


In [128]:
%%writefile hello.py

from mpi4py import MPI

comm = MPI.COMM_WORLD
procs = comm.Get_size()
rank = comm.Get_rank()
if rank == 0:
  print(f"Number of processes: {procs}") ## will be printed by each process.(or each core / machine.)

print(f"Message from processes: {procs} Rank : {rank}")


Writing hello.py


In [129]:
!sudo mpirun  --allow-run-as-root --use-hwthread-cpus -np 2 python hello.py

Number of processes: 2
Message from processes: 2 Rank : 1
Message from processes: 2 Rank : 0


In [130]:
!lscpu | grep 'CPU(s)'


CPU(s):                               2
On-line CPU(s) list:                  0,1
NUMA node0 CPU(s):                    0,1


In [131]:
!echo "localhost slots=4" > my_hostfile
!sudo mpirun  --allow-run-as-root --hostfile my_hostfile -np 4 python send_recv.py

Process 0 sent: Hello from Process 0
Process 1 received: Hello from Process 0


### Single send-recv

In [132]:
%%writefile send_recv.py

from mpi4py import MPI

comm = MPI.COMM_WORLD  # Get the communicator
rank = comm.Get_rank()  # Get the rank of the current process

if rank == 0:
    data = "Hello from Process 0"
    comm.send(data, dest=1)  # Send data to process 1
    print(f"Process {rank} sent: {data}")

elif rank == 1:
    received_data = comm.recv(source=0)  # Receive data from process 0
    print(f"Process {rank} received: {received_data}")


Overwriting send_recv.py


In [133]:
!sudo mpirun  --allow-run-as-root --hostfile my_hostfile -np 4 python send_recv.py

Process 0 sent: Hello from Process 0
Process 1 received: Hello from Process 0


### Bidirectional send-receive

In [134]:
%%writefile send_recv_bidirectional.py
from mpi4py import MPI

comm = MPI.COMM_WORLD  # Get the communicator
rank = comm.Get_rank()  # Get the rank of the current process

if rank == 0:
    data_to_send = "Hello from Process 0"
    comm.send(data_to_send, dest=1)  # Send to Process 1
    received_data = comm.recv(source=1)  # Receive from Process 1
    print(f"Process {rank} sent: {data_to_send}")
    print(f"Process {rank} received: {received_data}")

elif rank == 1:
    data_to_send = "Hello from Process 1"
    received_data = comm.recv(source=0)  # Receive from Process 0
    comm.send(data_to_send, dest=0)  # Send to Process 0
    print(f"Process {rank} received: {received_data}")
    print(f"Process {rank} sent: {data_to_send}")


Overwriting send_recv_bidirectional.py


In [135]:
!sudo mpirun  --allow-run-as-root --hostfile my_hostfile -np 4 python send_recv_bidirectional.py

Process 1 received: Hello from Process 0
Process 1 sent: Hello from Process 1
Process 0 sent: Hello from Process 0
Process 0 received: Hello from Process 1


In [136]:
%%writefile bidirec_square_numbers.py
from mpi4py import MPI
import time

comm = MPI.COMM_WORLD
rank = comm.Get_rank()

num_iterations = 5  # Number of times we send/receive numbers

if rank == 0:  # Sender process
    for i in range(1, num_iterations + 1):
        print(f"Process {rank} sending: {i}")
        comm.send(i, dest=1)  # Send number to Process 1
        squared_value = comm.recv(source=1)  # Receive squared value
        print(f"Process {rank} received squared: {squared_value}")
        time.sleep(1)  # Just to simulate processing delay

elif rank == 1:  # Receiver process
    for _ in range(num_iterations):
        received_number = comm.recv(source=0)  # Receive number from Process 0
        squared_number = received_number ** 2  # Square the number
        comm.send(squared_number, dest=0)  # Send squared number back



Overwriting bidirec_square_numbers.py


In [137]:
!sudo mpirun  --allow-run-as-root --hostfile my_hostfile -np 4 python bidirec_square_numbers.py

Process 0 sending: 1
Process 0 received squared: 1
Process 0 sending: 2
Process 0 received squared: 4
Process 0 sending: 3
Process 0 received squared: 9
Process 0 sending: 4
Process 0 received squared: 16
Process 0 sending: 5
Process 0 received squared: 25


### Process sync with shared memory

In [138]:
%%writefile bidirec_shared_memory.py

from mpi4py import MPI
import numpy as np
import time

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

N = 5  # Number of iterations
array_size = 4  # Size of shared array

# Create a shared memory buffer
win = MPI.Win.Allocate_shared(array_size * np.dtype('i').itemsize, np.dtype('i').itemsize, comm=comm)

# Get shared memory array reference
buf, itemsize = win.Shared_query(0)  # Rank 0 creates the memory
shared_array = np.ndarray(buffer=buf, dtype='i', shape=(array_size,))

# Create a flag in shared memory
flag_win = MPI.Win.Allocate_shared(np.dtype('i').itemsize, np.dtype('i').itemsize, comm=comm)
flag_buf, _ = flag_win.Shared_query(0)
flag = np.ndarray(buffer=flag_buf, dtype='i', shape=(1,))

if rank == 0:
    for iter in range(N):
        # Initialize array
        shared_array[:] = np.arange(1, array_size + 1) * (iter + 1)
        print(f"Process 0: Initialized array: {shared_array}")

        # Signal process 1 to start processing
        flag[0] = 1
        flag_win.Sync()

        # Wait for process 1 to finish
        while flag[0] != 2:
            flag_win.Sync()
            time.sleep(0.01)

        # Sum the modified array
        total = np.sum(shared_array)
        print(f"Process 0: Sum after modification: {total}")

        # Reset flag
        flag[0] = 0
        flag_win.Sync()

elif rank == 1:
    for _ in range(N):
        # Wait for signal from process 0
        while flag[0] != 1:
            flag_win.Sync()
            time.sleep(0.01)

        # Multiply values by 2
        shared_array[:] *= 2
        print(f"Process 1: Modified array: {shared_array}")

        # Signal process 0 that modification is done
        flag[0] = 2
        flag_win.Sync()


Overwriting bidirec_shared_memory.py


In [139]:
!sudo mpirun  --allow-run-as-root --hostfile my_hostfile -np 4 python bidirec_shared_memory.py

Process 0: Initialized array: [1 2 3 4]
Process 1: Modified array: [2 4 6 8]
Process 0: Sum after modification: 20
Process 0: Initialized array: [2 4 6 8]
Process 1: Modified array: [ 4  8 12 16]
Process 0: Sum after modification: 40
Process 0: Initialized array: [ 3  6  9 12]
Process 1: Modified array: [ 6 12 18 24]
Process 0: Sum after modification: 60
Process 0: Initialized array: [ 4  8 12 16]
Process 1: Modified array: [ 8 16 24 32]
Process 0: Sum after modification: 80
Process 0: Initialized array: [ 5 10 15 20]
Process 1: Modified array: [10 20 30 40]
Process 0: Sum after modification: 100


### isend and irecv with tagging

In [140]:
%%writefile non-blocking-bidir-sendrecv.py
from mpi4py import MPI

comm = MPI.COMM_WORLD
rank = comm.Get_rank()

n_iterations = 10  # Number of iterations

if rank == 0:
    for i in range(n_iterations):
        value = i
        print(f"Iteration {i+1}: Process {rank} sending {value} to Process 1")
        req = comm.isend(value, dest=1, tag=11)
        req.wait()

        # Receive the computed value from Process 1
        req = comm.irecv(source=1, tag=22)
        value = req.wait()
        print(f"Iteration {i+1}: Process {rank} received {value} from Process 1")

elif rank == 1:
    for i in range(n_iterations):
        # Receive value from Process 0
        req = comm.irecv(source=0, tag=11)
        value = req.wait()
        print(f"Iteration {i+1}: Process {rank} received {value} from Process 0")

        # Perform computation (e.g., square the value)
        computed_value = 3.14*(value ** 2)
        print(f"Iteration {i+1}: Process {rank} computed {computed_value}")

        # Send computed value back to Process 0
        req = comm.isend(computed_value, dest=0, tag=22)
        req.wait()


Overwriting non-blocking-bidir-sendrecv.py


In [141]:
!sudo mpirun  --allow-run-as-root --hostfile my_hostfile -np 4 python non-blocking-bidir-sendrecv.py

Iteration 1: Process 0 sending 0 to Process 1
Iteration 1: Process 1 received 0 from Process 0
Iteration 1: Process 1 computed 0.0
Iteration 1: Process 0 received 0.0 from Process 1
Iteration 2: Process 0 sending 1 to Process 1
Iteration 2: Process 1 received 1 from Process 0
Iteration 2: Process 1 computed 3.14
Iteration 2: Process 0 received 3.14 from Process 1
Iteration 3: Process 0 sending 2 to Process 1
Iteration 3: Process 1 received 2 from Process 0
Iteration 3: Process 1 computed 12.56
Iteration 3: Process 0 received 12.56 from Process 1
Iteration 4: Process 0 sending 3 to Process 1
Iteration 4: Process 1 received 3 from Process 0
Iteration 4: Process 1 computed 28.26
Iteration 4: Process 0 received 28.26 from Process 1
Iteration 5: Process 0 sending 4 to Process 1
Iteration 5: Process 1 received 4 from Process 0
Iteration 5: Process 1 computed 50.24
Iteration 5: Process 0 received 50.24 from Process 1
Iteration 6: Process 0 sending 5 to Process 1
Iteration 6: Process 1 receive

### broadcasting

In [142]:
%%writefile non-blocking-broadcast-sendrecv.py
from mpi4py import MPI

comm = MPI.COMM_WORLD
rank = comm.Get_rank()

n_iterations = 10  # Number of iterations

for i in range(n_iterations):
    if rank == 0:
        value = i
        print(f"Iteration {i+1}: Process {rank} broadcasting {value}")
    else:
        value = None  # Other processes start with empty value

    # Rank 0 broadcasts the value to all ranks
    value = comm.bcast(value, root=0)

    if rank == 1:
        print(f"Iteration {i+1}: Process {rank} received {value} from Broadcast")

        # Perform computation (e.g., square the value and multiply by 3.14)
        computed_value = 3.14 * (value ** 2)
        print(f"Iteration {i+1}: Process {rank} computed {computed_value}")

        # Send computed value back to Process 0
        req = comm.isend(computed_value, dest=0, tag=22)
        req.wait()

    elif rank == 0:
        # Receive the computed value from Process 1
        req = comm.irecv(source=1, tag=22)
        result = req.wait()
        print(f"Iteration {i+1}: Process {rank} received {result} from Process 1")



Overwriting non-blocking-broadcast-sendrecv.py


In [143]:
!sudo mpirun  --allow-run-as-root --hostfile my_hostfile -np 4 python non-blocking-broadcast-sendrecv.py

Iteration 1: Process 0 broadcasting 0
Iteration 1: Process 1 received 0 from Broadcast
Iteration 1: Process 1 computed 0.0
Iteration 1: Process 0 received 0.0 from Process 1
Iteration 2: Process 0 broadcasting 1
Iteration 2: Process 1 received 1 from Broadcast
Iteration 2: Process 1 computed 3.14
Iteration 2: Process 0 received 3.14 from Process 1
Iteration 3: Process 0 broadcasting 2
Iteration 3: Process 1 received 2 from Broadcast
Iteration 3: Process 1 computed 12.56
Iteration 3: Process 0 received 12.56 from Process 1
Iteration 4: Process 0 broadcasting 3
Iteration 4: Process 1 received 3 from Broadcast
Iteration 4: Process 1 computed 28.26
Iteration 4: Process 0 received 28.26 from Process 1
Iteration 5: Process 0 broadcasting 4
Iteration 5: Process 1 received 4 from Broadcast
Iteration 5: Process 1 computed 50.24
Iteration 5: Process 0 received 50.24 from Process 1
Iteration 6: Process 0 broadcasting 5
Iteration 6: Process 1 received 5 from Broadcast
Iteration 6: Process 1 comput

### Scatter an array between processes

In [144]:
%%writefile non-blocking-scatter-sendrecv.py
from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()  # Get number of processes

# Root process prepares data
if rank == 0:
    array_to_scatter = np.arange(size * 25)  # Total size should be divisible by number of processes
    chunks = np.array_split(array_to_scatter, size)  # Split data into `size` parts
else:
    chunks = None

# Each process receives its chunk
recv_data = comm.scatter(chunks, root=0)

print(f"Process {rank} received: {recv_data}")



Overwriting non-blocking-scatter-sendrecv.py


In [145]:
!sudo mpirun  --allow-run-as-root --hostfile my_hostfile -np 4 python non-blocking-scatter-sendrecv.py

Process 1 received: [25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
 49]Process 2 received: [50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
 74]

Process 3 received: [75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98
 99]
Process 0 received: [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24]


### Scatter uneven size of data.

In [146]:
%%writefile uneven-scatter.py
from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

# Root process prepares the data
if rank == 0:
    array_to_scatter = np.arange(103, dtype=np.int32)  # Explicit int32
    split_sizes = np.array_split(array_to_scatter, size)
    counts = np.array([len(chunk) for chunk in split_sizes], dtype=np.int32)
    displacements = np.insert(np.cumsum(counts), 0, 0)[:-1].astype(np.int32)
    flat_data = np.concatenate(split_sizes)
    print(f"Rank 0: Counts = {counts}, Displacements = {displacements}")
    print(f"Rank 0: Flat data = {flat_data}")
else:
    counts = None
    displacements = None
    flat_data = None

# Broadcast counts and displacements
counts = comm.bcast(counts, root=0)
displacements = comm.bcast(displacements, root=0)

# Allocate receive buffer with explicit type
recv_data = np.zeros(counts[rank], dtype=np.int32)

# Explicit send buffer
if rank == 0:
    sendbuf = [flat_data, counts, displacements, MPI.INT]
else:
    sendbuf = None

# Scatter the data
comm.Scatterv(sendbuf, recv_data, root=0)

print(f"Process {rank} expected {counts[rank]} elements, received: {recv_data}")

Overwriting uneven-scatter.py


In [147]:
!sudo mpirun  --allow-run-as-root --hostfile my_hostfile -np 4 python uneven-scatter.py

Rank 0: Counts = [26 26 26 25], Displacements = [ 0 26 52 78]
Rank 0: Flat data = [  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17
  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35
  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53
  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71
  72  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89
  90  91  92  93  94  95  96  97  98  99 100 101 102]
Process 0 expected 26 elements, received: [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25]Process 2 expected 26 elements, received: [52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75
 76 77]

Process 1 expected 26 elements, received: [26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
 50 51]
Process 3 expected 25 elements, received: [ 78  79  80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95
  96  97  98  9