# Par-Session-1 Solution: Introduction to MPI (Message Passing Interface)

This notebook provides complete solutions for the Par-session-1 exercises.
It demonstrates parallel programming concepts using MPI4Py.

## 1. Setup and Library Imports

In [4]:
import numpy as np
import sys
import os

# Note: MPI4Py requires MPI to be installed on the system
# This notebook demonstrates MPI concepts

print("Parallel Computing Environment Setup")
print(f"Python version: {sys.version}")
print(f"NumPy version: {np.__version__}")

Parallel Computing Environment Setup
Python version: 3.12.12 | packaged by Anaconda, Inc. | (main, Oct 21 2025, 20:07:49) [Clang 20.1.8 ]
NumPy version: 2.4.0


## 2. MPI Fundamentals

In [5]:
# SOLUTION: Basic MPI concepts
print("\n=== MPI FUNDAMENTALS ===")
print("""
MPI (Message Passing Interface) is a standardized communication protocol for parallel computing.

Key Concepts:
1. Communicator: A group of processes that can communicate
2. Rank: Unique identifier for each process (0 to size-1)
3. Size: Total number of processes
4. Point-to-Point Communication: Send/Receive between two processes
5. Collective Communication: Operations involving all processes

Common MPI Operations:
- Send/Recv: Point-to-point communication
- Bcast: Broadcast data from one process to all
- Scatter: Distribute data from one process to all
- Gather: Collect data from all processes to one
- Reduce: Combine data from all processes
""")


=== MPI FUNDAMENTALS ===

MPI (Message Passing Interface) is a standardized communication protocol for parallel computing.

Key Concepts:
1. Communicator: A group of processes that can communicate
2. Rank: Unique identifier for each process (0 to size-1)
3. Size: Total number of processes
4. Point-to-Point Communication: Send/Receive between two processes
5. Collective Communication: Operations involving all processes

Common MPI Operations:
- Send/Recv: Point-to-point communication
- Bcast: Broadcast data from one process to all
- Scatter: Distribute data from one process to all
- Gather: Collect data from all processes to one
- Reduce: Combine data from all processes



## 3. MPI4Py Examples

In [6]:
# SOLUTION: Example MPI program structure
mpi_hello_world = """
from mpi4py import MPI

# Get communicator
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

print(f"Hello from process {rank} of {size}")
"""

print("\n=== HELLO WORLD MPI PROGRAM ===")
print(mpi_hello_world)
print("\nTo run: mpirun -np 4 python hello_world.py")


=== HELLO WORLD MPI PROGRAM ===

from mpi4py import MPI

# Get communicator
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

print(f"Hello from process {rank} of {size}")


To run: mpirun -np 4 python hello_world.py


In [7]:
# SOLUTION: Point-to-point communication example
point_to_point = """
from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

if rank == 0:
    # Process 0 sends data
    data = np.array([1, 2, 3, 4, 5])
    comm.Send(data, dest=1)
    print(f"Process {rank} sent data: {data}")
elif rank == 1:
    # Process 1 receives data
    data = np.empty(5, dtype=int)
    comm.Recv(data, source=0)
    print(f"Process {rank} received data: {data}")
"""

print("\n=== POINT-TO-POINT COMMUNICATION ===")
print(point_to_point)


=== POINT-TO-POINT COMMUNICATION ===

from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

if rank == 0:
    # Process 0 sends data
    data = np.array([1, 2, 3, 4, 5])
    comm.Send(data, dest=1)
    print(f"Process {rank} sent data: {data}")
elif rank == 1:
    # Process 1 receives data
    data = np.empty(5, dtype=int)
    comm.Recv(data, source=0)
    print(f"Process {rank} received data: {data}")



In [8]:
# SOLUTION: Collective communication example
collective_comm = """
from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

# Broadcast: Send from rank 0 to all
if rank == 0:
    data = np.array([1, 2, 3, 4, 5])
else:
    data = np.empty(5, dtype=int)

comm.Bcast(data, root=0)
print(f"Process {rank} received: {data}")

# Gather: Collect from all to rank 0
local_data = np.array([rank, rank+1, rank+2])
if rank == 0:
    gathered = np.empty((size, 3), dtype=int)
else:
    gathered = None

comm.Gather(local_data, gathered, root=0)
if rank == 0:
    print(f"Gathered data:\n{gathered}")
"""

print("\n=== COLLECTIVE COMMUNICATION ===")
print(collective_comm)


=== COLLECTIVE COMMUNICATION ===

from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

# Broadcast: Send from rank 0 to all
if rank == 0:
    data = np.array([1, 2, 3, 4, 5])
else:
    data = np.empty(5, dtype=int)

comm.Bcast(data, root=0)
print(f"Process {rank} received: {data}")

# Gather: Collect from all to rank 0
local_data = np.array([rank, rank+1, rank+2])
if rank == 0:
    gathered = np.empty((size, 3), dtype=int)
else:
    gathered = None

comm.Gather(local_data, gathered, root=0)
if rank == 0:
    print(f"Gathered data:
{gathered}")



## 4. Parallel Computing Patterns

In [9]:
# SOLUTION: Master-Worker pattern
master_worker = """
from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

if rank == 0:
    # Master process
    print("Master process starting...")
    for i in range(1, size):
        # Send work to worker processes
        work = np.array([i*10, i*20, i*30])
        comm.Send(work, dest=i)
        print(f"Sent work to process {i}")
    
    # Receive results
    for i in range(1, size):
        result = np.empty(3, dtype=int)
        comm.Recv(result, source=i)
        print(f"Received result from process {i}: {result}")
else:
    # Worker process
    work = np.empty(3, dtype=int)
    comm.Recv(work, source=0)
    print(f"Process {rank} received work: {work}")
    
    # Process work
    result = work * 2
    comm.Send(result, dest=0)
"""

print("\n=== MASTER-WORKER PATTERN ===")
print(master_worker)


=== MASTER-WORKER PATTERN ===

from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

if rank == 0:
    # Master process
    print("Master process starting...")
    for i in range(1, size):
        # Send work to worker processes
        work = np.array([i*10, i*20, i*30])
        comm.Send(work, dest=i)
        print(f"Sent work to process {i}")

    # Receive results
    for i in range(1, size):
        result = np.empty(3, dtype=int)
        comm.Recv(result, source=i)
        print(f"Received result from process {i}: {result}")
else:
    # Worker process
    work = np.empty(3, dtype=int)
    comm.Recv(work, source=0)
    print(f"Process {rank} received work: {work}")

    # Process work
    result = work * 2
    comm.Send(result, dest=0)



In [10]:
# SOLUTION: Data parallelism pattern
data_parallelism = """
from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

# Create data on rank 0
if rank == 0:
    data = np.arange(100)
else:
    data = None

# Scatter data to all processes
local_data = np.empty(100 // size, dtype=int)
comm.Scatter(data, local_data, root=0)

# Each process computes on its portion
local_result = local_data ** 2

# Gather results back to rank 0
if rank == 0:
    result = np.empty(100, dtype=int)
else:
    result = None

comm.Gather(local_result, result, root=0)

if rank == 0:
    print(f"Final result: {result}")
"""

print("\n=== DATA PARALLELISM PATTERN ===")
print(data_parallelism)


=== DATA PARALLELISM PATTERN ===

from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()

# Create data on rank 0
if rank == 0:
    data = np.arange(100)
else:
    data = None

# Scatter data to all processes
local_data = np.empty(100 // size, dtype=int)
comm.Scatter(data, local_data, root=0)

# Each process computes on its portion
local_result = local_data ** 2

# Gather results back to rank 0
if rank == 0:
    result = np.empty(100, dtype=int)
else:
    result = None

comm.Gather(local_result, result, root=0)

if rank == 0:
    print(f"Final result: {result}")



## 5. Performance Considerations

In [11]:
# SOLUTION: Performance analysis
print("\n=== PERFORMANCE CONSIDERATIONS ===")
print("""
1. Communication Overhead:
   - Network latency and bandwidth
   - Message size affects performance
   - Collective operations can be optimized

2. Load Balancing:
   - Distribute work evenly across processes
   - Avoid idle processes
   - Consider dynamic load balancing

3. Scalability:
   - Strong scaling: Fixed problem, increase processes
   - Weak scaling: Increase problem and processes proportionally
   - Amdahl's Law: Speedup limited by serial portion

4. Optimization Strategies:
   - Minimize communication
   - Use non-blocking operations
   - Overlap computation and communication
   - Use efficient collective operations

5. Debugging:
   - Use MPI profiling tools
   - Check for deadlocks
   - Verify message ordering
   - Use debugging flags
""")


=== PERFORMANCE CONSIDERATIONS ===

1. Communication Overhead:
   - Network latency and bandwidth
   - Message size affects performance
   - Collective operations can be optimized

2. Load Balancing:
   - Distribute work evenly across processes
   - Avoid idle processes
   - Consider dynamic load balancing

3. Scalability:
   - Strong scaling: Fixed problem, increase processes
   - Weak scaling: Increase problem and processes proportionally
   - Amdahl's Law: Speedup limited by serial portion

4. Optimization Strategies:
   - Minimize communication
   - Use non-blocking operations
   - Overlap computation and communication
   - Use efficient collective operations

5. Debugging:
   - Use MPI profiling tools
   - Check for deadlocks
   - Verify message ordering
   - Use debugging flags



## 6. Key Concepts Summary

### MPI Basics:
- **Communicator**: Group of processes
- **Rank**: Process identifier
- **Size**: Number of processes

### Communication Types:
- **Point-to-Point**: Send/Recv between two processes
- **Collective**: Operations involving all processes
- **Blocking**: Wait for operation to complete
- **Non-blocking**: Continue while operation proceeds

### Common Operations:
- Send/Recv: Direct communication
- Bcast: Broadcast from one to all
- Scatter: Distribute from one to all
- Gather: Collect from all to one
- Reduce: Combine data from all
- AllReduce: Reduce and broadcast result

### Parallel Patterns:
- Master-Worker: Centralized control
- Data Parallelism: Distribute data
- Pipeline: Sequential stages
- Hybrid: Combine multiple patterns