****

# <center> <b> <span style="color:orange;"> African Institute for Mathematical Sciences RWANDA  </span> </b></center>

### <center> <b> <span style="color:green;">PYTHON PROGRAMMING : Individual Project </span> </b></center>

### <left> <b> <span style="color:blue;">Presented by : </span> </b></left>  Manuella Kristeva NAKAM YOPDUP


#

### <center> <b> <span style="color:brown;">Multiprocessing — Process-based parallelism </span> </b></center>

Python's multiprocessing module is a library that allows you to run tasks concurrently using multiple processes.

It is used to:

- Reduce execution time for large calculations

- Run parallel tasks, thereby improving program performance, especially for tasks that require a lot of computation.
- Take advantage of multicore architectures by allowing multiple processes to run simultaneously.
- Manage independent tasks that can be run in parallel without interfering with each other.

## How to use it?

The multiprocessing module is included in the Python standard library. To use it, simply import the module as follows:

In [9]:
from multiprocessing import *

The multiprocessing module also presents APIs that have no analogues in threading. Pool is one of these APIs and provides a convenient way to parallelize the execution of a function on multiple input values, by distributing the input data across multiple processes (data parallelism).


Before distributing operations between threads, it is important to know the number of threads on our machine.

In [16]:
import multiprocessing

print("Number of cpu : ", multiprocessing.cpu_count())

Number of cpu :  4


Multiprocessing offers many modules that each have a specific role.

## Pool function

- Pool() : which offers a convenient way to parallelize the execution of a function on multiple input values, by distributing the input data across multiple processes (data parallelism).

As we can see from the following example:

In [19]:

def f(x):
    return x*x -2*x

if __name__ == '__main__':

    
    with Pool(3) as p:
        print(p.map(f, [1, 2, 3, 8,10]))


[-1, 0, 3, 48, 80]


In this code, the function f(x) is calculated in parallel for the inputs [1, 2, 3, 8,10]

## Computes the time

We have seen that one of the benefits of multiprocessing is the reduction of execution time. So we will try to verify this by calculating the execution time of our function sequentially and in parallel.

In [47]:
#sequential code

import time

def f(x):
    return x*x -2*x

if __name__ == '__main__':
    # Début du chronométrage
    start_time = time.time()

    # Exécution des calculs sans parallélisation
    results = [f(x) for x in [1, 2, 3, 8,10]]

    # Fin du chronométrage
    end_time = time.time()

    # Affichage des résultats et du temps d'exécution
    print(results)  # Affiche : [1, 4, 9]
    print(f'Execution time: {end_time - start_time} seconds')

[-1, 0, 3, 48, 80]
Execution time: 7.104873657226562e-05 seconds


In [48]:
#code parrallele

import time

def f(x):
    return x*x -2*x

if __name__ == '__main__':

    start_time = time.time()
    
    with Pool(2) as p:
        print(p.map(f, [1, 2, 3, 8,10]))

    end_time = time.time()

    print(f'Execution time: {end_time - start_time} seconds')


[-1, 0, 3, 48, 80]
Execution time: 0.03745007514953613 seconds


In [49]:
#code parrallele

import time

def f(x):
    return x*x -2*x

if __name__ == '__main__':

    start_time = time.time()
    
    with Pool(4) as p:
        print(p.map(f, [1, 2, 3, 8,10]))

    end_time = time.time()

    print(f'Execution time: {end_time - start_time} seconds')


[-1, 0, 3, 48, 80]
Execution time: 0.09272241592407227 seconds


We find that when the data is not much the parallelization increases the execution time so parallelization is not advantageous in the case of small data.

## Start and join functions

start() and join() are generally use together :
- start() : to instantiate a process object. 
- join() : to tell the process to complete otherwise, the process will remain idle and won’t terminate.

In [50]:
from multiprocessing import Process

def f(name, firstname):
    print('Hello', name, firstname)

if __name__ == '__main__':
    p = Process(target=f, args=('bob','sponge',))
    p.start()
    p.join()


Hello bob sponge


This code shows how to create and run a process in Python using the Process class:

- A function f is defined to greet a first and last name.
- A process is created to run this function with specific arguments.
- The process is started and the main script waits for it to finish.

## Process identifier

Every process (except process 0) has one parent process, but can have many child processes. Each process is represent by an identifier given by the operating system kernel.

The following code allows us to se the parent id of an process and the process id .

In [1]:
from multiprocessing import Process
import os

def info(title):
    print(title)
    print('module name:', __name__)
    print('parent process:', os.getppid())
    print('process id:', os.getpid())

def f(name):
    info('function f')
    print('hello', name)

if __name__ == '__main__':
    info('main line')
    p = Process(target=f, args=('bob',))
    p.start()
    p.join()

main line
module name: __main__
parent process: 4539
process id: 4773
function f
module name: __main__
parent process: 4773
process id: 4971
hello bob


This code shows how to use the multiprocessing module to create a child process and get information about processes: 
- The info function provides information about the current process and its parent.
- The f function is executed in a new process, which prints information and a greeting message.

- The main process waits for the secondary process to finish before terminating itself.

## Types of communication channels between processes

Multiprocessing supports two types of communication channels between processes:

-  Queue() function: allows efficient communication between processes while maintaining the isolation of their memory spaces.
-  Pipe() function: allows you to create a communication channel between two processes. It creates a pipe object that consists of two ends (or "connections"): one to send data and the other to receive it.

### Queue function

In [4]:
from multiprocessing import Process, Queue

def f(q):
    q.put([42, None, 'hello'])

if __name__ == '__main__':
    q = Queue()
    p = Process(target=f, args=(q,))
    p.start()
    print(q.get())    # prints "[42, None, 'hello']"
    p.join()

[42, None, 'hello']


This code demonstrates how to use Queue to send data from a child process to the main process. The child process executes the function f, which places a list on the queue. The main process then retrieves this list and displays it. This method is useful for interprocess communication, allowing data to be shared securely and efficiently.

### Pipe function

In [5]:
from multiprocessing import Process, Pipe

def f(conn):
    conn.send([42, None, 'hello'])
    conn.close()

if __name__ == '__main__':
    parent_conn, child_conn = Pipe()
    p = Process(target=f, args=(child_conn,))
    p.start()
    print(parent_conn.recv())   # prints "[42, None, 'hello']"
    p.join()

[42, None, 'hello']


This code shows how to use Pipe to send data from a child process to the main process. The child process executes the function f, which sends a list through the pipe. The main process receives this list and displays it. This method is useful for interprocess communication, ensuring efficient data transmission.

## lock function

A lock is a synchronization mechanism used in concurrent programming to control access to a shared resource between multiple threads or processes. Using a lock helps prevent race conditions, where multiple threads or processes attempt to access or modify a resource at the same time, which can lead to undefined results or errors.

In [6]:
from multiprocessing import Process, Lock

def f(l, i):
    l.acquire()
    try:
        print('hello world', i)
    finally:
        l.release()

if __name__ == '__main__':
    lock = Lock()

    for num in range(10):
        Process(target=f, args=(lock, num)).start()

hello world 0
hello world1 
hello world 2
hello world 3
hello world 4
hello world 5
hello world 6
hello world 7
hello world 8
hello world 9


This code demonstrates the use of a lock to synchronize access to a shared resource (the terminal, in this case) between multiple processes. With this lock, only one process at a time can print its message, preventing concurrent output that could make the output difficult to read. This ensures that each "hello world" message is printed in an orderly manner.

## Process, Value, Array

- The Process class allows you to create and manage parallel processes. Each process runs in its own memory space, making it isolated from the others.
- The Value class allows you to create variables that are shared between multiple processes. This is useful for storing simple values ​​(such as numbers or strings) that need to be accessible and modifiable by multiple processes.
- The Array class allows you to create shared arrays that can contain elements of specific types. This is useful for sharing lists of data between multiple processes.

In [52]:
from multiprocessing import Process, Value, Array

def f(n, a):
    n.value = 3.1415927
    for i in range(len(a)):
        a[i] = -a[i]

if __name__ == '__main__':
    num = Value('d', 0.0)
    arr = Array('i', range(10))

    p = Process(target=f, args=(num, arr))
    p.start()
    p.join()

    print(num.value)
    print(arr[:])

3.1415927
[0, -1, -2, -3, -4, -5, -6, -7, -8, -9]


This code shows how to use Value and Array to share data between a main process and a child process.
This illustrates how objects of type Value and Array can be used to manage data shared securely between multiple processes.

## Process, Manager

- The Process class allows you to create and manage parallel processes. Each process runs in a separate memory space, making it isolated from the others.

- The Manager class allows you to create objects that can be shared between multiple processes, such as dictionaries, lists, etc. This makes it easier to communicate and share data between processes.

In [8]:
from multiprocessing import Process, Manager

def f(d, l):
    d[1] = '1'
    d['2'] = 2
    d[0.25] = None
    l.reverse()

if __name__ == '__main__':
    with Manager() as manager:
        d = manager.dict()
        l = manager.list(range(10))

        p = Process(target=f, args=(d, l))
        p.start()
        p.join()

        print(d)
        print(l)

{1: '1', '2': 2, 0.25: None}
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]


This code illustrates how to use a Manager to create shared objects between processes in Python. It shows how to use shared data structures to facilitate communication and coordination between multiple processes.

## Multiple processes with processes

In [46]:
from multiprocessing import Process


def print_func(foods='Rice'):
    print(' The name of food is : ', foods)

if __name__ == "__main__":  # confirms that the code is under main function
    names = ['Tomato', 'Groundnuts', 'cocoyam', 'plantain']
    procs = []
    proc = Process(target=print_func)  # instantiating without any argument
    procs.append(proc)
    proc.start()

    # instantiating process with arguments
    for name in names:
        # print(name)
        proc = Process(target=print_func, args=(name,))
        procs.append(proc)
        proc.start()

    # complete the processes
    for proc in procs:
        proc.join()

 The name of food is :  The name of food is :   The name of food is :   The name of food is : TomatoRice 
 The name of food is : 
 cocoyam plantain

Groundnuts


This code demonstrates how to create and manage multiple processes in Python:

- A process is created to print the default food ('Rice').
- Other processes are created to print the foods specified in the list ('Tomato', 'Groundnuts', 'cocoyam', 'plantain').
- All processes are started and the script waits for them to complete.

# Conclusion 

The multiprocessing module in Python allows you to create programs that use multiple processes to execute tasks in parallel. This is particularly useful for taking advantage of multi-core architectures and improving the performance of CPU-bound applications.

# References

- https://docs.python.org/3/library/multiprocessing.html
- https://www.digitalocean.com/community/tutorials/python-multiprocessing-example