# Multi-Processing, Assignment no. 14
## 15th feb 2023 Assignment

#### Q1. What is multiprocessing in python? Why is it useful?

Multiprocessing refers to the ability of a system to support more than one processor at the same time. Applications in a multiprocessing system are broken to smaller routines that run independently. The operating system allocates these threads to the processors improving performance of the system.

Multiprocessing in Python is a built-in package that allows the system to run multiple processes simultaneously. It will enable the breaking of applications into smaller threads that can run independently. The operating system can then allocate all these threads or processes to the processor to run them parallelly, thus improving the overall performance and efficiency.

### Multiprocessing is useful because:
Performing multiple operations for a single processor becomes challenging. As the number of processes keeps increasing, the processor will have to halt the current process and move to the next, to keep them going. Thus, it will have to interrupt each task, thereby hampering the performance.

You can think of it as an employee in an organization tasked to perform jobs in multiple departments. If the employee has to manage the sales, accounts, and even the backend, he will have to stop sales when he is into accounts and vice versa.

Suppose there are different employees, each to perform a specific task. It becomes simpler, right? That’s why multiprocessing in Python becomes essential. The smaller task threads act like different employees, making it easier to handle and manage various processes. A multiprocessing system can be represented as:

- A system with more than a single central processor
- A multi-core processor, i.e., a single computing unit with multiple independent core processing units

In multiprocessing, the system can divide and assign tasks to different processors.

#### Q2. What are the differences between multiprocessing and multithreading?

Both Multiprocessing and Multithreading are used to increase the computing power of a system. Multiprocessing: Multiprocessing is a system that has more than one or two processors. In Multiprocessing, CPUs are added for increasing computing speed of the system. Because of Multiprocessing, There are many processes are executed simultaneously. Multiprocessing are classified into two categories:

1. Symmetric Multiprocessing
2. Asymmetric Multiprocessing 

Multithreading: Multithreading is a system in which multiple threads are created of a process for increasing the computing speed of the system. In multithreading, many threads of a process are executed simultaneously and process creation in multithreading is done according to economical. 
 

1. >* In Multiprocessing, CPUs are added for increasing computing power.
>* While In Multithreading, many threads are created of a single process for increasing computing power.

2. >* In Multiprocessing, Many processes are executed simultaneously.
>* While in multithreading, many threads of a process are executed simultaneously.

3. >* Multiprocessing are classified into Symmetric and Asymmetric.
>* While Multithreading is not classified in any categories.

4. >* In Multiprocessing, Process creation is a time-consuming process.
>* While in Multithreading, process creation is according to economical.

5. >* In Multiprocessing, every process owned a separate address space.
>* While in Multithreading, a common address space is shared by all the threads.

#### Q3. Write a python code to create a process using the multiprocessing module.

In [28]:
import multiprocessing
import time

def square(n):
    #return n**2
    time.sleep(1)
    print(n**2)
    
def cube(n):
    print (n**3)

if __name__ == "__main__":
    m1 = multiprocessing.Process(target=square, args=(10,))
    m2 = multiprocessing.Process(target=cube, args=(15,))
    m1.start()
    m2.start()

    m1.join()
    m2.join()

3375
100


#### Q4. What is a multiprocessing pool in python? Why is it used?

Python multiprocessing Pool can be used for parallel execution of a function across multiple input values, distributing the input data across processes (data parallelism).

Despite the fact that Pool and Process both executes the job parallel, however their way executing job parallel is different.
- The process class stored the processes in memory and allocates the jobs to the available processors using a FIFO scheduling. When the process is ended, it pre-empts and plan new process for execution.

- The pool class using schedules execution using FIFO policy. It workings like a map reduce design. It maps the input are from different processors and bring together the output from all the processors. After the running the code, it restores the output in form of a list or array. It waits for all the jobs to finish and then returns the output. The processes in execution are puts in memory and other non-executing processes are puts away out of memory.



One can create a pool of processes which will carry out tasks submitted to it with the Pool class.

### class multiprocessing.pool.Pool([processes[, initializer[, initargs[, maxtasksperchild[, context]]]]])

A process pool object which controls a pool of worker processes to which jobs can be submitted. It supports asynchronous results with timeouts and callbacks and has a parallel map implementation.

- processes is the number of worker processes to use. If processes is None then the number returned by os.cpu_count() is used.

- If initializer is not None then each worker process will call initializer(*initargs) when it starts.

- maxtasksperchild is the number of tasks a worker process can complete before it will exit and be replaced with a fresh worker process, to enable unused resources to be freed. The default maxtasksperchild is None, which means worker processes will live as long as the pool.

- context can be used to specify the context used for starting the worker processes. Usually a pool is created using the function multiprocessing.Pool() or the Pool() method of a context object. In both cases context is set appropriately.

>* Note that the methods of the pool object should only be called by the process which created the pool.
>* creating a pool object :
    p = multiprocessing.Pool()
>* map list to target function :
    result = p.map(square, mylist)

#### Warning 
multiprocessing.pool objects have internal resources that need to be properly managed (like any other resource) by using the pool as a context manager or by calling close() and terminate() manually. Failure to do this can lead to the process hanging on finalization.
Note that it is not correct to rely on the garbage collector to destroy the pool as CPython does not assure that the finalizer of the pool will be called. (object.__del__())

#### Q5. How can we create a pool of worker processes in python using the multiprocessing module?

In order to utilize all the cores, multiprocessing module provides a Pool class. The Pool class represents a pool of worker processes. It has methods which allows tasks to be offloaded to the worker processes in a few different ways. 

The task is offloaded/distributed among the cores/processes automatically by Pool object. User doesn’t need to worry about creating processes explicitly.

In [1]:
import multiprocessing
import os
def square(n):
    print(f"\n Worker process id for {n} : {os.getpid()}")
    return n*n

if __name__ == "__main__":
    
    with multiprocessing.Pool(processes=5) as po:
        result2 = po.map(square, [2,3,4,5,6,7,8])
        #po.terminate()
        
    print(result2)
      


 Worker process id for 2 : 4036
 Worker process id for 5 : 4039
 Worker process id for 4 : 4038
 Worker process id for 6 : 4040
 Worker process id for 3 : 4037





 Worker process id for 7 : 4036
 Worker process id for 8 : 4039

[4, 9, 16, 25, 36, 49, 64]


### Let us try to understand above code step by step:

We create a Pool object using:
>* p = multiprocessing.Pool()

There are a few arguments for gaining more control over offloading of task. These are:
- processes: specify the number of worker processes.
- maxtasksperchild: specify the maximum number of task to be assigned per child.

All the processes in a pool can be made to perform some initialization using these arguments:
- initializer: specify an initialization function for worker processes.
- initargs: arguments to be passed to initializer.

Now, in order to perform some task, we have to map it to some function. In the example above, we map mylist to square function. As a result, the contents of mylist and definition of square will be distributed among the cores.
>*  result = p.map(square, mylist)

- Once all the worker processes finish their task, a list is returned with the final result.

#### Q6. Write a python program to create 4 processes, each process should print a different number using the multiprocessing module in python.

In [4]:
import multiprocessing
import os


def func(s):
    print(f"{s} process pid : {os.getpid()}")    

    
if __name__ == "__main__":
    
    m1 = multiprocessing.Process(target=func, args=('m1',))
    m2 = multiprocessing.Process(target=func, args=('m2',))
    m3 = multiprocessing.Process(target=func, args=('m3',))
    m4 = multiprocessing.Process(target=func, args=('m4',))
    
    
    m1.start()
    m2.start()
    m3.start()
    m4.start()
    
    m1.join()
    m2.join()
    m3.join()
    m4.join()
    
    #print(f'm1 pid :{m1.pid}, m2 pid : {m2.pid}, m3 pid :{m3.pid}, m4 pid :{m4.pid}') # pid is Process id

m1 process pid : 4175
m2 process pid : 4178
m3 process pid : 4185
m4 process pid : 4190
