## There are two classes in multiprocessing module one is Process and the other is `Pool`
* It allows to do multiple job per process
* If there are a large number task then launching separate process fo each than the task is impractical, `Pool` class uses mapping and it distribute tasks to the worker proceses (typically same as the number of available cores ), collects the return values in the form of list

https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool

### Making a simple function and apending the result in a list

In [1]:
def cal_square(n):
    return n * n

In [2]:
num_list = [1, 2, 3, 4, 5]
result = []

for num in num_list:
    result.append(cal_square(num))

In [3]:
result

[1, 4, 9, 16, 25]

<b> `output`: Irrespective of available processing unit this operation will take place only in one processing unit <b>

### We want to know how many cores has participated in the execution of the the operation

### Importing the os module

In [4]:
import os

### Let us check core available in the current computer

In [5]:
os.cpu_count()

4

### Now Using `os.getpid()` to get the process id that are executing the above code

In [6]:
def cal_square(n):
    print(n, os.getpid())
    return n * n

result = []

In [7]:
for num in num_list:
    result.append(cal_square(num))

result

1 18544
2 18544
3 18544
4 18544
5 18544


[1, 4, 9, 16, 25]

<b> `output`: Even though there are 4 core available without pooling only one core wil be used and that is why we are getting one process id for all the element operation <b>

### With the optimum utilisation of the available cores to get the fastest response the `Pool` class of multiprocessing is used
https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool

In [4]:
from multiprocessing import Pool

`map()` executes maping of the input i.e it divides the input to execute that in multiple cores

https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool.map

In [12]:
result = []

In [13]:
p = Pool()

result = p.map(cal_square, num_list)

p.close()

1 18575
3 18577
2 18576
4 18578
5 18577


In [14]:
result

[1, 4, 9, 16, 25]

### `Pool` function has a argument called `processess`, it is the number of worker processes to use

In [46]:
result = []

In [47]:
p = Pool(processes=2)

result = p.map(cal_square, num_list)

p.close()

2 18621
1 18620
3 18620
4 18621
5 18620


In [48]:
result

[1, 4, 9, 16, 25]

<b> `output`: The system I'm working on right now has four cores, we've 5 elements inside the list first four element is assigned to first four cores and the last element is assigned to first core again, that is we are getting for unique process ids <b>

In [2]:
import time

In [3]:
def sum_square(number):
    s = 0
    for i in range(number):
        s += i * i
    return s

In [5]:
def sum_square_with_mp(numbers):
    
    start_time = time.time()
    p = Pool()
    result = p.map(sum_square, numbers)
    
    p.close()
    p.join()
    
    total_time = time.time() - start_time
    
    print("Processing %d numbers took %.2f seconds using multiprocessing" % (len(numbers), total_time))

In [6]:
def sum_square_without_mp(numbers):

    start_time = time.time()
    result = []

    for i in numbers:
        result.append(sum_square(i))
        
    total_time = time.time() - start_time

    print("Processing %d numbers took %.2f seconds with serial processing" % (len(numbers), total_time))

In [37]:
numbers = range(1000)

sum_square_with_mp(numbers)
sum_square_without_mp(numbers)

Processing 1000 numbers took 0.14 seconds using multiprocessing
Processing 1000 numbers took 0.05 seconds with serial processing


In [38]:
numbers = range(10000)

sum_square_with_mp(numbers)
sum_square_without_mp(numbers)

Processing 10000 numbers took 3.34 seconds using multiprocessing
Processing 10000 numbers took 5.17 seconds with serial processing


<b> `output`: In the output we can see that the number of workers that took part is 3 as we've specified in the input args

In [9]:
numbers = range(20000)

sum_square_with_mp(numbers)
sum_square_without_mp(numbers)

Processing 20000 numbers took 23.52 seconds using multiprocessing
Processing 20000 numbers took 44.05 seconds with serial processing
