Based on **Francesco Pierfederici: Distributed Computing with Python, Chapter 3**

# Multiple processes
Traditionally, the way Python programmers have worked around the GIL and its 
effect on CPU-bound threads has been to **use multiple processes instead of multiple 
threads**. 

This approach (multiprocessing) has some **disadvantages**: 
having to launch multiple instances of the Python interpreter with all the 
**startup time and memory usage penalties** that this implies

In [1]:
#We start by importing the modules we need from the Standard Library (that is, threading, queue, and urllib.request).

from time import time

In [2]:
import psutil
psutil.cpu_count(logical=False)

4

In [3]:
psutil.cpu_count(logical=True)

4

Using **multiple processes to execute tasks in parallel has some nice properties.** 

Multiple processes have their **own memory space** and they also allow us to **(more) easily transition from a single-machine 
architecture to a distributed application**, where one would have to use multiple 
processes (on different machines) anyway.

There are two main modules in the Python Standard Library that we can use to 
implement process-based parallelism, and both of them are truly excellent. One is 
called **multiprocessing** and the other is **concurrent.futures**. 

The concurrent.futures module is built on top of multiprocessing and the threading module and 
provides a powerful high-level interface to them.

### Example for concurrent.futures multiprocessing

In [4]:
import concurrent.futures as cf

In [5]:
def fib(n):
    if n <= 2:
        return 1
    elif n == 0:
        return 0
    elif n < 0:
        raise Exception('fib(n) is undefined for n < 0')
        return fib(n - 1) + fib(n - 2)

In [6]:
fibnum=34
workernum=4
[fibnum] * workernum

[34, 34, 34, 34]

In [7]:
def runprocesses(workernum,fibnum):
    t0 = time();
    
    with cf.ProcessPoolExecutor(max_workers=workernum) as pool:
            results = pool.map(fib, [fibnum] * workernum) #run the fib function on each element of [34,34,34,34] in a parallel way
    
    dt = time() - t0; 
    print(dt)

We used the **ProcessPoolExecutor** class exported by concurrent.futures. 

This is one of the two main classes exported by 
the module, the other being **ThreadPoolExecutor**, which is used to create a **pool of 
threads**, instead of a **pool of processes**.

In [8]:
runprocesses(1,34)
runprocesses(2,34)
runprocesses(3,34)
runprocesses(4,34)
print('***************')
runprocesses(8,34)
runprocesses(16,34)
runprocesses(32,34)


0.009010553359985352
0.008515596389770508
0.010818719863891602
0.013442516326904297
***************
0.02607274055480957
0.05191349983215332
0.10563778877258301


Both, **ProcessPoolExecutor and ThreadPoolExecutor have the same API**: they have three main 
methods, which are as follows:

• **submit(f, *args, **kwargs)**: This is used to schedule an **asynchronous 
call** to f(*args, **kwargs) and return a **Future instance as a result** 
placeholder.

• **map(f, *arglist, timeout=None)**: This is the equivalent 
to the built-in map(f, *arglist) method. It returns a **list of Future objects** 
rather than a list of actual results, as map would do.

The third method, **shutdown(wait=True)** is used to **free the resources** used by the Executor object as soon as all currently scheduled functions are done. 

It waits (if wait=True) until that happens. 

A **Future instance** is a **placeholder** for the **result of an asynchronous call**. We can check 
whether the call is still running, whether or not it raised an exception, and so on.  

We call a Future instance result() method to access (with an optional timeout) its value once it is ready.

In [9]:
from concurrent.futures import ProcessPoolExecutor

In [10]:
pool = ProcessPoolExecutor(max_workers=1)
fut = pool.submit(fib, 38)
fut.running()

True

In [11]:
fut.done()

True

We saw how to use the concurrent.futures package to create a worker pool (using the ProcessPoolExecutor class) and submit 
work to it (pool.submit(fib, 38)). As we expect, submit returns a Future object 
(fut in the preceding code), which is a placeholder for a result that is not yet available.