# Multiprocessing

## How to get the number of CPUs in Python?

The Python standard library provides two ways:

 * `multiprocessing.cpu_count()` function
 * `os.cpu_count()` function.

Let's try one of these

In [None]:
# TODO

## How to use multiprocessing.Process in a for-loop?

Let's define this simple task function

In [None]:
from time import sleep
from random import random

# execute a task
def task(arg):
    # block during 1 second
    sleep(1)
    # report a message
    print(f'Task {arg} done', flush=True)

In [None]:
%%time 
task(1)

**Sequential execution**

In [None]:
%%time
for i in range(20):
    task(i)

# report that all tasks are completed
print('Done', flush=True)

**Parallel execution**

In the loop above, the `task` method calls are all independent. This is an embarrassingly parallel problem.

Try to use `multiprocessing.Process` in order to parallelize the loop. To do so, create a `multiprocessing.Process` for each task and use the `start()` method on each Process to run it. Do not forget to wait for all the processes to complete

In [None]:
%%time
import multiprocessing

# TODO

# report that all tasks are completed
print('Done', flush=True)

The loop should run in just over 1 second, even if you don't have 20 cpus on your machine. This is because the task uses no cpu and each OS has a mechanism for switching between processes when they're not doing anything. As you can see, the CPU user time is very low, which proves that the loop doesn't consume CPU. 

Let's start again by defining a task that actually consumes CPU. 

In [None]:
from time import sleep, perf_counter
from random import random

# execute a task
def task_cpu(arg):
    x=random()
    t=perf_counter()
    while (perf_counter()-t) < 10.0: # Do a calculation during 10 seconds
        x *=x
    # report a message
    print(f'Task {arg} done', flush=True)

Let's run one task. Look a the result of time. What happened?

In [None]:
%%time
task_cpu(1)

You can see now that the CPU user time is clode to 10 second. That means that this task actually consume 100% of one CPU during 10 second.

Let's try to parallalize the for-loop with the new task function. Try interpreting the wall-time of the loop execution depending of the number of CPUs on your machine.

In [None]:
%%time 
import multiprocessing

# TODO

# report that all tasks are completed
print('Done', flush=True)

## How to use multiprocessing.Pool in a for-loop?

In this example we created 24 processes, one for each task, even though we may have far fewer CPU cores in our system. Creating a process takes time. In order to be more efficient, we can use a **process pool**.

A process pool is a programming pattern for automatically managing a pool of worker processes.

The pool is responsible for a fixed number of processes.
 * It controls when they are created, such as when they are needed.
 * It also controls what they should do when they are not being used, such as making them wait without consuming computational resources.

Try to parallelize the for-loop example with `multiprocessing.Pool`. You can use `map` to execute tasks.

In [None]:
%%time

# TODO

# report that all tasks are completed
print('Done', flush=True)

Without arguments, the process pool uses all the CPU. Try to configure the number of CPU

In [None]:
%%time

# TODO

# report that all tasks are completed
print('Done', flush=True)

### Use the interface concurrent.futures

The concurrent.futures module provides a high-level interface for asynchronously executing callables.
The asynchronous execution can be performed with separate processes, using ProcessPoolExecutor.

Try to parallelize the for-loop with concurrent.futures

In [None]:
%%time 
import concurrent.futures

# TODO

# report that all tasks are completed
print('Done', flush=True)

## Synchronization between processes

### Merge for-loop result

Let's define another task function. Now the function return a value.

In [None]:
from time import sleep
from random import random

# execute a task
def task_with_return(arg):
    x = random()
    # block during 1 second
    sleep(1)
    # report a message
    print(f'Task {arg} x = {x}', flush=True)
    #return
    return x

In [None]:
task_with_return(1)

**Sequential execution**

In [None]:
%%time 
result = 0

for i in range(24):
    result += task_with_return(i)

print(f'result = {result}')

**Parallel execution**

In [None]:
%%time

# TODO

### Lock

Let's define a task with a lock. Only one process can have the lock at any time. If a process does not release an acquired lock, it cannot be acquired again.

In [None]:
from time import sleep

def task_with_lock(arg, lock):
    with lock:
        print(f'Task {arg} got the lock')
        sleep(1)
        print(i, 'world')

Try to parallelize a for-loop. Use `multiprocessing.Manager` to create a lock.

In [None]:
%%time
import multiprocessing
import  concurrent.futures 

# TODO

## Handle exception

Try to parallelize the loop with sub You can use `as_completed()` method to gather the result.

In [None]:
from time import sleep
from random import random

# execute a task
def task_with_exc(arg):
    # block during 1 second
    sleep(1)
    # Exception
    if random() < 0.5:
        raise Exception(f"Task {arg} error")
    # report a message
    return f'Task {arg} done'

In [None]:
%%time 
result = 0

try:
    for i in range(24):
        task_with_exc(i)
except Exception as e:
    print(e)
print(f'result = {result}')

With the `map`, it is not possible to which task has failed.

In [None]:
%%time
import concurrent.futures

with concurrent.futures.ProcessPoolExecutor(max_workers=8) as executor:
    results = executor.map(task_with_exc,range(24))
    try:
        for result in results:
            print(result)
    except:
        print("Error during execution")


Try to parallelize the loop with sub You can use `as_completed()` method to gather the result.

In [None]:
%%time

import concurrent.futures

# TODO

## Use share memory 

`multiprocessing.Array`enable to define an array in shared memory. So each process can have access to same array.

Try to fill an shared array with the id of the task 

In [None]:
import multiprocessing
    
def task_array(i):
    a[i] = i

# TODO

### More exercises

#### Fibonacci number

In [None]:
# calculate the nth fibonacci number
def fibonacci(n):
    # check for the start of the sequence
    if n <= 1:
        return n
    return (fibonacci(n-1) + fibonacci(n-2))

In [None]:
for i in range(20):
    print(f'f({i}) = {fibonacci(i)}')

1. Caclulate fibonacci numbers iteratively

In [None]:
# TODO

In [None]:
for i in range(20):
    print(f'f({i}) = {fibonacci(i)}')

2. Parallelize 

In [None]:
# TODO

### Count numbers

Generate with the cell above 10 files containing numbers 

In [None]:
import os
import numpy as np
for i in range(20):
    with open(os.path.join(os.getcwd(), f'numbers_{i}.txt'), 'w') as f:
        f.write(os.linesep.join(map(str, [np.random.randint(0, 100) for _ in range(1000) ] )))

Try to calculate the occurrency of each number

In [None]:
# TODO