> Process 1: "You could probably do that in four lines."

> Process 2: "I bet we could do that in more."

### Content of this note introduces multiprocessing module in python standard library. In this note, we will experience different threading controls in python.

```python
from multiprocessing import Process, Pool
from multiprocessing import Pipe, Queue, Lock
from multiprocessing import Manager
```

Citation: https://docs.python.org/3/library/multiprocessing.html

## Process

Basic class for multiprocessing class. Process can be both process or thread in this library. 

Use **Process()** to create a child process. Attribute **daemon** will determine whether a child process gets terminated when parent process exits. **daemon** flags are required to set before calling _run()_ function in a process. If ```process.daemon = False```, the child process will not exit if parent process terminates.

> Process function ```join()``` allow parent process to wait until child process finished

In [None]:
import time
def individual_process(sec):
    print("start child process")
    print("waiting for", sec, "seconds")
    time.sleep(sec)
    print("exit child process")
    return sec

from multiprocessing import Process
print("start parent process")
p = Process(target=individual_process, args=(2,))
p.start()
print("before join parent process doesn't wait")
p.join() #parent process receive signal from child process
print("exit parent process")

> Other attributes for **Process** class: ```.pid, .name, .is_alive(), .join(), .daemon```

Now we can experience a nested structure of **Processes**.

In [None]:
def complicated_process(sec):
    print("start x child process")
    child_p = Process(target=individual_process, args=(3,))
    child_p.start()
    child_p.join()
    print("exit x child process")

p = Process(target=complicated_process, args=(2,))
p.start()
p.join()
print("exit parent process")

# Pool
Multiprocessing module class that controls a number of processes to perform basic data parallelism

> ```Pool(5) -> while loop: create 5 Processes```

> ```p.map(task, [data chunk]) -> 
while loop: assign individual process with task and input; join() each process```

> ```p.map_async() -> don't join each process```

In [24]:
def process_task(x):
    print("task: running", x, "seconds")
    return time.sleep(x)

from multiprocessing import Pool
p = Pool(5)
#run task in a sync fashion, parent process will wait for child processes to finish
p.map(process_task, [1,1,1])
#run task in an async fashion
p.map_async(process_task, [1,1,5])
print("finished sync task")
#prevent any other task being submitted to Pool
p.close()
#!!! join process after close Pool
p.join()
print("finished async task")

task: running 1 seconds
task: running 1 seconds
task: running 1 seconds
task: running 5 seconds
task: running 1 seconds
task: running 1 seconds
finished sync task
finished async task


## Communication between Processes

# Pipe and Queue
Pipe is used to connect **two** processes.

``` python
conn_1, conn2 = Pipe()
conn_1.send(message)
conn_2.recv() # receive message
```

Queue is used to establish communications among multiple processes.

``` python
queue = Queue()
queue.put(message)
queue.get()
```

In [14]:
from multiprocessing import Pipe, Queue

conn_1, conn_2 = Pipe()
conn_1.send("pipe: [message] from conn_1")
print(conn_2.recv())
#print(conn_1.recv()) if there is no message in the pipe, the receiver will continue to listen until something arrives
conn_2.send("pipe: [message] from conn_2")
print(conn_1.recv())

queue = Queue()
queue.put("queue: [message] (1)")
queue.put("queue: [message] (2)")
print(queue.get())

pipe: [message] from conn_1
pipe: [message] from conn_2
queue: [message] (1)


## Synchronization between Processes

# Lock

Without using the lock output from the different processes is liable to get all mixed up.

**Try to comment out different part of script to see the difference in execution results. **

In [17]:
from multiprocessing import Lock

def lock_in(lock, time_span):
    lock.acquire()
    print("process:", time_span)
    time.sleep(time_span)
    lock.release()

def simple_process(time_span):
    print("process:", time_span)
    time.sleep(time_span)

lock = Lock()

print("results using lock.")
for i in range(0, 5):
    p = Process(target = lock_in, args=(lock, i)) # lock is liable to keep child processes executing in sequence
    p.start()
    p.join() # for the parent process to wait for child processes
print("parent process exits")

results using lock.
process: 0
process: 1
process: 2
process: 3
process: 4
parent process already exit


## Sharing information between Processes

# Manager
If we need to create a shared database among the child processes, instantiation of a manager object is the way to go. Manager() object supports various datatypes including ```dict(), list(), Condition, Event, Queue, Array``` etc.

In [20]:
from multiprocessing import Manager

def shared_data_process(l, i):
    l.append(i)

manager = Manager()
#create a shared list
database = manager.list(range(0, 5))

for i in range(0, 5):
    p = Process(target=shared_data_process, args = (database, i))
    p.start()
    p.join()
    
print("manager returns shared database")

# if the child processes are not executed in a desired sequential fashion, 
# we will use lock to synchronize the processes

print(database)

manager returns shared database
[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]
