# Exploration on Python Multiprocessing

In [1]:
import multiprocessing as mp
import time
import sys

### Test the blocking of processes
- Normal process
- daemon process

`sys.stdout.flush()`: 

Python's standard out is buffered (meaning that it collects some of the data "written" to standard out before it writes it to the terminal). Calling `sys.stdout.flush()` forces it to "flush" the buffer, meaning that it will write everything in the buffer to the terminal, even if normally it would wait before doing so. 

In [2]:
def sleepy_worker():
    p = mp.current_process()
    print('starting :',p.name, p.pid)
    sys.stdout.flush()
    for i in range(3):
        print('Sleep time:',i)
        sys.stdout.flush()
        time.sleep(2)
    print('Exiting :',p.name, p.pid)
    sys.stdout.flush()

In [3]:
def normal_worker():
    p = mp.current_process()
    print('starting :',p.name, p.pid)
    sys.stdout.flush()
    print('Exiting :',p.name, p.pid)
    sys.stdout.flush()

In [4]:
sleepy = mp.Process(name='sleepy', target=sleepy_worker)
normal = mp.Process(name='normal', target=normal_worker)
sleepy.start()
normal.start()

starting : sleepy 28233
Sleep time: 0
starting : normal 28234
Exiting : normal 28234
Sleep time: 1
Sleep time: 2
Exiting : sleepy 28233


**Conclution:** There is no blocking between those children processes.

In [8]:
if __name__ == '__main__':
    sleepy = mp.Process(name='sleepy', target=sleepy_worker)
    sleepy.daemon = True
    normal = mp.Process(name='normal', target=normal_worker)
    sleepy.start()
    time.sleep(2)
    normal.start()

starting : sleepy 28640
Sleep time: 0
Sleep time: 1
starting : normal 28644
Exiting : normal 28644
Sleep time: 2
Exiting : sleepy 28640


**Conclution:** If this block was run in `.py` file, then there will be no output after the existing of normal. Since all of the non-daemon processes (including the main program) exit before the daemon process wakes up from its sleep.

It means the non-daemon process will block the main program, while those daemon will not block it. But when the main process stops, all other daemons will be terminated.

`Process.join(timeout)` means that the main program will wait for `timeout` seconds for the process to finish its work. And it will block the main process from moving forward. Thus, by some means, you'd better start all the processes you want before join any of them.

`Process.terminate()` kills the child process, but **Note** It is important to `join()` the process after terminating it in order to give the background machinery time to update the status of the object to reflect the termination.

## Communication between processes

Any pickle-able object can pass through a Queue.

- One Queue for storing tasks
- Another Queue storing the results of workers.

In [9]:
mp.cpu_count()

4

**Signal Between Processes**
- Using Event
- `event.wait()` Block until the event being set
- `event.set()`

**Controlling Access to Resources**
- Using Lock
- `with lock:`
- `lock.acquire()` and `lock.release()`

**Synchronizing Operations**
- Using Condition
- `with condition`
- `condition.wait()`
- `conditon.notify_all()`

**Semaphore**

**Manager**

**Namespace**

### Process Pools
- Using Multiprocessing.Pool()
- `pool.map()` is functionally equivalent to the built-in `map()`, except that individual tasks run in parallel. 
- Since the pool is processing its inputs in parallel, `close()` and `join()` can be used to synchronize the main process with the task processes to ensure proper cleanup.
- By default `Pool` creates a fixed number of worker processes and passes jobs to them until there are no more jobs. Setting the `maxtasksperchild` parameter tells the pool to restart a worker process after it has finished a few tasks. This can be used to avoid having long-running workers consume ever more system resources.