# Multiprocessing Demonstration

![multiprocessing](multiprocessing.jpg)

### Learning Objectives:
* Global Interpreter Lock (GIL)
* What is multiprocessing?
* Speed comparison 
* Multiprocessing vs Threading
* Extending to other technologies

### Global Interpreter Lock (GIL)

Global interpreter lock (GIL) is a mechanism used in computer language interpreters to synchronize the execution of threads so that only one native thread can execute at a time. (https://en.wikipedia.org/wiki/Global_interpreter_lock)

* **In other words, CPython only runs one thread at a time**
* This avoids compromising shared/global data structures
* To get parallelization in python, we must run multiple jobs as separate processes


### What is multiprocessing?

In [1]:
%%bash 
brew install htop # installing htop



In [1]:
import time
def how_to_fry_eggs_on_my_computer(t):
    '''
    INPUT: INT/FLOAT - time (seconds) to waste electricity
    OUTPUT: None
    '''
    start = time.time()
    while time.time() - start < t:
        pass
    return 'One process completed'

This is serial processing.  Try running this and monitor `htop` for the activities of your different cores

In [3]:
how_to_fry_eggs_on_my_computer(10)

'One process completed'

Observations
* What did you observe?
* How many python processes were running?
* Why did two cores show activity?

Now let's try the same task but in parallel.  A lifecycle of a process is:

* Fork
* Execute 
* Exit
* Reaped by parent

In [4]:
from multiprocessing import Pool, cpu_count

n_cpus = cpu_count()
pool = Pool(processes=n_cpus)

print('This machine has {} cpu\'s'.format(n_cpus))

This machine has 4 cpu's


In [5]:
import numpy as np

pool.map(how_to_fry_eggs_on_my_computer, np.ones(n_cpus)*10)

['One process completed',
 'One process completed',
 'One process completed',
 'One process completed']

### Speed Comparison

In [6]:
def count_prime(n):
    '''
    INPUT: INT - positive number to calculate primes
    OUTPUT: INT - number of prime numbers in digits up to n
    '''
    counter = 0
    for i in range(2, n-1):
        if n % i == 0:
            counter += 1
    return counter

numbers = [int(i) for i in [7.7E7, 7.4E7, 7.3E7, 7.1E7]]
start = time.time()
[count_prime(n) for n in numbers]
print('Computation taken {} seconds with serial processing'.format(time.time()-start))

Computation taken 26.50576901435852 seconds with serial processing


In [7]:
pool = Pool(processes=n_cpus)
start = time.time()
pool.map(count_prime, numbers)
print('Computation taken {} seconds with multiprocessing'.format(time.time()-start))

Computation taken 15.318376064300537 seconds with multiprocessing


When do you use multiprocessing?  The rule of thumb is that if you're waiting for your code to run long enough to ask this question, it might be a good idea

### Multiprocessing vs Threading

Commit this to memory and you should always be able to pick the best tool:

|Approach |Context |Memory Space | Example Use Case | 
|:-----:|:-----:|:-----:|:-----:|:-----:|
|Threading | I/O bound | Shared | Webscraping (where you're waiting on GET requests) | 
|Multiprocessing | CPU bound | Separate |  Gridsearch (where you're limited by the computation) | 

In [8]:
import _thread # this is the deprecated module for demonstration purposes.  Also see `threading`

def print_time(threadName, delay):
    '''
    INPUT: name of thread as a string, delay time in seconds
    OUTPUT: None, prints time 5 times
    '''
    count = 0
    while count < 5:
        time.sleep(delay)
        count += 1
        print("{}: {}".format(threadName, time.ctime(time.time())))

_thread.start_new_thread( print_time, ("Thread-1", 2, ) )
_thread.start_new_thread( print_time, ("Thread-2", 4, ) )
_thread.start_new_thread( print_time, ("Thread-3", 1, ) )
_thread.start_new_thread( print_time, ("Thread-4", 5, ) )

123145378021376

Thread-3: Mon Feb 13 09:59:30 2017
Thread-1: Mon Feb 13 09:59:31 2017Thread-3: Mon Feb 13 09:59:31 2017

Thread-3: Mon Feb 13 09:59:32 2017
Thread-2: Mon Feb 13 09:59:33 2017
Thread-1: Mon Feb 13 09:59:33 2017
Thread-3: Mon Feb 13 09:59:33 2017
Thread-4: Mon Feb 13 09:59:34 2017
Thread-3: Mon Feb 13 09:59:34 2017
Thread-1: Mon Feb 13 09:59:35 2017
Thread-2: Mon Feb 13 09:59:37 2017
Thread-1: Mon Feb 13 09:59:37 2017
Thread-4: Mon Feb 13 09:59:39 2017
Thread-1: Mon Feb 13 09:59:39 2017
Thread-2: Mon Feb 13 09:59:41 2017
Thread-4: Mon Feb 13 09:59:44 2017
Thread-2: Mon Feb 13 09:59:45 2017
Thread-4: Mon Feb 13 09:59:49 2017
Thread-2: Mon Feb 13 09:59:49 2017
Thread-4: Mon Feb 13 09:59:54 2017


### Extending to other technologies

|Tool |Distribute term|Consolidate term|Application|
|:--------:|:--------:|:--------:|:--------:|
| multiprocessing | `start` | `join` |Distributing work across processors|
| threading | `run`/`start` | `join` |Running work concurrently with shared memory and less overhead |
| ipyparallel | `scatter` (data) `map` (work) | `gather` |Distributing work across processors or nodes|
| Hadoop | map | reduce|Distributing work across nodes|
| Spark | many | many | Adds complexity to Hadoop through an understanding of DAG |

### Concluding Remarks

* Use parallel processing with computationally intensive jobs
* Use threading for speed
* Use a parallel processing when the length of the job offsets the cost of launching a process
* Like always, make a minimum viable product and then scale.  
  * In this case, that means code in serial first before parallelizing  
  * "Premature optimization is the root of all evil"
* Have fun debugging :-)