Parallel Processing in Python
====

You will almost always start from the question, "How can I break up this problem into smaller pieces that can run concurrently?"

Once you have an answer to that question, there are a few Python tools that can help implement that answer.

Multiprocessing
----

In [1]:
import multiprocessing
import time

data = (
    ['a', '.2'], ['b', '.4'], ['c', '.6'], ['d', '.8'],
    ['e', '.1'], ['f', '.3'], ['g', '.5'], ['h', '.7']
)

def mp_worker(data):
    inputs, the_time = data
    print(" Processs %s\tWaiting %s seconds" % (inputs, the_time))
    time.sleep(float(the_time))
    print(" Process %s\tDONE" % inputs)
    return inputs + inputs

def mp_handler():
    p = multiprocessing.Pool(2)
    return p.map(mp_worker, data)

In [2]:
mp_handler()

 Processs b	Waiting .4 seconds
 Processs a	Waiting .2 seconds
 Process a	DONE
 Processs c	Waiting .6 seconds
 Process b	DONE
 Processs d	Waiting .8 seconds
 Process c	DONE
 Processs e	Waiting .1 seconds
 Process e	DONE
 Processs f	Waiting .3 seconds
 Process d	DONE
 Processs g	Waiting .5 seconds
 Process f	DONE
 Processs h	Waiting .7 seconds
 Process g	DONE
 Process h	DONE


['aa', 'bb', 'cc', 'dd', 'ee', 'ff', 'gg', 'hh']

In [3]:
import numpy as np

def mp_worker(power):
    d = np.random.randn(100000000)**power
    print("Raising random array to the {0}th power".format(power))
    return d.mean()

data = [1, 2, 12, 15]

In [4]:
mp_handler()

Raising random array to the 2th power
Raising random array to the 1th power
Raising random array to the 12th power
Raising random array to the 15th power


[0.00015697883657137359,
 1.000269545919426,
 10444.348963926968,
 8347.1901402059957]

Threading
----

Threads are lighter-weight since they share the Python interpreter and can sometimes share data. But mind the GIL!

In [5]:
import threading
import queue

In [6]:
q = queue.Queue()

In [7]:
q.put('foo')

In [8]:
q.put(5)

In [9]:
q.put('even more')

In [10]:
q.get(block=False)

'foo'

In [11]:
def work():
    q.put(np.random.randn(1000))

In [12]:
t = threading.Thread(target=work)

In [13]:
t.start()

In [14]:
q.get()

5

Dask
---

Higher level abstractions are available!

In [15]:
import numpy as np
import dask.array as da
import memory_profiler

In [17]:
Y = da.random.normal(size=(1000, 1000),
                     chunks=(100, 100))

Y

dask.array<da.rand..., shape=(1000, 1000), dtype=float64, chunksize=(100, 100)>

In [18]:
mu = Y.mean(axis=0)
mu

dask.array<mean_ag..., shape=(1000,), dtype=float64, chunksize=(100,)>

Notice the computation hasn't actually happened yet...

In [19]:
mu[0].compute()

0.00071773141474024626

In [21]:
from dask.diagnostics import ProgressBar

with ProgressBar():
    mu = Y.mean().compute()

[########################################] | 100% Completed |  0.2s


In [22]:
mu

0.00076207744231838949