# Parallel Computing in Python

Python natively supports parallel processing using the `multiprocessing` module. Though this module can become arbitrarily complex, it works best in cases where you want to repeat the same task over and over again on large amounts of data.

To get started, we need to import a few tasks from the `multiprocessing` library:

In [1]:
from multiprocessing import Pool

The Pool function allows you to create a "pool" of processes that can execute tasks that they are told to. We can generate pool of processes by telling Python how many processes we want to have.

In [2]:
if __name__ ==  '__main__':
    pool = Pool(processes=3)

The `multiprocessing` package also has a way to determine how many cores you have in your computer:

In [3]:
from multiprocessing import cpu_count

print(cpu_count())

8


How many CPUs does your computer have?

We can use the number of CPUs our computers have to determine how many processes to put into the Pool. It does not make much sense to have the number of processes be greater than the number of cores, as we can only pass out as many jobs as we have cores at a single time.

Now, lets try to do something in parallel. We need to create a task for the processes to do:

In [4]:
def task(x):
    return x**2

And some data to run the task on:

In [5]:
data = [0,1,2,3,4,5,6,7,8,9]

We can now use the `Pool.map` function to pass each of the values in the data list to the task function:

In [6]:
if __name__ ==  '__main__':
    pool = Pool(processes=3)
    output = pool.map(task, data)

print(output)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


Let's put it all together, and also add in a timer to test the speed.

In [7]:
from timeit import default_timer as timer

start = timer()
output = []
for i in range(len(data)):
    output.append(task(data[i]))
stop = timer()

print("Time to run in serial = ", stop - start)

if __name__ ==  '__main__':
    pool = Pool(processes=3)
    start = timer()
    output = pool.map(task, data)
    stop = timer()

print("Time to run in parallel = ", stop - start)

Time to run in serial =  0.00018611999999862405
Time to run in parallel =  0.0016886580000026186


Did this produce a speedup? Probably not. There is a significant amount of overhead involved in passing data to and from the processes, much more than it takes to run the calculation. To make the code more efficient, we probably need a bigger problem. To prove this to yourself, try increasing the size of the data array and see if you can get a speed up:

### A more complicated example

Squaring the value of a number just isn't a big enough operation to make the overheads of sending data to and receiving data from the different processes. Lets now try an example that performs a much more intensive operation on the data: sorting. First we need to create a new task function. Here we will create a Numpy array with N randomly selected elements, and then sort it:

In [8]:
import numpy as np

def task(N):
 rand = np.random.RandomState(42) # Give a seed to reproduce results
 a = rand.rand(N) # Generate an array of size n
 return a.sort() # Sort the array

Each task takes the number of elements that should be randomly generated in an array. Lets start with 10 arrays, each with 5 elements:

In [9]:
sizes = np.repeat(5, 10)

print(sizes)

[5 5 5 5 5 5 5 5 5 5]


Now, lets try running the task on the list of sizes, first in serial and then again in parallel:

In [10]:
start = timer()
output = []
for i in range(len(sizes)):
    output.append(task(sizes[i]))
stop = timer()

print("Time to run in serial = ", stop - start)

if __name__ ==  '__main__':
    pool = Pool(processes=3)
    start = timer()
    output = pool.map(task, sizes)
    stop = timer()

print("Time to run in parallel = ", stop - start)

Time to run in serial =  0.0008135669999944639
Time to run in parallel =  0.0032308970000016757


The parallel version is still slower! Try making the task even bigger by increasing the size of the arrays to be sorted. Can you find an array size for which the parallel computation becomes faster? How much faster can you make it?

What happens if you increase the number of processes you are using? Make a plot of the time it takes to run as a function of the number of processes you are using. How does this plot change if you change N?