# Parallelization

Python will use a single processor by default.  We can do things with mulptiple cores in many ways.  Here are some common methods for doing so with examples using the multiprocessing module

## Multiprocessing Process

The multiprocessing module has a process function that allows us to spawn individual threads, which may do similar or different things.

In [18]:
import os as os
from multiprocessing import Process

def squareInput( inputVal ):
    output = inputVal ** 2
    pid = os.getpid()
    print '{0} squared to {1} by pid: {2}'.format(inputVal, output, pid)
    
vals = [1, 2, 3, 4, 5]
procsUsed = []

for indVal, valHere in enumerate(vals):
    procThis = Process( target = squareInput, args = ( valHere, ) )
    procsUsed.append( procThis )
    procThis.start( )

for procThis in procsUsed:
    procThis.join( )

1 squared to 1 by pid: 4987
2 squared to 4 by pid: 4989
3 squared to 9 by pid: 4991
4 squared to 16 by pid: 4996
5 squared to 25 by pid: 4999


## Multiprocessing Pool

The multiprocessing module allows you to map inputs to a repeated function onto a pool of processors using the pool function.

Be sure to close the pool of processers - you may get 'Open File' errors.

In [7]:
from multiprocessing import Pool

def fxnSer( input ):
    return input * input

procs = Pool( 2 )
outs = procs.map( fxnSer, range( 10 ) )
print outs
procs.close( )

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


In [8]:
%%timeit
procs = Pool( 1 )
outs = procs.map(fxnSer, range(1000))
procs.close()

100 loops, best of 3: 9.77 ms per loop


In [9]:
%%timeit
procs = Pool( 2 )
outs = procs.map( fxnSer, range( 1000 ) )
procs.close( )

10 loops, best of 3: 28.1 ms per loop


There is overhead associated with creating the parallel processes - what's done in the function needs to be compute intensive enough to take enough time.

Addditionally, not all compute resources will provide multiple CPUs - know the limitations of the jupyter environment you are trying to run in.  (Also, just because we have access to it, doesn't mean we can use it with other tasks and users on the system going.)

In [39]:
import multiprocessing as mp
print mp.cpu_count( )

4


We can do I/O in parallel using multiprocessing by breaking up a file into sections to read with individual processors.  Here, we read in the complete works of Shakespeare (available: https://ocw.mit.edu/ans7870/6/6.006/s08/lecturenotes/files/t8.shakespeare.txt)

In [40]:
from multiprocessing import Pool

def readLine( line ):
    return "%s" % line

procs = 4
pool = Pool( procs )
with open( 't8.shakespeare.txt' ) as skspFile:
    fullText = pool.map( readLine, skspFile, procs)

pool.close()

In [30]:
%%timeit
procs = 1
pool = Pool( procs )
with open( 't8.shakespeare.txt' ) as skspFile:
    fullText = pool.map( readLine, skspFile, procs)

pool.close()

1 loop, best of 3: 4.19 s per loop


In [31]:
%%timeit
procs = 2
pool = Pool( procs )
with open( 't8.shakespeare.txt' ) as skspFile:
    fullText = pool.map( readLine, skspFile, procs)

pool.close()

1 loop, best of 3: 1.99 s per loop


In [32]:
%%timeit
procs = 3
pool = Pool(procs)
with open('t8.shakespeare.txt') as source_file:
    fullText = pool.map(process_line, source_file, procs)

pool.close()

1 loop, best of 3: 1.27 s per loop


In [33]:
%%timeit
procs = 4
pool = Pool( procs )
with open( 't8.shakespeare.txt' ) as skspFile:
    fullText = pool.map( readLine, skspFile, procs)

pool.close()

1 loop, best of 3: 911 ms per loop


## Other Parallelization Modules

There are may different parallelization modules for python.  Here are some of the most commonly used:
* mpi4py: Use MPI commands within python; must link to existing MPI library
* PyMP: OpenMP for python
* VecPy: SIMD extensions for vectorization (Python 3 only)

Find more information at: https://wiki.python.org/moin/ParallelProcessing