# Parallel Processing in Python
- simple threading system
- threads run under one process and share memory
- has some concurrent data structures, like queues, locks, semaphores
- roughly similar to Java concurrency facilities
- [doc](https://docs.python.org/3.5/library/threading.html#module-threading)
- generators are a way to do "manually scheduled threads"

In [1]:
import time
import threading

def counter(n):
    for j in range(n):
        print('count is', j)
        time.sleep(1)

t = threading.Thread(target=counter, args=[5])
t.start()

count is 0
count is 1
count is 2
count is 3
count is 4


# concurrent programming
- REALLY REALLY hard
- simple example of a [race condition](https://en.wikipedia.org/wiki/Race_condition)

# Global Interpreter Lock(GIL)
- CPython solves the concurrency problem by NOT being concurrent!
- The GIL can only be aquired by ONE thread at a time
- No matter how many threads you have, only ONE core will be used
- Really bad for CPU bound tasks
- GIL is released during I/O, so not so bad for I/O bound tasks
- for CPU bound tasks, use can separate processes, instead of threads
- however, processes are more "heavyweight" than threads, and do not share memory
- can move CPU bound tasks into to C - ctypes releases the GIL on a C function call
- Java and C++ do not have this problem
- Java has excellent concurrency facilities

# multiprocessing module
- run multiple Python processes
- avoids the GIL, but more expensive
- [doc](https://docs.python.org/3.5/library/multiprocessing.html)

In [2]:
from multiprocessing import Pool

def square(x):
    return x*x

# make a pool of 5 pythons
# each square call will run in a separate Python executable

p = Pool(5)
print(p.map(square, [1, 2, 3]))

[1, 4, 9]


In [3]:
p.close()