# Concurrency Issues and Patterns

Once we've sped up our code on a single core, the easiest remaining performance opportunities are exploitation of multiple cores (CPU, GPU, heterogeneous, external nodes, etc.)

### Multithreading

Like most modern languages, Python supports multithreading.

Multithreading is easy to access (if not easy to use properly) from an API perspective.

However, there are numerous drawbacks:
* Humans are notoriously bad at managing concurrency and shared state directly
    * Even if you are great at it ... don't you want a new project someday? That means transitioning your thread-based code to someone else (!)
* Python's famous GIL (or "global interpreter lock") limits the effective concurrency, depending upon what's actually occurring in the threads

Once again, the concerns depends on the audience and the goal. Python multithreading can be a key tool for infrastructure developers, but is likely not an ideal tool for application developers.

### Multiprocessing

Python also has a high-level interface for managing multiple processes -- running Python or other code.

Multiprocessing Python avoids the GIL, since each process has its own GIL. 

The MP API also provides some helpers for scheduling the various processes as well as sharing data: https://docs.python.org/3/library/multiprocessing.html

At the same time, the MP module doesn't provide distributed scheduling (beyond the local node), sophisticated fault tolerance semantice, or an easy way to exploit heterogenous compute (specifically GPU cores).

### What Would We Like?

Ideally, a library that provides...
* Easy parallelism, with no GIL issues
* Scale to nodes, not just cores
* An event/reactive/future-style API, so that programmers aren't managing locks, semaphores, or shared state
* Some facility for exploiting GPU (or any other compute device addressable in Python)
* Pythonic look & feel, mental model (i.e., not like PySpark)
