# Concurrency and Parallelism

**Concurrency is when a computer does many different things `seemingly` at the same time**. For example, on a computer with one CPU core, the operating system will rapidly change which program is running on the single processor. This interleaves execution of the programs, providing the illusion that the programs are running simultaneously.

**Parallelism is actually doing many different things at the same time**. Computers with multiple CPU cores can execute multiple programs simultaneously. Each CPU core runs the instructions of a separate program, allowing each program to make forward progress during the same instant.

Within a single program, concurrency is a tool that makes it easier for programmers to solve certain types of problems. **Concurrent programs enable many distinct paths of execution to make forward progress** in a way that seems to be both simultaneous and independent.

The key difference between parallelism and concurrency is **speedup**. When two distinct paths of execution in a program make forward progress in parallel, the time it takes to do the total work is cut in half; the speed of execution is faster by a factor of two. In contrast, concurrent programs may run thousands of separate paths of execution seemingly in parallel but provide no speedup for the total work.

**Python makes it easy to write concurrent programs**. Python can also be used to do parallel work through system calls, subprocesses, and C-extensions. But **it can be very difficult to make concurrent Python code truly in parallel**.

## Item 36: Use subprocess to Manage Child Processes



## Item 37: Use Threads for Blocking I/O, Avoid for Parallelism

The standard implementation of Python is called *CPython*. **CPython runs a Python program in two steps**. First, it parses and compiles the source text into bytecode. Then, **it runs the bytecode using a stack-based interpreter**. The bytecode interpreter has state that must be maintained and coherent while the Python program executes. Python enforces coherence with a mechanism called the **global interpreter lock (GIL)**.

Essentially, the GIL is a mutual-exclusion lock (mutex) that prevents CPython from being affected by preemptive multithreading, where one thread takes control of a program by interrupting another thread.

The GIL has an important negative side effect. With programs written in languages like C++ or Java, having multiple threads of execution means your program could utilize multiple CPU cores at the same time. Although Python supports multiple threads of execution, **the GIL causes only one of them to make forward progress at a time**. This means that when you reach for threads to do parallel computation and speed up your Python programs, you will be sorely disappointed.

In [3]:
from time import time

# computationally intensive
def factorize(number):
    for i in range(1, number+1):
        if number % i == 0:
            yield i
            
numbers = [2139079, 1214759, 1516637, 1852285, 2139079]
start = time()
for n in numbers:
    list(factorize(n))
end = time()
print('Took %.3f seconds' % (end - start))

Took 1.185 seconds


In [6]:
# do the same comp as before
from threading import Thread

class FactorizeThread(Thread):
    def __init__(self, number):
        super().__init__()
        self.number = number
        
    def run(self):
        self.factors = list(factorize(self.number))
        
numbers = [2139079, 1214759, 1516637, 1852285, 2139079]
start = time()

threads = []
for n in numbers:
    thread = FactorizeThread(n)
    thread.start()
    threads.append(thread)
    
# wait for all threads
for thread in threads:
    thread.join()

end = time()
print('Took %.3f seconds' % (end - start))

Took 1.235 seconds


What's surprising is that this takes even longer than running `factorize` in serial. You may wonder, why does Python support threads at all? There are two good reasons.

First, multiple threads make it easy for your program to seem like it’s doing multiple things at the same time. Managing the juggling act of simultaneous tasks is difficult to implement yourself (see Item 40: “Consider Coroutines to Run Many Functions Concurrently” for an example). **With threads, you can leave it to Python to run your functions seemingly in parallel**.

The second reason Python supports threads is to **deal with blocking I/O**, which happens when Python does certain types of system calls. System calls are how your Python program asks your computer’s operating system to interact with the external environment on your behalf. Blocking I/O includes things like reading and writing files, interacting with networks, communicating with devices like displays, etc. Threads help you handle blocking I/O by insulating your program from the time it takes for the operating system to respond to your requests.


In [7]:
import select

def slow_syscall():
    select.select([], [], [], 0.1)
    
start = time()
for _ in range(5):
    slow_syscall()
    
end = time()
print('Took %.3f seconds' % (end - start))

Took 0.531 seconds


In [8]:
# run multiple invocations in separate threads
start = time()
threads = []
for _ in range(5):
    thread = Thread(target = slow_syscall)
    thread.start()
    threads.append(thread)
    
def compute_loc(i):
    print(i)
    
for i in range(5):
    compute_loc(i)
    
for thread in threads:
    thread.join()
    
end = time()
print('Took %.3f seconds' % (end - start))

0
1
2
3
4
Took 0.107 seconds


The parallel time is 5× less than the serial time. This shows that the system calls will all run in parallel from multiple Python threads even though they’re limited by the GIL. The GIL prevents my Python code from running in parallel, but it has no negative effect on system calls. This works because **Python threads release the GIL just before they make system calls and reacquire the GIL as soon as the system calls are done**.

There are many other ways to deal with blocking I/O besides threads, such as the **asyncio built-in module**, and these alternatives have important benefits. But these options also require extra work in refactoring your code to fit a different model of execution (see Item 40: “Consider Coroutines to Run Many Functions Concurrently”). **Using threads is the simplest way to do blocking I/O in parallel with minimal changes to your program**.

Things to Remember
* Python threads can’t run bytecode in parallel on multiple CPU cores because of the global interpreter lock (GIL).
* Python threads are still useful despite the GIL because they provide an easy way to do multiple things at seemingly the same time.
* Use Python threads to make multiple system calls in parallel. This allows you to do blocking I/O at the same time as computation.

## Item 38: Use Lock to Prevent Data Races in Threads



## Item 39: Use Queue to Coordinate Work Between Threads

