# Chapter 20. Concurrency Models in Python

## A bit of jargon

**concurrency**: The ability to handle multiple pending tasks, making progress one at a time or in parallel (not necessarily) so that they all eventually succeed or fail. A single-core CPU is capable of concurrency if it turns an OS scheduler that interleaves the execution of the pending tasks. Also known as multitasking.

**parallelism**: The ability to execute multiple computations at the same time. This requires a multi-core CPU, a GPU, or multiple computers in cluster.

**process**: An instance of a computer program while it is running, using memory and a slice of the CPU time. Modern operating systems are able to manage multiple process concurrently, with each process isolated in its own private memory space. Processes communicate via pipes, sockets or memory mapped files -- all of which can only carry raw bytes, not live Python objects. A process can spawn sub-processes, each called a child process. These are also isolated from each other an from the parent.

**thread**: An execution path within a single process. When a process starts, it uses a single thread: the main thread. Using operating systems APIs, a process can create more threads that operate concurrently thanks to operating system scheduler. Threads share the memory space of the process, which holds live Python objects. This allows easy communication between threads, but can also lead to corrupted data when more than one thread updates the same object concurrently.

**contention**: Dispute over a limited asset. Resource contention happens when multiple processes or threads try to access a shared resource -- such as lock or storage. There is also CPU contention, when compute-intensive processes or threads must wait for their share of CPU time.

**lock**: An object that threads can use to coordinate and synchronize their actions and avoid corrupting data. While updating a shared data structure, a thread should hold an associated lock. This makes other well-behaved threads wait until the lock is released before accessing the same data structure. This simplest type of lock is also known as a mutex (for mutual exclusion).

In [2]:
# A concurrent hello world

import itertools
import time

def spin(msg, done):
    for char in itertools.cycle(r'\|/-'):
        status = f'\r{char} {msg}'
        print(status, end='', flush=True)
        if done.wait(.5):
            break
    blanks = ' ' * len(status)
    print(f'\r{blanks}\r', end='')

def slow():
    time.sleep(3)
    return 42

In [3]:
# Supervisor and main functions
from threading import Thread, Event

def supervisor():
    done = Event()
    spinner = Thread(target=spin, args=('thinking', done))
    print(f'spiner object: {spinner}')

    spinner.start()
    result = slow()
    done.set()
    spinner.join()
    return result

def main():
    result = supervisor()
    print(f'Answer: {result}')

main()

spiner object: <Thread(Thread-5 (spin), initial)>
Answer: 42 


### Spinner with multiprocessing

The multiprocessing package supports running concurrent tasks in separate Python processes instead of threads. When you create a `multiprocessing.Process` instance, a whole new Python interpreter is started as a child process in the background.

In [2]:
from multiprocessing import Process, Event
from multiprocessing import synchronize

def process_supervisor():
    done = Event()
    spinner = Process(target=spin, args=('thinking!', done))
    print(f'spinner object: {spinner}')

    spinner.start()
    result = slow()
    done.set()
    spinner.join()
    return result

def process_main():
    result = process_supervisor()
    print(f'Answer: {result}')

process_main()

spinner object: <Process name='Process-1' parent=70264 initial>


Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/multiprocessing/spawn.py", line 132, in _main
    self = reduction.pickle.load(from_parent)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: Can't get attribute 'spin' on <module '__main__' (<class '_frozen_importlib.BuiltinImporter'>)>


Answer: 42


### Spinner with asyncio

It is the job of OS schedulers to allocate CPI time to drive threads and processes. In contrast, coroutines are driven by an application-level event loop that manages a queue of pending coroutines, drives them one by one, monitors events triggered by I/O operations initiated by coroutines, and passes control back to the corresponding coroutine when each event happens.

In [18]:
# Coroutine version of the spinner program
import asyncio

async def spin(msg):
    for char in itertools.cycle(r'\|/-'):
        status = f'\r{char} {msg}'
        print(status, flush=True, end='')
        try:
            await asyncio.sleep(.1)
        except asyncio.CancelledError:
            break
    blanks = ' ' * len(status)
    print(f'\r{blanks}\r', end='')
    
async def slow():
    await asyncio.sleep(3)
    return 42

async def supervisor():
    spinner = asyncio.create_task(spin('thinking!'))
    print(f'spinner object: {spinner}')
    result = await slow()
    spinner.cancel()
    return result

def main():
    result = asyncio.run(supervisor())
    print(f'Answer: {result}')                           

Example above demonstrates the three main ways of running a coroutine:

`asyncio.run(coro())`\
Called from a regular function to drive a coroutine object which usually is the entry point for all the asynchronous code in the program, like the supervisor in this example. This call blocks until the body of `coro` returns. The return value of the `run()` calls is whatever the body of `coro` returns.

`asyncio.create_task(coro())`\
Called from a coroutine to schedule another coroutine to execute eventually. This call does not suspend the current coroutine. It returns a `Task` instance, an object that wraps the coroutine object and provides methods to control and query its state.

`await coro()`
Called from a coroutine to transfer control to the coroutine object returned by `coro()`. This suspends the current coroutine until the body of `coro` returns. The value of the await expression is whatever the body of `coro` returns.


### The Real impact of the GIL

In [3]:
import math

def is_prime(n):
    if n < 2:
        return False
    if n == 2:
        return True 
    if n % 2 == 0:
        return False
        
    root = math.isqrt(n)
    for i in range(3, root + 1, 2):
        if n % i == 0: 
            return False
    return True

%timeit is_prime(5_000_111_000_222_021)

1.04 s ± 12.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [7]:
"""
Baseline for comparing sequential, multiprocessing and threading code for CPU-intensive work
"""

from time import perf_counter
from typing import NamedTuple

NUMBERS = (3333333333333333,9999999999999917)

class Result(NamedTuple):
    prime: bool
    elapsed: float

def check(n):
    t0 = perf_counter()
    prime = is_prime(n)
    return Result(prime, perf_counter() - t0)

def main():
    print(f'Checkin {len(NUMBERS)} numbers sequentially')
    t0 = perf_counter()
    for n in NUMBERS:
        prime, elapsed = check(n)
        label = 'P' if prime else ' '
        print(f'{n:16} {label} {elapsed:9.6f}s')
    elpased = perf_counter() - t0
    print(f'Total time {elapsed:.2f}s')

In [8]:
main()

Checkin 2 numbers sequentially
3333333333333333    0.000010s
9999999999999917 P  1.507942s
Total time 1.51s


In [3]:
"""
Multiprocess primality check; imports, types and functions
"""

import sys
from time import perf_counter
from typing import NamedTuple
from multiprocessing import Process, SimpleQueue, cpu_count
from multiprocessing import queues

NUMBERS = (3333333333333333,9999999999999917)

class PrimeResult(NamedTuple):
    n: int
    prime: bool
    elapsed: float

JobQueue = queues.SimpleQueue[int]
ResultQueue = queues.SimpleQueue[PrimeResult]

def check(n):
    t0 = perf_counter()
    res = is_prime(n)
    return PrimeResult(n, res, perf_counter() - t0)

def worker(jobs, results):
    while n:= jobs.get():
        results.put(check(n))

In [None]:
def main():
    workers = cpu_count()
    print(f'Checking {len(NUMBERS)} numbers with {workers} processes:')

    jobs = SimpleQueue()
    results = SimpleQueue()
    t0 = perf_counter()

    for n in NUMBERS:
        jobs.put(n)

    for _ in range(workers):
        proc = Process(target=worker, args=(jobs, results))
        proc.start()
        jobs.put(0)

    while True:
        n, prime, elapsed = results.get()
        label = 'P' if prime else ' '
        print(f'{n:16} {label} {elapsed:9.6f}s') 
        
        if jobs.empty():
            break

    elapsed = perf_counter() - t0 
    print(f'Total time: {elapsed:.2f}s')    

### Chapter Summary

After a bit of theory, this chapter presented the spinner scripts implemented in each of Python's three native concurrency programming models:
- Threads, using `threading` package;
- Processes, using `multiprocessing`;
- Asynchronous coroutines with `asyncio`.

We then explored the real impact of the GIL with an experiment: changing the spinner examples to compute the primality of a large integer and observe the resulting behavior.

- This demonstrated graphically that CPU- intensive functions must be avoided in asyncio, as they block the event loop;
- The threaded version of the experiment worked—despite the GIL— because Python periodically interrupts threads, and the example used only two threads;
- The multiprocessing variant worked around the GIL, starting a new process just for the animation while the main process did the primality check.

The next example, computing several primes, highlighted the difference between multiprocessing and threading, proving that only processes allow Python to benefit from multicore CPUs. Python’s GIL makes threads worse than sequential code for heavy computations.