# 19 Concurrency Models in Python
Some notes, observations and questions along chapter 19.

- concurrency is the broader term; it can involve parallelism

- what makes concurrency difficult:

    - keeping track of processes and threads: in order to get their return value, we
      need to set up some communication channel (messages and queues)

    - starting threads and processes is not for free: often this is navigated by making
      each thread a "worker" that stands by and waits for tasks

    - coroutines are cheaper to start, but they are often started by the asynchronous
      framework, making them harder to keep track of

### A Bit of Jargon

##### Concurrency
- ability to handle multiple tasks, one at a time or in parallel; requires OS scheduling

##### Parallelism
- ability to execute multiple computations at the same time; requires multi-core CPU, a
  GPU, or several computers in a cluster

##### Execution unit
- objects that execute code concurrently, each with independent state and call stack; in
  Python has three kinds: processes, threads, and coroutines

##### Process
- running instance with its own memory and slice of CPU time
- communicate via pipes, sockets, or memory mapped files
- Python objects must be serialized (converted) into raw bytes to pass from one process
  to another (not every object is serializable)
- can spawn subprocesses
- OS scheduler schedules all the processes periodically

##### Thread
- execution unit within a single process
- upon start each process uses one single thread: the main thread
- threads within a process share the same memory space, which holds Python objects

##### Coroutine
- function that can suspend itself and resume later
- *classic coroutines* are built from generator functions, and *native coroutines* are
  defined with `async def`
- run within a single thread under supervision of an *event loop*
- coroutines support cooperative multitasking: each coroutine must explicitly cede
  control with the `yield` or `await` keyword so that another may proceed concurrently

##### Queue
- data structure that lets us put and get items, usually in FIFO order: first in, first
  out
- allow separate execution units to exchange application data and control messages, such
  as error codes and signals to terminate
    - Copilot adds: Queues are not strictly necessary for communication between threads,
      but highly recommended. While threads share memory and can technically communicate
      via shared variables, this creates serious problems: **race conditions**
      (unpredictable results when multiple threads read/write the same variable), and
      the need for manual lock management around every shared access. Queues solve this
      because they are **thread-safe** — they handle all locking internally. Critically,
      queues are less about exchanging data objects (which threads can already access)
      and more about **coordinating execution and safely signaling state changes**. This
      is why `queue.Queue` exists: following the principle "Don't communicate by sharing
      memory; share memory by communicating."

##### Lock
- object that execution units can use to synchronize their actions and avoid corrupting
  data
- if several threads want to write to the same memory address, they need to wait until
  they obtain the lock (other threads need to release it first)
- for instance mutex (mutual exclusion lock)

##### Contention
- dispute (regulation?) about access to a limited access (like a lock)


#### Processes, Threads, and Python's Infamous GIL
1. each instance of the Python interpreter is a process; additional processes can be
   started with the `multiprocessing` and `concurrent.futures` libraries
    - Each spawned process is a completely separate, independent operating
      system process with its own CPython interpreter instance, memory space, and GIL.

      You're correct that they're separate CPython sessions. They're not "aligned" with
      the main process — they're genuinely separate executables running concurrently or
      in parallel. The main process communicates with them via inter-process
      communication (IPC) like `multiprocessing.Queue`, pipes, or sockets. On a
      multi-core system, these processes can run truly in parallel (unlike threads,
      which are GIL-limited). On a single-core system, the OS scheduler time-slices
      between them. If you need to distribute processes across machines, you can use
      libraries like `multiprocessing` with remote managers or dedicated frameworks like
      `dask`.

2. the Python interpreter uses a single thread to run the user's program and the memory
   garbage collector; but we can start additional threads using the `threading` or the
   `concurrent.futures` module
    - remark: Really?! I thought garbage collection was done in a different thread or
      process; but seems it is not.

3. access to object reference counts and other internal interpreter state is controlled
   by a lock, the Global Interpreter Lock (GIL); Only one Python thread can hold the GIL
   at any time; meaning only one thread can execute Python code at any time, regardless
   of the number of CPU cores
    - GIL is existing in CPython and in PyPy, but not in Jython or in IronPython (but
      those two are lagging behind anyways)
    - question: Why do they point out write access of different threads to object
      reference counts so prominently here? I would imagine that's trivial because it's
      only a counter. I Imagine that the write access to living objects is much more
      complex and also needs to be protected by locks.

4. the GIL is released every 5 ms by default to prevent any thread to hold it too long

5. several build-in functions (for instance those making a syscall) and some numpy and
   scipy code can release the GIL, too

6. touching with the C part of Python (as numpy does), we can write GIL-free threads

7. network programming or programs heavily relying on I/O, will have no efficiency
   problem with Python GIL

8. contention over the GIL however slows down CPU-intense tasks 

#### Python's tools to work around the GIL

- `threading` module lets us create multiple threads that can release the GIL during I/O operations (like network requests, file reads)
- `multiprocessing` module avoids the GIL entirely by using separate processes, each with their own GIL

## A Concurrent Hello World

These examples show multithreading, multiprocessing and asynch coroutines in a function
that blocks for 3 seconds while displaying a spinner in the terminal to let the user
know that the program is “thinking” and not stalled:
    - each character in the string "\|/-" in the same screen position
    - when the slow computation finishes, the line with the spinner is cleared and the result is shown: Answer: 42

### Spinner with Threads

In [None]:
import itertools
import time
from threading import Thread, Event

def spin(msg, done):
    """Run in the additional thread."""
    # the `done` param is an instance of `threading.Event`
    for char in itertools.cycle(r'\|/-'): # itertools.cycle creates an infinite loop
        status = f'\r{char} {msg}' # the ASCII '\r' is a return character: returns the cursor
        print(status, end='', flush=True) # prints status with spinning wheel to standard out
        # this uses the Event.wait(timeout=None) method; done.set returns `False` while
        # the event was not set by another thread and `True` when this event was set
        # with `done.set()` by another thread:
        if done.wait(.1): # the .1s timeout is the "frame rate"
            break # exits the infinite loop
    blanks = ' ' * len(status)
    print(f'\r{blanks}\r', end='') # overwrites the status line with blank spaces in standard out

def slow():
    """Run in the main thread."""
    time.sleep(3) 
    # time.sleep() blocks the calling thread but releases the GIL, allowing other Python threads to run; this
    # is because it is not executing python code, but does a system call

    return 42

def supervisor():
    """Used to coordinate the threads."""
    done = Event()
    spinner = Thread(target=spin, args=('thinking!', done)) # provide function as the `target` keyword argument, and arguments to the target
    print(f'spinner object: {spinner}') # returns "initial" as the status of the thread, it means it has not started
    spinner.start() # can run meanwhile the main thread is blocked
    result = slow() # blocks the main thread
    done.set() # we call this here (instead of in slow); it can anyways only run afterwards; this terminates the loop inside the `spin` function
    spinner.join() # wait until the spinner thread finishes and join it into main thread
    return print(f'Answer: {result}')

supervisor()

spinner object: <Thread(Thread-20 (spin), initial)>
Answer: 42  


- `threading.Event` returns the flag `False` by default and only returns `True` after `Event.set()` was called
- while the flag is `False`, if a thread calls `Event.wait()`, it is blocked indefinitely until another thread calls `Event.set()`, at which time `Event.wait()` returns `True`
- `Event.wait(timeout)` returns `False` when the timeout expires (without the event being set)

### Spinner with Processes

- emulates the threading API, so easily adaptable

In [19]:
import itertools
import time
from multiprocessing import Process, Event

def spin(msg, done):
    """Run in the child process."""
    # the `done` param is an instance of `multiprocessing.synchronize.Event`
    for char in itertools.cycle(r'\|/-'): # itertools.cycle creates an infinite loop
        status = f'\r{char} {msg}' # the ASCII '\r' is a return character: returns the cursor
        print(status, end='', flush=True) # prints status with spinning wheel to standard out
        # this uses the Event.wait(timeout=None) method; done.set returns `False` while
        # the event was not set by another thread and `True` when this event was set
        # with `done.set()` by another thread:
        if done.wait(.1): # the .1s timeout is the "frame rate"
            break # exits the infinite loop
    blanks = ' ' * len(status)
    print(f'\r{blanks}\r', end='') # overwrites the status line with blank spaces in standard out

def slow():
    """Run in the main process."""
    time.sleep(3) 

    return 42
def supervisor():
    done = Event()
    spinner = Process(target=spin, args=('thinking!', done))
    print(f'spinner object: {spinner}')
    spinner.start()
    result = slow()
    done.set()
    spinner.join() # waiting for sub-process, also even joins it into the main process (I would not have thought they were joinable)
    return result

supervisor()

spinner object: <Process name='Process-2' parent=250529 initial>
            

42

- when creating a `multiprocessing.Process` instance, a whole new Python interpreter is started as a child
  process in the background
- calling `.join()` on a sub-process does the following
    - block parent process until child terminates (wow!)
    - collect exit status of child process
    - clean up OS resources
    - **NOT** doing: read childs memory and merge it into parent's memory
- in the example, memory is not shared; the only data that crosses the process boundary is the Event state

### Spinner with Coroutines
- in contrast to `multithreading` and `multiprocessing`, in asynch programming, it is not the job of OS
  schedulers to allocate CPU time to drive threads and processes; instead, coroutines are driven by an
  application-level event loop that manages a queue of pending coroutines
- the event loop and the library coroutines and the user coroutines all execute in a single thread
    - any time spent in a coroutine slows down the event loop—and all other coroutines

In [None]:
import asyncio
import itertools

async def spin(msg):
    for char in itertools.cycle(r'\|/-'):
        status = f'\r{char} {msg}'
        print(status, flush=True, end='')
        try:
            await asyncio.sleep(.1)
        except asyncio.CancelledError: # for exiting the loop
            break
    blanks = ' ' * len(status)
    print(f'\r{blanks}\r', end='')

async def slow():
    await asyncio.sleep(3) # await `asyncio.sleep(.1)` instead of `time.sleep(.1)`, to pause without blocking other coroutines
    return 42

def main():
    """The only regular function in this programme. The others are coroutines."""
    result = asyncio.run(supervisor()) # `asyncio.run()` starts the event loop to drive the coroutine that will eventually set the other coroutines in motion. The main function will stay blocked until supervisor returns. The return value of supervisor will be the return value of asyncio.run.
    print(f'Answer: {result}')

async def supervisor():
    spinner = asyncio.create_task(spin('thinking!')) # schedules the eventual execution of spin
    print(f'spinner object: {spinner}')
    result = await slow() # `await` keyword calls slow, blocking supervisor until slow returns
    spinner.cancel() # Task.cancel method raises a `CancelledError` exception inside the spin coroutine
    return result

main()

# The code doesn't run in jupyter notebook because `asyncio.run()` creates and runs its own event loop, but it
# can't do that if one is already active. Jupyter notebooks have a built-in event loop running in the
# background.

RuntimeError: asyncio.run() cannot be called from a running event loop

In [None]:
import asyncio
import itertools

async def spin(msg):
    for char in itertools.cycle(r'\|/-'):
        status = f'\r{char} {msg}'
        print(status, flush=True, end='')
        try:
            await asyncio.sleep(.1)
        except asyncio.CancelledError: # for exiting the loop
            break
    blanks = ' ' * len(status)
    print(f'\r{blanks}\r', end='')

async def slow():
    await asyncio.sleep(3) # await `asyncio.sleep(.1)` instead of `time.sleep(.1)`, to pause without blocking other coroutines
    return 42

async def main(): # fix for running this in a jupyter notebook
    """The only regular function in this programme. The others are coroutines."""
    result = await supervisor() # fix for running this in a jupyter notebook

async def supervisor():
    spinner = asyncio.create_task(spin('thinking!')) # schedules the eventual execution of spin
    print(f'spinner object: {spinner}')
    result = await slow() # `await` keyword calls slow, blocking supervisor until slow returns
    spinner.cancel() # Task.cancel method raises a `CancelledError` exception inside the spin coroutine
    return result

await main()

# Runs, but doesn't return the answer to the ultimate question of life, the universe, and everything.

spinner object: <Task pending name='Task-6' coro=<spin() running at /tmp/ipykernel_250529/3707794880.py:4>>
            

- skipping deeper understanding an a few parts of the book
- I choose not to put the same effort into understanding `asyncio` for now, as I did for `multithreading` and
  `multiprocessing` (since this is not my main focus now).

## The Real Impact of the GIL
- system calls or http requests (among others) release the GIL because they delegate work to the OS kernel or external services
- the real impact of the GIL becomes visible, when we need to wait to run CPU intense work
    - GPU does not block the GIL because the computations happen outside of python on the GPU hardware

In [18]:
import math 

def is_prime(n: int) -> bool:
    if n < 2:
        return False
    if n == 2:
        return True
    if n % 2 == 0:
        return False

    root = math.isqrt(n)
    for i in range(3, root + 1, 2):
        if n % i == 0:
            return False
    return True

is_prime(5_000_111_000_222_021)

True

#### Quiz: 
What would happen to the spinner animation if we replaced `time.sleep(3)` with is_prime, assuming that n =
5_000_111_000_222_021 and the function would take 3.3 seconds to run?

##### My answer:
- for multiprocessing: the `supervisor` function would take 3,3 seconds to finish, and the spinner spins with a framerate of 0.1 as before
- for multithreading: the GIL cannot be released at Event.wait(.1) and the spinner cannot turn until the main thread has finished
- for asyncrio I didn't bother thinking about this and I couldn't care less

Let's try out if I was right:

In [28]:
# multiprocessing

import itertools
from multiprocessing import Process, Event

def spin(msg, done):
    for char in itertools.cycle(r'\|/-'):
        status = f'\r{char} {msg}'
        print(status, end='', flush=True)
        if done.wait(.1):
            break
    blanks = ' ' * len(status)
    print(f'\r{blanks}\r', end='')

def slow():
    return is_prime(5_000_111_000_222_021)

def supervisor():
    done = Event()
    spinner = Process(target=spin, args=('thinking!', done))
    print(f'spinner object: {spinner}')
    spinner.start()
    result = slow()
    done.set()
    spinner.join()
    return print(f'Answer: {result}')

supervisor()

spinner object: <Process name='Process-9' parent=250529 initial>
Answer: True


In [29]:
# multithreading:

import itertools
from threading import Thread, Event

def spin(msg, done):
    for char in itertools.cycle(r'\|/-'):
        status = f'\r{char} {msg}'
        print(status, end='', flush=True)
        if done.wait(.1):
            break
    blanks = ' ' * len(status)
    print(f'\r{blanks}\r', end='')

def slow():
    return is_prime(5_000_111_000_222_021)

def supervisor():
    done = Event()
    spinner = Thread(target=spin, args=('thinking!', done))
    print(f'spinner object: {spinner}')
    spinner.start()
    result = slow()
    done.set()
    spinner.join()
    return print(f'Answer: {result}')

supervisor()

spinner object: <Thread(Thread-96 (spin), initial)>
Answer: True


- the spinner spins for `multithreading`; surprising and interesting
    - in Fluent Python the reason is given by "the spinner keeps spinning because Python suspends the running
      thread every 5ms (by default), making the GIL available to other pending threads"