# Chapter 13: Concurrency and Parallelism

Modern applications must handle multiple tasks simultaneously—serving thousands of web clients, processing large datasets, or maintaining responsive UIs during intensive operations. Python offers three distinct concurrency models: **threading** for I/O-bound concurrency, **multiprocessing** for CPU-bound parallelism, and **asyncio** for high-performance asynchronous I/O. Understanding when and how to apply each is essential for building scalable systems.

This chapter examines the Global Interpreter Lock (GIL) that shapes Python's concurrency landscape, thread safety and synchronization primitives, process-based parallelism that bypasses the GIL, and the async/await syntax for cooperative multitasking. We emphasize the critical distinction between I/O-bound and CPU-bound workloads, providing decision frameworks for selecting the appropriate concurrency model.

## 13.1 Understanding Concurrency vs. Parallelism

Before implementing concurrent code, understand the fundamental distinction:

**Concurrency** (Threading/Asyncio): Handling multiple tasks by interleaving execution. Tasks make progress within the same time period but not necessarily simultaneously. Ideal for I/O-bound operations (network requests, disk I/O) where the program waits for external resources.

**Parallelism** (Multiprocessing): Executing multiple tasks simultaneously on multiple CPU cores. Tasks truly run at the same time. Required for CPU-bound operations (heavy computation, data processing) that fully utilize processor cycles.

```python
import time
import threading
import multiprocessing
from typing import Callable
import requests

def io_bound_task(url: str) -> None:
    """
    I/O-bound: Most time spent waiting for network.
    CPU usage is low during execution.
    """
    response = requests.get(url)
    print(f"Fetched {len(response.content)} bytes")

def cpu_bound_task(n: int) -> int:
    """
    CPU-bound: Constant computation, no waiting.
    CPU usage is 100% during execution.
    """
    count: int = 0
    for i in range(n):
        count += i ** 2
    return count
```

**The Global Interpreter Lock (GIL):**
CPython's memory management isn't thread-safe. The GIL is a mutex that prevents multiple threads from executing Python bytecode simultaneously. This means:
*   **Threading**: Only one thread executes Python code at a time (but threads can wait for I/O simultaneously)
*   **Multiprocessing**: Each process has its own Python interpreter and GIL, enabling true parallelism

## 13.2 Threading: Concurrent I/O Operations

The `threading` module provides threads that share memory space but are limited by the GIL. Despite the GIL, threading excels for I/O-bound tasks because threads release the GIL when waiting for I/O operations.

### Basic Thread Creation

```python
import threading
from typing import List
import time

def download_file(file_id: int) -> None:
    """Simulate downloading a file."""
    print(f"[Thread {threading.current_thread().name}] Starting download {file_id}")
    time.sleep(1)  # Simulating network I/O
    print(f"[Thread {threading.current_thread().name}] Finished download {file_id}")

def basic_threading() -> None:
    """Create and manage threads manually."""
    threads: List[threading.Thread] = []
    
    # Create 5 threads
    for i in range(5):
        thread = threading.Thread(
            target=download_file,
            args=(i,),
            name=f"Worker-{i}"
        )
        threads.append(thread)
        thread.start()  # Begin thread execution
    
    # Wait for all threads to complete
    for thread in threads:
        thread.join()  # Block until thread finishes
    
    print("All downloads complete")

# Timing: Sequential would take 5 seconds
# Threading takes ~1 second (limited by slowest download)
```

### Thread Safety and Race Conditions

When multiple threads access shared data, race conditions occur:

```python
import threading
from typing import List

class UnsafeCounter:
    """Demonstrates race condition - not thread-safe."""
    def __init__(self) -> None:
        self.value: int = 0
    
    def increment(self) -> None:
        """Read-modify-write operation is not atomic."""
        current = self.value      # Read
        time.sleep(0.000001)      # Simulate processing (context switch happens here)
        self.value = current + 1  # Write

def demonstrate_race_condition() -> None:
    """Show data corruption without synchronization."""
    counter = UnsafeCounter()
    threads: List[threading.Thread] = []
    
    def worker() -> None:
        for _ in range(1000):
            counter.increment()
    
    # Create 10 threads
    for _ in range(10):
        t = threading.Thread(target=worker)
        threads.append(t)
        t.start()
    
    for t in threads:
        t.join()
    
    # Expected: 10000, Actual: Less (due to lost updates)
    print(f"Final count: {counter.value}")  # Likely < 10000
```

### Synchronization Primitives

Protect shared data with locks and other synchronization tools:

```python
import threading
from typing import Optional
import time

class SafeCounter:
    """Thread-safe counter using Lock."""
    
    def __init__(self) -> None:
        self.value: int = 0
        self._lock: threading.Lock = threading.Lock()
    
    def increment(self) -> None:
        """Atomic increment with explicit lock."""
        with self._lock:  # Acquires lock, releases automatically
            self.value += 1
    
    def get_value(self) -> int:
        with self._lock:
            return self.value

# Alternative: Using RLock (reentrant, same thread can acquire multiple times)
class ReentrantExample:
    def __init__(self) -> None:
        self._lock: threading.RLock = threading.RLock()
        self.data: dict = {}
    
    def process(self) -> None:
        with self._lock:
            self.modify()
    
    def modify(self) -> None:
        with self._lock:  # Would deadlock with regular Lock
            self.data['key'] = 'value'

# Semaphore: Limit concurrent access (e.g., connection pools)
class ConnectionPool:
    def __init__(self, max_connections: int = 5) -> None:
        self._semaphore: threading.Semaphore = threading.Semaphore(max_connections)
        self._connections: List[object] = []
    
    def acquire_connection(self) -> Optional[object]:
        """Block if max connections reached."""
        if self._semaphore.acquire(timeout=5):  # Wait max 5 seconds
            # Return connection from pool
            return object()  # Placeholder
        return None

# Event: One-shot signaling between threads
def wait_for_event() -> None:
    event: threading.Event = threading.Event()
    
    def waiter() -> None:
        print("Waiting for event...")
        event.wait()  # Blocks until set()
        print("Event received!")
    
    def setter() -> None:
        time.sleep(2)
        print("Setting event")
        event.set()
    
    threading.Thread(target=waiter).start()
    threading.Thread(target=setter).start()

# Condition: Complex coordination (producer-consumer pattern)
class ThreadSafeQueue:
    def __init__(self, max_size: int = 10) -> None:
        self._queue: List[int] = []
        self._condition: threading.Condition = threading.Condition()
        self._max_size: int = max_size
    
    def put(self, item: int) -> None:
        with self._condition:
            while len(self._queue) >= self._max_size:
                self._condition.wait()  # Wait until space available
            self._queue.append(item)
            self._condition.notify_all()  # Notify consumers
    
    def get(self) -> int:
        with self._condition:
            while not self._queue:
                self._condition.wait()  # Wait until items available
            item = self._queue.pop(0)
            self._condition.notify_all()  # Notify producers
            return item
```

### Thread-Local Storage

Each thread needs its own isolated data:

```python
import threading
from typing import Any

# Thread-local storage
thread_local: threading.local = threading.local()

def process_request(request_id: int) -> None:
    """Each thread has its own request_id."""
    thread_local.request_id = request_id
    thread_local.db_connection = create_connection()
    
    # Process...
    print(f"Thread {threading.current_thread().name} handling request {thread_local.request_id}")
    
    # Clean up
    thread_local.db_connection.close()

def create_connection() -> Any:
    return object()
```

### ThreadPoolExecutor: High-Level Interface

For most threading needs, use the high-level executor API:

```python
from concurrent.futures import ThreadPoolExecutor, as_completed
from typing import List, Dict
import requests

def fetch_url(url: str) -> Dict[str, Any]:
    """Fetch single URL."""
    try:
        response = requests.get(url, timeout=5)
        return {'url': url, 'status': response.status_code, 'size': len(response.content)}
    except Exception as e:
        return {'url': url, 'error': str(e)}

def concurrent_downloads(urls: List[str], max_workers: int = 5) -> List[Dict]:
    """
    Download multiple URLs concurrently using thread pool.
    
    Args:
        urls: List of URLs to fetch
        max_workers: Maximum concurrent threads
        
    Returns:
        Results as they complete
    """
    results: List[Dict] = []
    
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        # Submit all tasks
        future_to_url: Dict[Future, str] = {
            executor.submit(fetch_url, url): url 
            for url in urls
        }
        
        # Process results as they complete
        for future in as_completed(future_to_url):
            url = future_to_url[future]
            try:
                data = future.result()
                results.append(data)
            except Exception as exc:
                print(f"{url} generated exception: {exc}")
    
    return results

# Usage
urls = ['https://api.example.com/data'] * 20
results = concurrent_downloads(urls, max_workers=10)
```

## 13.3 Multiprocessing: True Parallelism

The `multiprocessing` module creates separate processes with their own memory space and Python interpreter, bypassing the GIL for CPU-bound tasks.

### Process Creation

```python
import multiprocessing as mp
from typing import List
import time
import os

def cpu_intensive_task(n: int) -> int:
    """
    Heavy computation that fully utilizes CPU.
    Runs in separate process with its own GIL.
    """
    print(f"Process {os.getpid()} handling task {n}")
    total: int = 0
    for i in range(n):
        total += i ** 2
    return total

def basic_multiprocessing() -> None:
    """Create processes manually."""
    processes: List[mp.Process] = []
    
    for i in range(4):
        # Args must be pickleable (sent to new process)
        p = mp.Process(target=cpu_intensive_task, args=(1000000,))
        processes.append(p)
        p.start()
    
    for p in processes:
        p.join()
    
    print("All processes complete")

# Safety check required on Windows/macOS
if __name__ == '__main__':
    basic_multiprocessing()
```

### Sharing Data Between Processes

Processes don't share memory by default. Use explicit shared memory or managers:

```python
import multiprocessing as mp
from typing import List

def shared_memory_example() -> None:
    """Share data using Value and Array (shared memory)."""
    # Shared value (synchronized with lock)
    counter: mp.Value = mp.Value('i', 0)  # 'i' = signed int
    
    # Shared array
    shared_array: mp.Array = mp.Array('d', [0.0, 0.0, 0.0])  # 'd' = double
    
    def worker(counter: mp.Value, arr: mp.Array, index: int) -> None:
        """Modify shared memory."""
        with counter.get_lock():  # Acquire lock
            counter.value += 1
        
        arr[index] = index * 2.0
    
    processes: List[mp.Process] = []
    for i in range(3):
        p = mp.Process(target=worker, args=(counter, shared_array, i))
        processes.append(p)
        p.start()
    
    for p in processes:
        p.join()
    
    print(f"Counter: {counter.value}")  # 3
    print(f"Array: {list(shared_array)}")  # [0.0, 2.0, 4.0]

def queue_communication() -> None:
    """Use Queue for process-safe communication."""
    def producer(queue: mp.Queue) -> None:
        for i in range(5):
            queue.put(f"Message {i}")
            time.sleep(0.1)
        queue.put(None)  # Sentinel value
    
    def consumer(queue: mp.Queue) -> None:
        while True:
            item = queue.get()
            if item is None:
                break
            print(f"Consumed: {item}")
    
    queue: mp.Queue = mp.Queue()
    
    p1 = mp.Process(target=producer, args=(queue,))
    p2 = mp.Process(target=consumer, args=(queue,))
    
    p1.start()
    p2.start()
    p1.join()
    p2.join()
```

### ProcessPoolExecutor

Like ThreadPoolExecutor but for processes:

```python
from concurrent.futures import ProcessPoolExecutor
import multiprocessing as mp

def parallel_map(data: List[int], func: Callable[[int], int]) -> List[int]:
    """
    Map function over data using process pool.
    
    Automatically distributes work across CPU cores.
    """
    # Default: uses os.cpu_count() processes
    with ProcessPoolExecutor(max_workers=mp.cpu_count()) as executor:
        # executor.map preserves order
        results: List[int] = list(executor.map(func, data))
        
        # Or submit individual tasks
        futures = [executor.submit(func, x) for x in data]
        results = [f.result() for f in futures]
    
    return results

# Usage
if __name__ == '__main__':
    data = range(100)
    results = parallel_map(data, cpu_intensive_task)
```

## 13.4 Asyncio: Asynchronous Programming

`asyncio` provides single-threaded concurrent I/O using an event loop and cooperative multitasking. It's ideal for high-concurrency network operations (thousands of simultaneous connections).

### Core Concepts

```python
import asyncio
from typing import Coroutine, List

async def fetch_data(url: str) -> str:
    """
    Coroutine: async function that can suspend execution.
    
    When awaiting, control returns to event loop to run other tasks.
    """
    print(f"Fetching {url}")
    await asyncio.sleep(1)  # Non-blocking sleep (yields control)
    return f"Data from {url}"

async def main() -> None:
    """Entry point for asyncio program."""
    # Await single coroutine
    result = await fetch_data("https://api.example.com")
    
    # Run multiple concurrently
    urls = ["url1", "url2", "url3"]
    tasks: List[Coroutine] = [fetch_data(url) for url in urls]
    
    # asyncio.gather runs them concurrently
    results: List[str] = await asyncio.gather(*tasks)
    print(results)

# Run the event loop
if __name__ == '__main__':
    asyncio.run(main())  # Python 3.7+
```

### Event Loop Mechanics

```python
import asyncio

def event_loop_example() -> None:
    """Manual event loop control (rarely needed in modern Python)."""
    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
    
    try:
        result = loop.run_until_complete(fetch_data("test"))
        print(result)
    finally:
        loop.close()
```

### Task Management

```python
import asyncio
from typing import Set

async def task_management() -> None:
    """Create and manage tasks explicitly."""
    # Create task (scheduled immediately)
    task1 = asyncio.create_task(fetch_data("api1.com"))
    task2 = asyncio.create_task(fetch_data("api2.com"))
    
    # Wait for specific task
    result = await task1
    
    # Wait with timeout
    try:
        result = await asyncio.wait_for(task2, timeout=5.0)
    except asyncio.TimeoutError:
        print("Task timed out")
    
    # Wait for multiple with return_when options
    pending: Set[asyncio.Task] = {
        asyncio.create_task(fetch_data(f"api{i}.com")) 
        for i in range(10)
    }
    
    while pending:
        done, pending = await asyncio.wait(
            pending,
            return_when=asyncio.FIRST_COMPLETED  # Or ALL_COMPLETED
        )
        for task in done:
            print(f"Completed: {task.result()}")

async def cancellation() -> None:
    """Cancel running tasks."""
    task = asyncio.create_task(asyncio.sleep(10))
    await asyncio.sleep(1)
    
    task.cancel()
    try:
        await task
    except asyncio.CancelledError:
        print("Task was cancelled")
```

### Real-World Asyncio: aiohttp Example

```python
import aiohttp
import asyncio
from typing import List

async def fetch_session(
    session: aiohttp.ClientSession, 
    url: str
) -> dict:
    """Fetch using aiohttp (async HTTP client)."""
    async with session.get(url) as response:
        return {
            'url': url,
            'status': response.status,
            'content': await response.text()
        }

async def fetch_all(urls: List[str]) -> List[dict]:
    """Fetch all URLs concurrently."""
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_session(session, url) for url in urls]
        return await asyncio.gather(*tasks)

# Usage: Fetch 100 URLs concurrently in single thread
urls = ['https://example.com'] * 100
results = asyncio.run(fetch_all(urls))
```

### Async Context Managers and Iterators

```python
import asyncio
from typing import AsyncIterator

class AsyncDatabaseConnection:
    """Async context manager."""
    async def __aenter__(self) -> 'AsyncDatabaseConnection':
        await self.connect()
        return self
    
    async def __aexit__(self, exc_type, exc_val, exc_tb) -> None:
        await self.disconnect()
    
    async def connect(self) -> None:
        await asyncio.sleep(0.1)  # Simulated async operation
    
    async def disconnect(self) -> None:
        await asyncio.sleep(0.1)

async def use_connection() -> None:
    async with AsyncDatabaseConnection() as conn:
        print("Using connection")

class AsyncRange:
    """Async iterator."""
    def __init__(self, start: int, end: int) -> None:
        self.start = start
        self.end = end
        self.current = start
    
    def __aiter__(self) -> AsyncIterator[int]:
        return self
    
    async def __anext__(self) -> int:
        if self.current >= self.end:
            raise StopAsyncIteration
        await asyncio.sleep(0.01)  # Simulated async work
        value = self.current
        self.current += 1
        return value

async def iterate_async() -> None:
    async for i in AsyncRange(0, 5):
        print(i)
```

## 13.5 Choosing the Right Concurrency Model

| Scenario | Solution | Reason |
|----------|----------|--------|
| Multiple I/O operations (HTTP, DB) | **Asyncio** | Thousands of concurrent connections, single-threaded |
| Few I/O operations, existing sync code | **Threading** | Simple, works with blocking libraries |
| CPU-intensive (math, data processing) | **Multiprocessing** | Bypasses GIL, uses multiple cores |
| Mixed I/O and CPU | **Asyncio + ProcessPoolExecutor** | Async for I/O, processes for CPU |
| Simple parallel tasks | **concurrent.futures** | High-level, easy to use |

### Decision Flowchart

```
Is it I/O bound (waiting for network/disk)?
├── Yes: Is it many connections (1000s)?
│   ├── Yes: Use Asyncio
│   └── No: Use Threading
└── No (CPU bound): Use Multiprocessing
```

### Combining Approaches

```python
import asyncio
from concurrent.futures import ProcessPoolExecutor
import multiprocessing as mp

async def hybrid_approach() -> None:
    """
    Use asyncio for I/O, process pool for CPU work.
    
    Example: Web server that handles requests (asyncio)
    but offloads image processing to processes.
    """
    loop = asyncio.get_running_loop()
    
    # Run CPU-bound function in process pool
    with ProcessPoolExecutor() as pool:
        result = await loop.run_in_executor(
            pool,
            cpu_intensive_task,
            1000000
        )
        print(f"Result from process: {result}")
    
    # Continue with async I/O
    await asyncio.sleep(1)
```

## Summary

Concurrency transforms sequential bottlenecks into scalable systems, but choosing the wrong model degrades performance. You understand the **Global Interpreter Lock** and its implications: threading provides concurrent I/O handling but not parallelism, while multiprocessing achieves true parallelism at the cost of memory overhead and serialization complexity.

You have mastered **threading** synchronization primitives—Locks, Semaphores, Conditions, and Events—that prevent race conditions in shared-memory concurrency. **ThreadPoolExecutor** provides a high-level interface for I/O-bound workloads without manual thread lifecycle management.

For CPU-intensive tasks, **multiprocessing** bypasses the GIL through separate processes, using shared memory (Value, Array), Queues, and Managers for inter-process communication. **ProcessPoolExecutor** distributes computational work across available CPU cores.

**Asyncio** represents Python's modern approach to high-concurrency I/O, using coroutines (`async def`), the event loop, and cooperative multitasking to handle thousands of simultaneous connections in a single thread. You understand that `await` yields control to the event loop, enabling efficient resource utilization during I/O waits.

However, concurrent code introduces complexity that demands rigorous testing and debugging. In the next chapter, we explore the ecosystem of web development—building APIs and services that leverage these concurrency models to serve users at scale.

**Next Chapter**: Chapter 14: Web Development and APIs (FastAPI, Flask, and HTTP Fundamentals).