<a href="https://colab.research.google.com/github/kchenTTP/python-series/blob/main/python_multithreading/Multithreading_in_Python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Multithreading in Python**

In the previous class on asynchronous programming, we had a first look at concurrency in programming using coroutines (event loops) to optimize I/O-bound tasks using coroutines. In this class, we will be diving a little bit deeper into concurrency in python and take a look at multithreading.

**Table of Contents**

- [Threads and Processes](#scrollTo=K3MSR-8x31WE)
- [Multithreading](#scrollTo=Fkz86bj94ARo)
- [Considerations When Using Multithreading](#scrollTo=nQ18LxcjJH90)


## **Threads and Processes**

Before diving into multithreading and multiprocessing, let's first understand the basics of threads and processes.

<br>

<figure align="center">
  <img src="https://raw.githubusercontent.com/kchenTTP/python-series/refs/heads/main/python_multithreading/assets/threads-vs-processes.png" alt="threads-vs-processes.png" />
  <figcaption>Threads vs processes</figcaption>
</figure>

<br>

- **Threads**

  - Lightweight units of execution within a process
  - Think of a threads as a "chain of command" within a process
  - A single thread can only execute one command at a time
  - Separate threads share the same memory space and resources of the parent process
  - Ideal for I/O-bound tasks, as the CPU can switch between threads when one is waiting for resources ([context switching](https://en.wikipedia.org/wiki/Context_switch))
  - Faster to create and destroy than processes
  - Easier to share data between threads, but requires careful synchronization to avoid issues like race conditions
  - Limited by the Global Interpreter Lock ([GIL](#scrollTo=K3MSR-8x31WE&line=39&uniqifier=1)) in CPython for CPU-bound tasks

<br>

- **Processes**

  - Heavyweight units of execution that can contain multiple threads
  - Think of a process as an independent “program” running on your CPU
  - Separate processes maintain their own distinct memory space and resources, enabling true [parallelism](https://en.wikipedia.org/wiki/Parallel_computing) across multiple CPU cores
  - Ideal for CPU-bound tasks, as they can utilize multiple CPU cores effectively
  - Processes have higher overhead for creation and require [inter-process communication](https://en.wikipedia.org/wiki/Inter-process_communication) to share data
  - Offer better isolation between tasks, reducing the risk of shared state issues
  - Since each process has its own Python interpreter instance, they are not limited by the [GIL](#scrollTo=K3MSR-8x31WE&line=39&uniqifier=1)

<br>

> 🤓 **Python Global Interpreter Lock (GIL)**
>
> The Global Interpreter Lock (GIL) is a mutex that ensures only one thread can control the Python interpreter at a time, preventing race conditions and ensuring memory safety.
>
> - In single-threaded programs, the GIL doesn't have much of an impact. However, in multithreaded programs, when multiple threads try to execute Python code, each thread must release the GIL before another can acquire it. This limits concurrency and makes multithreading better suited for I/O-bound tasks, where threads are often waiting for external resources anyway.
>
> - In contrast, multiprocessing programs use separate Python interpreters for each process, which mitigates the GIL's impact but consumes more resources since each process requires its own memory space and CPU allocation.
>
> [GIL Documentation](https://wiki.python.org/moin/GlobalInterpreterLock)

<br>

<figure align="center">
  <img src="https://raw.githubusercontent.com/kchenTTP/python-series/refs/heads/main/python_multithreading/assets/python-single-process-gil.png" alt="python-single-process-gil.png" />
  <figcaption>Single process, multithreaded python program</figcaption>
</figure>

### **Coroutine vs Threads vs Processes**

In the previous class, we learned about coroutines. Since coroutines and threads seem very similar and are both effective for I/O-bound tasks, it's important to understand how each approach handles concurrency and resource management.

<br>

<table>
  <thead>
    <tr>
      <th>Aspect</th>
      <th>Coroutines</th>
      <th>Threads</th>
      <th>Processes</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Concurrency Model</strong></td>
      <td>Cooperative multitasking within a single thread</td>
      <td>Preemptive multitasking within a process</td>
      <td>Parallel execution with separate processes</td>
    </tr>
    <tr>
      <td><strong>Ideal for</strong></td>
      <td>I/O-bound tasks (waiting for resources)</td>
      <td>I/O-bound tasks, limited for CPU-bound tasks (GIL)</td>
      <td>CPU-bound tasks (true parallelism)</td>
    </tr>
    <tr>
      <td><strong>Resource Usage</strong></td>
      <td>Lightweight, minimal overhead</td>
      <td>More resource-intensive due to OS-level management</td>
      <td>Heavy resource usage due to process creation and IPC</td>
    </tr>
    <tr>
      <td><strong>Memory Sharing</strong></td>
      <td>Shared memory within a single thread</td>
      <td>Shared memory within the process, needs locks</td>
      <td>Separate memory spaces for each process</td>
    </tr>
    <tr>
      <td><strong>Synchronization</strong></td>
      <td>Not required (runs in a single thread)</td>
      <td>Requires locks and synchronization</td>
      <td>Complex, requires inter-process communication (IPC)</td>
    </tr>
    <tr>
      <td><strong>Parallelism</strong></td>
      <td>No, runs in a single thread</td>
      <td>No true parallelism (limited by GIL in CPython)</td>
      <td>Yes, true parallelism using multiple CPU cores</td>
    </tr>
    <tr>
      <td><strong>Ease of Use</strong></td>
      <td>Simpler, no need for locks or complex management</td>
      <td>More complex due to race conditions and deadlocks</td>
      <td>Most complex, with higher overhead and IPC requirements</td>
    </tr>
    <tr>
      <td><strong>Use Cases</strong></td>
      <td>Handling many network or file I/O tasks simultaneously</td>
      <td>Concurrent tasks within a process, some CPU tasks</td>
      <td>CPU-intensive tasks that need full core utilization</td>
    </tr>
  </tbody>
</table>


## **Multithreading**

Multithreading is the act of using multiple threads to execute tasks concurrently within the same process, allowing a program to perform multiple operations simultaneously and improve efficiency by making better use of CPU resources.

<br>

In Python, there are several ways to achieve multithreading, each with unique syntax and applications. This section will cover the basics of the following methods:

- `asyncio.to_thread()`
- `concurrent.futures.ThreadPoolExecutor()`

> 🚨 While you can use the `threading` library to manage individual threads, it's often unnecessary for most simpler tasks we encounter. Therefore, we won't cover it in this section.

<br>

**Imports**

- `time`: Functions for time-based operations
- `asyncio`: Library for writing concurrent code with async I/O
- `requests`: Synchronous HTTP library
- `typing`: Provides type hinting
- `rich`: For enhanced text formatting and display


In [None]:
import time
import asyncio
import requests
from typing import Any, Callable, Coroutine
from rich.pretty import pprint

### **`asyncio`**

Previously, we used `asyncio` to write asynchronous code with Python's coroutine-based event loop. For asynchronous HTTP requests, we used the `httpx` library, as the synchronous `requests` library can't work directly with `asyncio` coroutines.

<br>

But what if there's no async-compatible alternative for a library?

<br>

In such cases, multithreading allows us to run multiple instances of blocking code on separate threads, achieving concurrency through context switching. With `asyncio`, we can use the `.to_thread()` method to run synchronous code in a separate thread while still maintaining the benefits of asynchrounus operations.




Let's look at an example of visiting multiple web pages. We have a function, `download_page()`, that takes a URL and fetches the page content. Since `requests.get(url)` is a blocking operation, we can create a wrapper function, `async_download_page()`, which uses `asyncio.to_thread()` to run this blocking code in a separate thread, making it non-blocking.


In [None]:
urls = [
  "https://www.example.com",
  "https://www.python.org",
  "https://www.github.com",
  "https://www.stackoverflow.com",
  "https://www.wikipedia.org"
]

In [None]:
def download_page(url: str) -> dict:
  print(f"Downloading: {url}")
  resp = requests.get(url)  # blocking code
  resp.raise_for_status()
  time.sleep(0.5)  # blocking code simlutate downloading page content
  return {
      "url": url,
      "status": resp.status_code,
      "length": len(resp.text)
  }


async def async_download_page(url: str) -> Coroutine | dict:
  return await asyncio.to_thread(download_page, url)


# Sync page download
start = time.perf_counter()
print("Blocking code execution...")
for url in urls:
  pprint(download_page(url))
print("---")
print(f"Time taken: {time.perf_counter() - start:.4f} seconds")

print("=========")
print()

# Multithreading page download
start = time.perf_counter()
print("Multithreading code execution...")
result = await asyncio.gather(*[async_download_page(url) for url in urls])
pprint(result)
print("---")
print(f"Time taken: {time.perf_counter() - start:.4f} seconds")

Blocking code execution...
Downloading: https://www.example.com


Downloading: https://www.python.org


Downloading: https://www.github.com


Downloading: https://www.stackoverflow.com


Downloading: https://www.wikipedia.org


---
Time taken: 3.3968 seconds

Multithreading code execution...
Downloading: https://www.example.com
Downloading: https://www.python.org
Downloading: https://www.github.com
Downloading: https://www.stackoverflow.com
Downloading: https://www.wikipedia.org


---
Time taken: 0.8047 seconds


### **`concurrent.futures.ThreadPoolExecutor()`**

The `concurrent.futures` module provides a high-level interface for asynchronously executing callables using threads or processes. The `ThreadPoolExecutor` class allows you to manage a pool of threads, making it easy to submit tasks and retrieve results without needing to manage the threads manually.

<br>

Here's a quick overview of how it works:

  -	**Create a Thread Pool**: Instantiate a `ThreadPoolExecutor` with a specified number of threads. If you don't specify, it defaults to the number of processors on your machine.
  -	**Submit Tasks**: Submit tasks using the `submit()` method, which returns a ***`Future`*** object representing the execution of the callable.
  -	**Gather Results**: Use the `result()` method on the ***`Future`*** object to obtain the result once the task has completed.

  > ***`Futures`*** represent the eventual result (promise) of an asynchronous computation. They provide methods to check if the computation is done, wait for its completion, and retrieve the result.

<br>

> 📒 **Note:** You can use the `as_completed()` function to retrieve results as soon as they are available. This method does not maintain the order of the tasks as they were submitted.


In [None]:
from concurrent.futures import ThreadPoolExecutor, as_completed

In [None]:
start = time.perf_counter()
print("Multithreading code execution...")

results = []
max_workers = len(urls)

# Create a thread pool
with ThreadPoolExecutor(max_workers=max_workers) as executor:
  # Submit all tasks and get future objects
  future_to_url = {
    executor.submit(download_page, url): url
    for url in urls
  }

  # Wait and collect results as they complete before execute more code below
  for future in as_completed(future_to_url):
    results.append(future.result())

pprint(results)
print("---")
print(f"Time taken: {time.perf_counter() - start:.4f} seconds")

Multithreading code execution...
Downloading: https://www.example.com
Downloading: https://www.python.org
Downloading: https://www.github.com
Downloading: https://www.stackoverflow.com
Downloading: https://www.wikipedia.org


---
Time taken: 0.7961 seconds


Nice, our code is running much faster now that we are using threads!

### **`threading` (Additional Material)**

The `threading` library in Python is a low-level module that allows you to create and manage threads manually. While it gives you more control over thread behavior, such as starting, stopping, and joining threads, it also requires more careful management to avoid common pitfalls like race conditions.

<br>

Here's a quick overview of how it works:

  -	**Creating Threads**: You can create a new thread by instantiating the `Thread` class and passing it a target function to execute.
  -	**Starting and Joining Threads**: Once a thread is created, you can start it with the `start()` method. To ensure the main program waits for a thread to complete its execution, you use the `join()` method.
  -	**Synchronization**: Since threads share the same memory space, synchronization mechanisms (like locks, semaphores, and conditions) are essential to prevent conflicts when accessing shared resources.

<br>

> 📒 **Note:** While the `threading` library provides flexibility, for most scenarios, especially I/O-bound tasks, `concurrent.futures.ThreadPoolExecutor()` offers a simpler and more efficient approach to concurrent programming in Python.

In [None]:
import threading


def download_page_t(url: str, results: list[Any]):
  """
  Threads cannot return values. Therefore, we use a list to store the results.
  Warning: This code is not thread-safe. For thread-safe append operation, use `threading.Lock()`.
  """
  resp = requests.get(url)
  time.sleep(0.5)
  results.append(
    {
      "url": url,
      "status": resp.status_code,
      "length": len(resp.text)
    }
  )


def download_page_t_safe(url: str, results: list[Any]):
  """
  Thread-safe version.
  """
  try:
    resp = requests.get(url)
    time.sleep(0.5)
    # Thread-safe append operation
    with threading.Lock():
      results.append({"url": url, "status": resp.status_code, "length": len(resp.text)})
  except Exception as e:
    with threading.Lock():
      results.append((url, f"Error: {str(e)}"))


start = time.perf_counter()
print("Multithreading code execution...")

# Create a list to hold results
results = []
threads = []

for url in urls:
  # Create a thread for each URL
  thread = threading.Thread(target=download_page_t_safe, args=(url, results))
  threads.append(thread)
  thread.start()

# Wait for all threads to complete
for thread in threads:
  thread.join()

pprint(results)
print("---")
print(f"Time taken: {time.perf_counter() - start:.4f} seconds")

Multithreading code execution...


---
Time taken: 0.8122 seconds


### **Example: Counting Machine**

> 💡 Multithreading isn't just about optimizing your code. It's also used when you need your code to do multiple things at once (concurrency).

<br>

Consider a simple but practical application of multithreading: a counter that starts at any integer (or zero by default), displays the current count every second, and allows you to manually adjust the count by entering a number or stop the counter by entering 'stop'.

<br>

<div align="center">
<img src="https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExOGozaHB6d2k2ZGNqZmJnN3U3dG51eTRtZ3hnOTUyYXpqemYwdjU5MiZlcD12MV9naWZzX3NlYXJjaCZjdD1n/08GcL1Wmk88Z8ggOI8/giphy.gif" alt="counting numbers" style=/>
<figcaption>How do you count numbers again?</figcaption>
</div>

To achieve this, we can employ two threads:

1. **Counter Thread:**
   - Displays the current count every second.

2. **Input Thread:**
   - Monitors user input.
   - Adds the entered number to the count.

By using separate threads, we can ensure that the counter keeps ticking while simultaneously allowing for user interaction.


In [None]:
class Counter:
  def __init__(self, count: int = 0):
    self.count = count
    self.running = True

  def counter(self):
    while self.running:
      print(f"Current count: {self.count}")
      time.sleep(1)

  def check_input(self):
    while self.running:
      input_value = input("Enter a positive, negative integer, or 'stop': ")
      if input_value.lower() == "stop":
        self.running = False
        break

      try:
        self.count += int(input_value)
      except ValueError:
        print("Invalid input. Please enter a positive or negative integer.")

  def start(self):
    print(f"Counter - starting count: {self.count}")
    print("Enter 'stop' to stop the counter")
    print("==========")
    time.sleep(3)

    with ThreadPoolExecutor(max_workers=2) as executor:
      executor.submit(self.counter)
      executor.submit(self.check_input)

In [None]:
counter = Counter()
counter.start()

Counter - starting count: 0
Enter 'stop' to stop the counter
Current count: 0
Current count: 0
Current count: 0
Enter a positive, negative integer, or 'stop': 2
Current count: 2
Enter a positive, negative integer, or 'stop': 5
Current count: 2
Current count: 7
Current count: 7
Current count: 7
Current count: 7
Current count: 7
Enter a positive, negative integer, or 'stop': 10
Current count: 17
Current count: 17
Enter a positive, negative integer, or 'stop': 12
Current count: 17
Current count: 29
Current count: 29
Current count: 29
Current count: 29
Current count: 29
Enter a positive, negative integer, or 'stop': -20
Current count: 9
Current count: 9
Enter a positive, negative integer, or 'stop': stop


Pretty cool, right? Our counter app can do two things at once, thanks to the magic of threads!


## **Considerations When Using Multithreading**

When implementing multithreading, there are several critical factors to keep in mind to ensure optimal performance and avoid common pitfalls:

- **Thread Safety and Synchronization**

  In a multithreaded environment, threads accessing shared data can cause issues such as race conditions if not handled properly. Use locks, semaphores, or barriers to maintain thread safety, but avoid overusing them as they can reduce performance. When data consistency is essential, carefully consider which synchronization method best suits your needs.

- **Overhead and Resource Costs**

  Managing multiple threads introduces system overhead, which can reduce performance gains, especially for lightweight tasks. Multithreading is most effective for long-running or I/O-bound tasks, where the performance gain from concurrency justifies the added overhead.

- **Global Interpreter Lock (GIL) Limitations in Python**

  Python's GIL prevents true parallel execution of threads on CPU-bound tasks, limiting speed improvements. For CPU-intensive tasks, consider multiprocessing instead, as it bypasses the GIL and allows real parallelism.

<br>

Evaluating these factors can help you decide if multithreading is the best approach or if alternatives like multiprocessing or asynchronous programming might be more effective.

In the next class, we will cover multiprocessing.