# H - Multi-Threading

Differences between processes and threads

## What is it?
Multi-threading enables us to do more than one thing at the same time.  One multi-core CPUs, threads can occupy more than one core and both perform work concurrently.  Even on a single core, threads can take turns running so that if one thread is waiting for a response (network, user input, disk read, etc) another thread can still be productive.  

**Options:**
* Single thread (a process with a single thread)
* Multiple threads (a process with multiple threads)
* Multiple processes (multiple processes each having one or more threads)
* Single thread using asyncio (library enabling single thread to manage multiple io-bound tasks consurrenty)
* Stackless python - uses greenlets (like threads in a thread) to manage multiple tasks pseudo-concurrently like asyncio.
* Native Libraries that handle concurrency on their own - numpy, pytorch, etc

<img src="https://raw.githubusercontent.com/a8ksh4/python_workshop/refs/heads/main/images/multithreading-comparison.png" width=600>

## Reasons for needing threading/concurrency
* Performance - get things done faster
  * Avoid doing nothing while waiting for a long io bound operation to complete
* Responsiveness - one thread can check for user input while other threads do the work efficiently

## Pitfalls of threading
* Python GIL limits actuall parallelism and limits performance.
  * Use multiprocessing or native extensions (e.g., NumPy, PyTorch) for parallel computation.
* Race Conditions - Multiple threads try to use the same resources at the same time.
  * Use threading.Lock() or higher-level synchronization primitives.
* Deadlocks - if threads don't release resources then work in other threads can be blocked.
  * Prefer context managers (with lock:) over lock.acquire() / lock.release() manually

## Threading library 

**Create a new thread**
Pass it the functiion that it will execute and the arguments to pass to the function
```
import threading

def worker(name_of_worker):
    work()
    work_more()
    print(f'{name_of_worker} is done')

t1 = threading.Thread(target=worker, args=("Thread-A",))
```

**Start the thread**
Threads don't start running until told to.
```
t1.start()
```

**Check if the thread is still running**
```
if t.is_alive():
    print("Thread is still running...")
```

**End the thread**
This is a blocking call that waits for the thread to finish.  Don't call this if you need to run other code in the main thread while the worker thread is still running.
```
t1.join()
```

### Locks
We need to take precautions to avoid reading/changing memory at the same time between threads using locks and/or mutexes.  

```
lock = threading.Lock()
...
with lock:
    do_something_with_shared_resource()
```

## Processes vs threads
Processes are allocated by the operating system and have their own protected memory.  It is expensive to create and destroy processes and sharing data between processes is difficult. Eace process can start multiple threads that all share the process' memory, so threads can communicate, e.g. worker threads can take turns retreiving tasks from a queue in "shared memory" to work on them in parallel.  

<img src="https://raw.githubusercontent.com/a8ksh4/python_workshop/refs/heads/main/images/threading_vs_processes.png" width=600>


## What is the GIL?
The GIL - Global Interpreter Lock - is a sort of mutex that only lets one thread execute at a time. Python was designed using the gil initially to make the language more simple and maintainable and to focus on single threaded performance.  Many python programs are slowed more by I/O wait than cpu and don't see performance penalty from the gil. 

Work is ongoing now to remove the gil, and even standard python can be compiled now without it if you have need for truly concurrent multi-threading with python.  Just note that some libraries are not thread safe or not yet updated to work without the gil.  

https://py-free-threading.github.io/running-gil-disabled/

<img src="https://raw.githubusercontent.com/a8ksh4/python_workshop/refs/heads/main/images/multithreading-gil.png" width=600>


# Simple first example
Let's create two threads that each execute the worker function.  
* threading.Thread returns a handle for a new thread
  * target= specifies the function it will run
  * args= specifies any arguments that should be passed to the function
* t.start() tells the thread to start running.
* t.join() waits for the thread to complete. This is "blocking". 

In [None]:
import threading
import time

def worker(name):
    for i in range(3):
        print(f"{name} is working on step {i}")
        time.sleep(1)

# Create threads
t1 = threading.Thread(target=worker, args=("Thread-A",))
t2 = threading.Thread(target=worker, args=("Thread-B",))

# Start threads
t1.start()
t2.start()

# Wait for threads to finish
t1.join()
t2.join()

print("All threads are done.")

## Working with shared memory
What happens when more than one thread each try to use a counter variable/object with no coordination?  In this example, each thread should add 10000 to our counter, so the conter should increment up to 50000 with 5 threads.  

Run it couple of times and then uncomment the time.sleep line and run it a few more times. Time.sleep forces a context switch to another thread.  This is something that always has a chance of happening at any point in our code but may be infrequent and hard to observe without adding the time.sleep to make it obvious. 

Why does the counter total change from 50000 to 10000 or 10001, 10002, 10003, ... when addding the sleep?  

Each thread checks the counter and adds to it in it's own variable.  If another thread context switches in before the first thread puts its change back on the counter, the second thread will try to make the same change, e.g. both of them try to add 1 to the same number.

With more complicated code these interactins can cause all kinds of unintended behavior.

In [None]:
import threading

counter = 0

def increment():
    global counter
    for _ in range(10000):
        new_value = counter + 1
        # time.sleep(0)
        counter = new_value

threads = [threading.Thread(target=increment) for _ in range(5)]

for t in threads:
    t.start()
for t in threads:
    t.join()

print("Final counter value:", counter)  # Usually not 50000!

Let's fix this unexpected behavior using a lock that only lets one thread at a time execute the contentious bit of code.  This ensures that, even with a task switch in the middle, no thread is able to read the counter between the time that another thread reads it and increments it.  The read & update block is atomic. 

In [None]:
import threading

counter = 0
lock = threading.Lock()

def increment():
    global counter
    for _ in range(10000):
        with lock:
            # only one thread at a time can execute this code block
            new_value = counter + 1
            time.sleep(0)
            counter = new_value

# Create 5 threads
threads = [threading.Thread(target=increment) for _ in range(5)]

for t in threads:
    t.start()
for t in threads:
    t.join()

print("Final counter value:", counter)  # Always 50000!

Final counter value: 50000


## Problem
The following code downloads several URLs in a single thread using a loop. Re-write this in the cell below using threads and measure how long it takes to complete compared to the single threaded version. 

**Bonus tasks**
* Add print statements to the thread function to say what each one is doing.  Add a name argument so they can say which thread they are.
* Have the thread function get a lock on the "urls" list and pop a url off rather than initializing the threads each with a url.

In [None]:
# Example single threaded code:
import requests
import time

urls = [
    'https://example.com',
    'https://httpbin.org/delay/2',
    'https://httpbin.org/uuid',
    'https://httpbin.org/ip',
]

def save_url(url):
    response = requests.get(url)
    with open(f'result_{i}.txt', 'w') as f:
        f.write(response.text)
    
start_time = time.time()
for i, url in enumerate(urls):
    save_url(url)
end_time = time.time()
print(f"Single-threaded execution time: {end_time - start_time:.2f} seconds")

In [None]:
# Example single threaded code:
import requests
import time

urls = [
    'https://example.com',
    'https://httpbin.org/delay/2',
    'https://httpbin.org/uuid',
    'https://httpbin.org/ip',
]

def save_url(url):
    response = requests.get(url)
    with open(f'result_{i}.txt', 'w') as f:
        f.write(response.text)

# Create threads for each url
...

start_time = time.time()
# Start the threads and see how long until  are all done
...

# Wait for the threads to finish
...

end_time = time.time()
print(f"Multi-threaded execution time: {end_time - start_time:.2f} seconds")