# H - Multi-Threading

Differences between processes and threads

## What is it?
Multi-threading enables us to do more than one thing at the same time.  One multi-core CPUs, threads can occupy more than one core and both perform work concurrently.  Even on a single core, threads can take turns running so that if one thread is waiting for a response (network, user input, disk read, etc) another thread can still be productive.  


### Processes vs threads
Processes are allocated by the operating system and have their own protected memory.  It is expensive to create and destroy processes and sharing data between processes is difficult. Eace process can start multiple threads that all share the process' memory, so threads can communicate, e.g. worker threads can take turns retreiving tasks from a queue in "shared memory" to work on them in parallel.  We need to take precautions to avoid reading/changing memory at the same time between threads using locks and/or mutexes.  
* A lock can be used to protect a critical section of code from being interrupted by a task switch to another thread
* A mutex can be used to ensure only one thread interacts with something in memory at a time.  This is sort of like a baton.  Each thread waits to receive the mutex before doing the thing and then releases it for another thread. 

## What for?
Using multiple threads lets us run multiple operations in parallel.  We might want to do this for a few reasons:
* Tasks that have blocking operations can be executed in parallel so whichever are currently unblocked get to run while the waiting ones sit idle. 
  * Web Servers can use a thread for each user session
* Large operatoins can be broken up into smaller chunks and run in parallel on more than one cpu core to improve performance. 

**Alternatives**
Many libraries are written in C and implement threading on their own to improve performance for their specific tasks. Exmalpes:
* numpy - pandas is built on this. 
* pytorch - used for neural networks
* opencv - for image processing
* ...


## What is the Gil?
The GIL - Global Interpreter Lock - is a sort of mutex that only lets one thread execute at a time. Python was designed using the gil initially to make the language more simple and maintainable and to focus on single threaded performance.  Many python programs are slowed more by I/O wait than cpu and don't see performance penalty from the gil. 

Work is ongoing now to remove the gil, and even standard python can be compiled now without it if you have need for truly concurrent multi-threading with python.  Just note that some libraries are not thread safe or not yet updated to work without the gil.  

https://py-free-threading.github.io/running-gil-disabled/

# Simple first example
Let's create two threads that each execute the worker function.  
* threading.Thread returns a handle for a new thread
  * target= specifies the function it will run
  * args= specifies any arguments that should be passed to the function
* t.start() tells the thread to start running.
* t.join() waits for the thread to complete. This is "blocking". 

In [3]:
import threading
import time

def worker(name):
    for i in range(3):
        print(f"{name} is working on step {i}")
        time.sleep(1)

# Create threads
t1 = threading.Thread(target=worker, args=("Thread-A",))
t2 = threading.Thread(target=worker, args=("Thread-B",))

# Start threads
t1.start()
t2.start()

# Wait for threads to finish
t1.join()
t2.join()

print("All threads are done.")

Thread-A is working on step 0
Thread-B is working on step 0
Thread-A is working on step 1
Thread-B is working on step 1
Thread-A is working on step 2
Thread-B is working on step 2
All threads are done.


## Working with shared memory
What happens when more than one thread each try to use a counter variable/object with no coordination?  In this example, each thread should add 10000 to our counter, so the conter should increment up to 50000 with 5 threads.  

Run it couple of times and then uncomment the time.sleep line and run it a few more times. Time.sleep forces a context switch to another thread.  This is something that always has a chance of happening at any point in our code but may be infrequent and hard to observe without adding the time.sleep to make it obvious. 

Why does the counter total change from 50000 to 10000 or 10001, 10002, 10003, ... when addding the sleep?  

Each thread checks the counter and adds to it in it's own variable.  If another thread context switches in before the first thread puts its change back on the counter, the second thread will try to make the same change, e.g. both of them try to add 1 to the same number.

With more complicated code these interactins can cause all kinds of unintended behavior.

In [None]:
import threading

counter = 0

def increment():
    global counter
    for _ in range(10000):
        new_value = counter + 1
        # time.sleep(0)
        counter = new_value

threads = [threading.Thread(target=increment) for _ in range(5)]

for t in threads:
    t.start()
for t in threads:
    t.join()

print("Final counter value:", counter)  # Usually not 50000!

Final counter value: 50000


Let's fix this unexpected behavior using locks

In [None]:
import threading

counter = 0
lock = threading.Lock()

def increment():
    global counter
    for _ in range(10000):
        with lock:
            # only one thread at a time can execute this code block
            new_value = counter + 1
            time.sleep(0)
            counter = new_value

threads = [threading.Thread(target=increment) for _ in range(5)]

for t in threads:
    t.start()
for t in threads:
    t.join()

print("Final counter value:", counter)  # Always 50000!

Final counter value: 50000
