# GIL, Multithreading, and Multiprocessing

The Python **Global Interpreter Lock** or **GIL**, in simple words, is a lock that allows only one thread to take control of the python interpreter in CPython. This lock is necessary mainly because CPython's memory management is not thread-safe (Not the case for IronPython or Jython, the .NET and Java implementation for Python). GIL is infamous for affecting multi-threaded programming in Python. But it is a myths to claim that the Python threading class is useless or it only slow down the execution time. In some circumstances, Python threading does speed up the process. Here we are going to look at how to properly handle threading in Python.

### Single Threaded Example

First, let's see an example how we computed the square and the cube for a list of numbers using a single thread. Here I am using **timer.sleep** function to demonstrate in which scenario multithread can be useful. 


In [31]:
from utils.timer import DecoTimer
import time

def calc_square(arr):
    print("calculate square of numbers")
    for n in arr:
        time.sleep(0.2)
        print(f'square: {n*n}')
        
def calc_cube(arr):
    print("calculate cube of numbers")
    for n in arr:
        time.sleep(0.2)
        print(f'cube: {n*n*n}')

In [32]:
arr = [2, 4, 8, 9]

with DecoTimer("Testing Single Threaded"):
    calc_square(arr)
    calc_cube(arr)

>>>>Starting Function Testing Single Threaded...
calculate square of numbers
square: 4
square: 16
square: 64
square: 81
calculate cube of numbers
cube: 8
cube: 64
cube: 512
cube: 729
<<Finished function Testing Single Threaded in 1.6371750831604004 seconds


Here you can see the overall runtime is just about 1.6 seconds.

### Multithreading Example

Now, let's try to run the square and cube computation in seperate threads.

In [33]:
from threading import Thread

with DecoTimer("Testing Multithreaded Example"):
    t1 = Thread(target=calc_square, args=(arr,))
    t2 = Thread(target=calc_cube, args=(arr,))
    
    # Running square and cube in seperate thread
    t1.start()
    t2.start()
    
    # Wait till t1 and t2 completes
    t1.join()
    t2.join()
    

>>>>Starting Function Testing Multithreaded Example...
calculate square of numbers
calculate cube of numbers
square: 4
cube: 8
square: 16
cube: 64
square: 64
cube: 512
square: 81
cube: 729
<<Finished function Testing Multithreaded Example in 0.8119800090789795 seconds


Note that the multithreading example actually speeds up the computatiom by about 0.8 seconds, which makes the process almost two times faster than the single-threaded example. But, doesn't GIL only allow one thread running at a time? How does multithreading actually speeds up the execution? Let's see another single threaded vs multi threaded example, but this time we use a long loop to replace the **time.sleep** functions.

In [34]:
def calc_square_cpu_bounded(arr):
    print("calculate square of numbers")
    for n in arr:
        for _ in range(8000000):
            pass
        print(f'square: {n*n}')
        
def calc_cube_cpu_bounded(arr):
    print("calculate cube of numbers")
    for n in arr:
        for _ in range(8000000):
            pass
        print(f'cube: {n*n*n}')

In [44]:
with DecoTimer("Testing Single Threaded without timer.sleep"):
    calc_square_cpu_bounded(arr)
    calc_cube_cpu_bounded(arr)

>>>>Starting Function Testing Single Threaded without timer.sleep...
calculate square of numbers
square: 4
square: 16
square: 64
square: 81
calculate cube of numbers
cube: 8
cube: 64
cube: 512
cube: 729
<<Finished function Testing Single Threaded without timer.sleep in 1.8836450576782227 seconds


In [43]:
with DecoTimer("Testing Multithreaded Example without timer.sleep"):
    t1 = Thread(target=calc_square_cpu_bounded, args=(arr,))
    t2 = Thread(target=calc_cube_cpu_bounded, args=(arr,))
    
    # Running square and cube in seperate thread
    t1.start()
    t2.start()
    
    # Wait till t1 and t2 completes
    t1.join()
    t2.join()

>>>>Starting Function Testing Multithreaded Example without timer.sleep...
calculate square of numbers
calculate cube of numbers
square: 4
cube: 8
square: 16
cube: 64
square: 64
cube: 512
square: 81
cube: 729
<<Finished function Testing Multithreaded Example without timer.sleep in 1.9724102020263672 seconds


Notice that the multithreaded example here did not really speed up the execution, but it might actually performs slightly worse than the single threaded example. This is because the delay loop we used in this example is more CPU hungry and GIL needs to be frequently acquired. The **timer.sleep** function, on the other hand, can release the GIL for full dealy, enabling the other thread to acuire the lock and continue its computation.

The **I/O Bounded** tasks or tasks involving external systems behave more similar to functions with timer.sleep, and therefore threads can combine their work more efficiently. However, CPython threads provides no benefit for CPU intensive tasks because of the GIL.