# Python Global Interpreter Lock (GIL)

Is a **mutex** (a lock) that allows only one thread to hold the control of the python interpreter.

This means that only one thread can be in a state of execution at any point in time. 

This could be a **performance bottleneck** in **CPU-bound** or **multi-threaded code**

## What problem did the GIL solve for python ?

Python uses **reference counting for memory management**. It means that objects created in Python have a reference count variable that keeps track of the number of references that point to the object. **When this count reaches zero, the memory occupied by the object is released**.

In [1]:
import sys

a = []
b = a
sys.getrefcount(a)

3

Back to GIL:


The problem was that this reference count variable needed protection from race conditions where two threads increase or decrease its value simultaneously. If this happens, it can cause either leaked memory that is never released or, even worse, incorrectly release the memory while a reference to that object still exists. This can can cause crashes or other “weird” bugs in your Python programs.

**This reference count variable can be kept safe by adding locks to all data structures that are shared across threads so that they are not modified inconsistently.** 

But adding **a lock to each object or groups of objects means multiple locks will exist** which can cause another problem—**Deadlocks** (deadlocks can only happen if there is more than one lock). Another side effect would be decreased performance caused by the repeated acquisition and release of locks.

**Deadlocks**:In an operating system, a deadlock occurs when a process or thread enters a waiting state because a requested system resource is held by another waiting process, which in turn is waiting for another resource held by another waiting process. If a process is unable to change its state indefinitely because the resources requested by it are being used by another waiting process, then the system is said to be in a deadlock

The **GIL is a single lock on the interpreter itself which adds a rule that execution of any Python bytecode requires acquiring the interpreter lock**. **This prevents deadlocks** (as there is only one lock) and doesn’t introduce much performance overhead. **But it effectively makes any CPU-bound Python program single-threaded**.

# Why was the GIL chosen as the solution ?

Python has been around since the days when operating systems did not have a concept of threads. Python was designed to be easy-to-use in order to make development quicker and more and more developers started using it. 

A lot of extensions were being written for the existing C libraries whose features were needed in Python. To prevent inconsistent changes, these C extensions required a thread-safe memory management which the GIL provided.

As you can see, the GIL was a pragmatic solution to a difficult problem that the CPython developers faced early on in Python’s life.

## The impact on multi-threaded Python programs

When you look at a typical Python program—or any computer program for that matter—there’s a difference between those that are CPU-bound in their performance and those that are I/O-bound.

**CPU-bound programs are those that are pushing the CPU to its limit**. This includes programs that do mathematical computations like matrix multiplications, searching, image processing, etc.

**I/O-bound programs are the ones that spend time waiting for Input/Output which can come from a user, file, database, network, etc**. I/O-bound programs sometimes have to wait for a significant amount of time till they get what they need from the source due to the fact that the source may need to do its own processing before the input/output is ready, for example, a user thinking about what to enter into an input prompt or a database query running in its own process.

Let’s have a look at a simple CPU-bound program that performs a countdown:

In [4]:
# single_threaded.py
import time
from threading import Thread

COUNT = 50000000

def countdown(n):
    while n>0:
        n -= 1

start = time.time()
countdown(COUNT)
end = time.time()bm

print('Time taken in seconds -', end - start)

Time taken in seconds - 2.693948984146118


Now I modified the code a bit to do to the same countdown using two threads in parallel:

In [5]:
# multi_threaded.py
import time
from threading import Thread

COUNT = 50000000

def countdown(n):
    while n>0:
        n -= 1

t1 = Thread(target=countdown, args=(COUNT//2,))
t2 = Thread(target=countdown, args=(COUNT//2,))

start = time.time()
t1.start()
t2.start()
t1.join()
t2.join()
end = time.time()

print('Time taken in seconds -', end - start)

Time taken in seconds - 3.8794522285461426


As you can see, both versions take almost same amount of time to finish. In the multi-threaded version the GIL prevented the CPU-bound threads from executing in parellel.



The GIL does not have much impact on the performance of I/O-bound multi-threaded programs as the lock is shared between threads while they are waiting for I/O. 

But a program whose threads are entirely CPU-bound, e.g., a program that processes an image in parts using threads, would not only become single threaded due to the lock but will also see an increase in execution time, as seen in the above example, in comparison to a scenario where it was written to be entirely single-threaded. 

This increase is the result of acquire and release overheads added by the lock.

## Why hasn’t the GIL been removed yet?

The developers of Python receive a lot of complaints regarding this but a language as popular as Python cannot bring a change as significant as the removal of GIL without causing backward incompatibility issues. 

The GIL can obviously be removed and this has been done multiple times in the past by the developers and researchers but all those attempts broke the existing C extensions which depend heavily on the solution that the GIL provides.

## Why wasn’t it removed in Python 3?

Python 3 did have a chance to start a lot of features from scratch and in the process, broke some of the existing C extensions which then required changes to be updated and ported to work with Python 3. This was the reason why the early versions of Python 3 saw slower adoption by the community. 

Removing the GIL would have made Python 3 slower in comparison to Python 2 in single-threaded performance and you can imagine what that would have resulted in. You can’t argue with the single-threaded performance benefits of the GIL. So the result is that Python 3 still has the GIL.

**But Python 3 did bring a major improvement to the existing GIL **

We discussed the impact of GIL on “only CPU-bound” and “only I/O-bound” multi-threaded programs but what about the programs where some threads are I/O-bound and some are CPU-bound? 

In such programs, Python’s GIL was known to starve the I/O-bound threads by not giving them a chance to acquire the GIL from CPU-bound threads.

**This was because of a mechanism built into Python that forced threads to release the GIL after a fixed interval of continuous use and if nobody else acquired the GIL, the same thread could continue its use.**

In [6]:
import sys
# The interval is set to 100 instructions:
sys.getcheckinterval()

  This is separate from the ipykernel package so we can avoid doing imports until


100

**The problem in this mechanism was that most of the time the CPU-bound thread would reacquire the GIL itself before other threads could acquire it**. This was researched by David Beazley and visualizations can be found here.

## How to deat with Python's GIL

If the GIL is causing you problems, here a few approaches you can try:

**Multi-processing vs multi-threading**: **The most popular way is to use a multi-processing approach where you use multiple processes instead of threads**. Each Python process gets its own Python interpreter and memory space so the GIL won’t be a problem. Python has a multiprocessing module which lets us create processes easily like this:

In [7]:
from multiprocessing import Pool
import time

COUNT = 50000000
def countdown(n):
    while n>0:
        n -= 1

if __name__ == '__main__':
    pool = Pool(processes=2)
    start = time.time()
    r1 = pool.apply_async(countdown, [COUNT//2])
    r2 = pool.apply_async(countdown, [COUNT//2])
    pool.close()
    pool.join()
    end = time.time()
    print('Time taken in seconds -', end - start)

Time taken in seconds - 1.5021004676818848


**A decent performance increase compared to the multi-threaded version, right?**

The time didn’t drop to half of what we saw above because process management has its own overheads. Multiple processes are heavier than multiple threads, so, keep in mind that this could become a scaling bottleneck.

# Source


 * https://realpython.com/python-gil/