# GIL

Unfortunately, in Python and its standard CPython interpreter, threads only appear to run in parallel, but they actually run sequentially. This is due to the GIL (Global Interpreter Lock), which limits Python to one running thread per unit of time.

In the previous module on the garbage collector, it was described that Python has a system for counting object references. The problem that the GIL solves is the ability for different threads to increment or decrement object references at the same time. It can happen that one thread dereferences an object and Python deletes it, while another thread uses the newly deleted object, resulting in an error.

In theory, this problem can be solved by adding locks to each object, but, unfortunately, this can lead to the problem of deadlocks - this is when threads are waiting for resources that another thread has captured, and so on endlessly.

GIL is a lock on the Python interpreter itself. This is the only lock in Python that solves the problem of deadlocks, but in turn makes all applications single-threaded.

### So is GIL good or bad?

If we divide all programs into CPU-dependent (image processing, matrix multiplication) and I / O-dependent (network communication, accessing the database), then we can understand that the use of threads and the GIL does not carry anything critical in I / O operations , because the time taken by Python to switch threads will be offset by the I/O time. At the same time, it is natural that, regardless of the number of processor cores, any multi-threaded Python program will not be able to unlock its potential and will work even slower than a single-threaded program due to switching GIL between threads.

Having considered the GIL, we should turn to the Python threading module, which is responsible for creating and working with threads.

In [1]:
import threading
import time
def thread_function(name):
    print(name, " - thread starting")
    time.sleep(2)
    print(name, " - after sleep")
if __name__ == "__main__":
    print("before create Thread")
    x = threading.Thread(target=thread_function, args=(1,))
    print("before running Thread")
    x.start()
    print("Wait thread finish")
    # x.join()
    print("all done")

before create Thread
before running Thread
1  - thread starting
Wait thread finish
all done


1  - after sleep


In this example:

1. We are importing the threading module.
2. We create an object of the Thread class, passing to it the function with which it will start working, and the arguments for this function.
3. With the start() method, you can start a thread, and when it finishes executing the thread_function() function, it will automatically terminate.

You can also pass the daemon=True parameter when creating an object of the Thread class, which will allow you to create a daemon thread.

In [None]:
x = threading.Thread(target=thread_function, args=(1,), daemon=True)

In theory, a daemon process is a process that runs in the background.

Python distinguishes between normal threads and daemon threads. The application to stop will wait for normal threads to terminate properly, while daemons threads will simply be killed. You can think of a daemon thread as a background thread that you don't have to worry about terminating.

In the previous example, x.join() was commented out.

x.join() is an instruction to the main thread to wait for thread x to finish. This can be useful when the child threads are doing some work, and the main thread is then working on the data that the child threads have prepared.

The example described above allows you to create one thread, and to run several, you can combine them by putting them in a list:

In [2]:
import threading
import time
def thread_function(name):
    print(name, " - thread start")
    time.sleep(2)
    print(name, "- thread job's done")
if __name__ == "__main__":
    threads = []
    for index in range(3):
        print("create thread - ", index)
        x = threading.Thread(target=thread_function, args=(index,))
        threads.append(x)
        x.start()

    for index, thread in enumerate(threads):
        print("before join - ", index)
        thread.join()
        print("after join - ", index)

create thread -  0
0  - thread start
create thread -  1
1  - thread start
create thread -  2
2  - thread start
before join -  0
10 - thread job's done
after join -  0
before join -  1
2 - thread job's done
 - thread job's done
after join -  1
before join -  2
after join -  2


You can see that, after the threads start, the main thread hangs waiting for thread number 0. But the threads finish at almost the same time, so the rest of the join() passes quickly.

In addition to creating multiple threads and storing them in a list, Python has the ability to use the ThreadPoolExecutor, which makes it easier to create N threads.

In [3]:
from concurrent.futures import ThreadPoolExecutor
import threading
import time
def thread_function():
    time.sleep(2)
    print("thread_function Executed {}".format(threading.current_thread()))
def main():
    executor = ThreadPoolExecutor(max_workers=3)
    task1 = executor.submit(thread_function)
    task2 = executor.submit(thread_function)
if __name__ == '__main__':
    main()

thread_function Executed <Thread(ThreadPoolExecutor-0_0, started 9716)>thread_function Executed <Thread(ThreadPoolExecutor-0_1, started 15108)>



This example creates a ThreadPoolExecutor with 3 threads and passes in the function to be executed using the executor object. As you can see from the output, different threads perform this function.