Based on **Francesco Pierfederici: Distributed Computing with Python, Chapter 3**

### Somtimes threads can hurt performance

In [1]:
from time import time
from threading import Thread

In [2]:
def fib(n):
    if n <= 2:
        return 1
    elif n == 0:
        return 0
    elif n < 0:
        raise Exception('fib(n) is undefined for n < 0')
    return fib(n - 1) + fib(n - 2)


In [3]:
fib(5)

5

In [4]:
# calculate the "fibnum" Fibonacci number threadnum times independently on different threads!

def runthreads(threadnum,fibnum):
    t0 = time();
    for i in range(threadnum):
            t = Thread(target=fib, args=(fibnum, )) #spawn a new thread
            t.start()
    dt = time() - t0; 
    print(dt) # time neaded to claculate threadnum Fibonacci numbers

In [5]:

runthreads(1,34)
runthreads(2,34)
runthreads(3,34)
runthreads(4,34)
runthreads(8,34)

0.005505561828613281
0.0312960147857666
0.04142189025878906
0.15059971809387207
0.3381540775299072


Interesting! Increasing the number of parallel computations just increases the execution time.

**Clearly, something is not quite right**, as we would have expected the threads 
to run in parallel (again, on a quad-core machine).

It turns out that there is something not obvious going on deep inside the Python 
interpreter that is affecting our CPU-bound threads. 

That thing is called **Global Interpreter Lock (GIL)**. 

As the name implies, the **GIL is a global lock** that is used, 
mostly, to **keep reference counting sane** (remember when we talked about that a little 
while ago?). The consequence of the GIL is that even though Python threads are real 
OS-native threads, **only one of them can be active at any given point in time**.

This has led some to say that the **Python interpreter is a single-threaded interpreter**, 
which is not quite true. However, this statement is also, conceptually at least, not 
completely wrong either. 

The situation we just witnessed is very **similar to the 
behavior we observed when writing coroutines**. In that case, in fact, only one piece 
of code could run at any given point in time. 

Things just work, meaning **we get the 
parallelism that we expect, when one coroutine or thread waits for I/O and another 
one takes over the CPU**. Things do not work as well in terms of performance speedups, 
when one task needs the CPU for a long time, as is the case with CPU-bound tasks as 
in the Fibonacci example.

**Not all Python interpreters have the GIL; Jython, for instance, does not.**