# Threading and Multiprocessing

With threading and multiprocessing you can run code in paralell and speed up the performance of your code. 

It is important to understand the difference between a process and a thread and the advantages and disadvantages of both. 

How and threads are limited by the Global Interpreter Lock (GIL, which we cover here) and how we can easily use the build threading and multiprocessing modules in Python to create to multiple threads or processes. 

## Difference between a process and thread

A process is an 'instance' of a program. So if you run a Firefox browser, that's one process. You could start another browser, which would be two processes. Likewise, one Python interpreter is one process. 

A thread on the other hand is an 'entity' within a process. Processes can have multiple threads inside. 

## Processes

Processes take advantage of multiple CPUs and cores, so you can execute your code on multiple CPUs in parellel. Processes have a sperate memory space, and it is not shared between processes. And they are great for CPU bound processing so this means for example if you have a large amount data and have to do a lot of exansive computations on them. With multi-processing you can process the data on different CPUs and this way speed up your code's execution. A new process is started independtly from other processes and processes are easily 'interuptable' and 'killable' and there's one GIL for each process so this avoids the GIL limitation. 

### Disdavantages of processes

A process is heavyweight, so it takes a lot of memory. Starting a process is slower than starting a thread. And since processes have a seperate space then memory sharing is not so easy, so the so called 'interprocess communication' is more complicated. 

## Threads 

A thread is an entity in a process that can be scheduled for execution. It's also known as a 'lightweight process' and a process can spawn multiple threads. All threads within a process share the same memory and they are lightweight so starting a thread is faster than starting a process. And they are great for input/output (I/O) tasks when your program is  interacting with a slower devices like a hard-drive or a network connection, then with threading your program can use the time waiting for the these devices and intelligently switch to other threads and do the processing in the mean time. This is how you can speed up your code with threading. 

### Disadvantages of threading

In Python threading is limited by the GIL, which allows only one thread at a time so there is no actual paralell computation going on in multi-threading. So threading has no effect for CPU bound tasks and they are not interuptable and killable. Be careful with memory leaks. Since a thread share the same memory you have to be careful with 'race conditions' 

#### What are race conditions

Race conditions occurs when two more threads want to modify the same variable at the same time. Easily causes bugs or crashes. 

#### What are memory leaks?

A memory leak in Python is when Python interpreter incorrectly manages memory in a way that memory which is no longer needed is not released. 

#### What is the GIL?

The GIL is the Global Interpreter Lock is a lock in Python that allows only one thread at a time to execute. This is very contraversial in the Python Community. It is needed because in CPython (which is the standard implementation of Python from python.org) there is a memory management which is not thread-safe. In CPython there is a technique which is called 'reference counting' which is used for memory management. Objects created in Python have a refence count variable that keeps track of the number of references that point to the object. And when this count reaches 0 the memory occupied by the object can be released. The problem in multi-threading is that this reference count variable need protection from race conditions where two threads increase or decrease the values simultaneously. When this happens it can either can cause leaked memory which is never released or it can incorrectly release the memory while a reference to the object still exists. So this is the reason the GIL exists in Python. There are a couple of ways to avoid the GIL if you want to use paralell computing, is to use multi-processing, or use a different a different 'free-threaded' Python implemnetation, like Jython or Iron-Python. Or use Python as a wrapper for third-party libraries like Numpy or SciPy modules which are basically Python wrappers for that then call code which is executed in C/C++. 


# Code for Multi-processing 

In [9]:
from multiprocessing import Process
import os
import time

#create a list to store processes
processes = []

# define number of processes
num_processes = os.cpu_count()

print(num_processes)

#define function to be used in processes
def square_numbers():
    for i in range(100):
        i * i
        time.sleep(0.5)

#create processes
for i in range(num_processes):
    # spawn new process
    p = Process(target=square_numbers)
    processes.append(p)
    
#start processes
for p in processes:
    p.start()
    
#join processes
for p in processes:
    p.join()
    #wait for all processes are finished and block the main thread
    
print('End main')

8
End main


# Code for threading


Setting up the threading module in exactly the same way as the multi-processing module

In [12]:
from threading import Thread
import os
import time

#create a list to store processes
threads = []

# define number of processes
num_threads = 8

print(num_threads)

#define function to be used in threads
def square_numbers():
    for i in range(100):
        i * i
        time.sleep(0.5)

#create threads
for i in range(num_threads):
    # spawn new thread
    t = Thread(target=square_numbers)
    threads.append(t)
    
#start threads
for t in threads:
    t.start()
    
#join threads
for t in threads:
    t.join()
    #wait for all threads are finished and block the main thread
    
print('End main')

8
End main


### Create and start multple threads and share Data between threads 



In [15]:
from threading import Thread
import time

# simulate a database
database_value = 0 

def increase():
    global database_value
    
    #simulating database access
    local_copy = database_value
    local_copy +=1
    time.sleep(0.1)
    database_value = local_copy
    
if __name__ == '__main__':
    
    # print start value 
    print('Start value: ', database_value)
    
    #Create the threads
    thread1 = Thread(target=increase)
    thread2 = Thread(target=increase)
    
    #Start the threads
    thread1.start()
    thread2.start()
    
    #Join the threads
    thread1.join()
    thread2.join()
    
    print("End value: ", database_value)
    
    print('End main')

Start value:  0
End value:  1
End main


As you can see the outputs for this script are:
    
- Start value:  0
- End value:  1
- End main


But if the function has been applied twice, it the *End Value should be 2*. It is still 1 because a 'race condition' has been created.   threads try to access the same variable at the same time, causing a bug.
Let's step through this program to see why the bug is occuring

### Step through - 'debugging' the problem

In our `increase` function, there is a `time.sleep` method. During the running of `thread1` the waiting time causes the programme to intelligently `thread2`, which then accesses `database_value` and copies it to `local_copy`. At this point `thread2` also has a local copy which is 0 then increases it to 1. Then again we hit the `time.sleep` method, causing the program to switch back to `thread1`. Finally `thread1` finishes by copying `local_copy` (which has the value 1) to `database_value` (which now becomes 1), and `thread2` does exactly the same, using its `local_copy` (which is also 1)!  We have to do something to avoid this occuring. 

### Using locks to prevent race conditions

So how do we prevent race conditions? We use a lock, which can be imported from the threading module. 


`from threading import Lock`

and create a lock

`lock = Lock`

and this must be be supplied to whichever function is the `target` of the thread as an argument. 

`def whatever_function(lock):
    pass`
    
So in our case we are supplying it to `increase`.
    
As `lock` is an argument, we must supply it *as a tuple* to the 'Thread' method when creating a thread with `args=(lock,)`. As the tuple supplied only has one element in it the we need a comma to show the rest of the tuple is empty. 

Inside the function definition, we also need to use `lock.acquire()` which is one of only two methods of `Lock`, the other is supplied after the values of variables are modified, is `lock.release`. Our function definition will therefore become 

`def increase(lock):
    global database_value
    
    #acquire a lock before manipulating variables
    lock.acquire()
    
    #simulating database access
    local_copy = database_value
    local_copy +=1
    time.sleep(0.1)
    database_value = local_copy

    #release the lock after manipulating variables
    lock.release()`
    
Which will release the lock - never forget to do this otherwise the program will get stuck. 

Another way to achieve this is with a context manager (recommended):

'with lock:
    #do something`
    
 Then we don't need the `lock.release()`. 

In [17]:
from threading import Thread, Lock
import time

# simulate a database
database_value = 0 

def increase(lock):
    global database_value
    
    #acquire a lock before manipulating variables
    lock.acquire()
    
    #simulating database access
    local_copy = database_value
    local_copy +=1
    time.sleep(0.1)
    database_value = local_copy

    #release the lock after manipulating variables
    lock.release()
    
if __name__ == '__main__':
    
    #create a Lock
    lock = Lock()
    
    # print start value 
    print('Start value: ', database_value)
    
    #Create the threads
    thread1 = Thread(target=increase, args=(lock,))
    thread2 = Thread(target=increase, args=(lock,))
    
    #Start the threads
    thread1.start()
    thread2.start()
    
    #Join the threads
    thread1.join()
    thread2.join()
    
    print("End value: ", database_value)
    
    print('End main')

Start value:  0
End value:  2
End main


We can see that the lock was successful at helping us to avoid race conditions and achieve the right answer.  

### How to use queues 

Next we will use queues to make thread-safe and process-safe data exchanges. They are excellent for data processing in multi-processing and threading environments. We have to import queue. 


In [23]:
from queue import Queue

A queue is a linear data structure that follow FIFO, or first in first out principle. It's like a first come first served situation when a line of customers forms. 

We need to intiate the queue with:



In [24]:
queue = Queue()

and then we put elements in the queue:



In [25]:
queue.put(1)
queue.put(2)
queue.put(3)

So the queue order will be 1 then 2 then 3. 

To get and remove the first item, we can use 'queue.get()', so :

In [26]:
first = queue.get()

Printing `first` will yield the first item in the queue. 

In [28]:
print(first)

1


Now the queue only has 2 and 3 in it, as 1 has been retrieved and removed. 

When you are done processing the items in the queue you should run `queue.task_done()` 

`queue.join()` method blocks until all items have been processed in the queue which is similar to `thread.join()` which we used before and with which we blocked the main thread and forced it to wait until all the threads were finished. 

# Using Daemon processes 

Define the number of threads and create them with:

```python
q = Queue()
    num_threads  = 10 
 ```
and make it a daemon process with `thread.deamon = True`.
 
```python 
for i in range(num_threads):
    thread = Thread(target=worker, args=(q,))
    thread.start()
    thread.deamon = True
 ```

in a dummy function `worker` we will create an infinite loop with `while True:` and use `value = q.get()` to get the first item from the queue in a thread-safe  way. 

We will also create 10 threads. Within `for i in range(num_threads):` we will use `q.put()` which is also thread-safe. No other thread can write at the same time until the queue position is finished. 

We will provide the function with the lock too, to prevent the possibility of confused threads. 

In [51]:
from threading import current_thread

def worker(q, lock):
    while True:
        value = q.get()
        
        #process q
        
        print(f'{current_thread().name} got value {value} \n')
        q.task_done()

if __name__ == '__main__':
    
    #create the queue
    q = Queue()
    
    #create the lock
    lock = Lock()
    
    num_threads  = 10 
        
    for i in range(num_threads):
        thread = Thread(target=worker, args=(q, lock))
        thread.start()
        thread.deamon = True

    for i in range(1, 11):
        q.put(i)

    q.join()
    
    print('End Main')

Thread-220 got value 1 
Thread-224 got value 5 
Thread-221 got value 2 
Thread-226 got value 8 
Thread-227 got value 7 

End Main



Thread-224 got value 6 

Thread-226 got value 9 

Thread-221 got value 3 

Thread-224 got value 10 


Thread-221 got value 4 



We can see that threads of different names are given the values. They are not sequential but all of the values from the queue get processed. 

### Why are we using a daemon thread?

From the [Python Documentation](https://docs.python.org/2/library/threading.html#thread-objects)
>  The significance of this flag is that the entire Python program exits when only daemon threads are left. The initial value is inherited from the creating thread.

By setting threads as daemon threads you can let them run and forget about them, and when your program quits and remaining daemon threads are killed automatically. Some threads that run background tasks are only useful when the main program is running so it's okay to kill them off once the other, main or non-daemon, threads have exited. By flagging them as `daemon=True` they are automatically killed when the main thread is exited. 

