# Threads (`threading`) & Processes (`multiprocessing`)
- both `threading` and `multiprocessing` module have very similar API.
- concept of putting code into workers which are started from the main code

## Threads
To create a thread one subclasses `threading.Thread` and implement the `__init__` and `run` functions.

That is all the code you need to successfully create and instantiate a thread in python.

In [20]:
import threading
import time

class Worker(threading.Thread):
    # Our workers constructor, note the super() method which is vital if we want this
    # to function properly
    def __init__(self):
        super(Worker, self).__init__()

    def run(self):
        for i in range(10):
            print(i)
            time.sleep(0.05)
        

# This initializes ''thread1'' as an instance of our Worker Thread
thread1 = Worker()
# This is the code needed to run our newly created thread
thread1.start()

0
1
2
3
4
5
6
7
8
9


We can now create multiple threads which work 'pseudo'-parallel, i.e. without blocking each other (asynchronous).

In [21]:
thread1 = Worker()
thread2 = Worker()
thread3 = Worker()
thread1.start()
thread2.start()
thread3.start()

0
0
0
1
1
1
2
2
2
3
3
3
4
4
4
5
5
5
6
6
6
7
7
7
8
8
8
99

9


## Problems with global interpreter lock (GIL)
The `threading` code is not using multiple cores.

In [23]:
import threading
import time
import math

class FactorialWorker(threading.Thread):
    # Our workers constructor, note the super() method which is vital if we want this
    # to function properly
    def __init__(self):
        super(FactorialWorker, self).__init__()

    def run(self):
        res = math.factorial(1500000)
        
        
thread1 = FactorialWorker()
thread1.start()

In [24]:
thread1 = FactorialWorker()
thread2 = FactorialWorker()
thread3 = FactorialWorker()
thread1.start()
thread2.start()
thread3.start()

# multiprocessing
The same calculation with `multiprocessing` is able to use multiple cores.
- define the worker functions and create multiple processes

In [25]:
import multiprocessing

def worker(num):
    """thread worker function"""
    print('Worker:', num, multiprocessing.current_process().name)
    res = math.factorial(1500000)
    print('Worker:', num, 'finished')
    return

jobs = []
for i in range(6):
    p = multiprocessing.Process(target=worker, args=(i,))
    jobs.append(p)
    p.start()

Worker: 0 Process-17
Worker: 1 Process-18
Worker: 2 Process-19
Worker: 3 Process-20
Worker: 4 Process-21
Worker: 5 Process-22
Worker: 1 finished
Worker: 3 finished
Worker: 0 finished
Worker: 4 finished
Worker: 5 finished
Worker: 2 finished


## Waiting for processes
- To wait until a process has completed its work and exited, use the `join()` method.
- Work can be distributed to the processes via arguments to worker functions.

In [6]:
import os
 
from multiprocessing import Process
 
def doubler(number):
    """
    A doubling function that can be used by a process
    """
    result = number * 2
    proc = os.getpid()
    print('{0} doubled to {1} by process id: {2}'.format(
        number, result, proc))
 
if __name__ == '__main__':
    numbers = [5, 10, 15, 20, 25]
    procs = []
 
    for index, number in enumerate(numbers):
        proc = Process(target=doubler, args=(number,))
        procs.append(proc)
        proc.start()
 
    for proc in procs:
        proc.join()

5 doubled to 10 by process id: 21713
10 doubled to 20 by process id: 21714
15 doubled to 30 by process id: 21719
20 doubled to 40 by process id: 21722
25 doubled to 50 by process id: 21723


## Pool class
The Pool class is used to represent a **pool of worker processes**. 

It has methods which can allow you to offload tasks to the worker processes. Let’s look at a really simple example:

In [7]:
from multiprocessing import Pool
 
def doubler(number):
    return number * 2
 
if __name__ == '__main__':
    numbers = [5, 10, 20]
    pool = Pool(processes=3)
    print(pool.map(doubler, numbers))

[10, 20, 40]


## Process Communication

In [8]:
from multiprocessing import Process, Queue
 
sentinel = -1
 
def creator(data, q):
    """
    Creates data to be consumed and waits for the consumer
    to finish processing
    """
    print('Creating data and putting it on the queue')
    for item in data:
        q.put(item)
 
def my_consumer(q):
    """
    Consumes some data and works on it
 
    In this case, all it does is double the input
    """
    while True:
        data = q.get()
        print('data found to be processed: {}'.format(data))
        processed = data * 2
        print(processed)
 
        if data is sentinel:
            break
 
if __name__ == '__main__':
    q = Queue()
    data = [5, 10, 13, -1]
    process_one = Process(target=creator, args=(data, q))
    process_two = Process(target=my_consumer, args=(q,))
    process_one.start()
    process_two.start()
 
    q.close()
    q.join_thread()
 
    process_one.join()
    process_two.join()

Creating data and putting it on the queue
data found to be processed: 5
10
data found to be processed: 10
20
data found to be processed: 13
26
data found to be processed: -1
-2
