## Threading and multiprocessing

### Threading
#### Pros
* Shared memory - makes access to state from another context easier
* Lightweight - low memory footprint
* Allows you to easily make responsive UIs
* cPython C extension modules that properly release the GIL will run in parallel
* Great option for I/O-bound applications

#### Cons
* cPython - subject to the GIL
* Not interruptible/killable
* If not following a command queue/message pump model (using the Queue module), then manual use of synchronization primitives become a necessity (decisions are needed for the granularity of locking)
* Code is usually harder to understand and to get right - the potential for race conditions increases dramatically


### Multiprocessing

#### Pros
* Separate memory space
* Code is usually straightforward
* Takes advantage of multiple CPUs & cores
* Avoids GIL limitations for cPython
* Eliminates most needs for synchronization primitives unless if you use shared memory (instead, it's more of a communication model for IPC)
* Child processes are interruptible/killable
* Python multiprocessing module includes useful abstractions with an interface much like threading.Thread
* A must with cPython for CPU-bound processing

#### Cons
* IPC a little more complicated with more overhead (communication model vs. shared memory/objects)
* Larger memory footprint


In [24]:
import time
from threading import Thread

def print_time(thread_name, delay):
    count = 0
    while count < 5:
        time.sleep(delay)
        count += 1
        print("%s: %s" % (thread_name, time.ctime(time.time())))

if __name__ == '__main__':
    # Create two threads as follows
    t1 = Thread(target=print_time, args=("Thread-1", 2, ))
    t2 = Thread(target=print_time, args=("Thread-2", 4, ))
    t1.start()
    t2.start()

Thread-1: Tue Sep  4 16:44:50 2018
Thread-2: Tue Sep  4 16:44:52 2018
Thread-1: Tue Sep  4 16:44:52 2018
Thread-1: Tue Sep  4 16:44:54 2018
Thread-2: Tue Sep  4 16:44:56 2018
Thread-1: Tue Sep  4 16:44:56 2018
Thread-1: Tue Sep  4 16:44:58 2018
Thread-2: Tue Sep  4 16:45:00 2018
Thread-2: Tue Sep  4 16:45:04 2018


In [16]:
import threading
import time

class MyThread(threading.Thread):
    def __init__(self, name, counter):
        threading.Thread.__init__(self)
        self.name = name
        self.counter = counter

    def run(self):
        print("Starting " + self.name)
        print_time(self.name, self.counter, 5)
        print("Exiting " + self.name)

def print_time(thread_name, delay, counter):
    while counter:
        time.sleep(delay)
        print("%s: %s" % (thread_name, time.ctime(time.time())))
        counter -= 1

        
# Create new threads
thread1 = MyThread("Thread-1", 1)
thread2 = MyThread("Thread-2", 2)

# Start new Threads
thread1.start()
thread2.start()

print("Exiting Main Thread")

Starting Thread-1
Starting Thread-2
Exiting Main Thread
Thread-1: Tue Sep  4 16:41:41 2018
Thread-1: Tue Sep  4 16:41:42 2018
Thread-2: Tue Sep  4 16:41:42 2018
Thread-1: Tue Sep  4 16:41:43 2018
Thread-1: Tue Sep  4 16:41:44 2018
Thread-2: Tue Sep  4 16:41:44 2018
Thread-1: Tue Sep  4 16:41:45 2018
Exiting Thread-1
Thread-2: Tue Sep  4 16:41:46 2018
Thread-2: Tue Sep  4 16:41:48 2018
Thread-2: Tue Sep  4 16:41:50 2018
Exiting Thread-2


### Syncronization
The threading module provided with Python includes a simple-to-implement locking mechanism that will allow you to synchronize threads. A new lock is created by calling the ```Lock()``` method, which returns the new lock.

The ```acquire``` (blocking) method of the new lock object would be used to force threads to run synchronously. The optional blocking parameter enables you to control whether the thread will wait to acquire the lock.

If blocking is set to 0, the thread will return immediately with a 0 value if the lock cannot be acquired and with a 1 if the lock was acquired. If blocking is set to 1, the thread will block and wait for the lock to be released.
The ```release()``` method of the new lock object would be used to release the lock when it is no longer required.


In [17]:
import threading
import time

class MyThread (threading.Thread):
    
    thread_lock = threading.Lock()

    def __init__(self, name, counter):
        threading.Thread.__init__(self)
        self.name = name
        self.counter = counter

    def run(self):
        print("Starting " + self.name)
        # Get lock to synchronize threads
        MyThread.thread_lock.acquire()
        print_time(self.name, self.counter, 3)
        # Free lock to release next thread
        MyThread.thread_lock.release()

def print_time(thread_name, delay, counter):
    while counter:
        time.sleep(delay)
        print("%s: %s" % (thread_name, time.ctime(time.time())))
        counter -= 1
        
        
# Create new threads
threads = (MyThread("Thread-1", 1), 
           MyThread("Thread-2", 2))

# Start all threads
for x in threads: 
    x.start()

# Start and wait for all threads to complete
for x in threads:
    x.join()

print("Exiting Main Thread")        

Starting Thread-1
Starting Thread-2
Thread-1: Tue Sep  4 16:42:03 2018
Thread-1: Tue Sep  4 16:42:04 2018
Thread-1: Tue Sep  4 16:42:05 2018
Thread-2: Tue Sep  4 16:42:07 2018
Thread-2: Tue Sep  4 16:42:09 2018
Thread-2: Tue Sep  4 16:42:11 2018
Exiting Main Thread


**```RLock```** - a reentrant lock is a synchronization primitive that may be acquired multiple times by the same thread. Internally, it uses the concepts of “owning thread” and “recursion level” in addition to the locked/unlocked state used by primitive locks. In the locked state, some thread owns the lock; in the unlocked state, no thread owns it.

Also Locks, RLocks, Semaphores, Conditions can be used **context managers** 'with':

In [25]:
import threading
import time

class MyThread (threading.Thread):
    
    thread_lock = threading.Lock()

    def __init__(self, name, counter):
        threading.Thread.__init__(self)
        self.name = name
        self.counter = counter

    def run(self):
        print("Starting " + self.name)
        # Get lock to synchronize threads
        with MyThread.thread_lock: # acquire and release
            print_time(self.name, self.counter, 3)

def print_time(thread_name, delay, counter):
    while counter:
        time.sleep(delay)
        print("%s: %s" % (thread_name, time.ctime(time.time())))
        counter -= 1
        
# Create new threads
threads = (MyThread("Thread-1", 1), 
           MyThread("Thread-2", 2))

# Start all threads
for x in threads: 
    x.start()

# Start and wait for all threads to complete
for x in threads:
    x.join()

print("Exiting Main Thread")        

Starting Thread-1
Starting Thread-2
Thread-2: Tue Sep  4 16:45:08 2018
Thread-1: Tue Sep  4 16:45:08 2018
Thread-1: Tue Sep  4 16:45:09 2018
Thread-1: Tue Sep  4 16:45:10 2018
Thread-2: Tue Sep  4 16:45:13 2018
Thread-2: Tue Sep  4 16:45:15 2018
Thread-2: Tue Sep  4 16:45:17 2018
Exiting Main Thread


### Condition

A condition variable is always associated with some kind of lock; this can be passed in or one will be created by default. (Passing one in is useful when several condition variables must share the same lock.)

A condition variable has ```acquire()``` and ```release()``` methods that call the corresponding methods of the associated lock. It also has a ```wait()``` method, and ```notify()``` and ```notifyAll()``` methods. These three must only be called when the calling thread has acquired the lock.



In [5]:
import threading
import time

def consumer(cv):
    print("Consumer thread started ...")
    with cv:
        print('Consumer waiting ...')
        cv.wait()
        print('Consumer consumed the resource')

def producer(cv):
    print('Producer thread started ...')
    with cv:
        print('Making resource available')
        print('Notifying to all consumers')
        cv.notifyAll()


condition = threading.Condition()
cs1 = threading.Thread(name='consumer1', target=consumer, args=(condition,))
cs2 = threading.Thread(name='consumer2', target=consumer, args=(condition,))
pd = threading.Thread(name='producer', target=producer, args=(condition,))

cs1.start()
time.sleep(2)
cs2.start()
time.sleep(2)
pd.start()        

Consumer thread started ...
Consumer waiting ...
Thread-2: Tue Sep  4 19:21:00 2018
Consumer thread started ...
Consumer waiting ...
Producer thread started ...
Making resource available
Notifying to all consumers
Consumer consumed the resource
Consumer consumed the resource


### Semaphores

A semaphore manages an internal counter which is decremented by each ```acquire()``` call and incremented by each ```release()``` call. The counter can never go below zero; when ```acquire()``` finds that it is zero, it blocks, waiting until some other thread calls ```release()```.

In [6]:
import threading
import time

shared_list = []

def writer(ws, rs):
    print('Writer thread started ...')
    phrase = "Hello world"
    for word in phrase.split():
        ws.acquire()
        print('Writer thread prints', word)
        shared_list.append(word)
        rs.release()
    rs.release()
    
def reader(ws, rs):
    print('Reader thread started ...')
    while 1:
        rs.acquire()
        if not shared_list:
            break
        word = shared_list.pop()
        print('Reader read', word)
        ws.release()
        

write_sem = threading.Semaphore(1)
read_sem = threading.Semaphore(0)
wr = threading.Thread(name='writer', target=writer, args=(write_sem, read_sem))
rd = threading.Thread(name='reader', target=reader, args=(write_sem, read_sem))
wr.start()
time.sleep(2)
rd.start()

Writer thread started ...
Writer thread prints Hello
Reader thread started ...
Reader read Hello
Writer thread prints world
Reader read world


### Event

This is one of the simplest mechanisms for communication between threads: one thread signals an event and other threads wait for it.

An event object manages an internal flag that can be set to true with the ```set()``` method and reset to false with the ```clear()``` method. The ```wait()``` method blocks until the flag is true.

In [7]:
import threading
import time


def foo(ev):
    print('Foo', end='')
    ev.set()


def bar(ev):
    while not ev.is_set():
        print('Waiting ...')
        ev.wait(1)
    print('Bar')


print_event = threading.Event()
 
ft = threading.Thread(name='foo', target=foo, args=(print_event,))
bt = threading.Thread(name='bar', target=bar, args=(print_event,))
bt.start()
time.sleep(3)
ft.start()

Waiting ...
Waiting ...
Waiting ...
FooBar


### Timer

This class represents an action that should be run only after a certain amount of time has passed — a timer. Timer is a subclass of Thread and as such also functions as an example of creating custom threads.

Timers are started, as with threads, by calling their ```start()``` method. The timer can be stopped (before its action has begun) by calling the ```cancel()``` method. The interval the timer will wait before executing its action may not be exactly the same as the interval specified by the user.

In [23]:
from threading import Timer


def hello():
    print("hello, world")


t = Timer(5, hello)
t.start()

hello, world


### Barrier

New in version 3.2.

This class provides a simple synchronization primitive for use by a fixed number of threads that need to wait for each other. Each of the threads tries to pass the barrier by calling the ```wait()``` method and will block until all of the threads have made their ```wait()``` calls. At this point, the threads are released simultaneously.

The barrier can be reused any number of times for the same number of threads.

In [22]:
import time
from threading import Barrier, Thread

def runner():
    b.wait()
    print("Run")

def commentator():
    for _ in range(10):
        print("Ready:", b.n_waiting)
        time.sleep(1)

b = Barrier(7)

comment_thread = Thread(target=commentator)
comment_thread.start()

runner_threads = []

for _ in range(7):
    runner_thread = Thread(target=runner)
    runner_threads.append(runner_thread)
    runner_thread.start()
    time.sleep(1)

Ready: 0
Ready: 1
Ready: 2
Ready: 3
Ready: 4
Ready: 5
Ready: 6
Run
Run
Run
Run
Run
Run
Run
Ready: 0
Ready: 0
Ready: 0


## Global interpreter lock (GIL)

The mechanism used by the CPython interpreter to assure that only one thread executes Python bytecode at a time. This simplifies the CPython implementation by making the object model (including critical built-in types such as dict) implicitly safe against concurrent access. Locking the entire interpreter makes it easier for the interpreter to be multi-threaded, at the expense of much of the parallelism afforded by multi-processor machines.

However, some extension modules, either standard or third-party, are designed so as to release the GIL when doing computationally-intensive tasks such as compression or hashing. Also, the GIL is always released when doing I/O.

Past efforts to create a “free-threaded” interpreter (one which locks shared data at a much finer granularity) have not been successful because performance suffered in the common single-processor case. It is believed that overcoming this performance issue would make the implementation much more complicated and therefore costlier to maintain.



In [10]:
import sys

a = []
b = a
sys.getrefcount(a)

3

![GIL checkinterval](files/img/1.png)

In [21]:
sys.getcheckinterval()

  """Entry point for launching an IPython kernel.


100

![GIL checkinterval](files/img/2.png)

In [20]:
sys.getswitchinterval()

0.005