# Parallel programming for CPU 

### Thread-based parallelism via the  threading module

the [threading](https://docs.python.org/3/library/threading.html) module - includes a high-level, object oriented, API for working with concurrency from Python. Thread objects run concurrently within the same process and share memory with other thread objects. Using threads is an easy way to scale for tasks that are more I/O bound than CPU bound. The python threading module is used to manage the execution of threads within a process. It allows a program to run multiple operations concurrently in the same process space.

----

In [60]:
import threading

def worker():
    """thread worker function"""
    print('Worker')


threads = []
for i in range(5):
    t = threading.Thread(target=worker)
    threads.append(t)
    t.start()

Worker
Worker
Worker
Worker
Worker


It is useful to be able to spawn a thread and pass it arguments to tell it what work to do. Any type of object can be passed as argument to the thread. This example passes a number, which the thread prints.

In [61]:
import threading
import time

def worker(number):
    """thread worker function"""
    time.sleep(number)
    print('Worker: %s' % number)
    
threads = []
for i in range(5):
    t = threading.Thread(target=worker, args=[i])
    threads.append(t)
    t.start()

Worker: 0
Worker: 1
Worker: 2
Worker: 3
Worker: 4


Threads can be named with:

In [9]:
threading.Thread(name='{}'.format(), target=worker)

IndexError: Replacement index 0 out of range for positional args tuple

In order to identify the current thread one can use:

In [14]:
print(threading.current_thread().getName())
print(threading.current_thread().native_id)

MainThread
22055


----

## Deamons vs. Non-deamons threads

Up to this point, the examples above have implicitly waited for all threads to complete their work before exiting (these are called Non-deamon threads). Sometimes it is beneficial for programs to spawn a thread as a daemon which will run without blocking the main program from exiting.

Using daemon threads is useful for services where there may not be an easy way to interrupt the thread, or where letting the thread die in the middle of its work does not lose or corrupt data (for example, a thread that generates “heart beats” for a service monitoring tool). To mark a thread as a daemon, pass `daemon=True` when constructing it or call its `set_daemon()` method with `True`. 

The default is for threads (in the threading module) to not be daemons.

----

In [26]:
%%writefile tdeamons.py
import threading
import time
import logging

def daemon():
    print('Starting',threading.current_thread().getName())
    time.sleep(5)
    print('Exiting',threading.current_thread().getName())


def non_daemon():
    print('Starting',threading.current_thread().getName())
    time.sleep(2)
    print('Exiting',threading.current_thread().getName())

d = threading.Thread(name='daemon', target=daemon, daemon=True)

t = threading.Thread(name='non-daemon', target=non_daemon, daemon=False)

d.start()
t.start()

#d.join()

Overwriting tdeamons.py


If you want the main program (thread) to wait until a daemon thread has completed its work, use the `join()` method. (Try it in the previous example!)

- By default, `join()` blocks indefinitely. It is also possible to pass a float value representing the number of seconds to wait for the thread to become inactive. If the thread does not complete within the timeout period, `join()` returns anyway.
- `join()` is useful not only with deamon threads, as it can act as a barrier for non-Daemon threads.

In [29]:
!python tdeamons.py

Starting daemon
Starting non-daemon
Exiting non-daemon


----

## Enumerating over all active threads

It is not necessary to retain an explicit handle to all of the daemon threads in order to ensure they have completed before exiting the main process. `enumerate()` returns a list of active Thread instances. The list includes the current thread, and since joining the current thread introduces a deadlock situation, it must be skipped.

----

In [63]:
#%%writefile print_active_threads.py
#import threading

#print out all active threads
for t in threading.enumerate():
    print(t)

<_MainThread(MainThread, started 23422740358976)>
<Thread(Thread-3, started daemon 23422690563840)>
<Heartbeat(Thread-4, started daemon 23422688462592)>
<Thread(Thread-5, started daemon 23422682158848)>
<Thread(Thread-6, started daemon 23422680057600)>
<ControlThread(Thread-2, started daemon 23422677956352)>
<HistorySavingThread(IPythonHistorySavingThread, started 23422675855104)>
<ParentPollerUnix(Thread-1, started daemon 23422673491712)>


In [62]:
!python print_active_threads.py

<_MainThread(MainThread, started 23207208236864)>


In [54]:
%%writefile threads_enumarate.py
import random
import threading
import time

def worker(pause):
    """thread worker function"""
    id=threading.current_thread().native_id
    print('Thread {} sleeping for {}'.format(id,pause))
    time.sleep(pause)
    print('Thread {} is awake!'.format(id,pause))
    return(pause)
    
start = time.perf_counter()

for i in range(5):
    t = threading.Thread(target=worker, args=[i],daemon=True)
    t.start()

#main_thread = threading.main_thread()
#
#for t in threading.enumerate():
#    if t is main_thread:
#        continue
#    t.join()
#    print('joined %s' % t.native_id)

end = time.perf_counter()
print('Finished in {} sec'.format(round(end-start)))

Overwriting threads_enumarate.py


In [55]:
!python threads_enumarate.py

Thread 4606 sleeping for 0
Thread 4606 is awake!
Thread 4607 sleeping for 1
Thread 4608 sleeping for 2
Thread 4609 sleeping for 3
Thread 4610 sleeping for 4
Thread 4607 is awake!
joined 4607
Thread 4608 is awake!
joined 4608
Thread 4609 is awake!
joined 4609
Thread 4610 is awake!
joined 4610
Finished in 4 sec


---
## Signaling Between Threads

Although the point of using multiple threads is to run separate operations concurrently, there are times when it is important to be able to synchronize the operations in two or more threads. Event objects are a simple way to communicate between threads safely.

An Event manages an internal flag that callers can control with the `set()` and `clear()` methods. Other threads can use `wait()` to pause until the flag is set, effectively blocking progress until allowed to continue.

---

In [8]:
import logging
import threading
import time

def wait_for_event(e):
    """Wait for the event to be set before doing anything"""
    logging.debug('wait_for_event starting')
    event_is_set = e.wait()
    logging.debug('event set: %s', event_is_set)

def wait_for_event_timeout(e, t):
    """Wait t seconds and then timeout"""
    while not e.is_set():
        logging.debug('wait_for_event_timeout starting')
        event_is_set = e.wait(t)
        logging.debug('event set: %s', event_is_set)
        if event_is_set:
            logging.debug('processing event')
        else:
            logging.debug('doing other work')

logging.basicConfig(
    level=logging.DEBUG,
    format='(%(threadName)-10s) %(message)s',
)

#initialize the event object
e = threading.Event()

# initialize thread 1 and name it block, because it will wait indefinitely for an event to be set
t1 = threading.Thread(
    name='block',
    target=wait_for_event,
    args=(e,),
)
t1.start()

# initialize thread 1 and name it block, because it will wait for an event to be set or timeout, whichever comes first
t2 = threading.Thread(
    name='nonblock',
    target=wait_for_event_timeout,
    args=(e, 5),
)
t2.start()

logging.debug('No thread will do anything because each thread is waiting for Event.set() to be called')
time.sleep(3)
e.set()
logging.debug('Event is set')



(block     ) wait_for_event starting
(nonblock  ) wait_for_event_timeout starting
(MainThread) No thread will do anything because each thread is waiting for Event.set() to be called
(MainThread) Event is set
(block     ) event set: True
(nonblock  ) event set: True
(nonblock  ) processing event


----
## Controlling Access to Resources


In addition to synchronizing the operations of threads, it is also important to be able to control access to shared resources to prevent corruption or missed data.

Python’s built-in data structures (lists, dictionaries, etc.) are thread-safe but data structures implemented in Python, or simpler types like integers and floats, do not have that protection.

To guard against simultaneous access to an object, use a `Lock` object.

----

In [1]:
%%writefile t_lock.py
import logging
import random
import threading
import time

class Counter:
    def __init__(self, start=0):
        self.lock = threading.Lock()
        self.value = start

    def increment(self):
        logging.debug('Waiting for lock')
        #aquire lock ... This will block all other threads from incrementing value
        self.lock.acquire()
        logging.debug('Acquired lock')
        #try to increment value, if there is no lock on the value
        try:
            logging.debug('Incrementing')
            #increment value
            self.value = self.value + 1
        finally:
            logging.debug('Releasing lock')
            #Release value
            self.lock.release()

def worker(c):
    for i in range(2):
        if threading.current_thread().name == "Thread-1":
            pause = 1
        else:
            pause = 5 
        logging.debug('Sleeping %0.02f', pause)
        time.sleep(pause)
        c.increment()
    logging.debug('Done')

    
logging.basicConfig(
    level=logging.DEBUG,
    format='(%(threadName)-10s) %(message)s',
)

counter = Counter()
for i in range(2):
    t = threading.Thread(target=worker, args=(counter,))
    t.start()

logging.debug('Waiting for worker threads')
main_thread = threading.main_thread()
for t in threading.enumerate():
    if t is not main_thread:
        t.join()

logging.debug('Counter: %d', counter.value)

Overwriting t_lock.py


In [2]:
!python t_lock.py

(Thread-1  ) Sleeping 1.00
(Thread-2  ) Sleeping 5.00
(MainThread) Waiting for worker threads
(Thread-1  ) Waiting for lock
(Thread-1  ) Acquired lock
(Thread-1  ) Incrementing
(Thread-1  ) Releasing lock
(Thread-1  ) Sleeping 1.00
(Thread-1  ) Waiting for lock
(Thread-1  ) Acquired lock
(Thread-1  ) Incrementing
(Thread-1  ) Releasing lock
(Thread-1  ) Done
(Thread-2  ) Waiting for lock
(Thread-2  ) Acquired lock
(Thread-2  ) Incrementing
(Thread-2  ) Releasing lock
(Thread-2  ) Sleeping 5.00
(Thread-2  ) Waiting for lock
(Thread-2  ) Acquired lock
(Thread-2  ) Incrementing
(Thread-2  ) Releasing lock
(Thread-2  ) Done
(MainThread) Counter: 4


#### `Lock()` vs `Rlock()`

The main difference is that a `Lock` can only be acquired once. It cannot be acquired again, until it is released. (After it's been released, it can be re-acaquired by any thread).

An `RLock` on the other hand, can be acquired multiple times, by the same thread. It needs to be released the same number of times in order to be "unlocked".

Another difference is that an acquired Lock can be released by any thread, while an acquired RLock can only be released by the thread which acquired it.

## Synchronizing Threads
### Condition

In addition to using `Events`, another way of synchronizing threads is through using a `Condition` object. Because the `Condition` uses a `Lock`, it can be tied to a shared resource, allowing multiple threads to wait for the resource to be updated. In this example, the `consumer()` threads wait for the `Condition` to be set before continuing. The `producer()` thread is responsible for setting the condition and notifying the other threads that they can continue.

In [53]:
import logging
import threading
import time


def consumer(cond):
    """wait for the condition and use the resource"""
    logging.debug('Starting consumer thread')
    with cond:
        cond.wait()
        logging.debug('Resource is available to consumer')


def producer(cond):
    """set up the resource to be used by the consumer"""
    logging.debug('Starting producer thread')
    time.sleep(5)
    with cond:
        logging.debug('Making resource available')
        cond.notifyAll()


logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s (%(threadName)-2s) %(message)s',
)

condition = threading.Condition()
c1 = threading.Thread(name='consumer 1', target=consumer,
                      args=(condition,))
c2 = threading.Thread(name='consumer 2', target=consumer,
                      args=(condition,))
p = threading.Thread(name='producer', target=producer,
                     args=(condition,))

c1.start()
c2.start()
p.start()

2022-04-26 16:41:48,021 (consumer 1) Starting consumer thread
2022-04-26 16:41:48,023 (consumer 2) Starting consumer thread
2022-04-26 16:41:48,030 (producer) Starting producer thread
2022-04-26 16:41:53,036 (producer) Making resource available
2022-04-26 16:41:53,037 (consumer 2) Resource is available to consumer
2022-04-26 16:41:53,039 (consumer 1) Resource is available to consumer


### Barriers

Barriers are another thread synchronization mechanism. A Barrier establishes a control point and all participating threads block until all of the participating “parties” have reached that point. It lets threads start up separately and then pause until they are all ready to proceed.

In [58]:
import threading
import time

def worker(barrier):
    print(threading.current_thread().name,
          'waiting for barrier with {} others'.format(barrier.n_waiting))
    worker_id = barrier.wait()
    print(threading.current_thread().name,' after barrier. Worker Id: ',worker_id)

NUM_THREADS = 3

barrier = threading.Barrier(NUM_THREADS)

threads = [
    threading.Thread(
        name='worker-%s' % i,
        target=worker,
        args=(barrier,),
    )
    for i in range(NUM_THREADS)
]

for t in threads:
    print(t.name, 'starting')
    t.start()
    time.sleep(3)

for t in threads:
    t.join()

worker-0 starting
worker-0 waiting for barrier with 0 others
worker-1 starting
worker-1 waiting for barrier with 1 others
worker-2 starting
worker-2 waiting for barrier with 2 others
worker-2  after barrier. Worker Id:  2
worker-0worker-1  after barrier. Worker Id:  0
  after barrier. Worker Id:  1


### Semaphores

Sometimes it is useful to allow more than one worker access to a resource at a time, while still limiting the overall number.
For example, a connection pool might support a fixed number of simultaneous connections, or a network application might support a fixed number of concurrent downloads. A `Semaphore` is one way to manage those connections.

In [63]:
import logging
import random
import threading
import time

class ActivePool:

    def __init__(self):
        super(ActivePool, self).__init__()
        self.active = []
        self.lock = threading.Lock()

    def makeActive(self, name):
        with self.lock:
            self.active.append(name)
            logging.debug('Running: %s', self.active)

    def makeInactive(self, name):
        with self.lock:
            self.active.remove(name)
            logging.debug('Running: %s', self.active)

def worker(s, pool):
    logging.debug('Waiting to join the pool')
    with s:
        name = threading.current_thread().getName()
        pool.makeActive(name)
        time.sleep(4)
        pool.makeInactive(name)

logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s (%(threadName)-2s) %(message)s',
)

pool = ActivePool()
s = threading.Semaphore(2)
for i in range(5):
    t = threading.Thread(
        target=worker,
        name=str(i),
        args=(s, pool),
    )
    t.start()

2022-04-26 16:56:09,510 (0 ) Waiting to join the pool
2022-04-26 16:56:09,514 (0 ) Running: ['0']
2022-04-26 16:56:09,513 (2 ) Waiting to join the pool
2022-04-26 16:56:09,515 (3 ) Waiting to join the pool
2022-04-26 16:56:09,515 (4 ) Waiting to join the pool
2022-04-26 16:56:09,517 (2 ) Running: ['0', '2']
2022-04-26 16:56:09,511 (1 ) Waiting to join the pool
2022-04-26 16:56:11,374 (15) Running: ['16', '17', '18', '19']
2022-04-26 16:56:11,376 (16) Running: ['17', '18', '19']
2022-04-26 16:56:11,396 (18) Running: ['17', '19']
2022-04-26 16:56:11,398 (17) Running: ['19']
2022-04-26 16:56:11,400 (19) Running: []
2022-04-26 16:56:13,519 (0 ) Running: ['2']
2022-04-26 16:56:13,521 (3 ) Running: ['2', '3']
2022-04-26 16:56:13,524 (2 ) Running: ['3']
2022-04-26 16:56:13,526 (4 ) Running: ['3', '4']
2022-04-26 16:56:17,526 (3 ) Running: ['4']
2022-04-26 16:56:17,527 (1 ) Running: ['4', '1']
2022-04-26 16:56:17,529 (4 ) Running: ['1']
2022-04-26 16:56:21,533 (1 ) Running: []


In this example, the ActivePool class simply serves as a convenient way to track which threads are able to run at a given moment. A real resource pool would allocate a connection or some other value to the newly active thread, and reclaim the value when the thread is done. Here, it is just used to hold the names of the active threads to show that at most two are running concurrently.

### Thread-specific Data

While some resources need to be locked so multiple threads can use them, others need to be protected so that they are hidden from threads that do not own them. The `local()` class creates an object capable of hiding values from view in separate threads.

In [64]:
import random
import threading
import logging

def show_value(data):
    try:
        val = data.value
    except AttributeError:
        logging.debug('No value yet')
    else:
        logging.debug('value=%s', val)

def worker(data):
    show_value(data)
    data.value = random.randint(1, 100)
    show_value(data)

logging.basicConfig(
    level=logging.DEBUG,
    format='(%(threadName)-10s) %(message)s',
)

local_data = threading.local()
show_value(local_data)
local_data.value = 1000
show_value(local_data)

for i in range(2):
    t = threading.Thread(target=worker, args=(local_data,))
    t.start()

2022-04-26 16:58:38,552 (MainThread) No value yet
2022-04-26 16:58:38,555 (MainThread) value=1000
2022-04-26 16:58:38,562 (Thread-7) No value yet
2022-04-26 16:58:38,563 (Thread-8) No value yet
2022-04-26 16:58:38,564 (Thread-7) value=65
2022-04-26 16:58:38,566 (Thread-8) value=60
