### This notebook is from a linkedin class on python concurrent programming

### sequential/serial execution
* program execute a series of instructions sequentially
* one instruction is executed at any give moment
* speed of the pogram is limited by cpu and how fast it can execute that series of instructions

### parallel programming
* with multiple processes, the instructions can be broken down into independent parts and executed simultaneously by different processes
* components that depend on all parts need the coordinations between the different parts.
* extra complexity is added to coordinate the actions, so the processing speed is not linear with the number of processors.
* parallel execution increases throughput by
  * accomplish a single task faster
  * accomplish more tasks in a given time
  * scale of the problem that can solve. Big computational tasks have to rely on parallel programming to save time, which outweights the cost of added hardware 
  
### multiprocessor architectures
* Flynn's taxonomy (4 classes of computer architecture based on number of concurrent instruction/control streams and number of data streams
  * SISD (single instruction single data)
    + sequential computer with a single processor unit
    + one single instruction at any given moment
  * SIMD (single instruction multiple data)  
    + parallel computer with multiple processor units
    + execute the same instructions at any give momonet, but can operate on different data element
    + for example, both executing chopping, one on onion, one on carrot, and their operations are in sync
    + suitable for applications that perform the same handful of operations on a massive set of data elements, such as in image analysis. modern computers use GPU with SIMD instructions to do that
  * MISD (mutiple instruction, single data)
    + each processor unit independently execute its own separate series of instructions, but all of them are operating on the single stream of data.
    + not a commonly used architecture
  * MIMD (multiple instruction, multiple data)
    + multiple processor units. Every processor unit can process a different series of instructions
    + at the same time, each of those processors can be operating on a different set of data
    + most commonly used architecture in Flynn's taxonomy from multiple core pcs to network clusters in supercomputers.
    + separated further into two parallel programming models:
      + SPMD (single program, multiple data)
        + multiple porcessing units excute a copy of the same single program simultaneously.
        + they can each use a different data.
        + different from SIMD since in SIMD, processing units execute the same instruction at the same time. In SPMD, procssing units just execute the same program
        + the processes can run asynchronously and the program usually includes conditional logic that allows different tasks of the program to only execute the specific parts of the program
        + example, two processors execute the same recipe, but can execute the different parts of the recipe
        + most common of parallel programming. using multiple processor computer to execute the same program as a MIMD architecture
      + MPMD (multiple program, multiple data)
        + Each processing unit is processing a different program.
        + processors execute indepently on different programs and may on different data. (a head/manager nodes with many worker nodes for function decomposition)
* another aspect to conside to categorize computer architectures is based on how memory is organized and how computer access data
  + memory opertes at a speed that is usually slower than processor speed.
  + when one processor is reading or writing to memory, it only prevents other processors to access that same memory element
  + two main memory architecures for parallel computing
    + shared memory
      + all processors access the same memory with global address space. Although each processor executes its own instructions independently, if one process changes a memory loaction, all the processors will see the change.
      + this doesn't mean all the data are on the same physical device. It could be spread across a cluster of systems. The key is all processors see everything happens in the shared memory space.
      + shared memory architectures have two categories based on how processors are connected to memory and how fast they can access the memory
      + easier to programming since it is easy to access data in shared memory
      + difficult to scale since adding more processors to a shared memory system increases the traffic on the shared memory bus and cost to main the cache coherency with communications between all the parts.
      + programmer is responsible to synchronize memory accesses to ensure correct behavior.
        + uniform memory access (UMA)
          + all processors have equal access to the memory and they can access it equally fast.
          + Symmetric multiprocessing system (SMP) is a typical UMA architecture.
            + two or more identical processor connected to a single shared memory through a system bus (processors connect to cache memory, which connects to system bus, which connects to manin memory, all connections are bi-directional)
            + each of processor core of computer or mobile phone is treated as a separate processor as a SMP architecture.
              + each core has its own cache as a small, very fast piece of memory that only it can see. The core uses it to store data it frequently works with.
              + the challenge is that if a processor copies a copy of data from shared memory and changes it in its local cache, the change needs to be updated back in the shared memory before another processor reads the old value. This issue is called cache coherency. It is handled by the hardware in multicore processors 
        + nonuniform memory access (NUMA)
          + physically connect multiple SMP systems (which is a UMA type architecture) together. The access is non-uniform because some processors will have quicker access to certain parts of the memory than others. (these SMP systems are connected by system bus, and are located on different positions of system bus. It takes longer to access the memory through the bus compared to shared memory within the same SMP). Overall, every processor can still see everything in memory.
    + distributed memory
      + each processor has its own memory with its own address space and there is no global address space. All processors are connected through some sort of network (such as an ethernet).
      + each processor operates independently. if it makes changes to its local memory, that change is not automatically reflected in the memory of other processors. 
      + it is up to programmer to explicitly define how and when data is communicated between the nodes. (difficult)
      + advantage of NUMA is it is scalable
        + adding more processors to the system, you get more memory. This makes it cost-effective to use commodity, off-the-shelf computers and network equipment to build large distributed memory systems. 

### Threads and processes
* process:
  + when a computer runs an application, that instance of the program executing is referred to as a process
    + includes code, data, and state information
    + independent instance of a running program
    + has its own, separate memory address and space
    + can have hundreds of processes at the same time and an operating system's job is to manage all of them
    + sharing resouces between processes will need to use inter-process communication(IPC)
      + sockets and pipes
      + shared memory
      + remote procedure calls
* within each process, there are one or more smaller sub-elements called threads
  + each thread is an independent path of execution through the program
  + a different sequence of instructions
  + only exists as part of a process (subset of a process)
  + basic unit that os manages. Os schedules threads for execution and allocates time on the processor to execute them.
  + threads of the same process share the process's address space so they can access to the same resources and memory, including code varialbes, and data, making it easy to work together.
  + sharing resources between processes is not as easy as sharing between threads in the same process.
  + threads are light-weight and require less overhead to create and terminate
  + operation system can switch between threads faster than processes  

### concurrency and parallel execution
* concurrency: ability of a program to be broken into parts that can be run indepently of each other. These parts can be executed out of order or partially out of order without impacting the result.
* independent tasks without multiple processors will be executed by switching back and forth between them, but only one task can be executed at a moment. This may give an illustion of parallel execution, but it is just concurrent execution since only one task is executed at a moment.
  + with multiple hardware, such as multiple processors, multiple tasks can be executed simultaneously, then we have parallel execution
* concurrency refers to the program structure that enables to deal with multiple things at once
* parallelism refers to siumultaneous execution that actually doing multiple things at once
* concurrent programming is useful for I/O dependent tasks. when a thread is waiting for I/O response, we can use another thread to accept user's input.
* parallel processing is useful for computational intensive tasks, such as matrix multiplication.

### concurrent python thread
* using threads to handle concurrent tasks in python is straightforward.
* pyhton interpreter will not allow concurrent threads to execute simultaneously and parallel due to GIL (global interpreter lock)
* Global interpreter lock is a mechanism that limits python to only execute one thread at a time when CPython is used as the interpreter
* GIL provide a simple way to provide thread-safe memory management for thread-safe operations.
* multi-thread is still useful for many I/O bound applications since GIL will not lock threads
* for CPU-bound applications, such as intensive computational tasks, GIL can negatively impact performance. 
  + we can implement parallel algorithms as external library functions such as C++ called by python functions.
  + you can also use python multiprocessing package to use multiple processors instead of multiple threads.
    + each process will have its a separate interpreter with its own GIL, so different processors can execute simultaneously
    + communcations between processors are more difficult than between threads
    + uses more resources compared to creating multiple threads
    

In [1]:
import os
import threading

# a simple function that wastes CPU cycles forever
def cpu_waster():
    while True:
        pass

# display information about this process
print('\n  Process ID: ', os.getpid())
print('Thread Count: ', threading.active_count())
for thread in threading.enumerate():
    print(thread)

print('\nStarting 12 CPU Wasters...')
for i in range(12):
    threading.Thread(target=cpu_waster).start()

# display information about this process
print('\n  Process ID: ', os.getpid())
print('Thread Count: ', threading.active_count())
for thread in threading.enumerate():
    print(thread)



  Process ID:  10944
Thread Count:  6
<_MainThread(MainThread, started 6764)>
<Thread(IOPub, started daemon 7740)>
<Heartbeat(Heartbeat, started daemon 1356)>
<ControlThread(Control, started daemon 8664)>
<HistorySavingThread(IPythonHistorySavingThread, started 10936)>
<ParentPollerWindows(Thread-4, started daemon 2876)>

Starting 12 CPU Wasters...

  Process ID:  10944
Thread Count:  18
<_MainThread(MainThread, started 6764)>
<Thread(IOPub, started daemon 7740)>
<Heartbeat(Heartbeat, started daemon 1356)>
<ControlThread(Control, started daemon 8664)>
<HistorySavingThread(IPythonHistorySavingThread, started 10936)>
<ParentPollerWindows(Thread-4, started daemon 2876)>
<Thread(Thread-5 (cpu_waster), started 3156)>
<Thread(Thread-6 (cpu_waster), started 10548)>
<Thread(Thread-7 (cpu_waster), started 7372)>
<Thread(Thread-8 (cpu_waster), started 7884)>
<Thread(Thread-9 (cpu_waster), started 6424)>
<Thread(Thread-10 (cpu_waster), started 3352)>
<Thread(Thread-11 (cpu_waster), started 9220)

### multiprocessing module
* for true parallel programming in python, we need to use multiprocessing rather than mutlithreading
* to use multiprocessing, we do the following from multithreads
  + import multiprocessing 
  + include all code inside __main__
  + replace 
  ```python
  threading.Thread(target=cpu_waster).start()
  ```
  to 
  ```python
  import multiprocessing as mp
  if __name__ == "__main__":
    for i in range(12):
        mp.Process(target=cpu_waster).start()
    
    for thread in threading.enumerate():
        print(thread)
  ``` 
* we need to include the mp process code inside main using the if statement because
  + mp.Process command will load the entire script to find out cpu_waster function and other dependencies
  + basically, each process will run the script. if we don't include the if condition that only main module can spawn new processes, the child processes will continue to spwan their child processes until the system crashes          
* The entire code snippet is attached in the following cell (if we don't include if conditions, line 22 will run forever)

In [None]:
""" Threads that waste CPU cycles """

import os
import threading
import multiprocessing as mp

# a simple function that wastes CPU cycles forever
def cpu_waster():
    while True:
        pass

print('Hi! My name is', __name__)
if __name__ == '__main__':
    # display information about this process
    print('\n  Process ID: ', os.getpid())
    print('Thread Count: ', threading.active_count())
    for thread in threading.enumerate():
        print(thread)

    print('\nStarting 12 CPU Wasters...')
    for i in range(12):
        mp.Process(target=cpu_waster).start()

    # display information about this process
    print('\n  Process ID: ', os.getpid())
    print('Thread Count: ', threading.active_count())
    for thread in threading.enumerate():
        print(thread)


### Scheduler
* Operating system function that assigns processes and threads to run on available CPUs
* scheduler makes it possible for multiple programs to run concurrently on a single processor
* when a process is created and ready to run, it gets loaded into memory and placed in the ready queue
* scheduler gets through the ready processes so they get a chance to execute on the processor
* if there are multiple processors, OS will schedule processes to run on each of them to make the most use of additional resources
* the following are some use cases that scheduler will do to processes:
  + a process will run until it finishes, and scheduler will assign another process on that processor
  + a process might get blocked and have to wait for an I/O event, which will go to a separate I/O waiting queue so another process can run
  + scheduler might determine that a process has spent its fair share of time on the processor, and swap it out for another process from the ready queue, which is called a context switch. In a context switch:
    + OS has to save the state or context of the process or thread that was running to resume them later
    + OS has to load the context of the new process or thread to run
    + context switch is not instaneous. it takes time to save and restore the registers and memory state, scheduler needs a strategy for how frequently it switches between processes.
* scheduling algorithms
  + first come, first served
  + shortest job next
  + priority
  + shortest remaining time
  + round robin
  + multiple-level queues
  + some of these algorithms are preemptive
    + meaning that lower priority processes will be paused or stopped when a high priority process enters the ready state.
    + non-preemptive algorithms allow a process to run once it is in running state, it is allowed to run for its alloted time
* scheduling goals
  + which algorithm to choose depends on the scheduling goals and different algorithms will be used by different OS
    + max throughput
    + max fairness
    + min wait time
    + min latency
  + your program should not rely on expected order of how multiple processes/threads will run
  + your program should not rely on that equal amount of time will be assigned to each process/thread    

In [None]:
""" Two threads chopping vegetables """

import threading
import time

chopping = True

def vegetable_chopper():
    name = threading.current_thread().getName()
    
    # set up local variable to count how many times the while loop executes
    vegetable_count = 0
    while chopping:
        print(name, 'chopped a vegetable!')
        vegetable_count += 1
    print(name, 'chopped', vegetable_count, 'vegetables.')

if __name__ == '__main__':
    threading.Thread(target=vegetable_chopper, name='Barron').start()
    threading.Thread(target=vegetable_chopper, name='Olivia').start()

    time.sleep(1)    # chop vegetables for 1 second
    chopping = False # stop both threads from chopping


The above code shows that
* it is unpredictable which thread will execute for how many times in the while loop
* variable chopping outside vegetable_chopper did control the while loop inside the function

### thread lifecycle
when a program or process starts, it will start as a single thread (main thread)
* main thread can start or spawn additional child threads as part of the same process, but execute independently to do other tasks
* child threads can spawn their child threads. When they finish the executing, they notify their parent threads and terminate
* main thread is usually the last thread to finish execution
* four states of a thread
  + new state 
    + when a new thread is spawned/created
    + the thread doesn't run and doesn't take any CPU resources
    + when creating the thread, a function is assigned to it for it to execute
    + some programming language requires to start a thread after creating it so that it will go to runnable state
  + runnable state
    + OS can schedule the thread to execute
    + through context switches, the thread can swap with a thread to go on one of the availabe processors
  + blocked
    + when a thread needs to wait for an event to occur, such as an external input or a timer, it goes to block state
    + thread will not use any CPU resources
    + OS will resume the thread by putting it on runnable state when the event it waits for occurs
    + when a thread needs to wait for other threads (e.g. its child threads) to complete their jobs, we use join()
      + wait until another thread completes its execution
      + when join() is called, the current thread goes to block state and wait for other thread's job is done
  + terminated 
    + when the thread finishes execution, it will notify its parent thread and goes to terminated state
    + when a thread is abnormally aborted also goes to terminated state

### code example
In the next cell, the code is to demonstrate the difference life cycle stage of threads
* Barron is the main thread, it spawn the child thread, Olivia
* Olivia thread is created by calling the __init__() method of Thread class when ChefOlivia is instantiated by main thread
* Olivia thread start (in runnable state) when main thread calls its start() method
* main thread waits for olivia thread by calling olivia's join() method from within the main thread
  + this block main thread's execution until olivia thread is terminated
* after olivia thread is done, main thread resumes the execution and finishes
* we can check if a thread is alive by calling its is_alive() method
  + when a thread is created, but not runnable, it is not alive
  + when a thread is in runnable state, even if it is sleep, it is alive
  + when a thread is terminated,it is not alive

In [None]:
""" Two threads cooking soup """

import threading
import time

class ChefOlivia(threading.Thread):

    def __init__(self):
        super().__init__()

    def run(self):
        print('Olivia started & waiting for sausage to thaw...')
        time.sleep(3)
        print('Olivia is done cutting sausage.')

# main thread
if __name__ == '__main__':
    print("Barron started & requesting Olivia's help.")
    olivia = ChefOlivia()
    print('  Olivia alive?:', olivia.is_alive())

    print('Barron tells Olivia to start.')
    olivia.start()
    print('  Olivia alive?:', olivia.is_alive())

    print('Barron continues cooking soup.')
    time.sleep(0.5)
    print('  Olivia alive?:', olivia.is_alive())

    print('Barron patiently waits for Olivia to finish and join...')
    olivia.join()
    print('  Olivia alive?:', olivia.is_alive())

    print('Barron and Olivia are both done!')


### Two ways to create threads in python
* create python threads with classes that inherits Thread class and overwrite its __init__() and run() methods
  + you should only override these two methods
  + you need to call the super().__init__() in the init method
```python
class MyThreadClass(threading.Thread):
    def __init__(self):
        super().__init__()

    def run(self):
        print('Olivia started & waiting for sausage to thaw...')
        time.sleep(3)
        print('Olivia is done cutting sausage.')
        
olivia = ChefOlivia()
olivia.start()
```

* direct instantiate a Thread object and define the target as the function to run
```python
threading.Thread(target=vegetable_chopper, name='Barron').start()
```
        
    

### Daemon threads
* Garbage Collector
  * automatic memory management running in the background
  * reclaim memory no longer in use by program
* if a main thread spawn a child thread running in the background, when the main thread finishes, it can not exit because it has child threads still executing
  + to solve this problem, we can make the child thread as a daemon (background) thread
    + a daemon thread will not prevent a program/process from terminating if it is still running
    + by default, threads are created as non-daemon. you need to explicitly turn a thread to a daemon thread
    + a daemon thread is called detached from the main thread, it will abruptly stop when main thread exits
      + make sure daemon thread will not have negative impacts when it is prematurely exits
* the following cell shows a daemon thread code example
  + new threads will inherit daemon status from their parent
  + main thread is a normal thread, therefore, child threads spawned from main are normal threads
  + you set up the thread's daemon status (as shown in line 14 of the code) before staring the thread
  + when program ends, remaining daemon threads are abandoned

In [None]:
# demo of daemon threads
""" Barron finishes cooking while Olivia cleans """

import threading
import time

def kitchen_cleaner():
    while True:
        print('Olivia cleaned the kitchen.')
        time.sleep(1)

if __name__ == '__main__':
    olivia = threading.Thread(target=kitchen_cleaner)
    olivia.daemon = True
    olivia.start()

    print('Barron is cooking...')
    time.sleep(0.6)
    print('Barron is cooking...')
    time.sleep(0.6)
    print('Barron is cooking...')
    time.sleep(0.6)
    print('Barron is done!')


### Data Race
* two or more concurrent threads access the same memory loaction
* at least one thread is modifying it
* what happens when a thread updates a value
  1. read the value from memory location
  2. calculate and modify the value
  3. write the calculated value to the memory location
  + any thing happens to the value stored in memory between step 1 and 3 and before step 3 creates data race
  + a potential case is that two threads reads the memory location a the same time, and based on the read value, one thread updated the value, and then the other updates the value using the outdated value before the first thread's update
  + since the timing of thread schedule is not predictable, the result value in memory is not predictable and sometimes are incorrect. especially when there are a lot of updates happens
* the best way to prevent this is to pay attention whenever two or more threads access the same resources

In [None]:
# demonstration of data race
# two threads are writing to the same varialbe (garlic_count) 1 million times
# this creates unpredictable, incosistent results in variable value due to data race
# notice that even though there is only one increment statement, the program will need
# to read, calculate and then write the updated garlic_count value (the three actions are not atomic)
""" Two shoppers adding items to a shared notepad """

import threading

garlic_count = 0

def shopper():
    global garlic_count
    for i in range(10_000_000):
        garlic_count += 1

if __name__ == '__main__':
    barron = threading.Thread(target=shopper)
    olivia = threading.Thread(target=shopper)
    barron.start()
    olivia.start()
    barron.join()
    olivia.join()
    print('We should buy', garlic_count, 'garlic.')

### critical section and mutex(lock)
* critical section
  * a critical section is a code segment that accesses a shared resource such as a data structrue memory or external device and may not operate correctly when multiple threads access it
  * critical section needs to be protected to only allow one thread or process execute on it at a time
  * critical section should not be executed by more than one thread or process at a time
* mutex (lock)
  + mutex (lock) is a mchanism to implement mutual exclusion
  + only allow one thread or process to possess at a time
  + this limits access to critical section
  + when a thread is trying to acquire a lock and find the lock is already token, it will block/wait for it to be available
  + the critical sections (protected sections of code) should be as short as possible
* atomic operation
  + to process to acquire the lock is an atomic operation meaning
    + it executes as a single action, relative to other threads
    + cannot be interrupted by other concurrent threads
* in python, we use the lock object included threading package, as shown in the following cell of the demo code
  + the critical section in line 15 is protected by mutex. in line 14, the lock is aquired, and after the incrementation, released.
  + note that we only need to protect the shortest part of the code (garlic_count += 1) to make it atomic and protected

In [None]:
""" Two shoppers adding items to a shared notepad """

import threading
import time

garlic_count = 0
pencil = threading.Lock()

def shopper():
    global garlic_count
    for i in range(5):
        print(threading.current_thread().getName(), 'is thinking.')
        time.sleep(0.5)
        pencil.acquire()
        garlic_count += 1
        pencil.release()

if __name__ == '__main__':
    barron = threading.Thread(target=shopper)
    olivia = threading.Thread(target=shopper)
    barron.start()
    olivia.start()
    barron.join()
    olivia.join()
    print('We should buy', garlic_count, 'garlic.')


### Deadlock
* reentrant mutex
  * Deadlock: if a thread tries to aquire a lock it already has, then all processes and threads waiting for the lock are unable to continue executing. This is called a deadlock
  * When a program needs to lock a mutex multiple times before unlocing it. you should use reentrant mutex lock 
* a reentrant mutex can be locked multiple times by the same thread
* it record the times it has been locked, and it must be unlocked as many times as it was locked before another thread can unlock it
* one example: we have a function incrementCounter() that locks mutex, and myFunction() calls that function also use lock. The thread executing myFunction will aquire lock and can unlock it multiple times to release the lock

```python
def incrementCounter(){
    lock()
    counter++
    unlock()
}

def myFunction(){
    lock()
    incrementCounter()
    unlock()
}
```

* another use case for reentrant mutex is used by recursive functions that use lock. Therefore, the following are the same
  + reentrant mutex
  + reentrant lock
  + recursive mutex
  + recursive lock
* python's implementation of reentrant lock is RLock, as shown in the following cell
* difference between Lock and RLock in python
  + Lock can be released by a different thread than was used to aquire it
  + RLock must be released by the same thread that acquired it
    + in addition, it must be released the same number of times it was acquired

In [None]:
""" Two shoppers adding garlic and potatoes to a shared notepad """

import threading

garlic_count = 0
potato_count = 0
pencil = threading.RLock()

def add_garlic():
    global garlic_count
    pencil.acquire()
    garlic_count += 1
    pencil.release()

# add_potato calls add_garlic that repeatedly acquire and release lock
    def add_potato():
    global potato_count
    pencil.acquire()
    potato_count += 1
    add_garlic()
    pencil.release()

def shopper():
    for i in range(10_000):
        add_garlic()
        add_potato()

if __name__ == '__main__':
    barron = threading.Thread(target=shopper)
    olivia = threading.Thread(target=shopper)
    barron.start()
    olivia.start()
    barron.join()
    olivia.join()
    print('We should buy', garlic_count, 'garlic.')
    print('We should buy', potato_count, 'potatoes.')


### Try Lock
* When the thread has other tasks to do, it doesnot have to be blocked and wait for lock. The logic of try lock is:
  + it is a non-blocking lock/acquire method for mutex
  + try lock do the following:
    + If the mutex is available, acquires it and return True
    + otherwise, immediately return False so the thread can process other tasks
* in python, the non-blocking loc is implemented by setting blocking=False when acquiring lock, which returns a bool
  + while waiting for lock, the thread can execute the increment of items_to_add that improves the productivity

In [None]:
""" Two shoppers adding items to a shared notepad """

import threading
import time

items_on_notepad = 0
pencil = threading.Lock()

def shopper():
    global items_on_notepad
    name = threading.current_thread().getName()
    items_to_add = 0
    while items_on_notepad <= 20:
        if items_to_add and pencil.acquire(blocking=False): # add item(s) to shared items_on_notepad
            items_on_notepad += items_to_add
            print(name, 'added', items_to_add, 'item(s) to notepad.')
            items_to_add = 0
            time.sleep(0.3) # time spent writing
            pencil.release()
        else: # look for other things to buy
            time.sleep(0.1) # time spent searching
            items_to_add += 1
            print(name, 'found something else to buy.')

if __name__ == '__main__':
    barron = threading.Thread(target=shopper, name='Barron')
    olivia = threading.Thread(target=shopper, name='Olivia')
    start_time = time.perf_counter()
    barron.start()
    olivia.start()
    barron.join()
    olivia.join()
    elapsed_time = time.perf_counter() - start_time
    print('Elapsed Time: {:.2f} seconds'.format(elapsed_time))


### reader-writer lock
* commonly used lock locks all threads to access the critical sections, which is not efficient since threads only read from the section should be safe and shouldn't be blocked
* a reader-writer lock or shared mutex can be locked in two ways
  + locked in a shared read mode that allows multiple threads that only need simultaneous reads to lock it
  + or exclusive write that only allow one thread at a time to write to the resource
  + when switching between these two modes, a thread need to wait
* it is better to use reader-writer lock when there are lot more read threads than write threads 
* in the following cell of code
  + RWLockFair give equal priorities to read and write
  + gen_rlock()- generates a reader lock object
  + gen_wlock()- generates a writer lock object

In [None]:
""" Several users reading a calendar, but only a few users updating it """

import threading
from readerwriterlock import rwlock

WEEKDAYS = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']
today = 0

marker = rwlock.RWLockFair()

def calendar_reader(id_number):
    global today
    read_marker = marker.gen_rlock()
    name = 'Reader-' + str(id_number)
    while today < len(WEEKDAYS)-1:
        read_marker.acquire()
        print(name, 'sees that today is', WEEKDAYS[today], '-read count:', read_marker.c_rw_lock.v_read_count)
        read_marker.release()

def calendar_writer(id_number):
    global today
    write_marker = marker.gen_wlock()
    name = 'Writer-' + str(id_number)
    while today < len(WEEKDAYS)-1:
        write_marker.acquire()
        today = (today + 1) % 7
        print(name, 'updated date to ', WEEKDAYS[today])
        write_marker.release()

if __name__ == '__main__':
    # create ten reader threads
    for i in range(10):
        threading.Thread(target=calendar_reader, args=(i,)).start()
    # ...but only two writer threads
    for i in range(2):
        threading.Thread(target=calendar_writer, args=(i,)).start()


### Deaklock
* in multiple threads with multiple locks system, deaklock occurs when each thread is waiting for another thread to tack action. For example, to access a resource, each thread needs to acquire two locks. If each of the two threads acquires one lock, and then waiting for the other lock, then neigher of these two threads can make progress and they kare in deadlock.
* this is a common challenge when using mutex to protect critical sections of the code.
* we want program to be free of deadlock to guarantee liveness
* liveness:
  + properties that require a system to make progress
* deadlock sometimes happens, sometimes not. It is diffciult to identify
* one solution is to keep a priority list of the locks, and follow that order to acquire locks for all threads.
   + ensure locks are always taken in the sqame order by any thread
* another solution is to use lock timeout
  + put a timeout on lock attempts
  + if a thread can not acquire all locks within the time limit:
    + back up and free all locks taken
    + wait for a random amount of time
    + try again
* code in the following cell implements three threads acquiring locks in a consistent order with priorities a > b > c    

In [None]:
""" Three philosophers, thinking and eating sushi """

import threading

chopstick_a = threading.Lock()
chopstick_b = threading.Lock()
chopstick_c = threading.Lock()
sushi_count = 500

def philosopher(name, first_chopstick, second_chopstick):
    global sushi_count
    while sushi_count > 0: # eat sushi until it's all gone
        first_chopstick.acquire()
        second_chopstick.acquire()

        if sushi_count > 0:
            sushi_count -= 1
            print(name, 'took a piece! Sushi remaining:', sushi_count)

        second_chopstick.release()
        first_chopstick.release()
        
if __name__ == '__main__':
    threading.Thread(target=philosopher, args=('Barron', chopstick_a, chopstick_b)).start()
    threading.Thread(target=philosopher, args=('Olivia', chopstick_b, chopstick_c)).start()
    threading.Thread(target=philosopher, args=('Steve', chopstick_a, chopstick_c)).start()


### Abandoned lock
* when a program/thread acquires a lock and then terminates because unexpected reasons, it may not automatically release the lock
* other threads waiting for the lock will never acquire the lock
* the solution is to put the critical section access code in a try block and put the lock release code in a finally section
* the example code is shown in the following cell

In [None]:
""" Three philosophers, thinking and eating sushi """

import threading

chopstick_a = threading.Lock()
chopstick_b = threading.Lock()
chopstick_c = threading.Lock()
sushi_count = 500
some_lock = threading.Lock()

some_lock.acquire()
# try:
#     # do something...
# finally:
#     some_lock.release()
# 
# with some_lock:
#     #do something...

def philosopher(name, first_chopstick, second_chopstick):
    global sushi_count
    while sushi_count > 0: # eat sushi until it's all gone
        first_chopstick.acquire()
        second_chopstick.acquire()
        try:
            if sushi_count > 0:
                sushi_count -= 1
                print(name, 'took a piece! Sushi remaining:', sushi_count)
            if sushi_count == 10:
                print(1/0)
        finally:
            second_chopstick.release()
            first_chopstick.release()

if __name__ == '__main__':
    threading.Thread(target=philosopher, args=('Barron', chopstick_a, chopstick_b)).start()
    threading.Thread(target=philosopher, args=('Olivia', chopstick_b, chopstick_c)).start()
    threading.Thread(target=philosopher, args=('Steve', chopstick_a, chopstick_c)).start()


### Abandoned lock using context manager
* we can utilize python context manager to manage the lock release, as shown in the following cell
* as shown in line 13 and 14, the context manager manages the lock acquire and release.

In [None]:
""" Three philosophers, thinking and eating sushi """

import threading

chopstick_a = threading.Lock()
chopstick_b = threading.Lock()
chopstick_c = threading.Lock()
sushi_count = 500

def philosopher(name, first_chopstick, second_chopstick):
    global sushi_count
    while sushi_count > 0: # eat sushi until it's all gone
        with first_chopstick:
            with second_chopstick:
                if sushi_count > 0:
                    sushi_count -= 1
                    print(name, 'took a piece! Sushi remaining:', sushi_count)

                if sushi_count == 10:
                    print(1/0)

if __name__ == '__main__':
    threading.Thread(target=philosopher, args=('Barron', chopstick_a, chopstick_b)).start()
    threading.Thread(target=philosopher, args=('Olivia', chopstick_b, chopstick_c)).start()
    threading.Thread(target=philosopher, args=('Steve', chopstick_a, chopstick_c)).start()


### Starvation
* when a thread or process is perpetually denied the resources it needs. It can not access the resources it needs
* starvation happens when 
  + threads have different priorities compete for resources, and low priority threads will never access resources
  + there are too many threads competing (as shown in the following cell)

In [None]:
""" Three philosophers, thinking and eating sushi """

import threading

chopstick_a = threading.Lock()
chopstick_b = threading.Lock()
chopstick_c = threading.Lock()
sushi_count = 5000

def philosopher(name, first_chopstick, second_chopstick):
    global sushi_count
    sushi_eaten = 0
    while sushi_count > 0: # eat sushi until it's all gone
        with first_chopstick:
            with second_chopstick:
                if sushi_count > 0:
                    sushi_count -= 1
                    sushi_eaten += 1
                    print(name, 'took a piece! Sushi remaining:', sushi_count)
    print(name, 'took', sushi_eaten, 'pieces')

if __name__ == '__main__':
    for thread in range(50):
        threading.Thread(target=philosopher, args=('Barron', chopstick_a, chopstick_b)).start()
        threading.Thread(target=philosopher, args=('Olivia', chopstick_a, chopstick_b)).start()
        threading.Thread(target=philosopher, args=('Steve', chopstick_a, chopstick_b)).start()


### Livelock
* multiple theads or processors are actively responding to each other to resolve conflict, but that prevents them from making progress
* no threads will make progress because they give up their locks.
* to resolve that, make sure only one thread is taking action chosen by priority or other mechanisms like random selection
* different from deadlock, threads in livelock are actively executing without useful progress.
* in the following cell, the livelock is resolved by assigning a random number of seconds to sleep once a thread release its first lock once the second lock is not available. This gives other threads time to acquire both locks before it acquire its first lock again immediately after it release the frist lock




In [None]:
""" Three philosophers, thinking and eating sushi """

import threading
import time
from random import random

chopstick_a = threading.Lock()
chopstick_b = threading.Lock()
chopstick_c = threading.Lock()
sushi_count = 500

def philosopher(name, first_chopstick, second_chopstick):
    global sushi_count
    while sushi_count > 0: # eat sushi until it's all gone
        first_chopstick.acquire()
        if not second_chopstick.acquire(blocking=False):
            print(name, 'released their first chopstick.')
            first_chopstick.release()
            time.sleep(random()/10)
        else:
            try:
                if sushi_count > 0:
                    sushi_count -= 1
                    print(name, 'took a piece! Sushi remaining:', sushi_count)
            finally:
                second_chopstick.release()
                first_chopstick.release()

if __name__ == '__main__':
    threading.Thread(target=philosopher, args=('Barron', chopstick_a, chopstick_b)).start()
    threading.Thread(target=philosopher, args=('Olivia', chopstick_b, chopstick_c)).start()
    threading.Thread(target=philosopher, args=('Steve', chopstick_c, chopstick_a)).start()
