## 1. Random Numbers

## 2. Decorators

## 3. Generators

## 4. Threading vs Multiprocessing (Theory time 🤓)


## Process:
- An instance of program
- Takes advantage of multiple CPU's & cores
- Memory is not shared b/w processes
- Great for cpu-bound processing
- New process is started independently of another process
- they are killable/interruptable
- One GIL (Global interpreter lock) for each process -> avoid GIL limitation
- More resource-intensive as each process
- Higher overhead due to inter-process communication
- Better for CPU-intensive tasks (like complex calculations)
- Starting a process is slower that starting a thread
- Larger memory footprint


## Threads:
An entity within a process that can be scheduled for execution (Also known as "leightweight process"). A Process can spawn multiple threads. The main difference is that all threads within a process share the same memory.

- Multiple threads can be spawned within one process
- Memory is shared between all threads
- Starting a thread is faster than starting a process
- Great for I/O-bound tasks
- Leightweight - low memory footprint
- One GIL for all threads, i.e. threads are limited by GIL
- Multithreading has no effect for CPU-bound tasks due to the GIL
- Not interruptible/killable -> be careful with memory leaks
- increased potential for race conditions

## GIL - Global interpreter lock
- Lock that allows only one thread to hold control of the Python interpreter. 
- It allows only one thread to execute at a time even in a multi-threaded architecture.

### Why is it needed?
It is needed because CPython's (reference implementation of Python) memory management is not thread-safe. 
Python uses reference counting for memory management.

It means that objects created in Python have a reference count variable that keeps track of the number of references that point to the object. 
When this count reaches zero, the memory occupied by the object is released. 
The problem was that this reference count variable needed protection from race conditions where two threads increase or decrease its value simultaneously. 

If this happens, it can cause either leaked memory that is never released or incorrectly release the memory while a reference to that object still exists.

### How to avoid the GIL
The GIL is very controversial in the Python community. 
The main way to avoid the GIL is by using multiprocessing instead of threading. 
Another (however uncomfortable) solution would be to avoid the CPython implementation and use a free-threaded Python implementation like Jython or IronPython. 
A third option is to move parts of the application out into binary extensions modules, i.e. use Python as a wrapper for third party libraries (e.g. in C/C++). This is the path taken by numypy and scipy.

In [6]:
# Multiprocessing
from multiprocessing import Process
import os 
import time

def sqaure():
    for i in range(100):
        i*i
        time.sleep(0.01)

processes=[]
num_process = os.cpu_count()

for i in range(num_process):
    p = Process(target=sqaure)
    processes.append(p)

for p in processes:
    p.start()

# join is used to block main thread until all processes are finished 
for p in processes:
    p.join()
    
print('End main process')

# -----------------------

# Multithreading
from threading import Thread
        
threads=[]
num_threads = 11

for i in range(num_threads):
    t = Thread(target=sqaure)
    threads.append(t)

for t in threads:
    t.start()

for t in threads:
    t.join()
    
print('End main thread')



End main process
End main thread


## 5. Multithreading

- Locks are used to avoid race condition 
- A lock is like a token that only one thread can hold at a time. 
- Other threads must wait until the lock is released.
- A race condition occurs when two or more threads can access shared data and they try to change it at the same time.
- Because the thread scheduling algorithm can swap between threads at any time, you don't know the order in which the threads will attempt to access the shared data.

In [11]:
from threading import Thread
import time

db_val=0

def increase():
    global db_val
    local_val=db_val
    local_val+=1
    time.sleep(0.01)
    db_val=local_val
    # time.sleep(0.01)

if __name__=='__main__':
    print('Start db_val: ', db_val)
    
    t1 = Thread(target=increase)
    t2 = Thread(target=increase)

    t1.start()
    t2.start()

    t1.join()
    t2.join()
    
    print('End db_val: ', db_val) 
    # due to race condition caused by t1's sleep time execution of thread
    # shifts to t2 being executed in meantime and both threads end up copying 1 into db_val
    # this can be avoided by using locks
    # so we get 1 instead of 2

Start db_val:  0
End db_val:  1


In [1]:
# Locks are used to avoid race condition 
# A lock is like a token that only one thread can hold at a time. 
# If a thread wants to access shared resources, it must first acquire the lock. 
# Other threads must wait until the lock is released.
# If the state is locked, it does not allow other concurrent threads to enter this code section until the state is unlocked again.

from threading import Thread, Lock
import time

db_val=0

def increase(lock):
    global db_val

    lock.acquire()
    local_val=db_val

    # processing
    local_val+=1
    time.sleep(0.01)
    db_val=local_val
    
    lock.release()


# using lock with context manager 
def increase2(lock):
    global db_val

    with lock:
        local_val=db_val
        local_val+=1
        time.sleep(0.01)
        db_val=local_val

if __name__=='__main__':
    print('Start db_val: ', db_val)
    lock = Lock()
    t1 = Thread(target=increase2, args=(lock,))
    t2 = Thread(target=increase2, args=(lock,))

    t1.start()
    t2.start()

    t1.join()
    t2.join()
    
    print('End db_val: ', db_val) 



Start db_val:  0
End db_val:  2


## 5.1 Queues for mt-mp data exchange

Queues can be used for thread-safe/process-safe data exchanges and data processing both in a multithreaded and a multiprocessing environment.

A queue is a linear data structure that follows the First In First Out (FIFO) principle. A good example is a queue of customers that are waiting in line, where the customer that came first is served first.

#### Operations with a queue are thread-safe. Important methods are:

- `q.get()` : Remove and return the first item. By default, it blocks until the item is available.
- `q.put(item)` : Puts element at the end of the queue. By default, it blocks until a free slot is available.
- `q.task_done()` : Indicate that a formerly enqueued task is complete. For each get() you should call this after you are done with your task for this item.
- `q.join()` : Blocks until all items in the queue have been gotten and proccessed (task_done() has been called for each item).
- `q.empty()` : Return True if the queue is empty.

#### Daemon threads
- Daemon threads are background threads that automatically die when the main program ends. 
- This is why the infinite loops inside the worker methods can be exited. Without a daemon process we would have to use a signalling mechanism such as a threading.
- Event to stop the worker. But be careful with daemon processes: They are abruptly stopped and their resources (e.g. open files or database transactions) may not be released/completed properly.

In [None]:
from threading import Thread, Lock, current_thread
from queue import Queue

def worker(q, lock):
    val = q.get() # blocks until the item is available

    # do stuff...
    with lock: # prevent printing at the same time with this lock
        print(f"in {current_thread().name} got {val}")
        q.task_done()
        
    # For each get(), a subsequent call to task_done() tells the queue
    # that the processing on this item is complete.
    # If all tasks are done, q.join() can unblock

if __name__ == "__main__":
    q = Queue()
    lock = Lock()
    num_thread = 10

    for i in range(num_thread):
        t = Thread(name=f"thread_num_{i+1}", target=worker, args=(q,lock))
        t.daemon = True # dies when the main thread dies
        t.start()

    # fill the queue with items
    for i in range(20):
        q.put(i)
        
    q.join()  # Blocks until all items in the queue have been gotten and processed.
    print('main done')


in thread_num_1 got 0
in thread_num_4 got 1
in thread_num_7 got 2
in thread_num_9 got 3
in thread_num_10 got 4
in thread_num_8 got 5
in thread_num_2 got 6
in thread_num_5 got 7
in thread_num_6 got 8
in thread_num_3 got 9


## 6. Multiprocessing
- Call ```process.join()``` to tell the program that it should wait for this process to complete before it continues with the rest of the code.

### Share data between processes
Since processes don't live in the same memory space, they do not have access to the same (public) data. Thus, they need special shared memory objects to share data.

Data can be stored in a shared memory variable using Value or Array.

- `Value(type, value)`: Create a ctypes object of type type. Access the value with .target.
- `Array(type, value)`: Create a ctypes array with elements of type type. Access the values with [].


In [7]:
# Task: Create two processes, each process should have access to a shared variable 
# and modify it (in this case only increase it repeatedly by 1 for 100 times). 
# Create another two processes that share an array and modify (increase) all the elements in the array.

from multiprocessing import Process, Value, Array
import time

def add_100(num):
    for _ in range(100):
        time.sleep(0.01)
        num.value += 1
    
def add_100_to_array(nums):
    for _ in range(100):
        time.sleep(0.01)
        for i in range(len(nums)):
            nums[i]+=1

if __name__ == "__main__":
    shared_number_1 = Value('i', 0) 
    print('Value at beginning:', shared_number_1.value)

    shared_array = Array('d', [0.0, 100.0, 200.0])
    print('Array at beginning:', shared_array[:])

    process1 = Process(target=add_100, args=(shared_number_1,))
    process2 = Process(target=add_100, args=(shared_number_1,))

    process3 = Process(target=add_100_to_array, args=(shared_array,))
    process4 = Process(target=add_100_to_array, args=(shared_array,))

    process1.start()
    process2.start()
    process3.start()
    process4.start()

    process1.join()
    process2.join()
    process3.join()
    process4.join()

    print('Value at end:', shared_number_1.value)
    print('Array at end:', shared_array[:])

    print('end main')
            

AttributeError: 'Synchronized' object has no attribute 'Value'

## 7. Function Arguments

## 8. Shallow vs Deep Copying

## 9. Asterisk Operator *

## 10. Context Managers