# Intermediate Python
### Patrick Loeber, python-engineer.com
### https://www.youtube.com/watch?v=HGOBQPFzWKo
(3:53:31)
September 17, 2022

## THREADING vs MULTIPROCESSING:
### allows for running code in parallel and speeding up code.

* a process is an instance of a program running on the computer, and is heavyweight, using much memory in comparison to threads.
* a thread is a sequence of instructions within a process allowing for simultaneously functioning.
* a process can have multiple threads.
* a thread is a lightweight process and can be scheduled for efficiency
* a process has its own memory space, while threads share the memory space of the process.
* threads are faster than processes.
* a raise condition is a situation where two or more threads can access shared data and they try to change it at the same time
* threads are used for I/O (input/output) bound tasks, while processes are used for CPU bound tasks.
* a GIL is a global interpreter lock that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once
* a GIL is a mutex that allows only one thread to execute at a time, which is why threads are more for I/O tasks

### Avoid:
1. memory leaks = memory that is allocated but never freed
2. memory fragmentation = memory that is allocated but not used
3. memory bloat = memory that is allocated and used but not freed

### THREADING MODULE:
* threading module provides a high-level interface for creating threads.
* threading.Thread class is used to create a thread.
* threading.Thread class has the following methods:
* start() - starts the thread.
* run() - method that contains the code to be executed by the thread.
* join() - waits for the thread to terminate.
* isAlive() - returns true if the thread is still executing.
* getName() - returns the name of the thread.
* setName() - sets the name of the thread.

### MULTIPROCESSING MODULE:
* multiprocessing module provides a high-level interface for creating processes.
* multiprocessing.Process class is used to create a process.
* multiprocessing.Process class has the following methods:
* start() - starts the process.
* run() - method that contains the code to be executed by the process.
* join() - waits for the process to terminate.
* isAlive() - returns true if the process is still executing.
* getName() - returns the name of the process.
* setName() - sets the name of the process.

### MULTIPROCESSING:
processes do not live in the same memory space, so they need special objects to allow them to share data.

In [79]:
from multiprocessing import Process, Value, Array, Lock
import os
import time

In [80]:
def square_numbers():
    for i in range(1000):
        i*i
        time.sleep(0.1)

if __name__ == '__main__':
    processes = []
    # a good number would be how many CPUs there are on the machine
    number_of_processes = os.cpu_count()

    # create processes
    for i in range(number_of_processes):
        # target is a function that is executed
        p = Process(target=square_numbers())
        processes.append(p)

    for p in processes:
        p.start()

    # waiting for a process to finish and blocking main thread
    for p in processes:
        p.join()

    print('end main')

KeyboardInterrupt: 

#### More in depth look at multiprocessing (04:31:05)

In [None]:
def add100(numbers, lock):
    for x in range(100):
        time.sleep(0.01)
        for number in range(len(numbers)):
            with lock:
                numbers[number] += 1

# If actually running this file, not just importing
if __name__ == '__main__':

    lock = Lock()
    # single shared value, takes a data type (int), and starting val
    shared_array = Array('d', [0.0, 100.0, 200.0])
    print("Array at beginning is: ", shared_array[:])

    process_01 = Process(target=add100, args=(shared_array, lock))
    process_02 = Process(target=add100, args=(shared_array, lock))

    process_01.start()
    process_02.start()

    process_01.join()
    process_02.join()

    print("Array at end is: ", shared_array[:])

In [None]:
# Queue Version

from multiprocessing import Queue

In [None]:
def square(numbers, queue):
    for integer in numbers:
        queue.put(integer * integer)


def make_negative(numbers, queue):
    for number in numbers:
        queue.put(-1 * number)

In [None]:
# If actually running this file, not just importing
if __name__ == '__main__':

    numbers = range(1, 6)
    process_queue = Queue()

    q_process01 = Process(target=square, args=(numbers, process_queue))
    q_process02 = Process(target=square, args=(numbers, process_queue))

    q_process01.start()
    q_process02.start()

    q_process01.join()
    q_process02.join()

    while not process_queue.empty():
        print(process_queue.get())



#### PROCESS POOL: object that can manage a pool of worker processes to which jobs can be submitted. It can manage the available processes and split data into separate sections to be processed.

Important methods are: map, apply, join, close

In [81]:
from multiprocessing import Pool

In [83]:
def cube(number):
    return (number * number * number)

if __name__ == '__main__':
    numbers = range(0, 10)
    pool = Pool()

    # This will split the iterable into equal sections
    # and submit them to the function using different
    # processes, allocating the pools, running in parallel.
    result = pool.map(cube, numbers)
    # This will execute a process with this function and
    # and the one number argument
    pool.apply(cube, numbers[0])
    pool.close()
    pool.join()

    print(result)



Process SpawnPoolWorker-73:
Process SpawnPoolWorker-72:
Process SpawnPoolWorker-70:
Process SpawnPoolWorker-71:
Process SpawnPoolWorker-74:
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
  File "/Users/evancarr/opt/anaconda3/envs/pythonProject1/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/Users/evancarr/opt/anaconda3/envs/pythonProject1/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/Users/evancarr/opt/anaconda3/envs/pythonProject1/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/evancarr/opt/anaconda3/envs/pythonProject1/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/evancarr/opt/anaconda3/envs/pythonProject1/lib/python3.10/multiprocessing/pool.py", line 114, in worker
    task = get()
  File 

KeyboardInterrupt: 

### MULTI-THREADING:

In [None]:
from threading import Thread

In [None]:
def square_numbers():
    for i in range(100):
        i*i
if __name__ == '__main__':
    threads = []
    number_of_thread = 10

    for i in range(number_of_thread):
        t = Thread(target=square_numbers())
        threads.append(t)

    for t in threads:
        t.start()

    # Wait and block main thread until complete
    for t in threads:
        t.join()

    print('end main')

### THREADING:

In [None]:
from threading import Lock

# Sharing data between threads
database_value = 0

def increase(Lock):
    # Allows for sharing information between threads
    global database_value

    with Lock: # This acquires and releases the lock automatically
        # every time you require a lock, it must be release
        local_copy = database_value
        # gets the database value and stores in local
        # modify the local copy
        # processing
        local_copy += 1
        time.sleep(0.1)
        database_value = local_copy




if __name__ == "__main__":
    # This lock will prevent multiple threads accessing code
    # at the same time
    lock = Lock()

    print('start value: ', database_value)

    thread1 = Thread(target=increase, args=(lock,))
    thread2 = Thread(target=increase, args=(lock,))

    thread1.start()
    thread2.start()

    thread1.join()
    thread2.join()

    print('end value: ', database_value)



### Queues in threading:
great for process-safe and thread-safe communication and data exchanges between threads and data processing in multiple threads and multi-processing environments.

Queues are thread-safe and process-safe, meaning that multiple threads or processes can share the same queue and access it simultaneously without corrupting the data.

A queue is a linear structure that stores items in a First-In-First-Out (FIFO) manner. A good example of a queue is any queue of consumers for a resource where the consumer that came first is served first.

#### Queue functions include:
1. .get() = get an item from the queue
2. .put() = put an item into the queue
3. .task_done() = indicate that a formerly enqueued task is complete
4. .join() = block until all items in the queue have been gotten and processed
5. enqueue() = add an item to the queue
6. dequeue() = remove an item from the queue
7. is_empty() = check if the queue is empty
8. is_full() = check if the queue is full
9. size() = return the size of the queue


In [None]:
from queue import Queue

if __name__ == "__main__":
    q = Queue()
    q.put(1)
    q.put(2)
    q.put(3)
    # 1 enters the queue, then 2 then 3.

    # this will get and remove the first item
    first = q.get()
    print(first)

    # When finished processing, always call q.task_done
    q.task_done()
    q.join()

    print("end main")

In [None]:
from threading import current_thread

In [None]:
def worker(queue, lock):
    while True:
        value04 = queue.get()
        # processing
        with lock:
            print(f"In {current_thread().name} and got {value04}")
        queue.task_done()
        # signal we are done

if __name__ == "__main__":
    queue01 = Queue()
    lock01 = Lock()
    number_threads = 10

    for i in range(number_threads):
        thread03 = Thread(target=worker, args=(queue01, lock01))
        # daemon = background thread that will die when main
        # thread dies
        thread03.daemon=True
        thread03.start()

    for i in range(1, 21):
        queue01.put(i)

    queue01.join()

    print('end main')
