# Multitasking

In Python, multitasking refers to the ability of a program to perform multiple tasks concurrently or simultaneously.

Python provides various techniques for achieving multitasking, including:

Threads: Threads are lightweight and independent units of execution within a program. They allow multiple tasks to run concurrently within the same process.

Processes: Processes are independent instances of a program that run in their own memory space. They can communicate with each other through inter-process communication (IPC) mechanisms.

Asyncio: Asyncio is a Python library for writing asynchronous code. It provides a way to write concurrent code using coroutines, which are functions that can be paused and resumed.

Using these techniques, you can write programs that can perform multiple tasks concurrently, which can improve the efficiency and performance of your code.

## Things we will review 
- Processes and threads
- Very basic threads in Python
- The Fork/Join model
- Synchronisation objects
- Simple use of Lock
- Queue objects
- The trouble with threads
- Using the multiprocessing module
- Queue objects
- concurrent.futures

## Processes and threads

We guess that you probably know what a process is, but you might
not have met threads before. Even if you have, through a
language like C# or Java, then some aspects of Python
multithreading may surprise you.

It has strengths and it has weaknesses

Programming with threads is different. You never know when
some other thread is going to alter the global resource you are
using, so synchronisation is essential. Of course there is an
overhead in doing that.

The Python threading module is a high-level interface based on
the thread module. If you have used threads procedurally, for
example Win32 threads or pthreads from C/C++, then you will be
familiar with the procedural interface.

## Fork join model

We will look at the fork join model 

The term is commonly used to describe the simple model of a
single fork position, and a single place where we all join together
again. It is a robust model that
is well worth using.

The General Threading Model can be chaotic - in more than one
sense of the word. 

## issues with threads
They are very difficult to code
• Sharing variables requires locking mechanisms
• Subtle timing differences can make debugging
difficult
The Python Global Interpreter Lock (GIL)
• The GIL locks the interpreter
• Threads are locked for about 100 byte-code
instructions
• Simplifies and protects the interpreter
• Improvements to the GIL strategy in 3.2 reduces the
overhead
• The GIL does not mean that:
• Python is not multi-threaded - C modules can multithread
• You don't need to worry about locking - you certainly
do!




In [1]:
from threading import Thread
import time

def test_function(*args):
    print("From thread", ''.join(args))
    time.sleep(5)

th1 = Thread(target= test_function, args='th 1')
th2 = Thread(target= test_function, args='th 2')

th1.start()
th2.start()

print("from main")

th1.join()
th2.join()

From threadFrom thread from main th 2
th 1



## Synchronisation objects

The threading module contains a ranges of objects that can be
used for thread synchronisation.
- Condition variables for example that are a synchronization primitive that allow for multiple threads to synchronize their execution based on a shared condition. A condition variable consists of a mutex (lock) and a queue of waiting threads
- Barrier object is a synchronization primitive that allows a set of threads to synchronize their execution at a specific point. A barrier object waits until a specified number of threads have reached it before releasing them all at once, allowing them to continue executing.
-  Events allows threads to wait for a specific event to occur. An event can be set or cleared, and threads can wait for the event to be set before continuing their execution
- Thread local storage is to enable a thread to share a global variable between functions, but for that variable to be different for each thread. Generally this is a hack to allow a single threaded program to be converted to multi-threaded! avoid globals and you wont need this 
- Locks are often required, and represent the basic locking mechanism. A thread either has ownership of a lock or waits for it. 
- Semaphores on the other hand are not "owned" by anyone. We put a limit on the number of threads that can lock a semaphore, if more come along then they wait until a thread releases the semaphore.
- Timers can be useful for triggering functions at specific times, or in specific intervals.

## Simple use of locks
we will look at creating locks to create a "Critical Section"
of code (the term Critical Section comes from Windows and
indicates code which can only run in one thread at a time).

## Queues 
Task Queues: A task queue is a common pattern in which a set of worker threads pull tasks from a queue and execute them. The queue module provides the Queue class, which is a thread-safe implementation of a task queue. A use case of task queues can be found in a web application that receives requests and needs to perform tasks in the background, such as sending emails or generating reports.

We will look at queues The Queue/queue module supplies a serialised communication
mechanism between threads. All the locking is done for us - we do
not need to worry about atomicity and other nasty details - every
operation is atomic

Queues are really designed for continuous use, and to terminate
processing can be problematic. What you should not do is to
terminate the primary thread without waiting for the worker
threads to finish processing

Queues are ideal for the producer-consumer model, where one
set of processes adds data items into the queue and another set
removes and processes them

Message Passing: The queue module can also be used for message passing between threads. One thread can send messages to another thread by adding them to a queue, and the other thread can receive messages by removing them from the queue. This pattern can be useful in many applications that require inter-thread communication, such as distributed computing or real-time systems.


## Python multiprocessing Process class
Python multiprocessing Process class is an abstraction that sets up another Python process, provides it to run code and a way for the parent application to control execution. There are two important functions that belongs to the Process class - start() and join() function. At first, we need to write a function, that will be run by the process. Then, we need to instantiate a process object. If we create a process object, nothing will happen until we tell it to start processing via start() function. Then, the process will run and return its result. After that we tell the process to complete via join() function. Without join() function call, process will remain idle and won’t terminate. So if you create many processes and don’t terminate them, you may face scarcity of resources. Then you may need to kill them manually. One important thing is, if you want to pass any argument through the process you need to use args keyword argument. 



In [None]:
import multiprocessing


def finder(data):
    line = data[0]
    search_term = data[1]
    if search_term in line:
        return line
    else:
        return None


if __name__ == '__main__':
    #prepare data
    data_lines = []
    with open('Shakespeare.as.you.like.it.txt', 'r') as shaky:
        data_lines = shaky.readlines()
    search_term = 'o'
    data_lines = [[line, search_term] for line in data_lines]

    # Create a Pool with 4 worker processes
    pool = multiprocessing.Pool(processes=4)
    results = pool.map(finder, data_lines)

    # Print the results
    print(results)


## Concurrent futures
This new module does not replace the multiprocessing and
threading modules, but provides an abstraction for them. Here,
"futures" does not refer to a future release of Python, but
something that is executed in the future, so this is about
asynchronous execution.

The main classes used are ProcessPoolExecutor
andThreadPoolExecutor, thinly veiled layers above multiprocessing
and multithreading. All the caveats and issues using the lowerlevel modules are still there (see next slide).
There are several ways to run our asynchronous task, most
commonly using submit(), passing a callable object and any
required parameters. An alternative is map(), which behaves in a
similar way to the map() builtin.
Once submitted they can start running at once - we do not have to
explicitly start them as we do with the lower-level modules. What
would be a join() at a lower-level is a wait() here. This time though
wait() gives us the returned value of the callable. An iterator is also
supplied for this (as_completed()), see the example on the next
slide.




In [None]:
import concurrent.futures


def finder(data):
    line = data[0]
    search_term = data[1]
    if search_term in line:
        return line
    else:
        return None


if __name__ == '__main__':
    # prepare data
    data_lines = []
    with open('Shakespeare.as.you.like.it.txt', 'r') as shaky:
        data_lines = shaky.readlines()
    search_term = 'o'
    data_lines = [[line, search_term] for line in data_lines]

    # Create a ThreadPoolExecutor with 4 worker threads
    with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:

        # Submit the square function to each number in the list using the ThreadPoolExecutor
        results = [executor.submit(finder, data_lines)]

        # Asynchronously process the results as they become available
        for future in concurrent.futures.as_completed(results):
            print(future.result())


In [None]:
import concurrent.futures

def square(n):
    '''
    square a number it gets
    '''
    return n**2

if __name__ == '__main__':
    numbers = [1, 2, 3, 4, 5]
    
    # Create a ThreadPoolExecutor 
    with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
        
        # Submit the square function with number 
        results = [executor.submit(square, n) for n in numbers]
        
        # Asynchronously process the results as they become available
        for future in concurrent.futures.as_completed(results):
            print(future.result())
