# Working with Threads in Python

## What is a Thread?

In computing, a process is an instance of a computer program that is being executed. Any process has 3 basic components:  

- An executable program.  
- The associated data needed by the program (variables, work space, buffers, etc.)  
- The execution context of the program (State of process)  
A thread is an entity within a process that can be scheduled for execution. Also, it is the smallest unit of processing that can be performed in an OS (Operating System).  

In simple words, a thread is a sequence of such instructions within a program that can be executed independently of other code. For simplicity, you can assume that a thread is simply a subset of a process!  

Multithreading is defined as the ability of a processor to execute multiple threads concurrently.  

In a simple, single-core CPU, it is achieved using frequent switching between threads. This is termed as context switching. In context switching, the state of a thread is saved and state of another thread is loaded whenever any interrupt (due to I/O or manually set) takes place. Context switching takes place so frequently that all the threads appear to be running parallely (this is termed as multitasking).  

Below is a simple program demonstrating how to setup and run threads. It runs two threadseach with their own function.

In [None]:
# Python program to illustrate the concept
# of threading
# importing the threading module
import threading
  
def print_cube(num):
    """
    function to print cube of given num
    """
    print(f"Cube: {num**3}")

def print_square(num):
    """
    function to print square of given num
    """
    print(f"Square: {num**2}")

# creating thread
t1 = threading.Thread(target=print_square, args=(10,))
t2 = threading.Thread(target=print_cube, args=(10,))

# starting thread 1
t1.start()
# starting thread 2
t2.start()

# wait until thread 1 is completely executed
t1.join()
# wait until thread 2 is completely executed
t2.join()

# both threads completely executed
print("Done!")

### Creating a Thread

The line:  
`t1 = threading.Thread(target=print_square, args=(10,))`  
Demonstrates the normal way to create a thread object to be run in the future. The `target` argument will be a function to be called when the thread is started. The `args` argumetn is a tuple of the arguments to be supplied to the function. Another argument, `kwargs` can be supplied if you want to provide key-word arguments to the thread function.  

The `start` method can then be run on the thread object. This calls the function with the provided arguments as it's own separate thread.  

The `join` method can be used if waiting for the completion of the thread is required. Call the method at any point and execution of the thread calling the join method (in this case the main execution thread) will wait for the join-ed thread to complete before continuining.  

## Exercises: Basics of Threading

1. Create a function that prints each number from 1 to 20 in a for loop. Create two threads that run this function and start them one immeditely after the other. Join to wait for each one to finish before your program ends. 

2. Edit your previous function so it takes a number between 1 and n, where n is an argument provided to the funciton. Create two threads and run them, this time having one thread print numbers up to 50 and the other print numbers up to 70. 

3. Create a global list. Write one function that appends a list of strings given to it as an argument to the global list. Write a second function that prints off each element in the global list one at a time. Finally, run the first funciton as a thread with five random words, and then run the second function as a thread, making sure that the previous thread has completed to allow it to print off all those strings.

4. Redo question 1, but this time run 5 at once, using a for loop or a data structure to manage them all.

### Race Conditions

You may have noticed that in problem 1 above the print statements were intermingled between the two threads. While this isn't a big deal here, imagine if there were two threads accessing a bank account and adjusting it's values instead. This could lead to serious problems if they both try to add money at the same time. This sort of problem, where the timing of two threads can affect what should be a consistent result, is called a 'Race Condition'.

Concurrent accesses to shared resource can lead to race condition.

A race condition occurs when two or more threads can access shared data and they try to change it at the same time. As a result, the values of variables may be unpredictable and vary depending on the timings of context switches of the processes.

Consider the program below to understand the concept of race condition: (Jupyter notebooks seems to try to not switch threads too much, you might need ot run it a few times before you see the problem)

In [None]:
import threading
  
# global variable x
x = 0
  
def increment():
    """
    function to increment global variable x
    """
    global x
    x += 1

def thread_task():
    """
    task for thread
    calls increment function 100000 times.
    """
    for _ in range(100000):
        increment()

def main_task():
    global x
    # setting global variable x as 0
    x = 0
  
    # creating threads
    t1 = threading.Thread(target=thread_task)
    t2 = threading.Thread(target=thread_task)
  
    # start threads
    t1.start()
    t2.start()
  
    # wait until threads finish their job
    t1.join()
    t2.join()

if __name__ == "__main__":
    for i in range(10):
        main_task()
        print(f"Iteration {i}: x = {x}")

In above program:  

Two threads t1 and t2 are created in main_task function and global variable x is set to 0.  
Each thread has a target function thread_task in which increment function is called 100000 times.  
increment function will increment the global variable x by 1 in each call.  
The expected final value of x is 200000 but what we get in 10 iterations of main_task function is some different values.  

This happens due to concurrent access of threads to the shared variable x. This unpredictability in value of x is a race condition.  

## Solving Race Conditions Using Locks 

The threading module provides a Lock class to deal with the race conditions.

Lock class provides following methods:

`acquire([blocking])` : To acquire a lock. A lock can be blocking or non-blocking.  

When invoked with the blocking argument set to `True` (the default), thread execution is blocked until the lock is unlocked, then lock is set to locked and return `True`.  
When invoked with the blocking argument set to `False`, thread execution is not blocked. If lock is unlocked, then set it to locked and return `True` else return `False` immediately.  

`release()` : To release a lock.
When the lock is locked, reset it to unlocked, and return. If any other threads are blocked waiting for the lock to become unlocked, allow exactly one of them to proceed.  
If lock is already unlocked, a `ThreadError` is raised.  

The code below demonstrates the use of locks:  

In [None]:
import threading
  
# global variable x
x = 0

def increment():
    """
    function to increment global variable x
    """
    global x
    x += 1

def thread_task(lock):
    """
    task for thread
    calls increment function 100000 times.
    """
    for _ in range(100000):
        lock.acquire()
        increment()
        lock.release()

def main_task():
    global x
    # setting global variable x as 0
    x = 0
  
    # creating a lock
    lock = threading.Lock()
  
    # creating threads
    t1 = threading.Thread(target=thread_task, args=(lock,))
    t2 = threading.Thread(target=thread_task, args=(lock,))
  
    # start threads
    t1.start()
    t2.start()
    
    # wait until threads finish their job
    t1.join()
    t2.join()

for i in range(10):
    main_task()
    print("Iteration {0}: x = {1}".format(i,x))

Let's walk through the above code step by step:  

Firstly, a Lock object is created using:  
  `lock = threading.Lock()`
Then, lock is passed as target function argument:  
  `t1 = threading.Thread(target=thread_task, args=(lock,))`  
  `t2 = threading.Thread(target=thread_task, args=(lock,))`  
In the critical section of target function, we apply lock using `lock.acquire()` method. As soon as a lock is acquired, no other thread can access the critical section (here, increment function) until the lock is released using `lock.release()` method.  
  `lock.acquire()`  
  `increment()`  
  `lock.release()`  
As you can see in the results, the final value of x comes out to be 200000 every time (which is the expected final result).  

## Exercise: Working with Locks

1.  Reuse the code you wrote from the previous exercise's question 1, but this time implement locks to ensure each thread prints out all of it's numbers before the other can start.

2. Create a global list with the number 0. Then create a function that checks the last value in that list, adds one to it, and appends that value to the global list so the list becomes [0, 1, 2, 3, ...]. Have the function do this 50 times. Create two threads and run them at the same time, using locks to ensure that the numbers stay in sequence.