# Threads and Processes

## Process
A process is an active program i.e. a program that is under execution. It is more than the program code as it includes the program counter, process stack, registers, program code etc. Compared to this, the program code is only the text section.

## Thread
A thread is a lightweight process that can be managed independently by a scheduler. It improves the application performance using parallelism. A thread shares information like data segment, code segment, files etc. with its peer threads while it contains its own registers, stack, counter etc.

For simplicity, you can assume that a thread is simply a subset of a process!

In [6]:
import threading 
import os 

In [7]:
def print_cube(num): 
    print("cube of a number",num * num * num) 
def print_square(num): 
    print("cube of a number",num * num)  
if __name__ == "__main__": 
    # creating thread 
    t1 = threading.Thread(target=print_square, args=(10,)) 
    t2 = threading.Thread(target=print_cube, args=(10,)) 
    # starting thread 1 
    t1.start() 
    # starting thread 2 
    t2.start() 
    # wait until thread 1 is completely executed 
    t1.join() 
    # wait until thread 2 is completely executed 
    t2.join() 
    # both threads completely executed 
    print("Done!")


cube of a number 100
cube of a number 1000
Done!


# Creating Threads

To create a new thread, we create an object of Thread class. It takes following arguments:
    target: the function to be executed by thread
    args: the arguments to be passed to the target function
 
To start a thread, we use start method of Thread class.
 
Once the threads start, the current program (you can think of it like a main thread) also keeps on executing. In order to stop execution of current program until a thread is complete, we use join method.
 
As a result, the current program will first wait for the completion of t1 and then t2. Once, they are finished, the remaining statements of current program are executed.

In [8]:
def task1(): 
    print(threading.current_thread().name)
    
def task2(): 
    print(threading.current_thread().name) 
   
  
if __name__ == "__main__": 
  
    # print ID of current process 
    print("ID of process running main program:",os.getpid()) 
  
    # print name of main thread 
    print("Main thread name:",threading.current_thread().name) 
    # get the current thread object
  
    # creating threads 
    t1 = threading.Thread(target=task1, name='t1') 
    t2 = threading.Thread(target=task2, name='t2')   
  
    # starting threads 
    t1.start() 
    t2.start() 
  
    # wait until all threads finish 
    t1.join() 
    t2.join()

ID of process running main program: 287770
Main thread name: MainThread
t1
t2


# Concept of Globals

In [9]:
x=20 # global
def f1():
	print(x)
 
f1() # 20

20


In [16]:
x=20 # global
def f1():
    x=x+1 # error
    print(x)
 
f1() 

UnboundLocalError: local variable 'x' referenced before assignment

In [11]:
x=20 # global
def f1():
	global x
	x=x+1
	print(x)
 
f1() # 21

21


# Inter Process Communication

Inter Process Communication (IPC)
A process can be of two types:
1. Independent process.
2. Co-operating process.

An independent process is not affected by the execution of other processes while
a co-operating process can be affected by other executing processes. Though one
can think that those processes, which are running independently, will execute
very efficiently, in reality, there are many situations when co-operative nature can
be utilised for increasing computational speed, convenience and modularity. Inter
process communication (IPC) is a mechanism which allows processes to
communicate with each other and synchronize their actions. The communication
between these processes can be seen as a method of co-operation between
them. Processes can communicate with each other through both:
1. Shared Memory
2. Message passing

The Figure 1 below shows a basic structure of communication between processes via the shared memory method and via the message passing method.

<img src='https://media.geeksforgeeks.org/wp-content/uploads/1-76.png'>

In [15]:
import multiprocessing
import time
result = []
def square_list(mylist):
    global result
    # append squares of mylist to global list result
    for num in mylist:
        result.append(num * num)
    # print global list result 
    print("Result(in process p1)",result)
    time.sleep(1)

if __name__ == "__main__":
    # input list
    mylist = [1,2,3,4]
    # creating new process
    p1 = multiprocessing.Process(target=square_list, args=(mylist,))
    # starting process
    p1.start()
    # wait until process is finished
    p1.join()
    # print global result list
    print("Result(in main program)",result)

Result(in process p1) [1, 4, 9, 16]
Result(in main program) []


In [18]:
def square_list(mylist, result, square_sum):
    #append squares of mylist to result array
    for idx, num in enumerate(mylist):
        result[idx] = num * num
    # square_sum value
    square_sum.value = sum(result)

    # print result Array 
    print("Result(in process p1):",result[:])
    # print square_sum Value
    print("Sum of squares(in process p1):",square_sum.value)

if __name__ == "__main__":
    # input list
    mylist = [1,2,3,4]
    # creating Array of int data type with space for 4 integers
    result = multiprocessing.Array('i', 4)
    # creating Value of int data type
    square_sum = multiprocessing.Value('i')
    # print(square_sum)
    # creating new process
    p1 = multiprocessing.Process(target=square_list, args=(mylist, result, square_sum))
    # starting process
    p1.start()
    # wait until process is finished
    p1.join()
    # print result array
    print("Result(in main program):",result[:])
    # print square_sum Value
    print("Sum of squares(in main program):",square_sum.value)

Result(in process p1): [1, 4, 9, 16]
Sum of squares(in process p1): 30
Result(in main program): [1, 4, 9, 16]
Sum of squares(in main program): 30


# Multiprocessing
## What is multiprocessing?

Multiprocessing refers to the ability of a system to support more than one
processor at the same time. Applications in a multiprocessing system are broken
to smaller routines that run independently. The operating system allocates these
threads to the processors improving performance of the system.

## Why multiprocessing?
Consider a computer system with a single processor. If it is assigned several
processes at the same time, it will have to interrupt each task and switch briefly
to another, to keep all of the processes going.
This situation is just like a chef working in a kitchen alone. He has to do several
tasks like baking, stirring, kneading dough, etc.
Multiprocessing: Running more than one process on a single processor
parallel processing: running a process on more than one processor

In [19]:
import multiprocessing

def print_cube(num):
    print("Cube of a number is",num * num * num)

def print_square(num):
    print("squre of a number is",num * num)

if __name__ == "__main__":
    # creating processes
    p1 = multiprocessing.Process(target=print_square, args=(10, ))
    p2 = multiprocessing.Process(target=print_cube, args=(10, )) 
    # starting process 1
    p1.start()
    # starting process 2
    p2.start()
    # wait until process 1 is finished
    p1.join()
    # wait until process 2 is finished
    p2.join()
    # both processes finished
    print("Done!")

squre of a number is 100Cube of a number is
 1000
Done!


In [20]:
# importing the multiprocessing module
import multiprocessing
import os
def f1():
    print("p1_id: ",os.getpid())

def f2():
    print("p2_id: ",os.getpid())

if __name__ == "__main__":
    # printing main program process id
    print("main process id",os.getpid())
    # creating processes
    p1 = multiprocessing.Process(target=f1) 
    p2 = multiprocessing.Process(target=f2)
    # starting processes
    p1.start()
    p2.start()
    # wait until processes are finished 
    p1.join()
    p2.join()
    # both processes finished
    print("Both processes finished execution!")
    # check if processes are alive
    print("p1 status is alive?:",p1.is_alive())
    print("p2 status is alive?:",p2.is_alive())

main process id 287770
p1_id:  292330p2_id: 
 292331
Both processes finished execution!
p1 status is alive?: False
p2 status is alive?: False


# Producer Consumer Problem:

The producer's job is to generate a piece of data, put it into the buffer and
start again. At the same time, the consumer is consuming the data (i.e.,
removing it from the buffer) one piece at a time.

```python
from threading import Thread

class ProducerThread(Thread):
    def run(self):
        pass

class ConsumerThread(Thread):
    def run(self):
        pass
```

The problem describes two processes, the producer and the consumer, who share a
common buffer.
So we keep one variable which will be global and will be modified by both
Producer and Consumer threads. Producer produces data and adds it to the buffer.
Consumer consumes data from the buffer i.e removes it from the buffer.

In [23]:
from threading import Thread, Lock
import time
import random
buffer = []
lock = Lock()
class ProducerThread(Thread):
    def run(self):
        nums = range(5) # Will create the list [0, 1, 2, 3, 4]
        global buffer
        while True:
            num = random.choice(nums) # Selects a random number from list [0, 1, 2, 3, 4]
            lock.acquire()
            buffer.append(num)
            print("Produced", num)
            lock.release()
            time.sleep(random.random())

class ConsumerThread(Thread):
    def run(self):
        global buffer
        while True:
            lock.acquire()
            if not buffer:
                print ("Nothing in buffer, but consumer will try to consume")
            num = buffer.pop(0)
            print ("Consumed", num)
            lock.release()
            time.sleep(random.random())

ProducerThread().start()
ConsumerThread().start()

Produced 2
Consumed 2
Produced 1
Consumed 1
Produced 0
Consumed 0


Exception in thread Thread-11:
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "<ipython-input-23-556d2ad12a41>", line 25, in run
IndexError: pop from empty list


Nothing in buffer, but consumer will try to consume


# Explanation:
1. We started one producer thread(hereafter referred as producer) and one consumer thread(hereafter referred as consumer).
2. Producer keeps on adding to the buffer and consumer keeps on removing from the buffer.
3. Since buffer is a shared variable, we keep it inside lock to avoid race condition.
4. At some point, consumer has consumed everything and producer is still sleeping. Consumer tries to consume more but since buffer is empty, an IndexError is raised.
5. But on every execution, before IndexError is raised you will see the print statement telling “Nothing in buffer, but consumer will try to consume”, which explains why you are getting the error.

We found this implementaion as the wrong behaviour.

# What is the correct behaviour?
When there was nothing in the buffer, consumer should have stopped running and waited instead of
trying to consume from the buffer. And once producer adds something to the buffer, there should be
a way for it to notify the consumer telling it has added something to buffer. So, consumer can again
consume from the buffer. And thus IndexError will never be raised.

# About Condition
Condition object allows one or more threads to wait until notified by another thread. Taken from here.

And this is exactly what we want. We want consumer to wait when the buffer is empty and resume
only when it gets notified by the producer. Producer should notify only after it adds something to
the buffer. So after notification from producer, we can be sure that buffer is not empty and hence no
error can crop if consumer consumes.

1. Condition is always associated with a lock.
2. A condition has acquire() and release() methods that call the corresponding methods of the associated lock.

Condition provides acquire() and release() which calls lock’s acquire() and release() internally, and
so we can replace lock instances with condition instances and our lock behaviour will keep working
properly.
Consumer needs to wait using a condition instance and producer needs to notify the consumer using
the condition instance too. So, they must use the same condition instance for the wait and notify
functionality to work properly.

In [26]:
from threading import Condition,Thread,Lock
import random
import time
buffer = []
lock = Lock()
condition = Condition()
class ConsumerThread(Thread):
    def run(self):
        global buffer
        while True:
            condition.acquire()
            if not buffer:
                print ("Nothing in buffer, consumer is waiting")
                condition.wait()
                print ("Producer added something to buffer and notified the consumer")
            num = buffer.pop(0)
            print ("Consumed", num)
            condition.release()
            time.sleep(random.random())

class ProducerThread(Thread):
    def run(self):
        nums = range(5)
        global buffer
        while True:
            condition.acquire()
            num = random.choice(nums)
            buffer.append(num)
            print ("Produced", num)
            condition.notify()
            condition.release()
            time.sleep(random.random())

ProducerThread().start()
ConsumerThread().start()

Produced 2
Consumed 2
Nothing in buffer, consumer is waiting
Produced 0
Producer added something to buffer and notified the consumer
Consumed 0
Nothing in buffer, consumer is waiting
Produced 0
Producer added something to buffer and notified the consumer
Consumed 0
Produced 2
Consumed 2
Produced 3
Produced 4
Consumed 3
Produced 1
Consumed 4
Produced 3
Consumed 1
Consumed 3
Nothing in buffer, consumer is waiting
Produced 1
Producer added something to buffer and notified the consumer
Consumed 1
Nothing in buffer, consumer is waiting
Produced 1
Producer added something to buffer and notified the consumer
Consumed 1
Produced 3
Consumed 3
Produced 2
Consumed 2
Produced 2
Consumed 2
Nothing in buffer, consumer is waiting
Produced 3
Producer added something to buffer and notified the consumer
Consumed 3
Produced 0
Consumed 0
Nothing in buffer, consumer is waiting
Produced 1
Producer added something to buffer and notified the consumer
Consumed 1
Nothing in buffer, consumer is waiting
Produced 

# Explanation
1. For consumer, we check if the buffer is empty before consuming.
2. f yes then call wait() on condition instance.
3. wait() blocks the consumer and also releases the lock associated with the condition. This lock was held by consumer, so basically consumer loses hold of the lock.
4. Now unless consumer is notified, it will not run.
5. Producer can acquire the lock because lock was released by consumer.
6. Producer puts data in buffer and calls notify() on the condition instance.
7. Once notify() call is made on condition, consumer wakes up. But waking up doesn’t mean it starts executing.
8. notify() does not release the lock. Even after notify(), lock is still held by producer.
9. Producer explicitly releases the lock by using condition.release().
10. And consumer starts running again. Now it will find data in buffer and no IndexError will be raised.