### Parrallelism and Concurrency in Python

Parrallelism: Refers to performing multiple tasks at the same time and in same order

Multiprocessing: referes to distributing tasks over CPU Cores.
For any CPU bound tasks, we can use python's multiprocessing module. We simply create a Pool object in multiprocessing which offeres a convenient means to parallelize the execution of a function across multiple input values


Reference: https://hackernoon.com/parallelism-and-concurrency-in-python-concept-code-3w75430wo

In [2]:
import multiprocessing
import os
import time
import numpy as np

def DotProduct(A):
    dot_product = np.dot(A[0],A[1])
    return

List = [[np.arange(1000000).reshape(5000,200),np.arange(1000000).reshape(200,5000)],
        [np.arange(1000000).reshape(500,2000),np.arange(1000000).reshape(2000,500)],
        [np.arange(1000000).reshape(5000,200),np.arange(1000000).reshape(200,5000)]]

if __name__ == "__main__":
    #executing a code without multiprocessing .. ie. on a single core
    start = time.time()
    B = list(map(DotProduct,List))
    end = time.time() - start
    print("Full time taken: ",end," seconds")
    
    #With multiprocessing module on multiple cores (#the current system on which this code is running has 2 cores)
    start = time.time()
    pool = multiprocessing.cpu_count()
    print(pool)
    with multiprocessing.Pool(pool) as p:
        print(p.imap(DotProduct,List))
    end = time.time() - start
    print("Full time taken: ",end," seconds")

Full time taken:  15.325034856796265  seconds
4
<multiprocessing.pool.IMapIterator object at 0x0000027BD7C74408>
Full time taken:  0.31531214714050293  seconds


Multiprocessing tends to take longer in Windows. The reason for this is explained here: https://stackoverflow.com/questions/52465237/multiprocessing-slower-than-serial-processing-in-windows-but-not-in-linux

It turns out the using p.imap instead of p.map tends to run faster on windows. The imap is a lazier version of map(). map() on the other hand, will block till complete. 
Read more here: https://docs.python.org/3/library/multiprocessing.html?highlight=process#the-spawn-and-forkserver-start-methods

There are other libraries such as Ray(https://github.com/ray-project/ray) that provide other effecient ways of multiprocessing. 


Concurrency: Refers to performing multiple tasks at same time but in overlapping or different or same order. 

Multithreading: Running different/multiple threads to perform tasks on a single processor. Multithreading is really good at performing IO bound tasks (like - Sending multiple request to servers concurrently). Every new thread created will have a PID (process ID) and it will have a start function. join() function of the thread can be used, if we want to run loc after thread finishes its job. Python has a very complicated relationshop with its GIL and the output of the code varies a lot.

Async IO: is a single threaded- single process design paradigm that manages to achieve concurrency (More details to follow)

Note: A program running in parallel will be called as concurrent but the reverse in not true. 

In [5]:
import threading
import os
import time
import numpy as np

def BasicOperation():
    #square of a number
    def square(number):
        return number*number
    #cube of a number
    def cube(number):
        return number**3
    #nth power of a number
    def nth_power(number,power):
        return number**power
    #sum of num number
    def sum_of_n_numbers(number):
        return number*(number+1)/2
    
    #using functions to drive a program
    
    print("square of 5 is " , square(5))
    print("cube of 5 is " , cube(5))
    print("5 raise to power 2 is " , nth_power(5,2))
    print("sum of first 5 numbers is" , sum_of_n_numbers(5))
    
def DotProduct():
    A = np.arange(1000000).reshape(5000,200)
    B = np.arange(1000000).reshape(200,5000)
    Dot = np.dot(A,B)
    

if __name__ == "__main__":
    
    #without threading
    start = time.time()
    BasicOperation()
    Mid = time.time() - start
    print("Mid time taken: ", Mid ," seconds")
    DotProduct()
    end = time.time() - start
    print("Full time taken: ", end ," seconds")
    
    #with threading
    start = time.time()
    Thread_1 = threading.Thread(target = BasicOperation, name = ' Basic Operation Thread ')
    Thread_2 = threading.Thread(target = DotProduct, name = ' Dot Product Thread')
    Thread_1.start()
    Thread_2.start()
    Thread_1.join()
    Mid = time.time() - start
    print("Mid time taken: ", Mid ," seconds")
    Thread_2.join()
    end = time.time() - start
    print("Full time taken: ", end ," seconds")

square of 5 is  25
cube of 5 is  125
5 raise to power 2 is  25
sum of first 5 numbers is 15.0
Mid time taken:  0.000993967056274414  seconds
Full time taken:  7.3741419315338135  seconds
square of 5 is  25
cube of 5 is  125
5 raise to power 2 is  25
sum of first 5 numbers is 15.0
Mid time taken:  0.003990888595581055  seconds
Full time taken:  7.333523988723755  seconds


TODO:
https://stackoverflow.com/questions/4844637/what-is-the-difference-between-concurrency-parallelism-and-asynchronous-methods

https://medium.com/swift-india/concurrency-parallelism-threads-processes-async-and-sync-related-39fd951bc61d

Async IO : https://realpython.com/async-io-python/
https://realpython.com/python-concurrency/
