# Async Python: 
# The Different Forms of Concurrency

### Quick Recap
So this is what we have realized so far:

* Sync: Blocking operations.
* Async: Non blocking operations.
* Concurrency: Making progress together.
* Parallelism: Making progress in parallel.

## Treads

Python has had Threads for a very long time. Threads allow us to run our operations concurrently. But there was/is a problem with the Global Interpreter Lock (GIL) for which the threading could not provide true parallelism. However, with multiprocessing, it is now possible to leverage multiple cores with Python.

In [2]:
import threading
import time
import random


def worker(number):
    sleep = random.randrange(1, 10)
    time.sleep(sleep)
    print("I am Worker {}, I slept for {} seconds".format(number, sleep))


for i in range(5):
    t = threading.Thread(target=worker, args=(i,))
    t.start()

print("All Threads are queued, let's see when they finish!")

All Threads are queued, let's see when they finish!
I am Worker 0, I slept for 2 seconds
I am Worker 3, I slept for 3 seconds
I am Worker 1, I slept for 7 seconds
I am Worker 2, I slept for 9 seconds
I am Worker 4, I slept for 9 seconds


## Global Interpreter Lock (GIL)

* One thread can run at a time.
* The Python Interpreter switches between threads to allow concurrency.
* The GIL is only applicable to CPython (the defacto implementation). Other implementations like Jython, IronPython don’t have GIL.
* GIL makes single threaded programs fast.
* For I/O bound operations, GIL usually doesn’t harm much.
* GIL makes it easy to integrate non thread safe C libraries, thansk to the GIL, we have many high performance extensions/modules written in C.
* For CPU bound tasks, the interpreter checks between N ticks and switches threads. So one thread does not block others.


## Processes

In [3]:
import multiprocessing
import time
import random


def worker(number):
    sleep = random.randrange(1, 10)
    time.sleep(sleep)
    print("I am Worker {}, I slept for {} seconds".format(number, sleep))


for i in range(5):
    t = multiprocessing.Process(target=worker, args=(i,))
    t.start()

print("All Processes are queued, let's see when they finish!")

All Processes are queued, let's see when they finish!
I am Worker 4, I slept for 2 seconds
I am Worker 2, I slept for 6 seconds
I am Worker 3, I slept for 7 seconds
I am Worker 1, I slept for 8 seconds
I am Worker 0, I slept for 8 seconds


With the Pool class, we can also distribute one function execution across multiple processes for different input values. If we take the example from the official docs:

In [8]:
from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    p = Pool(5)
    print(p.map(f, [1, 2, 3]))

[1, 4, 9]


Here, instead of iterating over the list of values and calling f on them one by one, we are actually running the function on different processes. One process executes f(1), another runs f(2) and another runs f(3). Finally the results are again aggregated in a list. This would allow us to break down heavy computations into smaller parts and run them in parallel for faster calculation.

## The concurrent.futures module

The concurrent.futures module packs some really great stuff for writing async codes easily. My favorites are the ThreadPoolExecutor and the ProcessPoolExecutor. These executors maintain a pool of threads or processes. We submit our tasks to the pool and it runs the tasks in available thread/process. A Future object is returned which we can use to query and get the result when the task has completed.

In [9]:
from concurrent.futures import ThreadPoolExecutor
from time import sleep
 
def return_after_5_secs(message):
    sleep(5)
    return message
 
pool = ThreadPoolExecutor(3)
 
future = pool.submit(return_after_5_secs, ("hello"))
print(future.done())
sleep(5)
print(future.done())
print(future.result())

False
False
hello


## Asyncio - Why, What and How?

In [10]:
import asyncio
import datetime
import random


async def my_sleep_func():
    await asyncio.sleep(random.randint(0, 5))


async def display_date(num, loop):
    end_time = loop.time() + 50.0
    while True:
        print("Loop: {} Time: {}".format(num, datetime.datetime.now()))
        if (loop.time() + 1.0) >= end_time:
            break
        await my_sleep_func()


loop = asyncio.get_event_loop()

asyncio.ensure_future(display_date(1, loop))
asyncio.ensure_future(display_date(2, loop))

loop.run_forever()

RuntimeError: This event loop is already running

Loop: 1 Time: 2019-01-09 15:07:05.601780
Loop: 2 Time: 2019-01-09 15:07:05.601923
Loop: 2 Time: 2019-01-09 15:07:06.603976
Loop: 2 Time: 2019-01-09 15:07:08.606170
Loop: 1 Time: 2019-01-09 15:07:09.603845
Loop: 2 Time: 2019-01-09 15:07:12.608543
Loop: 1 Time: 2019-01-09 15:07:14.607422
Loop: 2 Time: 2019-01-09 15:07:14.610369
Loop: 1 Time: 2019-01-09 15:07:16.611080
Loop: 1 Time: 2019-01-09 15:07:17.612879
Loop: 2 Time: 2019-01-09 15:07:18.612759
Loop: 2 Time: 2019-01-09 15:07:21.615997
Loop: 1 Time: 2019-01-09 15:07:21.616366
Loop: 1 Time: 2019-01-09 15:07:25.620888
Loop: 2 Time: 2019-01-09 15:07:26.617913
Loop: 1 Time: 2019-01-09 15:07:27.623849
Loop: 2 Time: 2019-01-09 15:07:29.621666
Loop: 1 Time: 2019-01-09 15:07:31.627451
Loop: 2 Time: 2019-01-09 15:07:34.626050
Loop: 1 Time: 2019-01-09 15:07:34.628731
Loop: 2 Time: 2019-01-09 15:07:35.628248
Loop: 1 Time: 2019-01-09 15:07:35.629765
Loop: 1 Time: 2019-01-09 15:07:38.633137
Loop: 2 Time: 2019-01-09 15:07:39.630841
Loop: 1 Time: 20

* We have an async function display_date which takes a number (as an identifier) and the event loop as parameters.
* The function has an infinite loop that breaks after 50 secs. But during this 50 sec period, it repeatedly prints out the time and takes a nap. The await function can wait on other async functions (coroutines) to complete.
* We pass the function to event loop (using the ensure_future method).
* We start running the event loop.

### Pseudo code:

In [None]:
if io_bound:
    if io_very_slow:
        use("Use Asyncio")
    else:
        use("Use Threads")
else:
    use("Multi Processing")

* CPU Bound => Multi Processing
* I/O Bound, Fast I/O, Limited Number of Connections => Multi Threading
* I/O Bound, Slow I/O, Many connections => Asyncio