# Intro
Author: Sam Sinayoko

Date: 2020-06-20


A 2 part tour of asyncio. The first part is a quick overview of how asyncio helps solve IO bound problems using asynchronous IO via the async/await API. Uses 
- `async def`
- `await`
- `asyncio.wait`
- `loop.get_event_loop`

I'm a fan of `concurrent.futures` pool executors, and was wondering how to run a pool of tasks with asyncio. I'm showing how to do this with `asyncio.wait` and I roll my own little async AsyncPoolExecutor. I also look the built-in `loop.run_in_executor`, which let's you do essentially the same but for blocking functions.

In part 2, I dive under the covers and work through the asyncio jargon like `Futures`, `Task`, `Coroutines`, and how they relate to generators and `yield from`. 

# Part1: Asyncio Quick Tour 

A Quick Tour of Asyncio of async await and how it provides an alternative way to to deal with concurrency compared to threads. 

## The non-asyncio approach using threads
Let's speedup a simple IO bound problem using threads. 

In [None]:
import time
import asyncio
from concurrent.futures import ThreadPoolExecutor

In [2]:
from contextlib import contextmanager

In [3]:
def slow(x):
    print(x)
    time.sleep(0.05)
    return x

def main_synchronous():
    res = []
    for i in range(10):
        res.append(slow(i))
    return res

In [4]:
%time main_synchronous()

0
1
2
3
4
5
6
7
8
9
CPU times: user 5.53 ms, sys: 3.07 ms, total: 8.6 ms
Wall time: 530 ms


[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [5]:
def main_threads():
    with ThreadPoolExecutor(10) as pool:
        res = list(pool.map(slow, range(10)))
    return res

In [6]:
%time main_threads()

0
1
23

4
5
6
78

9
CPU times: user 6.39 ms, sys: 4.99 ms, total: 11.4 ms
Wall time: 57.4 ms


[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

## The asyncio approach
Let's solve the same problem using async/await. The main idea is to define asyncio compatible asynchronous functions using `async def`, where such functions must at some point `await` from another asynchronous function. The simplest example of asynchronous function is probably `asyncio.sleep`, so let's use that. 
To call an asyncronous function, one must also `await` it.

The asyncio compatible asynchronous functions are called coroutines. 

Outside of Jupyter, call `loop = asyncio.get_event_loop; loop.run_until_complete(coroutine)` to trigger the execution. 

### Full on concurrency with `asyncio.wait`. 
We put the coroutines we want to execute concurrently in a list, then await `asyncio.wait` to fire all coroutines. 

In [7]:
async def slow_async(x):
    print(x)
    import random
    s = random.random()
    await asyncio.sleep(s)
    return x

async def main_async_nolimit():
    """Fire all tasks at once"""
    t0 = time.perf_counter()
    done, pending = await asyncio.wait([slow_async(i) for i in range(10)], timeout=5)
    res = [task.result() for task in done]
    t1 = time.perf_counter()
    print(t1 - t0)
    return res

In [8]:
await main_async_nolimit()

5
3
1
4
6
0
2
9
8
7
0.9748185430653393


[9, 1, 6, 4, 8, 0, 5, 7, 3, 2]

The downside of the above approach is that it gives little control over the amount of concurrency: all concurrent tasks at fired at once. Let's see how we can control that better. 

### Controlled concurrency with a pool of coroutines
We can limit the number of concurrent tasks by hand via the `return_when='FIRST_COMPLETED'` parameter of `asyncio.wait`. We'll set up a fixed size queue of coroutines and will only push extra coroutines into that queue as it starts emptying. 

In [9]:
async def main_async_pool1(max_workers=None, N=10):
    """Limits the number of concurerent tasks"""
    
    if max_workers is None:
        max_workers = N
    t0 = time.perf_counter()
    tasks = [slow_async(i) for i in range(10)]
    pending = set()
    all_done = set()
    while tasks:
        #print('len(tasks) = {}, len(queue) = {}'.format(len(tasks), len(queue)))
        while True:
            if tasks and len(pending) < max_workers:
                #print('pushing 1 tasks')
                pending.add(tasks.pop())
            else:
                break
        if pending:
            done, pending = await asyncio.wait(pending, return_when='FIRST_COMPLETED')  
            #return {'done': done, 'pending': pending}
            all_done.update(done)
    res = [task.result() for task in all_done]
    t1 = time.perf_counter()
    print(t1 - t0)
    return res

await main_async_pool1(3)

7
8
9
6
5
4
3
2
1
0
1.707174857147038


[0, 8, 5, 9, 7, 2, 4, 6]

**better version using contextmanager**

We can wrap the above into an asynchronous Pool executor via a context manager similar to what's available in the concurrent.futures package, complete with a `map` method to run a function over an iterable asynchronously. 


In [10]:
class AsyncPoolExecutor:
    """Setting up an async Pool of executors with asyncio with an API similar to concurrent.futures"""
    
    def __init__(self, max_workers=None):
        self.max_workers = max_workers
        
    def __enter__(self):
        return self 
    
    def __exit__(self, *args, **kwargs):
        pass 
    
    async def map(self, foo, iterable):
        pending = set()
        finished_tasks = set()
        tasks = [foo(i) for i in iterable]
        while tasks or pending:
            #print('len(tasks) = {}, len(pending) = {}'.format(len(tasks), len(pending)))
            while True:
                if tasks and len(pending) < self.max_workers:
                    #print('pushing 1 tasks')
                    pending.add(tasks.pop())
                else:
                    break
            done, pending = await asyncio.wait(pending, return_when='FIRST_COMPLETED')  
            #return {'done': done, 'pending': pending}
            finished_tasks.update(done)
        #print('finished_tasks = ', finished_tasks)
        return [task.result() for task in finished_tasks]
    
    
async def main_async_pool2(max_workers=None, N=5):
    """Fire all tasks at once"""
    
    if max_workers is None:
        max_workers = N
    t0 = time.perf_counter()
    res = []
    with AsyncPoolExecutor(max_workers=max_workers) as pool:
        res = await pool.map(slow_async, range(N))
    t1 = time.perf_counter()
    print(t1 - t0)
    return res

In [11]:
await main_async_pool2()

3
4
1
0
2
0.9532711869105697


[4, 3, 0, 1, 2]

In [12]:
loop = asyncio.get_event_loop()

### Controlled concurrency with a pool of synchronous functions
Sometimes you don't control the code you'd like to use with asyncio. For example boto3 is synchronous, and you may not want to use some of the functionality without having to rewrite the whole library. For such cases, you can combine `asyncio` with a PoolExecutor from `concurrent.futures` to turn a blocking function into an ansynchronous one that can be awaited and is compatible with asyncio. 

The trick is to use`loop.run_in_executor`.

In [13]:
async def main_async_run_in_executor(max_workers=None, N=10):
    """This uses a pool of executors and a blocking function! Nice to wrap existing code"""
    t0 = time.perf_counter()
    res = []
    with ThreadPoolExecutor(max_workers=max_workers) as pool:
        loop = asyncio.get_event_loop()
        # loop.run_in_executor runs blocking code is separate threads/processes to make it async
        # note that ThreadPoolExecutor(max_workers=None) is actually the default, so we could set `pool` below 
        # to None and get that behaviour. 
        tasks = [loop.run_in_executor(pool, slow, i) for i in range(N)]
        done, pending = await asyncio.wait(tasks)
        res = [task.result() for task in done]
    t1 = time.perf_counter()
    print(t1 - t0)
    return res

In [14]:
await main_async_run_in_executor()

0
1
2
3
4
56

7
89

0.060840144054964185


[9, 0, 5, 4, 6, 1, 7, 2, 8, 3]

# Part 2: AsyncIO Under the Cover
This is gonna explain some of the asyncio jargon, like Coroutines, Tasks and Futures, starting from Generators.

## Coroutine as generators
Like a blank space representing a value that will be populated asynchronously at some point in the future. That blank space is the return value of the *coroutine function*. Somewhere in the definition of a *coroutine function*, there will be a point where the execution is suspended asynchronously while we wait for the completion of another coroutine. 

This mechanism is implemented via a the `yield from <some_generator>` operator, which suspends the execution until `<some_generator>` returns a value. 

In [15]:
def i_am_a_coroutine():
    """Must yield form another coroutine.
    
    Just using asyncio.sleep for now!
    """
    yield  
    return 'Coroutine says hi!'

In [16]:
def coroutine_as_generator(coroutine_function): 
    """Turn coroutine into generator: yield None while result isn't ready yet, else yield the result"""
    coroutine = coroutine_function()
    future = asyncio.ensure_future(coroutine)
    while True:
        if not future.done():
            time.sleep(0.1)
            # yielding nothing if future isn't completed
            yield None
        else:
            # otherwise yielding the result
            yield future.result()
            break

In [17]:
g = coroutine_as_generator(i_am_a_coroutine)

In [18]:
try:
    res = next(g)  # keep executing this by hand until you get a result (or Stop Iteration)
    if res is None:
        print('Result not ready yet. Please try again.')
except StopIteration:
    pass
if res is not None:
    print(res)

Result not ready yet. Please try again.


## Asyncio Coroutines
Coroutines that are compatible with asyncio must yield from asynchronous objects like futures, tasks or other coroutines. 
Let's see how this works using the `asyncio.sleep` coroutine, which sleeps for 2 seconds asynchronously. 

Here we still use the old `yield from` like before to await the result from `asyncio.sleep`. We can start by using the same "coroutine_as_generator" function to run our coroutine.
This demonstrates that there's no need for any additional magic to run asyncio coroutines. They're basically just generators. 

In [19]:
def coroutine():
    print('Waiting 2 seconds')
    yield from asyncio.sleep(2)
    return 1

In [20]:
g = coroutine_as_generator(coroutine)

In [21]:
# keep executing this cell by hand until you get a result 
try:
    res = next(g)  
    if res is None:
        print('Result not ready yet. Please try again.')
except StopIteration:
    pass
if res is not None:
    print(res)

Result not ready yet. Please try again.


## Awaiting coroutines
If we use the `asyncio.coroutine` decorator we can now await our coroutine. This automates the repeated calls to `next()` while we wait for the coroutine to complete its execution. 

In [22]:
awaitable_coroutine_function = asyncio.coroutine(coroutine)
await awaitable_coroutine_function()

Waiting 2 seconds
Waiting 2 seconds


1

So we could have written our coroutine as 

In [23]:
@asyncio.coroutine
def coroutine_func():
    print('Waiting 2 seconds')
    yield from asyncio.sleep(2)
    return 1

await coroutine_func()

Waiting 2 seconds


1

## Async sugar
Using the `@asyncio.coroutine` decorator is the old style way of defining coroutines. The new way is to use the `async def` keyword to define the coroutine function. The only difference with the previous example is we can no longer use `yield from`: `yield form` must be replaced by `await`. So we now have

In [24]:
async def coroutine_func():
    print('Waiting 2 seconds')
    await asyncio.sleep(2)
    return 1

await coroutine_func()

Waiting 2 seconds


1

## Coroutine vs Coroutine Function

Confusingly, calling a coroutine (function) defined via `async_def` & `await` (or `asyncio.coroutine` and `yield from`) returns a coroutine. This is similar to generators, where the term generator is used both to denote the `generator function`, and the output of the generator function. 

In [25]:
coroutine_func()

<coroutine object async-def-wrapper.<locals>.coroutine_func at 0x10c4ac6d0>

[From Python 3.6 documentation on coroutines](https://docs.python.org/3.6/library/asyncio-task.html?highlight=coroutine#coroutines)
> The word “coroutine”, like the word “generator”, is used for two different (though related) concepts:
>  - The function that defines a coroutine (a function definition using async def or decorated with @asyncio.coroutine). If disambiguation is needed we will call this a coroutine function (iscoroutinefunction() returns True).
>  - The object obtained by calling a coroutine function. This object represents a computation or an I/O operation (usually a combination) that will complete eventually. If disambiguation is needed we will call it a coroutine object (iscoroutine() returns True).


## Futures

An object "encapsulating the asynchronous execution of a callable". The `asyncio.Future` class is very similar to the `concurrent.futures.Future` class. Makes it possible to represent an asynchronous execution and to pass it around. It's largely the same as a coroutine (not a *coroutine function*). The main difference is that it has a few more methods allowing to control the execution (if it's not completed yet):
- futures can be cancelled by calling the `cancel` method
- their status can be monitored via the `cancelled`, `running` and `done` methods

In [26]:
async def my_name_is(future):
    await asyncio.sleep(1)
    print(f'(Is future {future} cancelled? {future.cancelled()})')
    print(f'(Is future {future} done? {future.done()})')
    future.set_result('Sam')

async def whats_your_name():
    print("What's your name mate?")
    future = asyncio.Future()
    coroutine = my_name_is(future)
    await coroutine
    print(f'(Is future {future} done? {future.done()})')
    print(f"My name is {future.result()}")
    


In [27]:
await whats_your_name()

What's your name mate?
(Is future <Future pending> cancelled? False)
(Is future <Future pending> done? False)
(Is future <Future finished result='Sam'> done? True)
My name is Sam


## Tasks
[From the docs](https://docs.python.org/3/library/asyncio-task.html#asyncio.Task):

> A Future-like object that runs a Python coroutine. Not thread-safe.
>
> Tasks are used to run coroutines in event loops. If a coroutine awaits on a Future, the Task suspends the execution of the coroutine and waits for the completion of the Future. When the Future is done, the execution of the wrapped coroutine resumes.
>
> Event loops use cooperative scheduling: an event loop runs one Task at a time. While a Task awaits for the completion of a Future, the event loop runs other Tasks, callbacks, or performs IO operations.

So this provides an alternative way to run a coroutine. The advantage over awaiting a coroutine is that tasks can be cancelled, or monitored just like futures. The difference with a future is that task is scheduled straight away. 

In [28]:
coroutine = whats_your_name()

In [29]:
task = loop.create_task(coroutine) # create and schedule a task
print(task)

<Task pending coro=<whats_your_name() running at <ipython-input-26-271ca7af892c>:7>>


In [30]:
await asyncio.sleep(1)

What's your name mate?
(Is future <Future pending> cancelled? False)
(Is future <Future pending> done? False)
(Is future <Future finished result='Sam'> done? True)
My name is Sam


In [31]:
task

<Task finished coro=<whats_your_name() done, defined at <ipython-input-26-271ca7af892c>:7> result=None>

**asyncio.ensure_future vs loop.create_task**

We can also create a task via `asyncio.ensure_future`. The main different is `ensure_future` can take a coroutine or a future as an input and will return a task, whereas `loop.create_task` can only take a coroutine. If you know you're getting a coroutine, use `create_task`, otherwise use `ensure_future`. 

In [32]:
task = asyncio.ensure_future(whats_your_name())
print(task)

<Task pending coro=<whats_your_name() running at <ipython-input-26-271ca7af892c>:7>>


In [33]:
task

<Task pending coro=<whats_your_name() running at <ipython-input-26-271ca7af892c>:7>>

What's your name mate?
(Is future <Future pending> cancelled? False)
(Is future <Future pending> done? False)
(Is future <Future finished result='Sam'> done? True)
My name is Sam


# References

- Getting started with Asyncio (8mn, 30k views) https://www.youtube.com/watch?v=L3RyxVOLjz8
- Nice and simple 15mn intro (views): https://www.youtube.com/watch?v=tSLDcRkgTsY. Part 1 is based on this talk. 
- Another short intro (30mn, 71k views): https://www.youtube.com/watch?v=BI0asZuqFXM. 
- The mind bending "Concurrency from the Ground Up" talk by the Amazing David Beazley https://www.youtube.com/watch?v=MCs5OvhV9S4. Part 2 is partly based on this. 