# A short guide to asynchronous programming<br>with asyncio
by Karol Horosin

# You will learn
* What is asynchronous programming
* asyncio library basics
* How to implement a simple async program making http requests and compare it to synchronous version

# What is asyncio
* In current form part of Python standard library since 3.5
* Asynchronous loop to execute our asynchronous functions (coroutines) implemented using async and await python keywords
* Has shallow learning curve and minimal boilerplate

# When to use asyncio

* In I/O heavy applications
* If you don't want to deal with parallel complexities
* Multithreading and multiprocessing is resource heavy - default(!) thread stack size in modern architectures is 2MB! [source](http://man7.org/linux/man-pages/man3/pthread_create.3.html)
* Asynchronous is another way to optimize and works well with multiprocessing


# Useful concepts

An **event loop** handles the execution of registered tasks. It is responsible for the flow of control.

**Coroutines** are functions that give control back to the loop while waiting for some action to happen. A coroutine needs to be scheduled in the loop.

# Execution of synchronous and asynchronous programs
![slide](images/01-sync-vs-async.jpg)

![slide](images/02-loop-1.jpg)

![slide](images/03-loop-2.jpg)

![slide](images/04-loop-3.jpg)

# Standard way of executing tasks
`time.sleep(1)` emulates synchronous operation taking 1 second

In [1]:
import time


def task1():
    print('Running task 1')
    time.sleep(1)
    print('Finished task 1')


def task2():
    print('Running task 2')
    time.sleep(1)
    print('Finished task 2')


if __name__ == "__main__":
    
    start = time.time()
    
    task1()
    task2()
    
    print("Finished in {}s".format(time.time() - start))

Running task 1
Finished task 1
Running task 2
Finished task 2
Finished in 2.001476287841797s


# Using asyncio
`asyncio.sleep(1)` emulates **asynchronous** operation taking 1 second

In [5]:
import time
import asyncio


async def task1():
    print('Running task 1')
    await asyncio.sleep(1)
    print('Finished task 1')


async def task2():
    print('Running task 2')
    await asyncio.sleep(1)
    print('Finished task 2')


async def main():
    tasks = [task1(), task2()]
    results = await asyncio.gather(*tasks)


if __name__ == "__main__":
    start = time.time()

    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())
    
    print("Finished in {}s".format(time.time() - start))

Running task 2
Running task 1
Finished task 2
Finished task 1
Finished in 1.00266695022583s


# Typical asynchronous operations

* Retrieving an updating contents of a database
* Fetching data from web apis
* Any network operation
* Using other software tools in subprocesses - pipelines!

# Making code asynchronous

1) Convert your functions to asynchronous coroutines
  * async in front of function - `async def function()`
  * replace synchronous operations with asynchronous equivalents
  * await them to give back control to loop

2) Create event loop to execute tasks

`loop = asyncio.get_event_loop()`

3) Run tasks on the loop

`loop.run_until_complete(task())`

# Fetching data from API

# Synchronous implementation

In [3]:
import urllib.request

BASE_URL = 'https://jsonplaceholder.typicode.com/posts/'
POSTS = 10

def fetch_api_sync(post):
    print('Fetching post {}, '.format(post), end='')
    response = urllib.request.urlopen(BASE_URL + str(post))
    print('Fetched post {}, '.format(
        post), end='')


if __name__ == "__main__":
    start = time.time()
    for post in range(1, POSTS + 1):
        fetch_api_sync(post)
    print()    
    print("Finished in {}s".format(time.time() - start))

Fetching post 1, Fetched post 1, Fetching post 2, Fetched post 2, Fetching post 3, Fetched post 3, Fetching post 4, Fetched post 4, Fetching post 5, Fetched post 5, Fetching post 6, Fetched post 6, Fetching post 7, Fetched post 7, Fetching post 8, Fetched post 8, Fetching post 9, Fetched post 9, Fetching post 10, Fetched post 10, 
Finished in 2.856227159500122s


# Asynchronous implementation

In [4]:
import asyncio
import aiohttp
import time


BASE_URL = 'https://jsonplaceholder.typicode.com/posts/'
POSTS = 10
MAX_CONNECTIONS = 10


async def fetch_api_async(post, session):
    print('Fetching post {}, '.format(post), end='')
    response = await session.get(BASE_URL + str(post))   
    print('Fetched post {}, '.format(
        post), end='')


async def main():
    
    conn = aiohttp.TCPConnector(limit=MAX_CONNECTIONS)
    async with aiohttp.ClientSession(connector=conn) as session:
        tasks = [fetch_api_async(post, session) for post in range(1, POSTS+1)]
        await asyncio.gather(*tasks)


if __name__ == "__main__":
    start = time.time()

    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())
    print()
    print("Finished in {}s".format(time.time() - start))

Fetching post 7, Fetching post 2, Fetching post 8, Fetching post 3, Fetching post 9, Fetching post 4, Fetching post 10, Fetching post 5, Fetching post 6, Fetching post 1, Fetched post 8, Fetched post 1, Fetched post 10, Fetched post 7, Fetched post 5, Fetched post 2, Fetched post 9, Fetched post 4, Fetched post 6, Fetched post 3, 
Finished in 0.13844084739685059s


# Other tips
If you have a problem with an event loop, try downgrading tornado (jupyter uses it).
`pip3 install tornado==4.5.3`

## Thanks!

# Acknowledgements
* Thanks to @klaussweiss for investigating default stack size for new threads
* Thanks to @nielsdenissen for inspiration for slides visualising how tasks are executed in a loop https://github.com/nielsdenissen/pydata-asyncio