# Asynchronous - python 3.5

## Green Threads
Green threads are a primitive level of asynchronous programming. 
A green thread looks and feels exactly like a normal thread, except that the threads are scheduled by application code 
rather than by hardware. Gevent is a well known python library for using green threads. 
Gevent is basically green threads + eventlet, a non-blocking I/O networking library. 
Gevent monkey patches common python libraries to have non-blocking I/O. 
Here is an example using gevents to make requests to multiple urls at once:


In [None]:
import gevent.monkey

from urllib.request import urlopen

gevent.monkey.patch_all()

urls = ['http://www.google.com', 'http://www.yandex.ru', 'http://www.python.org']



def print_head(url):

    print('Starting {}'.format(url))

    data = urlopen(url).read()

    print('{}: {} bytes: {}'.format(url, len(data), data))



jobs = [gevent.spawn(print_head, _url) for _url in urls]



gevent.wait(jobs)

Starting http://www.google.com
Starting http://www.yandex.ru
Starting http://www.python.org
http://www.google.com: 13295 bytes: b'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en-IN"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop="image"><title>Google</title><script>(function(){window.google={kEI:\'-LAFWtjEHcjgvgT_16n4Dg\',kEXPI:\'201793,1354277,1354723,1354915,1355217,1355263,1355528,1355736,1355866,1355893,1356033,1356369,3700337,3700429,3700440,3700489,4029815,4031109,4043492,4045841,4048347,4061945,4076999,4079954,4081039,4081165,4093169,4095909,4097153,4097922,4097929,4098721,4098728,4098752,4101430,4101437,4102238,4103475,4103845,4103861,4104258,4104414,4105240,4109316,4109490,4113217,4115289,4115697,4116350,4116724,4116731,4116926,4116928,4116935,4117328,4117980,4118227,4118546,4118798,4119032,4119034,4119036,4120415,4120660,4121035,41211



http://www.yandex.ru: 64827 bytes: b'<!DOCTYPE html><html class="i-ua_js_no i-ua_css_standart i-ua_browser_ i-ua_browser_desktop i-ua_platform_other" lang="ru"><head xmlns:og="http://ogp.me/ns#"><meta http-equiv="X-UA-Compatible" content="IE=edge"><title>\xd0\xaf\xd0\xbd\xd0\xb4\xd0\xb5\xd0\xba\xd1\x81</title><link rel="shortcut icon" href="//yastatic.net/iconostasis/_/8lFaTHLDzmsEZz-5XaQg9iTWZGE.png"><meta http-equiv=Content-Type content="text/html;charset=UTF-8"><link rel="apple-touch-icon" href="//yastatic.net/iconostasis/_/5mdPq4V7ghRgzBvMkCaTzd2fjYg.png" sizes="76x76"><link rel="apple-touch-icon" href="//yastatic.net/iconostasis/_/s-hGoCQMUosTziuARBks08IUxmc.png" sizes="120x120"><link rel="apple-touch-icon" href="//yastatic.net/iconostasis/_/KnU823iWwj_vrPra7x9aQ-4yjRw.png" sizes="152x152"><link rel="apple-touch-icon" href="//yastatic.net/iconostasis/_/wT9gfGZZ80sP0VsoR6dgDyXJf2Y.png" sizes="180x180"><link rel="alternate" type="application/rss+xml" title="\xd0\x9d\xd0\xbe\xd0\xb2\

As you can see, the gevent API looks and feels just like threading. However under the hood, it’s using coroutine’s 
rather than actual threads, and running them on an event loop for scheduling. This means you get the benefits of 
light-weight threading without needing to understand coroutines, but you still have all the other issues that threading 
brings. Gevent is a good library for those who already understand threading and want lighter weight threads.


## Event Loop? Coroutines? Woah, slow down, I’m lost…
Lets clear up some things about how asynchronous programming works. One way to do asynchronous programming is with an 
event loop. The event loop is exactly what it sounds like, there is a queue of events/jobs and a loop that just 
constantly pulls jobs off the queue and runs them. These jobs are called coroutines. They are a small set of 
instructions, including which events to put back on to the queue, if any.

## Callback Style Async
While many asynchronous libraries exist in Python, the most popular ones are probably Tornado and gevent. As we have already talked about gevent, lets focus a little on how Tornado works. Tornado is an asynchronous web framework that uses the callback style to do asynchronous network I/O. A callback is a function, and it means “Once this is done, execute this function”. It’s basically a “when done” hook for your code. In other words a callback is like when you call a customer service line, and immediately leave your number and hang up, so they can call you back when they are available, rather than having to wait on hold forever.

Let’s take a look at how to do the same thing as above using tornado.

In [None]:
import tornado.ioloop

from tornado.httpclient import AsyncHTTPClient

urls = ['http://www.google.com', 'http://www.yandex.ru', 'http://www.python.org']



def handle_response(response):

    if response.error:

        print("Error:", response.error)

    else:

        url = response.request.url

        data = response.body

        print('{}: {} bytes: {}'.format(url, len(data), data))



http_client = AsyncHTTPClient()

for url in urls:

    http_client.fetch(url, handle_response)

    

tornado.ioloop.IOLoop.instance().start()

To explain the code a little, the very last line is calling a tornado method called AsyncHTTPClient.fetch which fetches a url in a non-blocking way. This method essentially executes and returns immediately allowing the program to do other things, while waiting on the network call. Because the next line is reached before the url has been hit, it is not possible to get a return object from the method. The solution to this problem is that instead of the fetch method returning an object, it calls a function with the result, or a callback. The callback in this example is handle_response.

## Callback Hell
In the previous example, you will notice that the very first line is checking for an error. This is required because it is not possible to raise an exception. If an exception was raised, it would not be handled by the proper section of code, due to the event loop. When fetch is executed, it starts the http call, then puts handling the response on the event loop. By the time we notice our error, the call stack would only be the event loop and this function, with none of our code to handle the exception. So any exceptions thrown in the callback will break the event loop and the program. Therefore all errors have to be passed as objects rather than raised. This means if you forget to check for errors, your errors will be swallowed. Anyone familiar with golang will recognize this style, as the language enforces it everywhere. This is the most complained about aspect of golang.

The other problem with callbacks is that in an asynchronous world, the only way to not block things is with a callback. This can lead to a very long chain of callback after callback after callback. Since you lose access to the stack and variables, you end up shoving large objects into all your callbacks, but if your using 3rd party APIs, you can’t pass anything into the callback that’s not expected. This also becomes a problem because every callback acts like a thread, but there is no way to “gather” the tasks. Lets say for example you wanted to call three APIs, then wait till the three are done, and return the aggregated results. In the gevent world, you could do this, but with callbacks you cannot. 

You would have to hack around it by saving results to some global state variables, and in the callback you would have to check if it’s the last result or not.

## Comparisons
Let’s compare so far. If we want to prevent I/O from blocking, we have to use either threads or async. Threads come with issues such as resource starvation, dead-locks, and race conditions. It also creates context switching overhead for the CPU. Async programming can solve the context switching error, but comes with its own problems. In python our options are green threads or callback style of async programming.
### Green Threads Style
Threads are controlled at the application level, rather than hardware
Feel like threads; Good for those who understand threading
Includes all the problems of normal thread-based programming other than CPU context switching
### Callback Style
Not like threaded programs at all
Threads/coroutines are invisible to the programmer
Callbacks swallow exceptions
Callbacks are not gather-able
Callback after callback gets confusing and hard to debug.

# How can we improve?
Up until python 3.3 this really was the best you could do. In order to do better you need more language support. In order to do better, Python would need some way to execute a method partially, halting execution, and maintain stack objects and exceptions throughout. If you’re familiar with Python concepts, you might realize I am hinting at Generators. Generators allow a functions to return a list, one item at a time, halting execution until the next item is needed. The problem with generators is that they must be completely consumed by the function calling it. In other words, a generator can not call a generator, halting execution of both. That is however until PEP 380 added the *yield from* syntax that allows a generator to yield the result of another generator. While async isn’t really the intention of generators, it provides all the features needed to make async great. Generators maintain a stack and can raise exceptions. If you were to write an event loop that ran generators, you could have a great *async* library. And thus, the *asyncio* library was born. All you have to do is add a *@coroutine* decorator and asyncio will patch your generator into a coroutine. Here is an example of us calling the same three urls as before

In [None]:
import asyncio

import aiohttp



urls = ['http://www.google.com', 'http://www.yandex.ru', 'http://www.python.org']



@asyncio.coroutine

def call_url(url):

    print('Starting {}'.format(url))

    with aiohttp.ClientSession() as session:

        response = yield from session.get(url)

        data = yield from response.text()

        print('{}: {} bytes: {}'.format(url, len(data), data))

        return data



futures = [call_url(url) for url in urls]



loop = asyncio.get_event_loop()

loop.run_until_complete(asyncio.wait(futures))

In [None]:
### A couple things to note here:
1) We are not looking for errors, because errors get passed up the stack correctly.
2) We can return an object if we want.
3) We can start all coroutines, and gather them later.
4) No callbacks
5) Line 10 doesn’t execute until line 9 is completely done. (feels synchronous/familiar)

Life is great! The only problem is the yield from looks way too much like a generator, and it could cause problems if it actually was a generator.

### Async and Await

The asyncio library was gaining a lot of traction, so Python decided to make it a core library. With the introduction of the core library, they also added the keywords async and await in Python 3.5. The keywords are designed to make it more clear your code is asynchronous; so your methods are not confused with generators. The async keyword goes before def to show that a method is asynchronous. The await keyword replaces yield from and makes it more clear that you are waiting for a coroutine to finish. Here is our example again but with the async/await keywords.

In [None]:
import asyncio

import aiohttp



urls = ['http://www.google.com', 'http://www.yandex.ru', 'http://www.python.org']



async def call_url(url):

    print('Starting {}'.format(url))
    async with aiohttp.ClientSession() as session:

        response = await session.get(url)

        data = await response.text()

        print('{}: {} bytes: {}'.format(url, len(data), data))

        return data



futures = [call_url(url) for url in urls]



loop = asyncio.get_event_loop()

loop.run_until_complete(asyncio.wait(futures))

Basically what is happening here is an async method, when executed, returns a coroutine which can then be awaited.

### We Have Arrived

Python finally has an excellent asynchronous framework, asyncio. Lets take a look at all the problems of threading and see if we have solved them.
#### CPU Context switching: 
asyncio is asynchronous and uses an event loop; it allows you to have application controlled context switches while waiting for I/O. No CPU switching found here!
#### Race Conditions: 
Because asyncio only runs a single coroutine at a time and switches only at points you define, your code is safe from race conditions.
#### Dead-Locks/Live-Locks: 
Since you don’t have to worry about race conditions, you don’t have to use locks at all. This makes you pretty safe from dead-locks. You could still get into a dead-lock situation if you require two coroutines to wake each other, but that is so rare you would almost have to try to make it happen.
#### Resource Starvation: 
Because coroutines are all run on a single thread, and dont require extra sockets or memory, it would be a lot harder to run out of resources. Asyncio however does have an “executor pool” which is essentially a thread pool. If you were to run too many things in an executor pool, you could still run out of resources. However, using too many executors is an anti-pattern, and not something you would probably do very often.

To be fair, while asyncio is pretty great, it does come with its own problems. First, asyncio is new to python. There are some weird edge cases that will leave you wanting for more. Second, when you go fully asynchronous, it means your entire codebase has to be asynchronous. Every. Single. Piece. This is because synchronous functions might take up too much time, thereby blocking your event loop. The libraries for asyncio are still young and maturing, so it is sometimes hard to find an asynchronous version for part of your stack.