# Agenda -- asyncio

1. `asyncio` basics -- what is it?
2. Basic use of `asyncio`
3. Scheduling and waiting
4. Deeper with the event loop
5. What if there's no coroutine?  What then?
6. Example: HTTP client
7. Example: Chatbot
8. `asyncio` vs. threads

# Why do we need asyncio?

# Reactor pattern

The problem with current concurrency is:
- Threads are lightweight, but hard to work with... and even lightweight threads can overburden a server at a certain point.
- Processes are easier to work with, and you can even have a lot of them on a server, but they are very heavyweight, and can bring your server down.

The reactor pattern says:
- Have one process
- Have one thread

The idea is: You have a list of functions, and you loop over that list again and again and again. You give each function the chance to execute for a little bit of time.  This way, you can handle a ton of incoming network connections, because the only overhead is additional functions.

Each time we get a new network connection, we run a function a new time. If there are *n* incoming connections, then we're running our function *n* times.  Because the overhead of a function is so much lower than threads or processes, we can get away with this.

The "Twisted" framework in Python has existed for 20+ years, and has used this technique.

JavaScript's NodeJS framework for server-side Web apps has been doing this 10-15 years already.

`asyncio` is still something of a work in progress.
- The API is stable, with fewer changes with each Python version
- A growing number of libraries support it
- A growing number of people are using it

BUT it is still:
- Hard to understand
- Hard to integrate with much existing software
- A lot of the documentation is still unclear

# What kinds of problems does `asyncio` solve?

Network applications are mostly idle.
- When we want to request something from the network (as a client), we wait until we get a response. When we're runn
- When we are running a server, much of the time is idle, while the client either sends a request or processes it.

There's a ton of idle time there!  That's where `asyncio` comes in.

`asyncio` **DOES NOT PROMISE** that our code will run in parallel.  Each of our "tasks" will get a little bit of time to run, before it's expected to cede control back to the other tasks we're running.  But that's OK, because it'll cede control when it knows it'll have to wait a while before getting more communication.

With `asyncio`, we know exactly when in a function's execution we might cede control to another task. By using local variables, we know that our task won't interfere with any other tasks.



# Early `asyncio` was based on generators

Behind the scenes, there are still some Python generators (and generator functions) hiding.  However, modern Python doesn't really use generators.

That said, generators can really help us to understand what's happening in `asyncio`.

In [1]:
# dumbest function in the world

def myfunc():
    return 1
    return 2
    return 3

In [3]:
myfunc()  # we run it, and it gets to "return 1", returns 1, and that's the end!

1

In [4]:
import dis
dis.dis(myfunc)

  2           0 LOAD_CONST               1 (1)
              2 RETURN_VALUE


In [5]:
# generator function

def mygen():
    yield 1
    yield 2
    yield 3
    
# yield means: give a value back and wait -- go to sleep    

In [7]:
# when you run a generator function, you get a generator object back
# generator objects implement the iterator protocol

mygen()

<generator object mygen at 0x1110cd230>

In [9]:
# each time we ask for the next element from the generator (i.e., mygen()),
# the generator function runs up to and including the next "yield".  "yield"
# tells the generator function to return a value, and go to sleep, remembering
# where it was!

for one_item in mygen():
    print(one_item)

1
2
3


In [12]:
def count_up_to(maxnum):
    for one_number in range(maxnum):
        yield one_number
        
def fib(maxnum):
    first = 0
    second = 1
    counter = 0
    while True:
        yield first
        counter += 1
        first, second = second, first+second
        
        if counter >= maxnum:
            break
            
def squares(maxnum):
    for one_number in range(maxnum):
        yield one_number ** 2
        
g1 = count_up_to(10)        
g2 = fib(8)
g3 = squares(11)

all_generators = [g1, g2, g3]

while all_generators:
    for one_generator in all_generators:
        try:
            value = next(one_generator)
            print(f'{one_generator.__name__}: {value}')
        except StopIteration:
            all_generators.remove(one_generator)

count_up_to: 0
fib: 0
squares: 0
count_up_to: 1
fib: 1
squares: 1
count_up_to: 2
fib: 1
squares: 4
count_up_to: 3
fib: 2
squares: 9
count_up_to: 4
fib: 3
squares: 16
count_up_to: 5
fib: 5
squares: 25
count_up_to: 6
fib: 8
squares: 36
count_up_to: 7
fib: 13
squares: 49
count_up_to: 8
count_up_to: 9
squares: 64
squares: 81
squares: 100


# What did we just see?

- Generator functions look like regular functions, but when you run them, you get *generator objects*.
- A generator implements the iterator protocol, so we can use it in a `for` loop.
- Each time we ask for the generator to run a little bit, it does so until it hits `yield`.  That's the signal to let someone else run; the generator goes to sleep, and will pick up where it left off.

# Basic `asyncio` 

In `asyncio`, you don't your functions directly.  Rather, you put your function on the "event loop," an infinite loop that goes through each function you've added, and gives it a chance to run until the function is done.

- When we write a function for `asyncio`, it's called a "coroutine function."  
- When we run a coroutine function, we get back a "coroutine object."

How do we write a coroutine, vs. a regular function? We use the special syntax `async def` instead of `def`.


In [13]:
async def main():
    print('Hello, world!')

print(main)


<function main at 0x1121c3c70>


In [14]:
main()

<coroutine object main at 0x112071b60>

# Keywords

- `async` before `def` means: We are defining a coroutine function
- `await` can only be used inside of a coroutine function, and it means: I know that what I'm about to run is going to take a while to get back to me, so I'll wait here while it run and will continue when it returns

You can only use `await` on values that are "awaitable," meaning, they're designed to be used with `asyncio`.  The other thing to remember/realize is that `await` does cede control of the CPU to the loop, but it blocks in this function.

Raymond Hettinger compares threading, processes, and asyncio: https://www.youtube.com/watch?v=9zinZmE3Ogk

- We write coroutine functions.
- When we execute coroutine functions, we get coroutine objects.
- We can then ask `asyncio` to put our coroutine objects on the loop.
- When we do that, our routine is known as a "task."

A task is a scheduled coroutine object.

# Threads vs. `asyncio`, and operating systems

With threads (and processes), our tasks are given an amount of time to run. When that time slice is up, the OS yanks control away from the thread/process.  That's why you can have race conditions, because you don't know when that might happen, or what you might be in the middle of.

- Good news: No thread/process can gum up the system
- Bad news: You end up with race conditions, etc.

With `asyncio`, our tasks decide when they're good and ready to cede control with `await`.

- Good news: This makes it easier to reason about things, and to ensure there aren't weird conditions
- Bad news: A poorly behaved task (i.e., one without any `await` in it) can monopolize the CPU.

Before modern versions of the Mac OS and Windows, those operating systems used "cooperative multitasking" — a program would tell the OS when it was ready to cede control of the CPU. Which meant that a badly behaved application could lock up your whole system. That's much hard