### Coroutines

What is a coroutine?

The word co actually comes from **cooperative**.

A coroutine is a generalization of subroutines (think functions), focused on **cooperation** between routines.

If you have some concepts of multi-threading, this is similar in some ways. But whereas in multi-threaded applications the **operating system** usually decides when to suspend one thread and resume another, without asking permission, so-called **preemptive** multitasking, here we have routines that voluntarily yield control to something else - hence the term **cooperative**.

We actually have all the tools we need to start looking at this.

It is the `yield` statement we studied in the last section on generator functions.

Let's dig a little further to truly understand what coroutines are and how they can be used.

We'll need to first define quickly what a queue is.

It is a collection where items are added to the back of the queue, and items are removed from the end of the queue. So, very similar to a queue in a supermarket - you join the queue at the back of the queue, and the person in the front of the queue is the first one to leave the queue and go to the checkout counter.

This is also called a First-In First-Out data structure.

(For comparison, you also have a **stack** which is like a stack of pancakes - the last cooked pancake is placed on top of the stack of pancakes (called a **push**), and it's the first one you take fomr the stack and eat (called a **pop**) - so that is called First-In Last-Out)

We can just use a simple list to act as queue, but lists are not particularly effecient when adding elements to the beginning of the list - they are fine for adding element to the end, but less so at inserting elements, including at the front.

So, instead of using a list, let's just use a more efficient data structure for our queue.

The `queue` module has some queue implementations, including some very specialized ones. In Python 3.7, it also has the `SimpleQueue` class that is more lightweight.

In this case though, I'm going to use the `deque` class (double-ended queue) from the `collections` module - it is very efficient adding and removing elements from both the start and the end of the queue - so, it's very general purpose and widely used. The `queue` implementations are more specialized and have several features useful for multi-tasking that we won't actually need.

In [1]:
from collections import deque

We can specify a maximum size for the queue when create it - this allows us to limit the number of items in the queue. 

We can then add and remove items by using the methods:
* `append`: appends an element to the right of the queue
* `appendleft`: appends an element to the left of the queue
* `pop`: remove and return the element at the very right of the queue
* `popleft`: remove and return the element at the very left of the queue

(Note that I'm avoiding calling it the start and end of the queue, because what you consider the start/end of the queue might depend on how you are using it)

Let's just try it out to make sure we're comfortable with it:

In [2]:
dq = deque([1, 2, 3, 4, 5])
dq

deque([1, 2, 3, 4, 5])

In [3]:
dq.append(100)
dq

deque([1, 2, 3, 4, 5, 100])

In [4]:
dq

deque([1, 2, 3, 4, 5, 100])

In [5]:
dq.appendleft(-10)
dq

deque([-10, 1, 2, 3, 4, 5, 100])

In [6]:
dq.pop()

100

In [7]:
dq

deque([-10, 1, 2, 3, 4, 5])

In [8]:
dq.popleft()

-10

In [9]:
dq

deque([1, 2, 3, 4, 5])

We can create a capped queue:

In [10]:
dq = deque([1, 2, 3, 4], maxlen=5)

In [11]:
dq.append(100)
dq

deque([1, 2, 3, 4, 100])

In [12]:
dq.append(200)
dq

deque([2, 3, 4, 100, 200])

In [13]:
dq.append(300)
dq

deque([3, 4, 100, 200, 300])

As you can see the first item (`2`) was automatically discarded from the left of the queue when we added `300` to the right.

We can also find the number of elements in the queue by using the `len()` function:

In [14]:
len(dq)

5

as well as query the `maxlen`:

In [15]:
dq.maxlen

5

There are more methods, but these will do for now.

Now let's create an empty queue, and write two functions - one that will add elements to the queue, and one that will consume elements from the queue:

In [16]:
def produce_elements(dq):
    for i in range(1, 36):
        dq.appendleft(i)

In [17]:
def consume_elements(dq):
    while len(dq) > 0:
        item = dq.pop()
        print('processing item', item)

Now we can use them as follows:

In [18]:
def coordinator():
    dq = deque()
    producer = produce_elements(dq)
    consume_elements(dq)

In [19]:
coordinator()

processing item 1
processing item 2
processing item 3
processing item 4
processing item 5
processing item 6
processing item 7
processing item 8
processing item 9
processing item 10
processing item 11
processing item 12
processing item 13
processing item 14
processing item 15
processing item 16
processing item 17
processing item 18
processing item 19
processing item 20
processing item 21
processing item 22
processing item 23
processing item 24
processing item 25
processing item 26
processing item 27
processing item 28
processing item 29
processing item 30
processing item 31
processing item 32
processing item 33
processing item 34
processing item 35


But suppose now that the `produce_elements` function is reading a ton of data from somewhere (maybe an API call that returns course ratings on some Python course :-) ).

The goal is to process these after some time, and not wait until all the items have been added to the queue - maybe the incoming stream is infinite even.

In that case, we want to "pause" adding elements to the queue, process (consume) those items, then once they've all been processed we want to resume adding elements, and rinse and repeat.

We'll use a capped `deque`, and change our producer and consumers slightly, so that each one does it's work, the yields control back to the caller once it's done with its work - the producer adding elements to the queue, and the consumer removing and processing elements from the queue:

In [20]:
def produce_elements(dq, n):
    for i in range(1, n):
        dq.appendleft(i)
        if len(dq) == dq.maxlen:
            print('queue full - yielding control')
            yield
        
def consume_elements(dq):
    while True:
        while len(dq) > 0:
            print('processing ', dq.pop())
        print('queue empty - yielding control')
        yield
    
def coordinator():
    dq = deque(maxlen=10)
    producer = produce_elements(dq, 36)
    consumer = consume_elements(dq)
    while True:
        try:
            print('producing...')
            next(producer)
        except StopIteration:
            # producer finished
            break
        finally:
            print('consuming...')
            next(consumer)

In [21]:
coordinator()

producing...
queue full - yielding control
consuming...
processing  1
processing  2
processing  3
processing  4
processing  5
processing  6
processing  7
processing  8
processing  9
processing  10
queue empty - yielding control
producing...
queue full - yielding control
consuming...
processing  11
processing  12
processing  13
processing  14
processing  15
processing  16
processing  17
processing  18
processing  19
processing  20
queue empty - yielding control
producing...
queue full - yielding control
consuming...
processing  21
processing  22
processing  23
processing  24
processing  25
processing  26
processing  27
processing  28
processing  29
processing  30
queue empty - yielding control
producing...
consuming...
processing  31
processing  32
processing  33
processing  34
processing  35
queue empty - yielding control


Notice a **really important** point here - the producer and consumer generator functions do not use `yield` for iteration purposes - they are simply using `yield` to suspend themselves and cooperatively hand control back to the caller - our coordinator function in this case.

The generators used `yield` to cooperatively suspend themselves and yield control back to the caller.

Similarly, we are not using `next` for iteration purposes, but more for starting and resuming the generators.

This is a fundamentally different idea than using `yield` to implement iterators, and forms the basis for the idea of using generators as coroutines.

### Timings using Lists and Deques for Queues

Let's see some timing differences between `lists` and `deques` when inserting and popping elements. We'll compare this with appending elements to a `list` as well.

In [1]:
from timeit import timeit

In [62]:
list_size = 10_000

def append_to_list(n=list_size):
    lst = []
    for i in range(n):
        lst.append(i)

def insert_front_of_list(n=list_size):
    lst = []
    for i in range(n):
        lst.insert(0, i)
        
lst = [i for i in range(list_size)]
def pop_from_list(lst=lst):
    for _ in range(len(lst)):
        lst.pop()
        
lst = [i for i in range(list_size)]
def pop_from_front_of_list(lst=lst):
    for _ in range(len(lst)):
        lst.pop(0)

Let's time those out:

In [63]:
timeit('append_to_list()', globals=globals(), number=1_000)

0.8679745109602663

In [64]:
timeit('insert_front_of_list()', globals=globals(), number=1_000)

20.793169873565148

In [65]:
timeit('pop_from_list()', globals=globals(), number=1_000)

0.0017591912596799375

In [66]:
timeit('pop_from_front_of_list()', globals=globals(), number=1_000)

0.012326529086294613

As you can see, insert elements at the front of the list is not very efficient compared to the end of the list. So lists are OK to use as stacks, but not as queues.

The standard library's `deque` is efficient at adding/removing items from both the start and end of the collection:

In [49]:
from collections import deque

In [67]:
list_size = 10_000

def append_to_deque(n=list_size):
    dq = deque()
    for i in range(n):
        dq.append(i)

def insert_front_of_deque(n=list_size):
    dq = deque()
    for i in range(n):
        dq.appendleft(i)
        
dq = deque(i for i in range(list_size))
def pop_from_deque(dq=dq):
    for _ in range(len(lst)):
        dq.pop()
        
dq = deque(i for i in range(list_size))
def pop_from_front_of_deque(dq=dq):
    for _ in range(len(lst)):
        dq.popleft()

In [68]:
timeit('append_to_deque()', globals=globals(), number=1_000)

0.8704001035901001

In [69]:
timeit('insert_front_of_deque()', globals=globals(), number=1_000)

0.8407907529494878

In [70]:
timeit('pop_from_deque()', globals=globals(), number=1_000)

0.000532037516904893

In [71]:
timeit('pop_from_front_of_deque()', globals=globals(), number=1_000)

0.0005195763528718089