# Iterators

Most Python programmers are familiar with the concept of iterating some sort of 
collection (for example, strings, lists, tuples, file objects, and so on):


In [1]:
for i in range(3):
    print(i)

0
1
2


In [2]:
for line in open('hello_world.txt'):
...     print(line, end='')

Hello
World!
How are you doing today?
:-)




The reason why we can iterate all sorts of objects and not just lists or strings is the 
iteration protocol. The iteration protocol defines a standard interface for iteration: 
an object that implements __iter__ and __next__ (or __iter__ and next in Python 
2.x) is an iterator and, as the name suggests, can be iterated over, as shown in the 
following code snippet

In [4]:
class MyIterator(object):
    def __init__(self, xs):
        self.xs = xs

    def __iter__(self):
        return self

    def __next__(self):
        if self.xs:
            return self.xs.pop(0)
        else:
            raise StopIteration

In [6]:
for i in MyIterator(['a', 'b', 'c']):
    print(i)

a
b
c


\__iter__   returns the object we iterate, and the \__next__ method returns the individual 
elements of the sequence one by one.

To better see how the protocol works, we can unroll the loop manually as the 
following piece of code shows:


In [10]:
itrtr = MyIterator([3, 4, 5, 6])

print(next(itrtr))
print(next(itrtr))

3
4


Once the sequence is exhausted, next() throws a StopIteration 
exception. The for loop in Python, for instance, uses the same mechanism; it calls 
next() on its iterator and catches the StopIteration exception to know when to stop.

In [11]:
print(next(itrtr))
print(next(itrtr))
print(next(itrtr))

5
6


StopIteration: 

# Generators

A generator is simply a callable that generates a sequence of results rather than 
returning a result. This is achieved by yielding (by way of using the yield keyword) 
the individual values rather then returning them, as we can see in the following 
example (generators.py):


In [14]:
def mygenerator(n):
    while n:
        n -= 1
        yield n

for i in mygenerator(3):
    print(i)

2
1
0


It is the simple presence of yield that makes mygenerator a generator and not a 
simple function. The interesting behavior in the preceding code is that calling the 
generator function does not start the generation of the sequence at all; it just creates 
a generator object, as the following example shows:

In [16]:
g=mygenerator(2)

In [17]:
next(g)

1

In [18]:
next(g)

0

In [19]:
next(g)

StopIteration: 

Each next() call produces a value from the generated sequence until the sequence 
is empty, and that is when we get the StopIteration exception instead. This is the 
same behavior that we saw when we looked at iterators. Essentially, generators  
are a simple way to write iterators without the need for defining classes with their 
\__iter__ and \__next__ methods.

As a side note, you should keep in mind that generators are one-shot operations; it is 
not possible to iterate the generated sequence more than once. To do that, you have 
to call the generator function again.

# Corutines 

The same yield expression used in generator functions to produce a sequence  
of values can be used on the right-hand side of an assignment to consume values. 
This allows the creation of coroutines.

A coroutine is simply a type of function  
that can suspend and resume its execution at well-defined locations in its code  
(via yield expressions).

It is important to keep in mind that coroutines, despite being implemented as 
enhanced generators, are not conceptually generators themselves. The reason is  
that coroutines are not associated with iteration.

### Let's create some coroutines and see how we can use them. 

There are three main  constructs in coroutines, which are stated as follows:

• yield(): This is used to suspend the execution of the coroutine <br>
• send(): This is used to pass data to a coroutine (and hence resume its 
execution) <br>
• close(): This is used to terminate a coroutine <br>

In [21]:
def complain_about(substring):
    print('Please talk to me!')
    try:
        while True:
            text = (yield)
            if substring in text:
                print('Oh no: I found a %s again!'
                      % (substring))
    except GeneratorExit:
        print('Ok, ok: I am quitting.')

We start off by defining our coroutine; it is just a function (we called it complain_about) that takes a single argument: a string. 

After printing a message, it enters an infinite loop enclosed in a try except clause. 

This means that the only way to exit the loop is via an exception. 

We are particularly interested in a very specific exception: GeneratorExit. When we catch one of these, we simply clean up and quit.

The body of the loop itself is pretty simple; we use a yield expression to fetch data (somehow) and store it in the variable text. Then, we simply check whether substring is in text, and if so, we whine a bit.

In [29]:
c = complain_about('Machine Learning')

In [30]:
next(c)

Please talk to me!


In [31]:
c.send('Test data')

In [32]:
c.send('Some more random text')

In [33]:
c.send('I love Machine Learning because it is cool!')

Oh no: I found a Machine Learning again!


In [34]:
c.send('Hello')

In [35]:
next(c)

TypeError: argument of type 'NoneType' is not iterable

The execution of complain_about('Machine Learning') creates the coroutine, but nothing else 
seems to happen. 

In order to use the newly created coroutine, we need to call next() 
on it, just like we had to do with generators. In fact, we see that it is only after calling 
next() that we get Please talk to me! printed on the screen.

At this point, the coroutine has reached the text = (yield) line, which means  
that it suspends its execution. The control goes back to the interpreter so that we 
can send data to the coroutine itself. We do that using the its send() method.

We can stop the coroutine by calling its close() method, which results in a 
GeneratorExit exception being risen inside the coroutine. The only thing that a 
coroutine is allowed to do at this point is catch the exception, do some cleaning up, 
and exit. The following snippet shows how to close the coroutine:

In [36]:
c.close()

### Decorators

When using coroutines, most people find having to call next() on the coroutine 
rather annoying and end up using a decorator to avoid the extra call, as the following 
example shows:

In [38]:
def coroutine(fn):
    def wrapper(*args, **kwargs):
        c = fn(*args, **kwargs)
        next(c)
        return c
    return wrapper

In [39]:
@coroutine
def complain_about2(substring):
    print('Please talk to me!')
    while True:
        text = (yield)
        if substring in text:
            print('Oh no: I found a %s again!'% (substring))
            

In [40]:
c = complain_about2('JavaScript')

Please talk to me!


In [41]:
c.send('Test data with JavaScript somewhere in it')

Oh no: I found a JavaScript again!


In [42]:
c.send('Hello')

In [43]:
c.close()

Coroutines can be arranged in rather complex hierarchies, with one coroutine sending 
data to multiple other ones and getting data from multiple sources as well

# An asynchronous example

To keep things simple but still interesting, let's write a tool that, given a text file, will 
count the occurrences of a given word

In [61]:
# !time(!grep -io love pg2600.txt | wc -l)

In [62]:
def coroutine(fn):
    def wrapper(*args, **kwargs):
        c = fn(*args, **kwargs)
        next(c)
        return c
    return wrapper

In [73]:
#Reading the file line by line (done by the cat function)

#The first function, cat, acts as the data source for the whole program; it reads the 
#file line by line and sends each line to grep (child.send(line)). If we want a 
#case-insensitive match, then we simply make line lowercase; otherwise, we pass it 
#unchanged.

def cat(f, case_insensitive, child):
    if case_insensitive:
        line_processor = lambda l: l.lower()
    else:
        line_processor = lambda l: l

    for line in f:
        child.send(line_processor(line))

In [64]:
#Counting the occurrences of substring in each line

#The grep command is our first coroutine. In it, we enter an infinite loop where we 
#keep receiving data (text = (yield)), count the occurrences of substring in 
#text, and send that number of occurrences to the next coroutine (count in our case): 
#child.send(text.count(substring))).

@coroutine
def grep(substring, case_insensitive, child):
    if case_insensitive:
        substring = substring.lower()
    while True:
        text = (yield)
        child.send(text.count(substring))

In [72]:
#Adding up all the numbers and printing out the total

#The count coroutine keeps a running total, n, of the numbers it receives, (n += 
#(yield)), from grep. It catches the GeneratorExit exception sent to each coroutine 
#when they are closed (which in our case happens automatically when we reach the 
#end of the file) to know when to print out substring and n

@coroutine
def count(substring):
    n = 0
    try:
        while True:
            n += (yield)
    except GeneratorExit:
        print(substring, n)

In [77]:
cat(f=open('pg2600.txt'), case_insensitive=True, child=grep(substring='love', case_insensitive=True, child=count('love')))

love 677


Things become interesting when we start organizing coroutines into complex  
graphs. For instance, we might want to count the concurrence of multiple words  
in the input file.

In [78]:
def coroutine(fn):
    def wrapper(*args, **kwargs):
        c = fn(*args, **kwargs)
        next(c)
        return c
    return wrapper

def cat(f, case_insensitive, child):
    if case_insensitive:
        line_processor = lambda l: l.lower()
    else:
        line_processor = lambda l: l

    for line in f:
        child.send(line_processor(line))

@coroutine
def grep(substring, case_insensitive, child):
    if case_insensitive:
        substring = substring.lower()
    while True:
        text = (yield)
        child.send(text.count(substring))

@coroutine
def count(substring):
    n = 0
    try:
        while True:
            n += (yield)
    except GeneratorExit:
        print(substring, n)

@coroutine
def fanout(children):
    while True:
        data = (yield)
        for child in children:
            child.send(data)

In [81]:
cat(f=open('pg2600.txt'), case_insensitive=True, child=fanout(children=[grep(substring=p,case_insensitive=True,child=count(p)) for p in ['love','hate','hope']]))

hate 103
love 677
hope 158


Python 3.4 introduced a new library for asynchronous I/O called asyncio.
Python 3.5 introduced true coroutine types via async def and await