# Chapter 17: Iterators, Generators,and Classic Coroutines

## A sequence of Words

In [71]:
import re
import reprlib

RE_WORD = re.compile(r'\w+')

class Sentence:
    
    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text) # 1
    
    def __getitem__(self, index):
        return self.words[index]  # 2
    
    def __len__(self):
        return len(self.words) # 3
    
    def __repr__(self):
        return 'Sentence(%s)' % reprlib.repr(self.text) # 4

1. `.findall` returns a list of all non-overlapping matches of pattern , as a list of strings. 
2. `self.words` holds the result of `findall` method, so we simply return the word at the given index.
3. To complete the sequence protocol, we implement `__len__` although it is not really needed for our purpose.
4. `reprlib.repr` limits the generated string to 30 characters. 

Testing iteration on a `Sentence` instance:
1. A sentence is created from a string.
2. Note the output of `__repr__` using ... generated by `reprlib.repr`.
3. `Sentence` instances are iterable; we'll see why in a moment.
4. Being iterable, `Sentence` objects can be used as input to build lists and other iterable types.

In [72]:
s = Sentence('"The time has come," the Walrus said,')
s

Sentence('"The time ha... Walrus said,')

In [73]:
for word in s:
    print(word)

The
time
has
come
the
Walrus
said


In [74]:
list(s)

['The', 'time', 'has', 'come', 'the', 'Walrus', 'said']

In [75]:
words = Sentence('This is a test')
iterator = iter(words)
print(next(iterator))

This


## Why Sequences Are Iterable: The iter Function

## Sentence Take #2: A Classic Iterator

In [59]:
import re
import reprlib

RE_WORD = re.compile(r'\w+')

class Sentence:
        
        def __init__(self, text):
            self.text = text
            self.words = RE_WORD.findall(text)
            
        def __repr__(self):
              return f'Sentence({reprlib.repr(self.text)})'
          
        def __iter__(self):
            return SentenceIterator(self.words)
        
class SentenceIterator:
    
    def __init__(self, words):
        self.words = words
        self.index = 0
        
    def __next__(self):
        try:
            word = self.words[self.index]
        except IndexError:
            raise StopIteration()
        self.index += 1
        return word
    
    def __iter__(self):
        return self

In [76]:
words = Sentence('This is a test')
iterator = iter(words)
print(next(iterator))

This


## Sentence Take #4: Lazy Generator

Sentence implementation #1-#3 build a list of all words in the text, binding it to `self.words` attribute. This requires processing the entire text, and list may as much memory as the text itself. It's not lazy enough.

In [77]:
import re
import reprlib

RE_WORD = re.compile(r'\w+')

class Sentence:
    
    def __init__(self, text):
        self.text = text #1
        
    def __repr__(self):
        return f'Sentence({reprlib.repr(self.text)})'
    
    def __iter__(self):
        for match in RE_WORD.finditer(self.text): #2
            yield match.group()  #3

In [78]:
words = Sentence('This is a test')
iterator = iter(words)
print(next(iterator))

This


1. No need to have a word list.
2. `finditer` builds an iterator over the matches of `RE_WORD` on `self.text`, yielding `MatchObject` instance.
3. `match.group()` extracts the matched text from the `MatchObject` instance.

## Sentence Take #5: Lazy Generator Expression

We can replace simply generator function with a generator expression.

In [82]:
import re
import reprlib

RE_WORD = re.compile(r'\w+')

class Sentence:
    
    def __init__(self, text):
        self.text = text #1
        
    def __repr__(self):
        return f'Sentence({reprlib.repr(self.text)})'
    
    def __iter__(self):
        return (match.group() for match in RE_WORD.finditer(self.text)) #2

In [83]:
words = Sentence('This is a test')
iterator = iter(words)
print(next(iterator))

This


## When to Use Generator Expressions

A generator expression:

-  is a syntactic shortcut to create a generator without defining and calling a function.
-  is more flexible: we can code complex logic with multiple statements, and we can even use them as coroutines.

If the generator expression spans more than a couple of lines, I prefer to code a generator function for the sake of readability.

## Contrasting Iterators and Generators

- iterator: General term for any object that implements a `__next__` method. Iterators are designed to produce data that is consumed by the client data. In practice, most iterators we use in Python are `generators`.
- generator: An iterator built by the Python compiler. To create a generator, we don't implement `__next__`. Instead, we use the `yield` keyword to make a generator function, which is a factory of generator objects. A generator expression is another way to build a generator object. Generator objects provide `__next__`, so they are iterators. Since Python 3.5, we also have a asynchronous generators declared with `async def`. 

In [5]:
def g():
    yield 0

ge = (c for c in 'XYZ')
g(), ge

(<generator object g at 0x00000208B4704350>,
 <generator object <genexpr> at 0x00000208B47043C0>)

In [6]:
type(g()), type(ge)

(generator, generator)

In [8]:
print(dir(g()))

['__class__', '__del__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__name__', '__ne__', '__new__', '__next__', '__qualname__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'close', 'gi_code', 'gi_frame', 'gi_running', 'gi_yieldfrom', 'send', 'throw']


## Rewrite Sentence Class to a Generator Function

In [100]:
import re

RE_WORD = re.compile(r'\w+')

def sentence(text):
    for match in RE_WORD.finditer(text):
        yield match.group()

In [101]:
s

<generator object sentence at 0x00000250ABE762D0>

In [102]:
msg = 'Hello Fluent Python'
s = sentence(msg)
print(next(s))

Hello


## An Arithmetic Progression Generator

The `range` builtin generates a bounded arithmetic progression(AP) of integers.

In [84]:
class ArithmeticProgression:
    
    def __init__(self, begin, step, end=None): #1
        self.begin = begin
        self.step = step
        self.end = end # None -> "infinite" series
        
    def __iter__(self):
        result_type = type(self.begin + self.step) #2
        result = result_type(self.begin) #3
        forever = self.end is None #4
        # end = result_type(self.end) if not forever else end 
        index = 0
        while forever or result < self.end: #5
            yield result  #6
            index += 1
            result = self.begin + self.step * index #7

1. `__init__` requires two arguments: `begin` and `step`; `end` is optional, if it's `None`, the series will be unbounded.
2. Get the type of adding `self.begin` and `self.step`. For example, if one is `int` and the other is `float`, `result_type` will be `float`.
3. This line makes a `result` with the same numeric value of `self.begin` but coerced to `result_type`.
4. For readability, the `forever` flag will be `True` if `end` is `None`, resulting in an unbounded series.
5. This loop runs `forever` or until `result` exceeds `self.end`. When this loop exits, so does the generator.
6. The current `result` is produced.
7. The next potential `result` is calculated. It may never be yielded, if the loop ends here.

In [40]:
ap = ArithmeticProgression(0, 1, 3)
print(list(ap))

ap = ArithmeticProgression(1, .5, 3)
print(list(ap))

ap = ArithmeticProgression(0, 1/3, 1)
print(list(ap))

[0, 1, 2]
[1.0, 1.5, 2.0, 2.5]
[0.0, 0.3333333333333333, 0.6666666666666666]


In [89]:
# Doesn't support complex numbers if end is not None
# TypeError: '<' not supported between instances of 'complex' and 'complex'
# Are there any ways to compare complex numbers?

ap = ArithmeticProgression(1, 0.5+.5j)
iterator = iter(ap)
for _ in range(10):
    print(next(iterator))

(1+0j)
(1.5+0.5j)
(2+1j)
(2.5+1.5j)
(3+2j)
(3.5+2.5j)
(4+3j)
(4.5+3.5j)
(5+4j)
(5.5+4.5j)


Maybe we don't need a class here, all we need is a function.
A generator object is created with `__next__` and `__iter__` methods.
That's much better than a class object only with `__iter__` method but without `__next__` method.


In [91]:
def arithmetic_progression(begin, step, end=None):
    result_type = type(begin + step)
    result = result_type(begin)
    forever = end is None
    index = 0
    while forever or result < end:
        yield result
        index += 1
        result = begin + step * index

In [94]:
ap = arithmetic_progression(0, 1, 3)
print(next(ap))
print(list(ap))

0
[1, 2]


## Arithmetic Progression with itertools

In [105]:
import itertools
gen = itertools.count(1, .5)
print(next(gen))
print(next(gen))
print(next(gen))

1
1.5
2.0


In [106]:
import itertools
gen = itertools.count(1, .5j)
print(next(gen))
print(next(gen))
print(next(gen))

1
(1+0.5j)
(1+1j)


`itertools.count` never stops, so if you call `list(count())`, Python will try to build a `list` that would fill all available memory and crash the program.

On the other hand, there is the `itertools.takewhile` function, which also produces a generator, but consumes another generator or iterable to stop after a condition evaluates to `False`.

In [107]:
gen = itertools.takewhile(lambda n: n < 3, itertools.count(1, .5))
print(list(gen))

[1, 1.5, 2.0, 2.5]


## Generator Functions in the Standard Library

### Table 17.1 Filtering generator functions

| Module | Function | Description |
| --- | --- | --- |
| itertools | `takewhile(predicate, it)` | consumes a generator and stops at a condition |
| itertools | `dropwhile(predicate, it)` | consumes a generator and drops items while a condition holds |
| itertools | `compress(it, selector_it)` | consumes a generator and an iterable, returning only the items from the iterable for which the corresponding item in the generator is truthy |
| builtin | `filter(predicate, it)` | consumes a function and an iterable, returning only the items from the iterable for which the function returns truthy |
| itertools | `filterfalse(predicate, it)` | consumes a generator and an iterable, returning only the items from the iterable for which the corresponding item in the generator is falsy |
|itertools | `islice(it, stop)` or `islice(it, start, stop, step=1)` | consumes a generator and returns an iterator that produces selected items from the original generator, by index |

In [108]:
vowel = lambda c: c.lower() in 'aeiou'

print(list(filter(vowel, 'Aardvark')))

['A', 'a', 'a']


In [109]:
import itertools

print(list(itertools.filterfalse(vowel, 'Aardvark')))
print(list(itertools.dropwhile(vowel, 'Aardvark')))
print(list(itertools.takewhile(vowel, 'Aardvark')))
print(list(itertools.compress('Aardvark', (1,0,1,1,0,1))))
print(list(itertools.islice('Aardvark', 4)))
print(list(itertools.islice('Aardvark', 4, 7)))
print(list(itertools.islice('Aardvark', 1, 7, 2)))

['r', 'd', 'v', 'r', 'k']
['r', 'd', 'v', 'a', 'r', 'k']
['A', 'a']
['A', 'r', 'd', 'a']
['A', 'a', 'r', 'd']
['v', 'a', 'r']
['a', 'd', 'a']


### Table 17.2 Mapping generator functions

| Module | Function | Description |
| --- | --- | --- |
| itertools | `accumulate(it, [func])` | consumes a generator and returns an iterator that produces accumulated sums, or accumulated results of other binary functions |
| itertools | `starmap(func, it)` | consumes a generator and an iterable, returning an iterator that produces the result of passing the items from the iterable to the function as individual arguments |
| builtin | `map(func, it)` | consumes a function and an iterable, returning an iterator that produces the result of passing the items from the iterable to the function as individual arguments |
| builtin | `enumerate(it, start=0)` | consumes an iterable and returns an iterator that produces tuples of `(index, item)` pairs, where `index` starts at `start` and `item` are the values from the iterable |