## 17. Iterators, Generators, and Classic Coroutines

In [1]:
import re
import reprlib

In [2]:
RE_WORD = re.compile(r'\w+')

class Sentence:

    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text)

    def __getitem__(self, index):
        return self.words[index]
    
    def __len__(self):
        return len(self.words)
    
    def __repr__(self):
        return 'Sentence(%s)' % reprlib.repr(self.text)

In [3]:
s = Sentence('"The time has come," the Walrus said,')
s

Sentence('"The time ha... Walrus said,')

In [4]:
for word in s:
    print(word)

The
time
has
come
the
Walrus
said


In [5]:
list(s)

['The', 'time', 'has', 'come', 'the', 'Walrus', 'said']

In [6]:
s[0]

'The'

In [7]:
s[5]

'Walrus'

In [8]:
s[-1]

'said'

## Why Sequences Are Iterable: The `iter` function

Whenever Python needs to iterate over an object x, it automatically calls `iter(x)`.

The `iter` built-in function:

1. Checks whether the object implements `__iter__`, and calls that to obtain an iterator.
2. If `__iter__` is not implemented, but `__getitem__` is, then `iter()` creates an iterator that tries to fetch items by index, starting from `0` (zero).
3. If that fails, Python raises `TypeError`, usually saying `'C' object is not iterable`, where `C` is the class of the target object.

In [10]:
class Spam:
    def __getitem__(self, i):
        print('->', i)
        raise IndexError()
    
spam_can = Spam()
iter(spam_can)

<iterator at 0x7fb6f8181db0>

In [11]:
list(spam_can)

-> 0


[]

In [12]:
from collections import abc
isinstance(spam_can, abc.Iterable)

False

In [13]:
class GooseSpam:
    def __iter__(self):
        pass

from collections import abc
issubclass(GooseSpam, abc.Iterable)

True

In [14]:
goose_spam_can = GooseSpam()
isinstance(goose_spam_can, abc.Iterable)

True

### Using `iter` with a Callable

We can call `iter()` with two arguments to create an iterator from a function or any callable object. In this usage, the first argument must be a callable to be invoked repeatedly (with no arguments) to produce values, and the second argument is a _sentinel_: a marker value which, when returned by the callable, causes the iterator to raise StopIteration instead of yielding the sentinel.

In [28]:
from random import randint

def d6():
    return randint(1, 6)


In [29]:
d6_iter = iter(d6, 1)
d6_iter

<callable_iterator at 0x7fb6e28edd80>

In [31]:
for roll in d6_iter:
    print(roll)

## Iterables Versus Iterators

_iterable_
> Any object from which the `iter` built-in function can obtain an iterator. Objects implementing an `__iter__` method returning an iterator are iterable. Sequences are always iterable, as are objects implementing a `__getitem__` method that accepts 0-based indexes.

In [32]:
s = 'ABC'
for char in s:
    print(char)

A
B
C


## Sentence Classes with `__iter__`

The next variations of `Sentence` implement the standard iterable protocol, first by implementing the Iterator design pattern, and then with generator functions.


### Sentence Take #2: A Classic Iterator

In [33]:
import re
import reprlib

RE_WORD = re.compile(r'\w+')

class Sentence:

    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text)

    def __repr__(self):
        return f'Sentence({reprlib.repr(self.text)})'
    
    def __iter__(self):
        return SentenceIterator(self.words)
    
class SentenceIterator:

    def __init__(self, words):
        self.words = words
        self.index = 0

    def __next__(self):
        try:
            word = self.words[self.index]
        except IndexError:
            raise StopIteration()
        self.index +=1
        return word
    
    def __iter__(self):
        return self

### Sentence Take #3: A Generator Function

In [34]:
import re
import reprlib

RE_WORD = re.compile(r'\w+')

class Sentence:

    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text)

    def __repr__(self):
        return 'Sentence(%s)' % reprlib.repr(self.text)
    
    def __iter__(self):
        for word in self.words:
            yield word
