# Iterables, Iterators, and Generators
The yield keyword was added in Python 2.2 (2001). The yield keyword allows the construction of generators, which work as iterators.

## Sentence Take #1: A Sequence of Words

In [9]:
import re
import reprlib

RE_WORD = re.compile('\w+')

class Sentence:
    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text)

    def __getitem__(self, index):
        return self.words[index]

    def __len__(self):
        return len(self.words)

    def __repr__(self):
        return 'Sentence(%s)' % reprlib.repr(self.text)


In [2]:
s = Sentence('"The time has come," the Walrus said,') 

In [3]:
s

Sentence('"The time ha... Walrus said,')

In [4]:
for word in s:
    print(word)


The
time
has
come
the
Walrus
said


In [6]:
s[0], s[1], s[2]

('The', 'time', 'has')

## Why Sequences Are Iterable: The iter Function
Whenever the interpreter needs to iterate over an object x, it automatically calls iter(x). The iter built-in function:
-  Checks whether the object implements __iter__, and calls that to obtain an iterator.
-  If __iter__ is not implemented, but __getitem__ is implemented, Python creates an iterator that attempts to fetch items in order, starting from index 0 (zero).
-  If that fails, Python raises TypeError, usually saying “C object is not iterable,” where C is the class of the target object.

That is why any Python sequence is iterable: they all implement __getitem__. In fact, the standard sequences also implement __iter__, and yours should too, because the special handling of __getitem__ exists for backward compatibility reasons and may be
gone in the future (although it is not deprecated as I write this).

## Iterables Versus Iterators
Any object from which the iter built-in function can obtain an iterator. Objects implementing an __iter__ method returning an iterator are iterable. Sequences are always iterable; as are objects implementing a __getitem__ method that takes 0-based indexes.

It’s important to be clear about the relationship between iterables and iterators: Python obtains iterators from iterables.

In [1]:
s = "ABC"  # s is the iterable string

In [2]:
for char in s:
    print(char)

A
B
C


There was no for statement, but we could do:

In [4]:
s = "ABC"
it = iter(s)
it

<str_iterator at 0x4e797b8>

In [5]:
while True:
    try:
        print(next(it))
    except StopIteration:  # StopIteration signals that the iterator is exhausted.
        del it
        break


A
B
C


This is formalized in the collections.abc.Iterator ABC, which defines the __next__ abstract method, and subclasses Iterable—where the abstract __iter__ method is defined. The best way to check if an object x is an iterator is to call isinstance(x, abc.Iterator).

In [8]:
import collections

s = "12345"
it = iter(s)
isinstance(it, collections.abc.Iterator)

True

In [10]:
s3 = Sentence('Pig and Pepper')
it = iter(s3)
it

<iterator at 0x50dc080>

In [11]:
next(it)

'Pig'

In [12]:
next(it)

'and'

In [13]:
next(it)

'Pepper'

In [14]:
import traceback

try:
    next(it)
except:
    traceback.print_exc()


Traceback (most recent call last):
  File "<ipython-input-14-8d2715f26bbc>", line 4, in <module>
    next(it)
StopIteration


In [15]:
list(it)  # the iterator is empty

[]

In [16]:
list(iter(s3))  # as we've already taken the 3 words from this sentence

['Pig', 'and', 'Pepper']

An iterator is any object that implements the __next__ no-argument method that returns the next item in a series or raises StopIteration when there are no more items. Python iterators also implement the __iter__ method so they are iterable as well.

## Sentence Take #2: A Classic Iterator
The next Sentence class is built according to the classic Iterator design pattern following the blueprint in the GoF book. 

In [17]:
import re
import reprlib

RE_WORD = re.compile('\w+')

class Sentence:
    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text)

    def __repr__(self):
        return 'Sentence(%s)' % reprlib.repr(self.text)

    def __iter__(self):
        return SentenceIterator(self.words) 


In [18]:
class SentenceIterator:
    def __init__(self, words):
        self.words = words
        self.index = 0

    def __next__(self):
        try:
            word = self.words[self.index]
        except IndexError:
            raise StopIteration()
        self.index += 1
        return word

    def __iter__(self):
        return self


Note that implementing __iter__ in SentenceIterator is not actually needed for this example to work, but the it’s the right thing to do: iterators are supposed to implement both __next__ and __iter__, and doing so makes our iterator pass the issubclass(SentenceInterator, abc.Iterator) test. If we had subclassed SentenceIterator from abc.Iterator, we’d inherit the concrete abc.Iterator.__iter__ method.

In [19]:
s = Sentence('Pig and Pepper')

In [21]:
s.__iter__()

<__main__.SentenceIterator at 0x53614a8>

In [22]:
it = iter(s)
it

<__main__.SentenceIterator at 0x53610b8>

In [23]:
list(it)

['Pig', 'and', 'Pepper']

In [24]:
isinstance(it, collections.abc.Iterator)

True

## Making Sentence an Iterator: Bad Idea
A common cause of errors in building iterables and iterators is to confuse the two. To be clear: iterables have an __iter__ method that instantiates a new iterator every time. Iterators implement a __next__ method that returns individual items, and an __iter__ method that returns self.

Therefore, iterators are also iterable, but iterables are not iterators. A proper implementation of the pattern requires each call to iter(my_iterable) to create a new, independent, iterator.

## Sentence Take #3: A Generator Function