### Sequences (Iterables) - The `iter` function
- When the interpreter needs to iterate over an object `x`, it calls `iter(x)`.
- The iter function:
    - Checks whether the object implements `__iter__` method.
    - Falls back to `__getitem__` if it's implemented and `__iter__` isn't.
    - Raises `TypeError` if that fails.
- An object is considered iterable if it implements `__iter__`, or if it implements `__getitem__`, so long as `__getitem__` accepts index keys from 0.
- Most accurate way of checking whether an item is iterable is calling `iter` on it, and handling a `TypeError` if it isn't.

> #### `iterable`
> Any object from which the built-in function `iter` can obtain an iterator. Objects implementing an `__iter__` method are iterable. Sequences are always iterable, so are objects implementing a `__getitem__` method that takes 0-based indices.

> This means Python obtains iterators from iterables.


In [6]:
# Sentence - take 1

import re
import reprlib

RE_WORD = re.compile('\w+')


class Sentence:
    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text)

    def __getitem__(self, index):
        return self.words[index]

    def __len__(self):
        return len(self.words)

    def __repr__(self):
        return 'Sentence(%s)' % reprlib.repr(self.text)


### Iterators
- Standard iterface for an iterator has 2 methods:
    - `__next__` - Return the next available item. Raise `StopIteration` when there are no more items.
    - `__iter__` - Return `self`. Allows iterators to be used where iterables are expected, e.g in for loops.

> #### iterator
> Any object that implements the `__next__` no-argument method which returns the next item in series or raises a `StopIteration` when there are no more items. Python iterators implement the `__iter__` method so that they are _iterable_ as well.


- Iterables have an `__iter__` method that instantiates a new iterator every time. Iterators implement a `__next__` method that returns individual items and an `__iter__` method that returns `self`.
> Iterators are also iterable, but iterables are not iterators.

- An iterable should never act as an iterator over itself, i.e, iterables must only implement `__iter__` but not `__next__`.
- Iterators should have an `__iter__` method that just returns `self` so that the iterator is iterable, for convenience.

## Classic (GoF) `Iterator` pattern

In [7]:
"""
Sentence - Take 2

This example shows that Sentence is iterable since it implements
an __iter__ method.

Not very Pythonic.
"""
import re
import reprlib

RE_WORD = re.compile('\w+')


class Sentence:
    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text)

    def __repr__(self):
        return 'Sentence(%s)' % reprlib.repr(self.text)

    def __iter__(self):
        return SentenceIterator(self.words)


class SentenceIterator:
    def __init__(self, words):
        self.words = words
        self.index = 0

    def __next__(self):
        try:
            word = self.words[self.index]
        except IndexError:
            raise StopIteration()

        self.index += 1
        return word

    # implementing __iter__ in an iterator is not necessary to make the iterator
    # work. It's the right thing to do though, since iterators are supposed to
    # implement both the __next__ and __iter__ methods.
    # Doing so makes the iterator pass the
    # `issubclass(SentenceIterator, abc.Iterator)` test.
    def __iter__(self):
        return self


## Using generators.
- You can replace the `SentenceIterator` class with a generator.
- Any function with a yield keyword in its body is a generator function.
- A generator function - function which, when called, returns a generator object.

In [9]:
# Sentence - Take 3.

import re
import reprlib

RE_WORD = re.compile('\w+')


class Sentence:
    def __init__(self, text):
        self.text = text
        self.words = RE_WORD.findall(text)

    def __repr__(self):
        return 'Sentence(%s)' % reprlib.repr(self.text)

    def __iter__(self):
        for word in self.words:
            yield word
        return # This return is not needed, the function can fall through
               # and automatically return.


##  Lazy Implementation Using Generator Functions.
- Lazy implementation postpones producing values to the last possible moment.
- Saves memory, may avoid useless processing.

In [11]:
# Sentence - Take #4
import re
import reprlib

RE_WORD = re.compile('\w+')


class Sentence:
    def __init__(self, text):
        self.text = text

    def __repr__(self):
        return 'Sentence(%s)' % reprlib.repr(self.text)

    def __iter__(self):
        # finditer builds an iterator over the matches of RE_WORD, yielding
        # MatchObject instances.
        for match in RE_WORD.finditer(self.text):
            yield match.group() # extract and yield the actual matched text.


## Generator Expressions
- Simple generator functions can be replaced by generator expressions.
- Generator expression - like a lazy version of a list comprehension.

In [14]:
def gen_AB():
    print('Start')
    yield 'A'
    print('Continue')
    yield 'B'
    print('End.')

In [15]:
res1 = [x * 3 for x in gen_AB()]

Start
Continue
End.


In [17]:
for i in res1:
    print('-->', i)

--> AAA
--> BBB


- Notice the list comprehension build eagerly, producing values and printing output.

In [18]:
res2 = (x * 3 for x in gen_AB())

In [19]:
res2

<generator object <genexpr> at 0x10f97d620>

In [20]:
for i in res2:
    print('-->', i)

Start
--> AAA
Continue
--> BBB
End.


- Notice no output is produced until the generator is iterated on and it has to produce values. Only then does the body of gen_AB execute, yielding values and printing output.

In [21]:
# Sentence - Take 5.
import re
import reprlib

RE_WORD = re.compile('\w+')


class Sentence:
    def __init__(self, text):
        self.text = text

    def __repr__(self):
        return 'Sentence(%s)' % reprlib.repr(self.text)

    def __iter__(self):
        # Use a generator expression to build a generator and return it.
        return (match.group() for match in RE_WORD.finditer(self.text))
