# Iterators, Containers, Generators

Talks by David Beazley:
  * [Generator Tricks for Systems Programmers](http://www.dabeaz.com/generators/) [[downloaded pdf](./Generators.pdf)]
  * [Generators: The Final Frontier](http://www.dabeaz.com/finalgenerator/) [[downloaded pdf](./FinalGenerator.pdf)]

## Summary

Iterators, Iterables, Containers, and Generators are all objects/functions that semantically represent a collection of items that can be looped over. There are two built-in functions `next(obj)` and `iter(obj)` that will internally call the passed in objects `obj.__next__()` and `obj.__iter__()` methods. The usual calling pattern is -

```
it = iter(obj)
x = next(it)
```

### Iterator
Classes that have to implement the `__iter__` and `__next__` methods where the `__next__` method will raise `StopIteration` if there are no more elements. The `__iter__` method just returns `self`.

### Containers
Classes that have to implement the `__iter__` method that should return an iterator object. These classes do not have to implement the `__next__` method as the underlying iterator that they returned does that. However, in order to be a container, the class must have access to some iterator that it can return that in turn can loop over the containing elements.

### Generators
Functions that `yield` a value are actually returning an object of `<class 'generator'>` type. The generators are internally simply iterators, i.e., when a generator object is passed to the `iter(gen)` function it will return itself which can then be looped over.

### Iterables
Classes that only have to implement the `__iter__` method. They don't have to implement `__next__` because the object retured by `__iter__` does it. Generators and containers are examples of iterables. 

### Common Patterns
Generator functions are an easy way to implement containers. The `__iter__` method of the container can simply `yeild` values. This means that the `__iter__` method's return type is `generator` which is an iterator. This meets the requirements to be a container. So far I have never had to implement a pure iterator or a container. In most cases I implement a generator.

If I want to wrap an existing iterable in a generator, then I can use the `yeild from` syntax show below.


## Iterable vs. Iterator

An **iterator** is an object:

  * With state that remembers where it is during iteration.
  * With a `__next__` method that:
    - Returns the next value in the iteration
    - Updates the state to point at the next value
    - Signals when it is done by raising `StopIteration`
  * Is self-iterable, i.e., it has an `__iter__` method that returns `self`.

An **iterable** is:

  * Anything that can be looped over
  * Anything you can call with `iter()` that will return an **iterator**
  * An object that defines `__iter__` that returns a fresh **iterator**, or it may have a `__getitem__` method suitable for indexed lookup.

## Iterators
Any object that support the **iterator protocol**, i.e, the following two methods are called Iterators.

`iterator.__iter__()`

Return the iterator object itself.

<p>&nbsp;</p>

`iterator.__next__()`

Return the next item in line. If there are no further items, raise the `StopIteration` exception. Once this happens, all subsequent calls must also raise the exception.

In [2]:
class FibonacciIterator:
    def __init__(self, capacity=10):
        self._num1 = 0
        self._num2 = 1
        self._capacity = capacity
        self._cursor = 0
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self._cursor < self._capacity:
            ans = self._num1 + self._num2
            self._num1, self._num2 = self._num2, ans
            self._cursor += 1
            return ans
        else:
            raise StopIteration("Capacity exceeded!")
            

for x in FibonacciIterator(5):
    print(x)

1
2
3
5
8


## Builtins

### iter
The builtin `iter(obj)` function calls the `__iter__()` method on the passed in object.

### next
The bultin `next(obj)` function calls the `__next__()` method on the passed in object. It does not handle the `StopIteration` exception.

<p>&nbsp;</p>

These two functions are used when we call a for loop on iterators:
  1. The iterator is instantiated by calling `obj = FibonacciIterator(5)`.
  2. The builtin `i = iter(obj)` is called, which returns the FibonacciIterator object itself. This seems like a noop but will come in handy next when we use containers and generators instead of raw iterators.
  3. The bulitin `x = next(i)` is called getting the first value in `x`. After that `next(i)` is called repeatedly until a `StopIteration` exception occurs.

In [3]:
fib_nums = FibonacciIterator(3)  # Step 1 in a for loop

iterator = iter(fib_nums)  # Step 2 in the for loop
print(type(iterator))  # Note that the type of the iterator is the same

x1 = next(iterator)  # Step 3 in the for loop
x2 = next(iterator)  
x3 = next(iterator)
print(x1, x2, x3)

try:
    next(fib_nums)
except StopIteration as si:
    print("StopIteration: ", si)

<class '__main__.FibonacciIterator'>
1 2 3
StopIteration:  Capacity exceeded!


## Containers
These are wrappers for iterators for when I want to get iterator semantics without exposing the underlying iterator. They only have to implement the `__iter__()` method which will return the underlying iterator.

The same 3 steps are performed when containers are used in the context of a for loop:
  1. The container object is instantiated `obj = FibonacciNumbers(3)`.
  2. The builtin `i = iter(obj)` is called to get the underlying iterator.
  3. The bulitin `x = next(i)` is called getting the first value in `x`. After that `next(i)` is called repeatedly until a `StopIteration` exception occurs.

In [4]:
class FibonacciNumbers:
    def __init__(self, num1, num2, capacity):
        self._iterator = FibonacciIterator(capacity=capacity)
        self._iterator._num1 = num1
        self._iterator._num2 = num2
    
    def __iter__(self):
        return self._iterator
    

for x in FibonacciNumbers(5, 8, 3):
    print(x)

13
21
34


In [5]:
fib_nums = FibonacciNumbers(5, 8, 3)

fib_iter = iter(fib_nums)
print(type(iterator))

x1 = next(fib_iter)
x2 = next(fib_iter)
x3 = next(fib_iter)
print(x1, x2, x3)

try:
    next(fib_iter)
except StopIteration as si:
    print("StopIteration: ", si)

<class '__main__.FibonacciIterator'>
13 21 34
StopIteration:  Capacity exceeded!


## Generators
Generators are like iterators. However, unlike previous examples where iterators and containers were being defined as classes which implement the iterator protocol, generators can be defined as a function. Any function that `yields` a value actually has a return type of `generator`.

The same 3 steps are performed in the context of a for loop:
  1. The generator object is instantiated by calling the function `g = genfunc()`.
  2. The builtin `i = iter(g)` is called to get the underlying iterator, which is the generator itself.
  3. The builtin `x = next(i)` is called repeatedly until `StopIteration` is raised.

In [6]:
def gen_fibs(cap):
    i, j = 0, 1
    curr = 0
    while curr < cap:
        x = i + j
        yield x
        i, j = j, x
        curr += 1
    
for x in gen_fibs(3):
    print(x)

1
2
3


In [7]:
obj = gen_fibs(3)
print(type(obj))
i = iter(obj)
print(type(i))
x1 = next(i)
x2 = next(i)
x3 = next(i)
print(x1, x2, x3)

try:
    next(i)
except StopIteration as si:
    print("StopIteration: ", si)

<class 'generator'>
<class 'generator'>
1 2 3
StopIteration:  


### Generator Classes
These are really just containers. While generators can be defined as functions, it is also possible to implement a generator as a class. Strictly speaking this is a container with a single `__iter__()` method that is supposed to return the underlying iterator. However, any function that `yield`s a value implicitly returns a generator object, which is an iterator because internally it implements the iterator protocol. So having my container's `__iter__()` method `yield` values will do the trick.

In [8]:
class FibonacciGenerator:
    def __init__(self, num1, num2, capacity):
        self._num1 = num1
        self._num2 = num2
        self._capacity = capacity
        
    def __iter__(self):
        curr = 0
        while curr < self._capacity:
            x = self._num1 + self._num2
            yield x
            self._num1, self._num2 = self._num2, x
            curr += 1
        
        
for x in FibonacciGenerator(5, 8, 3):
    print(x)

13
21
34


In [9]:
obj = FibonacciGenerator(5, 8, 3)
print(type(obj))
i = iter(obj)
print(type(i))
x1 = next(i)
x2 = next(i)
x3 = next(i)
print(x1, x2, x3)

try:
    next(i)
except StopIteration as si:
    print("StopIteration: ", si)

<class '__main__.FibonacciGenerator'>
<class 'generator'>
13 21 34
StopIteration:  
