# iterators and generators

# What is an Iterator?

An Iterator is any object that implements the Iterator protocol. Iterators support:

- Looping through the object in a `for` loop
- Using `in` to check whether the object contains some element

Most Iterator types, such as `list`, `dict`, and `tuple`, support more! But this extra functionality is not part of the Iterator protocol per se.

In [1]:
t = 'a', 'b', 'c'
for e in t:
    print(e)

a
b
c


#### `iter()`, `next()`, and `StopIteration`

Let's consider the protocol more formally:

- First, `iter()` provides an actual iterator object
- Then, `next()` retrieves elements from this iterator object
- Until the iterator is exhausted, in which case a `StopIteration` is raised

In [2]:
i = iter(t)
print(next(i))
print(next(i))
print(next(i))
print(next(i))

a
b
c


StopIteration: 

In [None]:
i = iter(t)
while True:
    try:
        e = next(i)
    except StopIteration:
        break
    print(e)


# Creating your own Iterator

An `Iterator` is a class that implements (at least) the following methods:

- `__iter__()`
- `__next__()`

Let's implement an `Iterator` that allows you to walk through its elements in random order.

In [3]:
import random


class RandomIterator:
    
    def __init__(self, *elements):
        
        self._elements = list(elements)
        
    def __iter__(self):
        
        random.shuffle(self._elements)
        self._cursor = 0
        return self
    
    def __next__(self):
        
        if self._cursor >= len(self._elements):
            raise StopIteration()
        e = self._elements[self._cursor]
        self._cursor += 1
        return e


Now let's see our `RandomIterator` in action!

In [4]:
i = RandomIterator(1, 2, 3)
for e in i:
    print(e)
print('--')
for e in i:
    print(e)


2
3
1
--
3
1
2


# Exploring Generators

A Generator is a function with one or more `yield` statements. Each `yield` returns a value and suspends the Generator. When a suspended Generator is called again, it picks up where it left off.

In [5]:
def my_generator():
    
    yield 'a'
    yield 'b'
    yield 'c'

Because a Generator is an Iterator, they support `for` and `in`.

In [6]:
print('d' in my_generator())

False


In [7]:
print('c' in my_generator())

True


You can also use `next()` to iterate through a Generator--after all, it is an Iterator!

In [8]:
g = my_generator() # iter() is initialized here
print(next(g))
print(next(g))
print(next(g))
print(next(g))

a
b
c


StopIteration: 

#### Sending information to a Generator

You can use `send()` to send information to a Generator. Inside the Generator, this will become the return value of `yield`. Because the Generator gives output before it receives input, the first `send()` must send `None`! The flow information between a Generator and the caller of the Generator can be tricky to follow.

Let's define a Generator that gives random replies until it receives 'Bye'.

In [9]:
import random

SENTENCES = [
    'How are you?',
    'Fine, thank you!',
    'Nothing much',
    'Just chillin'
]


def random_conversation():
    
    recv = yield 'Hi'
    while recv != 'Bye!':
        recv = yield random.choice(SENTENCES)


So how can we use this Generator?

In [10]:
g = random_conversation() # init (like calling iter())

print(g.send(None))
while True:
    try:
        x = input()
        print(x)
        reply = g.send(x)
    except StopIteration:
        break
    print('>>> ' + reply)
print('Conversation over!')


Hi
1
>>> Just chillin
Bye!
Conversation over!


# Lazy Evaluation

#### Eager evaluation

Let's consider a function that generates a fibonacci series of length `n`. This is an *eager* implementation, because the full series is created and held in memory at once.

In [11]:
def eager_fibonacci(n):
    
    l = [1, 1]
    for i in range(n-2):
        l.append(sum(l[-2:]))
    return l

print(eager_fibonacci(10))

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]


#### Lazy evaluation

Now let's consider a Generator function that also generates a fibonacci series, but does so one number at a time. This is a *lazy* implementation, because only two numbers are held in memory at once (we need two numbers in order to determine the next number in the series). We also no longer need to specify the length of the series is advance. We just keep going!

In [12]:
def lazy_fibonacci():
    
    yield 1
    yield 1
    l = [1, 1]
    while True:
        l = [l[-1], sum(l[-2:])]
        yield l[-1]

        
for i, f in enumerate(lazy_fibonacci()):
    if i == 10:
        break
    print(f)


1
1
2
3
5
8
13
21
34
55


# Coroutines

#### Two Generators in parallel

The simplest way to run two Generators in parallel is to `zip()` them together in a `for` loop.

In [13]:
def fibonacci():
        
    yield 1
    yield 1
    l = [1, 1]
    while True:
        l = [l[-1], sum(l[-2:])]
        yield l[-1]
        

def tribonacci():

    yield 0
    yield 1
    yield 1
    l = [0, 1, 1]
    while True:
        l = [l[-2], l[-1], sum(l[-3:])]
        yield l[-1]


for i, (f, t) in enumerate(zip(fibonacci(), tribonacci())):
    if i == 10:
        break
    print(f, t)


1 0
1 1
2 1
3 2
5 4
8 7
13 13
21 24
34 44
55 81


#### Letting two Generators communicate

Let's consider a slightly more complicated example in which two Generators need to communicate with each other. The `speaker` returns random sentences. These are received by the `replier`, who aborts the conversation when the `speaker` has said 'Bye!' and otherwise replies with a random sentence.

In [20]:

import random


SENTENCES = [
    'How are you?',
    'Fine, thank you!',
    'Nothing much',
    'Just chillin',
    'Bye!'
    ]


def speaker():
    
    while True:
        yield random.choice(SENTENCES)

        
def replier():
    
    while True:
        recv = yield
        print('Received: %s' % recv)
        if recv == 'Bye!':
            break
        print('Replied: %s' % random.choice(SENTENCES))
        
s = speaker()
r = replier()
r.send(None)
while True:
    recv = s.send(None)
    try:
        r.send(recv)
    except StopIteration:
        break


Received: Nothing much
Replied: Nothing much
Received: How are you?
Replied: Fine, thank you!
Received: Just chillin
Replied: Nothing much
Received: Bye!


# Collections - Convenience Iterators

#### `collections.namedtuple`

`namedtuple()` is a factory function; that is, it generates a class, and not an instance of a class (an object).

So there are two steps:

1. First, use `namedtuple()` to generate a `class`
2. And then create an instance of this `class`

In [21]:
from collections import namedtuple

Person = namedtuple('Person', ['name', 'age'])
jay_z = Person(name='Sean Carter', age=47)
print('%s is %s years old' % (jay_z.name, jay_z.age))

Sean Carter is 47 years old


In [22]:
name, age = jay_z
print(name, age)

Sean Carter 47


#### `collections.OrderedDict`

If your code requires that the order of elements in a `dict` is preserved, use `OrderedDict`—even if your version of Python (e.g. CPython 3.6) already does this!

Otherwise, `OrderedDict` behaves exactly like a regular `dict`.


`OrderedDict` is ordered iterators

In [23]:
from collections import OrderedDict

d = OrderedDict([
    ('Lizard', 'Reptile'),
    ('Whale', 'Mammal')
])


for species, class_ in d.items():
    print('%s is a %s' % (species, class_))


Lizard is a Reptile
Whale is a Mammal


#### collections.defaultdict

Use `defaultdict` if you want non-existing keys to return a default value, rather than raise a `KeyError`.

Otherwise, `OrderedDict` behaves exactly like a regular `dict`.

In [24]:
from collections import defaultdict

favorite_programming_languages = {
    'Claire' : 'Assembler',
    'John' : 'Ruby',
    'Sarah' : 'JavaScript'
}

d = defaultdict(lambda: 'Python')
d.update(favorite_programming_languages)
print(d['John'])

Ruby


In [26]:
print(d['Xxx'])

Python
