# Iterables, iterators, and generators

We've seen many examples of `for`-loops. Anything that one can iterate over is an _iterable_. Examples of iterables are ranges, lists, strings, sets, dicts, dicts items. 

In [None]:
# Iterating over a dict is the same as iterating over the dict's keys
d = {1: 'a', 2: 'b'}
items = [i for i in d]
print(*items, sep=", ")

In [None]:
# Iterating over a dict's items returns key-value pairs
items = [i for i in d.items()]
print(*items, sep=", ")

In [None]:
# Iterate multiple times over the same range instance
r = range(3)
items_1 = [i for i in r]
items_2 = [i for i in r]
print(*items_1, sep=", ")
print(*items_2, sep=", ")

Common for all iterables is that one can call `iter()` on them. This returns an _iterator_. An iterator is something that one can call `next()` on. An iterator is also an iterable and calling `iter()` on an iterator simply return the iterator itself. However, not all iterables are iterators.

In [None]:
list_iterator = iter([1, 2, 3])
print(next(list_iterator))
print(next(list_iterator))
print(next(list_iterator))
# print(next(list_iterator))

In [None]:
# Calling iter() on an iterator returns the iterator itself
list_iterator = iter([1, 2, 3])
id(list_iterator) == id(iter(list_iterator))

In [None]:
# Iterators are iterables
list_iterator = iter([1, 2, 3])
items = [i for i in list_iterator]
print(*items, sep=", ")

Calling `iter()` on an iterable and then calling `next()` on the iterator is not a common thing to do because a regular `for`-loop takes care of all that. However, it's good to know about them, because you may want to implement your own classes (to be discussed later), such that instances of your class are iterable. Then you'll have to override the `__iter__()` method to return an iterator. A common way of doing this is to let `__iter__()` return the object itself, and implement the `__next__()` method. These methods are invoked when `iter()` and `next()` are called on the instances, and should not be invoked directly otherwise. 

### Lazy evaluation

A list is an example of an iterable that is _eagerly evaluated_ , meaning that all the items of the list are evaluated when the list is created or when they are appended to the list. Not all iterables are eagerly evaluated. For some iterables, items are computed only when calling `next()` on the iterable's iterator. Such iterables are said to be _lazy_. 

Lazy evaluation has the advantage that items are only computed *if* they are needed (saving computation time) and *when* they are needed (saving need for intermediate storage).

In [None]:
from itertools import count

# count() returns an iterator of all ints from 0 to infinity 

for n in count():
    if n > 10000:
        break
n

In [None]:
# Ranges are lazy evaluated in Python3 (but not in Python2)
r = range(10000000000000000000000000000000000000000000)

In [None]:
# Zips are lazy evaluated in Python3 (but not in Python2)
zip(r, r)

__Note:__ Many objects that were eagerly evaluated in Python2 are lazily evaluated in Python3

## Generators

Generators are a special kind of iterators that can easily (i.e., there is no need to define a `class`) be defined either using a generator comprehension or a generator function. We've seen them already, but because they fit so nicely into Python's comprehension framework, you may not have noticed them.

### Generator expressions

Generator expressions are comprehensions that evaluate to a generator. They are similar to list comprehensions, except that parenthesis are used instead of square brackes (`[]`). 

In [None]:
(i for i in range(1, 21) if i % 3 == 0 or i % 5 == 0)

In [None]:
g = _
next(g)

In [None]:
next(g)

In [None]:
list(g)

In [None]:
g = (i for i in range(1, 11))
sum(g)

A special rule applies to when a generator expression is passed as the only argument to a function: In those cases, the parentheses enclosing the generator expression may be dropped. That is why you'll see 

```Python
sum(i for i in range(1, 11))
```
instead of 
```Python
sum((i for i in range(1, 11)))
```

However, if there are more arguments, the parantheses must be present.

In [None]:
next((i for i in range(1, 11) if i % 3 == 0 and i % 5 == 0), 42)

## Generator functions

Generators functions are functions that produce generator. The keyword `yield` is used for the items that the generator produces. 

Example:

In [None]:
def gen_ints(start=0, end=None):
    """Generates a sequence of ints from start (inclusive) to end (exclusive).
    If end is None, the sequence is infinite."""
    i = start
    while not end or i < end:
        yield i
        i += 1

If you are able to define a function that prints a sequence of items to the console, then you are able to define a generator function simply by replacing the print statements with yield statements.

Note that none of the statements in the generator function body are executed at the time the generator function is called. Those statements are only executed at the time that `next()`  is called.

In [None]:
one_to_ten = gen_ints(1, 11) # No item computed yet
print(next(one_to_ten), end=" ") # First item computed
for n in one_to_ten:
    print(n, end=" ") # remaining items computed

Generator functions are useful for decoupling how items are generated from what is done with the items. Let's assume you have a function that prints a bunch of stuff to the console. This function is doing mainly two things: 1) determining what to print, and 2) printing. When these two things are combined into one function, the function is really hard to test. How do you determine that your function is printing the right stuff using an automated test? By mocking the console? However, when you extract the first part into a generator that generates the items to print, you can easily test that generator. 

### Exercises

1. I have a list of str's `privileged_users = ["Erik", "Mike", "Rob"]` and a function

```Python
def get_privileged_users():
    return privileged_users
```

I'm worried that a caller might append other names into the list, yet I want to give general read access to the list. Discuss how I can modify `get_privileged_users` in such a way that a caller may "see" the contents without having access to the list.

In [None]:
privileged_users = ["Erik", "Mike", "Rob"]

def get_privileged_users():
    pass

2. Write the 5 first lines of a file

In [None]:
with open('assets/honeyproduction.csv', mode='r') as f:
    # f is a line (i.e., str) iterator
    # print the 5 first lines of f
    pass
    

3. Write a generator _expression_ for all even numbers from 0 to infinity.
Hint use `itertools.count`

In [None]:
def count_events():
    return None # Replace None with a generator expression

4. Write a function `fizz_buzz(n)` that takes a positive int `n` as argument and returns "Fizz" if `n` is divisible by 3, "Buzz" if `n` is divisible by 5, and "FizzBuzz" if `n` is divisible by both 3 and 5. If `n` is divisible by neither 3 nor 5, the function shall return `n`.

In [None]:
def fizz_buzz(n):
    pass

5. Write a generator expression for the infinite FizzBuzz sequence: 1, 2, Fizz, 4, Buzz, Fizz, 7, ... 

In [None]:
def fizz_buzz_seq():
    return None # Replace None with a generator expression

6. Create a list of the first 100 items in the FizzBuzz sequence

In [None]:
fizz_buzz_100_list = None # Replace None with a list comprehension using fizz_buzz_seq()

Another solution:

In [None]:
from itertools import cycle, count

fizzes = cycle(('', '', 'Fizz'))
buzzes = cycle(('', '', '', '', 'Buzz'))
fizz_buzzes = (fizz + buzz or n 
               for fizz, buzz, n in zip(fizzes, buzzes, count(1)))