## Generators

Python generators are a simple way of creating iterators. All the work we mentioned in the previous notes are automatically handled by generators in Python. Simply speaking, a generator is a function that returns an object (iterator) which we can iterate over (one value at a time).

Generators simplify the creation of iterators. A generator is a function that produces a sequence of results instead of a single value. Rather than using `return` to return a value once, generator functions use `yield` to yield a (potentially infinite) sequence of values:

In [38]:
def yrange(n):
    i = 0
    while i < n:
        yield i
        i += 1

Each time the yield statement is executed, the function generates a new value.

In [91]:
y = yrange(3)
print(y)

<generator object yrange at 0x7ff5d3314d60>


In [92]:
next(y)

0

So, a generator is also an iterator. 

The word "generator" is confusingly used to mean both the function that generates and what it generates. We'll refer to the generated object as a "generator" and the function that generates it as a "generator function".

When a generator function is called, it returns a generator object without even beginning execution of the function. When the next method is called for the first time, the function starts executing until it reaches the yield statement. The yielded value is returned by the next call.

The following example demonstrates the interplay between the yield statement and the next method on a generator object:

In [121]:
def integers():
    """Infinite sequence of integers."""
    i = 1
    while True:
        yield i
        i = i + 1 # i += 1
        
def squares():
    for i in integers():
        yield i * i # i ** 2
        
def take(n, seq):
    """Returns first n values from the given sequence."""
    result = []
    try:
        for i in range(n):
            result.append(next(seq))
    except StopIteration:
        pass # nothing happens, but you avoid getting an error when empty code is not allowed
    return result

print(take(5, squares()))

[1, 4, 9, 16, 25]


In [119]:
x = squares()

In [120]:
next(x)

1

In [34]:
help(integers)

Help on function integers in module __main__:

integers()
    Infinite sequence of integers.



### Generator Expressions

Generator expressions are the generator version of list comprehensions. They look like list comprehensions, but return a generator instead of a list:

In [128]:
a = (x * x for x in range(100))
print(a)

<generator object <genexpr> at 0x7ff5d331e190>


In [125]:
sum(a)

328350

We can use the generator expressions as arguments to various functions that consume iterators.

Notice that printing the generator expression does not print the contents; one way to print the contents of a generator expression is to pass it to the `list` constructor:

In [122]:
G = (n ** 2 for n in range(12))
list(G)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121]

### Lists - Collection of Values; Generator - Recipe for Producing Values

When you create a list, you are actually building a collection of values, and there is some memory cost associated with that. When you create a generator, you are not building a collection of values, but a recipe for producing those values. Both expose the same iterator interface, as we can see here:

In [9]:
L = [n ** 2 for n in range(12)]
for val in L:
    print(val, end=' ')

0 1 4 9 16 25 36 49 64 81 100 121 

In [10]:
G = (n ** 2 for n in range(12))
for val in G:
    print(val, end=' ')

0 1 4 9 16 25 36 49 64 81 100 121 

The difference is that a generator expression does not actually compute the values until they are needed. This not only leads to memory efficiency, but to computational efficiency as well. This also means that while the size of a list is limited by available memory, the size of a generator expression is unlimited.

An example of an infinite generator expression can be created using the `count` iterator defined in `itertools`. The `count` iterator will go on happily counting forever until you tell it to stop; this makes it convenient to create generators that will also go on forever:

In [132]:
from itertools import count
factors = [2, 3, 5, 7]
G = (i for i in count() if all(i % n > 0 for n in factors))
for val in G:
    print(val, end=' ')  
    if val > 40: 
        break

1 11 13 17 19 23 29 31 37 41 

If we were to expand the list of factors appropriately, we would have the beginnings of a prime number generator.

### Lists Can Be Interated Multiple Times; Generator Expressions are Single-Use

This is one of those potential gotchas of generator expressions. With a list, we can straightforwardly do this:

In [14]:
L = [n ** 2 for n in range(12)]
for val in L:
    print(val, end=' ')
    
print()
    
for val in L:
    print(val, end=' ')

0 1 4 9 16 25 36 49 64 81 100 121 
0 1 4 9 16 25 36 49 64 81 100 121 

A generator expression, on the other hand, is used up after one iteration:

In [17]:
G = (n ** 2 for n in range(12))
list(G)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121]

In [18]:
list(G)

[]

This can be very useful because it means iteration can be stopped and started:

In [19]:
G = (n ** 2 for n in range(12))
for n in G:
    print(n, end=' ')
    if n > 30:
        break
        
print("\ndoing something in between")

for n in G:
    print(n, end=' ')

0 1 4 9 16 25 36 
doing something in between
49 64 81 100 121 

### Example: Prime Number Generator

In [21]:
def gen_primes(N):
    """Generate primes up to N"""
    primes = set()
    for n in range(2, N):
        if all(n % p > 0 for p in primes):
            primes.add(n)
            yield n
            
print(list(gen_primes(100)))

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
