# Generators

In computer science, a generator is a special routine that can be used to control the iteration behavior of a loop.

A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator.

Python provides tools that produce results only when needed:
1. **Generator functions:** They are coded as normal **def** but use **yield** to return results one at a time, suspending and resuming.
2. **Generator expressions:** These are similar to the list comprehensions. But they return an object that produces results on demand instead of building a result list.

Because neither of them constructs a result list all at once, they save memory space and allow computation time to be split by implementing the **iteration protocol.**

## Generator Functions: yield vs. return

We can write functions that send back a value and later be resumed by picking up where they left off. Such functions are called **generator functions** because they generate a sequence of values over time.

Generator functions are not much different from normal functions and they use defs. When created, however, they are automatically made to implement the iteration protocol so that they can appear in iteration contexts.

Normal functions return a value and then exit. But generator functions automatically **suspend and resume** their execution. Because of that, they are often a useful alternative to both computing an entire series of values up front and manually saving and restoring state in classes. Because the state that generator functions retain when they are suspended includes their local scope, their local variables retain information and make it available when the functions are resumed.

The primary difference between generator and normal functions is that a generator **yields** a value, rather than **returns** a value. The yield suspends the function and sends a value back to the caller while retains enough state to enable the function immediately after the last yield run. This allows the generator function to produce a series of values over time rather than computing them all at once and sending them back in a list.

Generators are closely bound up with the iteration protocol. Iterable objects define a **__next__()** method which either returns the next item in the iterator or raises the special **StopIteration** exception to end the iteration. An object's iterator is fetched with the **iter** built-in function.

The **for** loops use this iteration protocol to step through a sequence or value generator if the protocol is suspended. Otherwise, iteration falls back on repeatedly indexing sequences.

To support this protocol, functions with yield statement are compiled specially as generators. They return a generator object when they are called. The returned object supports the iteration interface with an automatically created __next__() method to resume execution. Generator functions may have a return simply terminates the generation of values by raising a StopIteration exceptions after any normal function exit.

The net effect is that generator functions, coded as def statements containing yield statement, are automatically made to support the iteration protocol and thus may be used any iteration context to produce results over time and on demand.

Let's look at the interactive example below:

In [2]:
def create_counter(n):
    print('create_counter()')
    while True:
        yield n
        print('increment n')
        n += 1

c = create_counter(2)
print(c)

print(next(c))
print(next(c))
print(next(c))

<generator object create_counter at 0x7fe13b894eb0>
create_counter()
2
increment n
3
increment n
4


Here are the things happening in the code:
1. The presence of the yield keyword in create_counter() means that this is not a normal function. It is a special kind of function which generates values one at a time. We can think of it as a resumable function. Calling it will return a generator that can be used to generate successive values of n.
2. To create an instance of the create_counter() generator, just call it like any other function. Note that this does not actually execute the function code. We can tell this because the first line of the create_counter() function calls print(), but nothing was printed from the line:

In [3]:
c = create_counter(2)

3. The create_counter() function returns a generator object.
4. The next() function takes a generator object and returns its next value. The first time we call next() with the counter generator, it executes the code in create_counter() up to the first yield statement, then returns the value that was yielded. In this case, that will be 2, because we originally created the generator by calling create_counter(2).
5. Repeatedly calling next() with the same generator object resumes exactly where it left off and continues until it hits the next yield statement. All variables, local state, &c.; are saved on yield and restored on next(). The next line of code waiting to be executed calls print(), which prints increment n. After that, the statement n += 1. Then it loops through the while loop again, and the first thing it hits is the statement yield n, which saves the state of everything and returns the current value of n (now 3).
6. The second time we call next(c), we do all the same things again, but this time n is now 4.
7. Since create_counter() sets up an infinite loop, we could theoretically do this forever, and it would just keep incrementing n and spitting out values. 

## Generator Expressions: Iterators with Comprehensions

The notions of iterators and list comprehensions have been combined in a new feature, **generator expressions**. Generator expressions are similar to list comprehensions, but they are enclosed in parentheses instead of square brackets:

In [4]:
# List comprehension makes a list
l = [ x ** 3 for x in range(5)]
print(l)

# Generator expression makes an iterable
g = (x ** 3 for x in range(5))
print(g)

[0, 1, 8, 27, 64]
<generator object <genexpr> at 0x7fe123ff42e0>


Actually, coding a list comprehension is essentially the same as wrapping a generator expression in a **list** built-in call to force it to produce all its results in a list at once:

In [5]:
list(x ** 3 for x in range(5))

[0, 1, 8, 27, 64]

But in terms of operation, generator expressions are very different. Instead of building the result list in memory, they return a generator object. The returned object supports the **iteration protocol** to yield one piece of the result list at a time in any iteration context:

In [6]:
Generator = (x ** 3 for x in range(5))
print(next(Generator))
print(next(Generator))
print(next(Generator))
print(next(Generator))
print(next(Generator))

0
1
8
27
64


Typically, we don't see the **next** iterator machinery under the hood of a generator expression like this because of **for** loops trigger the **next** for us automatically:

In [7]:
for n in (x ** 3 for x in range(5)):
    print('%s, %s' % (n, n * n))

0, 0
1, 1
8, 64
27, 729
64, 4096


Generator expressions are a memory-space optimization. They do not require the entire result list to be constructed all at once while the square-bracketed list comprehension does. They may also run slightly slower in practice, so they are probably best used only for very large result sets.

## Generator: Functions vs. Expressions

The same iteration can be coded with either a generator function or a generator expression. Let's look at the following example which repeats each character in a string five times:

In [8]:
G = (c * 5 for c in 'Python')
list(G)

['PPPPP', 'yyyyy', 'ttttt', 'hhhhh', 'ooooo', 'nnnnn']

The equivalent **generator function** requires a little bit more code but as a multistatement function, it will be able to code more logic and use more state information if needed:

In [9]:
def repeat5times(x):
    for c in x:
        yield c * 5
G = repeat5times('Python')
list(G)

['PPPPP', 'yyyyy', 'ttttt', 'hhhhh', 'ooooo', 'nnnnn']

Both expressions and functions support automatic and manual iteration. The **list** we've got in the above example iterated automatically.