## Item 16: Consider Generators Instead of Returning Lists

* Generator functions do not actually run but instead immediately return an iterator.
* With each call to the next built-in function, the iterator will advance the generator to its next yield expression.
* Each value passed to yield by the generator will be returned by the iterator to the caller.

### Simple function

In [None]:
# simple function using append method

def index_words(text):
    results = []
    if text:
        results.append(0)
    for i, letter in enumerate(text):
        if letter == ' ':
            results.append(i + 1)
    return results          

In [None]:
# input text
address = "Four score and seven years ago..."

In [None]:
res = index_words(address)
res

* Problem 1

* The code is a bit dense and noisy
    * Each time a new result is found, I call the append method.
    * There is one line for creating the result list and another for returning it.
    * A better way to write this function is using a generator.


* It requires all results to be stored in the list before being returned
    * For huge inputs, this can cause your program to run out of memory and crash.

### Better way using a generator

* Generator functions do not actually run but instead immediately return an iterator.
* With each call to the next built-in function, the iterator will advance the generator to its next yield expression.
* Each value passed to yield by the generator will be returned by the iterator to the caller.

In [None]:
# easier to read; all interactions with the result list have been eliminated
def index_words_iter(text):
    if text:
        yield 0
    for i, letter in enumerate(text):
        if letter == ' ':
            yield i + 1

In [None]:
index_words_iter(address)

In [None]:
res_2 = list(index_words_iter(address))

In [None]:
res_2

In [None]:
iter(res_2)

* Problem 2

    * It requires all results to be stored in the list before being returned.
    * For huge inputs, this can cause progran to run out of memory and crash.

### generator that streams input from a file

* The working memory for this function is bounded to the maximum length of line line of input.

In [None]:
def index_file(f):
    """
    A generator that streams input from a file one line at a time
    and yields outputs one word at a time
    """
    offset = 0
    for line in f:
        if line:
            yield offset
        for letter in line:
            offset += 1
            if letter == ' ':
                yield offset    

In [None]:
f = "../data/address.txt"

In [None]:
# the iterator returned are stateful and can't be reused

with open(f) as f:
    it = index_file(f)
    res = iter(it)
    print(list(res))

### Things to Remember

* Using generators can be clearer than the alternative of returning lists of accumulated results.
* The iterator returned by a generator produces the set of values passed to yield expressions within the generator function’s body.
* Generators can produce a sequence of outputs for arbitrarily large inputs because their working memory doesn’t include all inputs and outputs.