<table style="float: left;">
<tbody>
<tr>
<td ><img src="https://static1.squarespace.com/static/5992c2c7a803bb8283297efe/t/59c803110abd04d34ca9a1f0/1530629279239/" alt="Kenzie Logo" width="93" height="93" /></td>
<td >
<h1>&nbsp;A Brief Tour of Generators&nbsp;</h1>
</td>
</tr>
</tbody>
</table>

Generators are very easy to implement, but a bit difficult to understand.

Generators are used to create _iterators_, but with a different approach. Generators are simple functions which return an _iterable_ set of items, one at a time, in a special way.


When an iteration over a set of item starts using the for statement, the generator is run. Once the generator's function code reaches a "yield" statement, the generator yields its execution back to the for loop, returning a new value from the set. The generator function can generate as many values (possibly infinite) as it wants, yielding each one in its turn.

### Iteration Review
Before we jump into generators, it may be beneficial to review a bit about iteration in Python.  There is a distinction between an _iterator_ and an object that is _iterable_.  What exactly is an _iterator_ ?

 - An iterator is an object that allows _iteration_ to be performed on it.
 - Implements an `__iter__` method
 - Implements a `__next__` method
 
Iterators are **stateful** which means they are only able to iterate ONCE over a sequence.  After one traversal of a sequence, the iterator is _exhausted_.

### Generators
Generators arrived way back in Python 2.3, from [PEP-255](https://www.python.org/dev/peps/pep-0255/).  This PEP introduced the idea of creating an iterator within a single function by using a single new keyword: `yield`  BUT WAIT you may ask -- "If iterators must be stateful (so they can know when they are exhausted), how can this be done from a function?"

Well, a GENERATOR allows an ordinary function to store iterator state, AND generate the members of a sequence, one at a time.  This is known as **LAZY EVALUATION**.  The ordinary function is transformed into a generator simply by using the `yield` keyword.  The function does not become a Generator until it is invoked (called).

https://docs.python.org/3/glossary.html

> **Generator**: A function which returns a generator iterator. It looks like a normal function except that it contains `yield` expressions for producing a series of values usable in a for-loop or that can be retrieved one at a time with the `next()` function.<br>
Each `yield` temporarily suspends processing, remembering the location execution state (including local variables and pending try-statements). When the generator iterator resumes, it picks up where it left off (in contrast to functions which start fresh on every invocation).

When Python sees a function with a `yield` keyword inside, it treats it differently.

Example:


In [None]:
import random

students = ['james', 'doug', 'michael', 'jen', 'clint', 'jenny', 'michelle', 'lea', 'PK']
def student_spotlight_picker():
    """returns a new random student name, until all students are exhausted"""
    random.shuffle(students)
    print('Shuffled students: ' + str(students))
    for student in students:
        print("Yielding student " + student)
        yield student
        

In [None]:
# Invoke the generator function to get our generator object
it = student_spotlight_picker()

# WHAT THE HELL IS IT???
print(it)
print(type(it))

# Let's peek under the hood
help(it)

### IMPORTANT
Generators are not executed when they are invoked, only when they are _iterated_ over.  This is an important difference between generators and regular functions.  Python knows the function is a generator, and will return a generator object during invokation, without executing it.

After the function produces the generator object, you must iterate that object according to the Python iteration protocol.

In [None]:
# How to use IT?  Just keep callin' it.next()
it.next()

In [None]:
it.next()

In [None]:
next(it)  # Or you can use the builtin next() function

NOTICE that generators _freeze their state_ after a yield statement.  They suspend their state of execution until the next `next()` call. 

In [None]:
# Let's exhaust the rest of them!
# The for-loop simply calls .next() until the StopIteration exception is raised and then it terminates.
for s in it:
    print(s)

The `for-loop` was used above, because it follows the Python iteration protocol.  It will continue calling the iterator's `.next()` method until a `StopIteration` exception is raised.

Let's try it again, but without a for-loop this time, so we can see the `StopIteration` exception:

In [None]:
def simple_gen():
    seq = [1, 2, 3, 4]
    for i in seq:
        yield i

it = simple_gen()
# Looky, no for-loop!
print(it.next())
print(it.next())
print(it.next())
print(it.next())
# Wait for it ....
print(it.next())


### Generator return statement
Any `return` statement within a generator function will raise a `StopIteration` exception

In [None]:
def simple_gen():
    yield 'Michael'
    yield 'Jenny'
    yield 'Lea'
    return  # raises StopIteration!

it = simple_gen()
print(it.next())
print(it.next())
print(it.next())
print(it.next())

### What happens if we use `iter()` on a generator?

In [None]:
def simple_gen():
    yield 'PK'
    yield 'Doug'
    yield 'James'
    return
gen = simple_gen()

if iter(gen) == gen.__iter__() == gen:
    print("Same generator object instance!")
    
if gen is iter(gen):
    print("Generator is it's own iterator!")

if id(gen) == id(iter(gen)):
    print("Stop me when this gets old")

## Excercise
In the previous example, we knew all the student names up front.  Could have just written a for-loop to iterate the list itself.  But what if the list is potentially huge?  Like an infinite series of numbers?


<img align=left src="https://upload.wikimedia.org/math/7/6/f/76f99713cf111eb035d908228c351710.png" width=200px/>
<br clear=left>



Write a generator function which returns the Fibonacci series. They are calculated using the following formula: The first two numbers of the series is always equal to 1, and each consecutive number returned is the sum of the last two numbers. Hint: Can you use only two variables in the generator function? Remember that assignments can be done simultaneously...



In [None]:
# fill in this function
def fib():
    pass

# testing code
import types
# make sure you wrote a generator!
assert type(fib()) is types.GeneratorType, "You did not write a generator function!"

# If you make it here, lets run your fib generator 10 times
for i, n in enumerate(fib()):
    print("fib(%d) = %d" % (i, n))
    if i == 10:
        break
    

In [None]:
# Here is one solution (b64)
soln = 'CmRlZiBmaWIoKToKICAgIGEsIGIgPSAwLCAxCiAgICB3aGlsZSBUcnVlOiAgICAgICAgICAgICMgRmlyc3QgaXRlcmF0aW9uOgogICAgICAgIHlpZWxkIGEgICAgICAgICAgICAjIHlpZWxkIDAgdG8gc3RhcnQgd2l0aCBhbmQgdGhlbgogICAgICAgIGEsIGIgPSBiLCBhICsgYiAgICAjIGEgd2lsbCBub3cgYmUgMSwgYW5kIGIgd2lsbCBhbHNvIGJlIDEsICgwICsgMSkK'

### When should I use a generator?
The general rule of thumb is that a generator can replace any function that returns a list.  Look for a function pattern that accumulates something into a list, during a loop.

To use a generator instead, just insert a `yield` statement at the point of accumulation.

Example:

In [None]:
# A familiar function to all ...
def div_by_5_and_7(max_num):
    """Returns a list of numbers that are divisible by 5 AND 7"""
    result = []
    for n in range(1, max_num + 1):
        if not n%5 and not n%7:
            result.append(n)
    return result

div_by_5_and_7(500)

In [None]:
# Presto Chango
def div_by_5_and_7(max_num):
    """Returns a list of numbers that are divisible by 5 AND 7"""
    for n in range(1, max_num + 1):
        if not n%5 and not n%7:
            yield n
            
list(div_by_5_and_7(500))

## Real-world example: Database Chunking
This function acts as a wrapper around `dbcursor.fetchmany()`.  A business may use very large datasets for analytics or reporting.  If the dataset is larger than the OS system memory, it's not possible to fetch the entire set from a single database read.  However if the data is fetched one row at a time, this imposes a large network time cost.

In [None]:
def fetch_many_wrapper(dbcursor, count=20000):
    """Fetch data in chunks, instead of one row at a time"""
    done = False
    while not done:
        items = dbcursor.fetchmany(count)
        done = len(items) == 0
        if not done:
            for item in items:
                yield item

## Conclusions
Python generators are a powerful, but misunderstood tool. They are often treated as too difficult a concept for
beginning programmers to learn — creating the illusion that beginners should hold off on learning generators until they are ready.

Generators are lazy because they only give us a value when we ask for it. The ultimate result is that generators are incredibly memory efficient, which makes it a perfect candidate for reading and using "Big Data" files. Once we ask for the next value of a generator, the old value is discarded. Once we traverse the entire generator, it is also discarded from memory as well.

Generators provide for **Lazy evaluation**.  Being lazy is (sometimes) good.