Generator functions allow us to write a function that can send back a value and then later resume to pick up where it left off. Allows us to generate a sequence of values over time, rather than hold all of them in memory. 

The main difference in syntax will be the use of a yield statement. When a generator function is compiled they become an object that supports an iteration protocol. That means when they are called in your code they don't actually return a value and then exit. 

Generator functions will automatically suspend and resume their execution and state around the last point of value generation. The advantage is that instead of having to compute an entire series of values up front, the generator computes one values and waits until the next value is called. 

For example, the range() function doesn't produce a list in memory for all the values from start to stop. Instead it just keeps track of the last number and step size, to provide a flow of numbers. If a user did need the list, they have to transform the generato to a list with list(range(0,10)). 

Let's start by creating a normal function. 

In [1]:
def create_cubes(n):
    result = []
    for x in range(n):
        result.append(x**3)
    return result

In [2]:
create_cubes(10)

[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]

The way the code is written with the empty list, we have to keep that entire list in memory. Sometimes that may be useful, but say you wanted to do:

In [3]:
for x in create_cubes(10):
    print(x)

0
1
8
27
64
125
216
343
512
729


We really only needed one value at a time, but we are holding all of them in memory for the entire execution of the for loop. We just need the previous value and the formula to generate the next number. So rather than holding the entire list in memory, it would be better if we yielded the numbers. 

In [4]:
def create_cubes(n):
    for x in range(n):
        yield x**3

In [5]:
for x in create_cubes(10):
    print(x)

0
1
8
27
64
125
216
343
512
729


Notice that we get back the same results, but now create cubes is way more memory efficient. So not the create_cubes() function is a generator. However, keep in mind that if you call create_cubes(10) on its own, you don't get the list, you get a generator object with its location in memory. If you have to iterate through it if you want the numbers. 

In [6]:
create_cubes(10)

<generator object create_cubes at 0x0000024AA383D848>

You can cast the object to a list and get back a list. 

In [7]:
list(create_cubes(10))

[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]

Let's do an example for the Fibonnaci sequence. 

In [9]:
def gen_fibon(n):
    
    a = 1
    b = 1
    for i in range(n):
        yield a
        a,b = b, a+b

In [11]:
for num in gen_fibon(10):
    print(num)

1
1
2
3
5
8
13
21
34
55


The key to really understanding generators is the next() function and the iter() function. 

In [12]:
def simple_gen():
    for x in range(3):
        yield x

In [14]:
for number in simple_gen():
    print(number)

0
1
2


In [16]:
g = simple_gen()

In [17]:
print(next(g))

0


In [18]:
print(next(g))

1


This is what the generator object is going internally when you call the yield keyword - its remembering what the previous value was and returning the next one based on whatever formula it is following. 

The for loop is repeatedly calling next() on the generator object. 

The iter() function basically allows us to iterate through a normal object. For example, we know we can iterate through a string with a for loop. 

In [19]:
s = 'hello'

In [20]:
for letter in s:
    print(letter)

h
e
l
l
o


This doesn't mean we can iterate through s using next(), because the string object is not an iterator...

In [21]:
next(s)

TypeError: 'str' object is not an iterator

To turn the string into a generator so we can directly iterate over it...

In [22]:
s_iter = iter(s)

In [23]:
next(s_iter)

'h'