# Python Generator Functions

In [1]:
import numpy as np
import pandas as pd
import sys


## References

<https://jeffknupp.com/blog/2013/04/07/improve-your-python-yield-and-generators-explained/>  
<http://inmachineswetrust.com/posts/understanding-generators/#cell3>  
<http://tech.pro/tutorial/2136/a-gentle-introduction-to-generators-in-python>  

## Overview

The summary shown here is an extraction from Jeff Knupp's [blog](https://jeffknupp.com/blog/2013/04/07/improve-your-python-yield-and-generators-explained/).

A generator is a function that uses the `yield` keyword (once or more times), which yields a value back to the calling function, the `yield` saves/retains its internal state, and on the next call starts immediately after the line with the `yield` keyword.  This allows the generator to act as a special type of iterator. The generator may have a return keyword (including an implicit `return None` at the end of the function), which terminates the function, destroying all internal state.  When `yield` is encountered some value and flow control is given back to the calling function, but state and current line of execution is kept for use during future calls. Subsequent calls will continue on the line after the `yield` keyword, using the state left behind from the previous call.

In [2]:
def simple_generator_function():
    yield 1
    yield 2
    yield 3

An iterator is an object that provides a series of values when its internal `next()` function is called. Iterator objects can also be used in `for` statements (which calls the `next()` function implicitly).

In [3]:
for value in simple_generator_function():
    print(value)

1
2
3


In [4]:
our_generator = simple_generator_function()
print(next(our_generator))
print(next(our_generator))
print(next(our_generator))


1
2
3


If a generator function calls `return` or reaches the end its definition, a `StopIteration` exception is raised. This signals to whoever was calling `next()` that the generator is exhausted (this is normal iterator behavior).


In [5]:
our_generator = simple_generator_function()
for value in our_generator:
    pass
# uncomment the next line to cause a StopIteration error:
# print(next(our_generator))

Create a new generator by calling the generator function again:

In [6]:
new_generator = simple_generator_function()
print(next(new_generator)) # perfectly valid

1


The following example calculates prime numbers using a generator function.
Note that `yield` is only executed if the number is prime, return the prime value and control back to the iterator (`get_primes`). 

The prime calculation is enclosed in a `while True:` conditional to not fall through to the implied `return None` at the end of the function. This is a fairly common idiom in generators: the function seldom reaches an explicit or implied `return` statement.


In [7]:
def is_prime(number):
    if number > 1:
        if number == 2:
            return True
        if number % 2 == 0:
            return False
        for current in range(3, int(np.sqrt(number) + 1), 2):
            if number % current == 0: 
                return False
        return True
    return False

def get_primes(number,maxNum=sys.maxsize):
    """get primes between number and maxNum (inclusive)
    """
    while number < maxNum+1 :
        if is_prime(number):
            yield number
        number += 1
    return 
        
        

In [8]:
for prime in get_primes(7):
    print(prime)
    if prime > 11:
        break

7
11
13


In [9]:
our_generator = get_primes(7)
print(next(our_generator))
print(next(our_generator))
print(next(our_generator))

7
11
13


## Pandas

Since Pandas 0.19 a generator can be used to initialise the values in a DataFrame.  It seems that the DataFrame constructor will use the generator to fill the rows in the DataFrame.  
https://stackoverflow.com/questions/18915941/create-a-pandas-dataframe-from-generator  
https://codereview.stackexchange.com/questions/162402/importing-database-of-4-million-rows-into-pandas-dataframe


In [10]:
pd.DataFrame(get_primes(7,20))

Unnamed: 0,0
0,7
1,11
2,13
3,17
4,19


## Tuple-like generators

List comprehensions are a convenient way to construct a customized list object. For example, let's create a list containing the cubes of even integers between 0 and 20 inclusive and display each element.

In [11]:
cubes_list = [x ** 3 for x in range(21) if x % 2 == 0]

for i in cubes_list:
    print(i, end=' ')

0 8 64 216 512 1000 1728 2744 4096 5832 8000 

If we swap out the brackets for parentheses in the list comprehension, we have a generator expression, which produces a generator that successively yields the same sequence of numbers. It's very important to note that a generator expression is not a tuple comprehension. That is, a generator expression yields a generator object, not a tuple object.

In [12]:
cubes_gen = (x ** 3 for x in range(21) if x % 2 == 0)

for i in cubes_gen:
    print(i, end=' ')

0 8 64 216 512 1000 1728 2744 4096 5832 8000 