# Introduction to Python Generators
by: PyLadies London (2016)

** Disclaimer: The authors of this Jupyter notebook take no responsibility for any crashes, chaotic-computing-situations or general programming shenanigans that may follow from using this code **


A generator is a language feature that allows a function to return values in a lazy manner ( on-demand ) instead of all at one. Generators were first introduced into Python in 2001 (PEP - Python Enhancement Proposal - 255 Simple Generators [PEP-255](https://www.python.org/dev/peps/pep-0255/) ). In the sections below, we will explore some very basic aspects of Python generators and how to write them. 

(Please note: most of the code below is just for demo and does not represent 'clean, beautiful or otherwise desirable code style'. Use at your own peril. )

(Also please note that this is a beginner/intermediate workshop, so if you are a Python-ninja-super-rockstar-programmer-Guido-van-Rossum guru, you will probably spend this workshop bored to tears. :D )

(And the very last note, I promise, is that the humour during this workshop will be bad. I apologise in advance. )

### Pre-requisites
Very basic familiarity with the following Python features are required for this tutorial:
* variables, assigning variables, basic arithmetic manipulation
* functions 
* for -loops
* some example will include things like list comprehensions

### Motivating Example 
Let's start with a small example. Suppose we want to write a function ``generate_number`` which takes some value ``n`` and returns numbers that are multiples of 3 of each number until n. 
We can quickly come up with a function to give us the desired result. 

In [1]:
#sample function 
def generate_number(n):
    return [3*i for i in range(n)]
# let's try it out
generate_number(4)

[0, 3, 6, 9]

For small sizes of ``n``, you will probably be find with returning the entire list and keeping it in memory for the duration of your program. However, imagine someone decided to call this function with n=1000 or n=100 000 multiple times. Your Python process might carrying around some heavy memory baggage! In fact, we can use a tool ``psutil`` to measure how much memory (there are various kinds of memory that you can measure on your PC, I'm being a bit vague here intentionally ) to measure what happens to a Python process when we start generating large amounts of lists to keep in -memory. ( see here for a [handy introduction to psutil](http://fa.bianp.net/blog/2013/different-ways-to-get-memory-consumption-or-lessons-learned-from-memory_profiler/) by Fabian Pedregosa - you should note that the blog post is from 2013 and the API for psutil has change a bit. For the most up-to-date information, please check the [docs](https://pythonhosted.org/psutil/#quick-links) ).  

In [2]:
#mini program to measure memory usage 
import psutil 
import os # this module contains various useful libraries for interacting with the OS

def generate_number(n):
    return [3*i for i in range(n)]

def measure_proc_memory():
    proc = psutil.Process(os.getpid()) 
    mem_info = proc.memory_info()[0]
    print 'Memory consumed: ', mem_info

def main():
    some_list = generate_number(1000)
    measure_proc_memory()
    
    
main()
    

ImportError: No module named psutil

In fact, if you experiment a bit, you will notice that at some point Python will throw a ``MemoryError`` at you. 
If you don't need to carry out operations on the whole list at one, you can use a ``generator`` to generate elements of the list in an on-demand fashion. Let's convert the original ``generate_numbers`` function into a generator. 

### Writing a generator function

In [3]:
# convert generate_numbers into a generator 
def gen_numbers(n):
    for i in range(n):
        yield 3*i

There is one key concept introduced in the example above
* the ``yield`` keyword
Instead of ``returning`` the entire list like we did in the original example, we are now ``yielding`` individual elements of the list. 

In [4]:
gen_numbers(4)

<generator object gen_numbers at 0x7f9dc8d67280>

The second thing you should notice is that calling the ``gen_numbers`` function did not return any numbers. Instead we got a generator object which we can iterate through to produce numbers on-demand.

In [5]:
#create a generator object and retrieve its values

In [6]:
gen_obj = gen_numbers(3)
print next(gen_obj) #1st element
print next(gen_obj) #2nd element
print next(gen_obj) #3rd element

0
3
6


After that the generator will be 'exhausted', which means that calling ``next`` on the generator object again, will not give you any more elements of the list. Instead, you will get a ``StopIteration`` exception. To start iterating again, we will have to create a new instance of the generator object. 

In [7]:
#calling `next` on an exhausted generator will result in StopIteration
print next(gen_obj)

StopIteration: 

## Exercises : Part 1
1.1 Write a generator function that returns a single number 

1.2 Write a generator function that returns letters from the English alphabet

## Generator Expressions
Generating a list of values with particular characteristics is such a ubiquitous programming task that Python has introduced a language feature known as 'list comprehensions'. For example, to generate all multiples of three between 0 and 4, we can employ the following list comprehension:

In [8]:
[ 3*i for i in range(5)]

[0, 3, 6, 9, 12]

List comprehensions also allow us to include filtering based on conditions. For example, the list comprehensions below produces all multiples of 3 between 0 and 15 if their modulo 5 is equal to 0 (ie, they are divisible by 5 without a remainder ). 

In [9]:
[ 3*i for i in range(16) if 3*i%5==0]

[0, 15, 30, 45]

In the world of generators, a sort-of analogue for the list comprehension is a ``generator_expression``. For example, we can return a list of multiples of 3 between 0 and 4 in a lazy manner by employing the generator expression below. Notice that this does not return a list, but a generator object. 

In [10]:
( 3*i for i in range(5))

<generator object <genexpr> at 0x7f9dc8d672d0>

We can then iterate through the resulting generator object using the same next method we saw before:

In [11]:
gen_obj =(3*i for i in range(5))
next(gen_obj)

0

We can also use the resulting generator objects in a ``for``-loop : 

In [12]:
for element in ( 'Hello World, ' + name for name in ('Tom', 'Karlie', 'Taylor', 'Gigi')):
    print element

Hello World, Tom
Hello World, Karlie
Hello World, Taylor
Hello World, Gigi


### Nested Generator Expressions
Like nested list comprehensions, generator expressions can also be nested. For example, in the example below we first multiply a number by 3 ( in the innermost expression ) and then turn it into a string [ This is a rather trivial example, you could multiple by 3 and then directly pass the resulting number into str, but it illustrates the point of using nested generator expressions ]. 

In [13]:
#nested generator expressions
gen_obj = ( str(n) for n in (i+2 for i in (1,2,3)))
print next(gen_obj)
print next(gen_obj)
print next(gen_obj)

3
4
5


## Exercises: Part 2
2.1 Write a generator expression that returns the first letter of each word in the list ['New York', 'London', 'Paris', 'Helsinki' ]

2.2 Write a regular Python function that accepts a temperature value as an argument and converts that value from degrees Fahrenheit into degrees Celsius ( the formula for this conversion is C=(f-32)*(5/9) ). Then write a generator expression that generates a list of Celsius values from a list of Fahrenheit values of your choice. 

## Bidirectional communication with generators

Based on the previous two sections, we know that introducing the ``yield`` statement into a function, will cause the function to return values to the caller of the function. But is it possible to send values from the function back into the generator? Bidirectional communication between the generator function and its caller was introduced in [PEP 342 (Python 2.5 ) "Coroutines via Enhanced Generators"](https://www.python.org/dev/peps/pep-0342/). Let's look at a simple example  of how we can pass values back into the generator.

In [53]:
def adder(n): #generator function that adds numbers
    i=0
    for _ in range(n):
        add = yield i 
        print '::adder: add has a value: ', add
        i+=add
        print '::adder: i has a value: ', i
        
        
def main():
    gen_obj = adder(10)
    print '::main: Value returned when gen initialised:',next(gen_obj) # we have to call 'next' on a new generator object to 'initialise'/start the generator
    returned_value = gen_obj.send(5) # after the generator has been initialised, we can send values using .send
    print '::main: returned_value: ', returned_value #value returned is 1, because 
    returned_value = gen_obj.send(2)
    print '::main: returned_value: ', returned_value
    
main()

::main: Value returned when gen initialised: 0
::adder: add has a value:  5
::adder: i has a value:  5
::main: returned_value:  5
::adder: add has a value:  2
::adder: i has a value:  7
::main: returned_value:  7


Notice what happens when we try to directly call .send() on the generator object

In [14]:
def main2():
    gen_obj = adder(10)
    #print next(gen_obj) # we have to call 'next' on a new generator object to 'initialise'/start the generator
    returned_valued = gen_obj.send(5) # after the generator has been initialised, we can send values using .send
main2()

NameError: global name 'adder' is not defined

Notice that the error message says: "can't send non-None value to a just-started generator"

### Flow of execution in coroutines
To clairify what happened in the previous section, let's briefly look at another coroutine and use print statements to examine how the execution logic flows within the function.

In [16]:
def simple_coroutine():
    value=0
    print 'Executing everything until the first yield'
    print 'Value is currently: ', value
    value = yield 1
    print 'Executing everything after the first yield'
    print 'Value is currently: ', value
    
obj = simple_coroutine()
next(obj)

 Executing everything until the first yield
Value is currently:  0


1

Let's examine a few of the things that happened in the example above:

1) When we called ``next(obj)`` on the generator object, we executed everything until the yield and also returned the yielded value (which is 1 ) in this case

In [17]:
obj.send(5)

Executing everything after the first yield
Value is currently:  5


StopIteration: 

When we call ``obj.send(5)``, we are able to alter the internal value of the variable ``value``, but we no longer have anything to yield, so we throw a ``StopIteration`` to signal that the generator has been exhausted

## Exercises: Part 3
1. Add logic to the ``simple_coroutine`` method, which would result in the following behaviour: after calling ``obj.send(5)``, yield the integer 2 instead of raising ``StopIteration`` method. 

# Bibliography and Further Reading:
[Scipy Lecture Notes: 2.1 Advanced Python ](http://www.scipy-lectures.org/advanced/advanced_python/#generators)

_Distributed Computing with Python_ by Francesco Pierfederici

[PyCon 2014 - David Beazley: Generators - the Final Frontier ](https://www.youtube.com/watch?v=D1twn9kLmYg)

[PyData Silicon Valley 2014- James Powell: Generators Will Free Your Mind ](https://www.youtube.com/watch?v=RdhoN4VVqq8)