This week we will look at generator functions and exceptions handling.
Most of the material covered here is based on Chapter 14 in Fluent Python by L. Ramalho.

There are several terms we have encountered in the past, and we need to make sure we are clear on their meaning.

In pure programming terms, the following holds:

<b>Iterator</b> is a an object that support iterating over a sequence of objects (i.e. an iterable). The simplest way to think about it, is that an iterator allows us to call 'next' on an sequency of objects. That means that iterators implement \__next__ methods which enables doing this -> ...[i]

<b>Iterable</b> is an object that implements \__iter__ and/or \__getitem__.
   Iterables return iterators when \__iter__ is called on them. Note that an iterable does not implement \__next__. It simply    returns an iterator that does. 

<b>Generator</b> in Python is an object that gets produced by a generator function or a generator expression. We have already seen generator expressions in Week 3. Also, Python generators are the 'pythonic' way to create iterators (this is because the Iterator pattern is not idiomatic to Python (i.e. not 'pythonic').

An aside note: what happens when you call a function from another function? How does Python know where to return? 
Before branching off to another function, Python saves the pointer to the current program execution state. This state is 'loaded' in the stack - the space in computer memory that keep track of what is 'going on'.

<b>Generator expression</b> are short (lambda-like) statements that return generator.


<b>Generator Function</b> are explicit function (explicit because they use <i>yield</i> keyword).

Thus, both, generator expressions and generator functions can produce generators.

In [None]:
#iterable superset of iterator, iterator subset of iterable
#iterators can be implemented as generator functions

In [1]:
import pandas as pd
import numpy as np

## In Week 3 We Saw This:

In [2]:
n = 10
genExpr = (i for i in range(1,10,1))   # generator expression that returns a generator

In [3]:
print (next(genExpr)) #implements __next__
print (next(genExpr))
print (next(genExpr))
print (next(genExpr))
print (next(genExpr))

1
2
3
4
5


In [4]:
type(genExpr)

generator

In [None]:
import collections as col
isinstance(genExpr, col.Iterable)

## We can Implement the Same Functionality with a Generator Function

In [5]:
def genFunc(n):
    
    for i in range(1,n,1):
        yield i #return returns for good, yield saves the programme state, knows where i is in the range of 1 to n, 
                #need to call the whole code again with return

In [6]:
x = genFunc(10)

In [7]:
type (x)

generator

In [8]:
print (next(x))
print (next(x))
print (next(x))
print (next(x))
print (next(x))

1
2
3
4
5


## Let's look at a Few More Examples

### Example 1: Fibonacci Series

In [9]:
# implemented recursively

def Fibo_recursive(n):
    
    if n == 0 or n==1:
        return n
    else:
        return Fibo_recursive(n-1)+Fibo_recursive(n-2) #memory intensive, stack can crash computer

In [10]:
n = 10
fib_rec = list(map(Fibo_recursive, range(1,n)))

In [11]:
fib_rec

[1, 1, 2, 3, 5, 8, 13, 21, 34]

In [54]:
# implemented as a generator function

In [12]:
def Fibo_generator(n):
    
    a, b = 0,1
    
    for i in range(n):
        yield a
        a, b = b, a+b #go from smallest to largest the recursive after reahced max value
                      #a becomes b, b becomes a+b
                      #first yield saves the 0 and then updates a and b, yield makes it a generator

In [13]:
x = Fibo_generator(10)

In [14]:
fib_gen = [i for i in Fibo_generator(n)] #list comprehension ensures that it returns all stages of the yield

In [15]:
fib_gen

[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

At this stage you should recognize several advantages of the generator function over the recursive approach:

(1) It saves memory

(2) It is easier to read

#### What are the similarities in the execution of these two implementations of Fibo?

Very informally, it is saving of the state of program execution.
Let's look at how yield works:

In [16]:
def yield_next_i(i):
    
    j = i+1
    yield j    # line 1
    j = j+1   
    yield j    # line 2
    j = j+1
    yield j    # line 3

In [17]:
yielder = yield_next_i(100)

In [18]:
print (next(yielder))   # the programme state is saved just after line 1

101


In [19]:
print (next(yielder))   # the programme state moves to the 2nd j increment and yield the next j, after saving the state 
                        # just after line 2

102


In [20]:
print (next(yielder))   # generator is exhasted

103


In [21]:
print (next(yielder)) #use yield for very large n in your iterable

StopIteration: 

### Example 2: Parsing Sentence One Word at a Time:

In [22]:
def sentence_parser(sentence):
    ''' Parsers a string a returns one word at a time
    '''
    
    for word in sentence.split(" "):
        yield word

In [23]:
sentence = "Generators are everywhere in Python. They make our lives so much simpler!"

In [24]:
my_parser = sentence_parser(sentence)

In [25]:
print (next(my_parser))
print (next(my_parser))
print (next(my_parser))
print (next(my_parser))

Generators
are
everywhere
in


In [26]:
# I can also capture the whole thing by using a list comprehension

In [27]:
print ([word for word in sentence_parser(sentence)])

['Generators', 'are', 'everywhere', 'in', 'Python.', 'They', 'make', 'our', 'lives', 'so', 'much', 'simpler!']


### Connecting to svv, querying a table and looping through the results a batch at a time

In [28]:
# my svv implements psycopg2.connect and returns the connection, models trained on a batch
import svv_connector as svv

def get_svv_results(query, batch_size):
    ''' Connects to svv, executes a query and provides results back one batch_size at a time
    '''
    
    
    try: 
        con = svv.get_svv_connection() #will run the whole query if no exception

        cur = con.cursor()
        cur.execute(query)
        
        res = 0
        while res != []: #finish exhaustion of the generator
            res = cur.fetchmany(batch_size) #doesn't bring the whole query into memory but still runs the whole query in Redshift
            yield res #updates function state
    
    except Exception as inst:
        print(inst) #print the exception, wrong query, bad connection, bad logic in the function
    finally:
        cur.close()
        con.close()

In [29]:
query = "Select Top 1000 * from sandbox.viewers_summary_hc_2018c_es;"

In [30]:
my_results = get_svv_results(query, 100)

In [31]:
print (next(my_results)) #first 100 rows

[('9a9215e2-8c1a-4e82-9789-65c18bde3ad9', 'F', 43, 'ad.axa.h5.hw.wo', 'CM0 8', 'SOUTHMINSTER', 1, 'Active', '3', '10', '07', '0', '0', '11', '3', '11', '2', '1', '0', '4', 47700, 'H', '3', 0, 0, 0, 0, '[{"id":"LONG-RUNNING DRAMA","ranking":"1"}]', Decimal('89483.8000000000'), Decimal('86638.4110000000'), Decimal('0.5080'), 'CM0_8', 'CM0_8', 'Suburbanites', 'Semi-detached suburbia', 'White suburban communities', 0), ('bc345304-d35c-4dbf-b0d6-8134d38be8f2', 'F', 67, 'ad.axa.hw.wo', 'KY16 0', 'ST. ANDREWS', 0, 'Active', '5', '05', '05', '1', '0', '05', '5', '13', '5', '1', '0', '3', 31900, 'G', '1', 0, 0, 0, 0, '[{"id":"CHAT AND MAGAZINE","ranking":"1"},{"id":"CELEBRITIES","ranking":"2"}]', Decimal('560.2320000000'), Decimal('461579.7670000000'), Decimal('0.0012'), 'KY16_0', 'KY16_0', 'Rural residents', 'Rural tenants', 'Rural white-collar workers', 0), ('685d57bd-1acc-4b61-ac2f-5d22c45570fa', 'M', 50, 'ad.me', 'SK7 3', 'STOCKPORT', 1, 'Active', '3', '01', '04', '1', '0', '11', '2', '10',

In [32]:
print (next(my_results))

[('ee51beb4-5183-47ba-b13f-033ca8a47035', 'F', 36, 'ad.axa.hc.wo', 'SE24 0', 'LONDON', 5, 'Active', '1', '03', '05', '1', '1', '05', '3', '04', '1', '1', '0', '7', 82200, 'C', '8', 0, 0, 0, 0, '[{"id":"CRIME AND THRILLERS","ranking":"1"}]', Decimal('14027.8130000000'), Decimal('71850.1830000000'), Decimal('0.1633'), 'SE24_0', 'SE24_0', 'Ethnicity central', 'Aspirational techies', 'Old EU tech workers', 1), ('e5de3e25-53c2-4870-a1f1-e3c936624ac8', 'F', 46, 'ad.axa.h5.hw.wo', 'AB53 5', 'TURRIFF', 24, 'Active', '2', '04', '05', '1', '0', '07', '2', '07', '2', '1', '0', '0', 11800, 'J', '1', 0, 0, 0, 0, '[{"id":"DATING SHOWS","ranking":"1"},{"id":"FACTUAL ENTERTAINMENT","ranking":"2"}]', Decimal('2795.9600000000'), Decimal('227318.8970000000'), Decimal('0.0121'), 'AB53_5', 'AB53_5', 'Rural residents', 'Farming communities', 'Rural workers and families', 0), ('4edba9dd-5ddf-4854-8dfb-6f999e2a5e98', 'M', 41, 'ad.axa.hc.me', 'GL52 8', 'CHELTENHAM', 43, 'Active', '2', '07', '00', '1', '1', '04

In [33]:
# There are several things to note about the example above - especailly try - except - finally

#### 1. Where can this be useful?

In [None]:
#big data sets, no need for locally saved csvs etc

#### 2. What is wrong with this function? 

In [None]:
#split up the function: connect, run query, close connection. Too much for one object, best practices.

#### Answer: it is doing too much. We need to split it up into what it is from what it does. 

#### We will learn how to do this next week. Next week - classes!

# Your Home Work

Write a generator function to produce an arithmetic progression between any given two numbers X and Y by a given constant C.

For example, the following is an arithmetic progression between 10 and 22 by 2:

10, 12, 14, 16, 18, 20, 22

Handle a case when Y < X as an exception (either by using try - except) or by checking if Y<X and raising an exception like this:

In [None]:
if Y< X:
    raise Exception("End must be greater than Start.")

In [14]:
def arithmetic_progression(X,Y,C):
    if Y < X:
        raise Exception("End must be greater than Start")
    X = X
    while X <= Y-C:
        yield X
        X = X + C

In [15]:
print ([i for i in arithmetic_progression(10, 22, 2)])

[10, 12, 14, 16, 18, 20]


In [13]:
print ([i for i in arithmetic_progression(10, 8, 2)])

Exception: End must be greater than Start

In [16]:
print ([i for i in arithmetic_progression(8, 10, 0.15)])

[8, 8.15, 8.3, 8.450000000000001, 8.600000000000001, 8.750000000000002, 8.900000000000002, 9.050000000000002, 9.200000000000003, 9.350000000000003, 9.500000000000004, 9.650000000000004, 9.800000000000004]
