In [None]:
import os
array_file = iter(open(os.path.join('.','attached','test_array.dat'), 'r'))

# List Comprehensions and Generators
Download this notebook from Canvas under Files > Python > MarcusCode > Comps_and_generators.ipynb<br/>
or clone my repo at https://github.com/betteridiot/b575f18

# Lists Comprehensions

As we *all* know, mathematicians and statisticians like to make simple things complicated. Instead of just saying `1+2=3`, we first have to prove that `+` is in fact addition. Then we have to ensure that our assumptions only apply to the select cases we expect our "algorithm" to be used. See, I am boring myself already.

### However, something useful came out of this banal minutiae--set notation:

This is something like:<br/>
$V=(1,2,\ \ldots\ ,n)$<br/> 
$S= \{x^2:\ x\ in\ {\rm I\!R}\}$

While it looks like absolute garbage when you are about to take an exam...granted, I usually forget that `x` is a letter in the alphabet during exams, it has its uses.

In Python, users tend to use ***'for loops'*** for iterate through our entire sample space. For example:

In [None]:
S = []
for x in range(1000):
    S.append(x**2)
print(S[:10])

# There's got a be a better way!

There is Kevin! This same step can be compressed to a single line in Python. The example below is called a ***list comprehension***:

In [None]:
S = [x**2 for x in range(1000)]
print(S[:10])

# Syntax of a *list** comprehensions

<center><img src="http://python-3-patterns-idioms-test.readthedocs.io/en/latest/_images/listComprehensions.gif"/></center>

# Not just math

Remember your homework? You had to process each line of an array one at a time, `strip()`, `split()`, and convert the object type. There were ultimately 2 ways to accomplish this:

In [None]:
# First way: functionally (or standard for loop)


In [None]:
# Second way: list comprehension


See how **easy** and **short** that is visually?

List comprehensions are one of foundations that Python coders use when they want to do something simple that requires a for loop. However, list comprehensions involve most of the same rules that apply to regular for loops, thus making them capable of much more.

In [None]:
# Codons


# But there is more!

You can add predicates (or conditionals) to your for loops to handle simple cases

In [None]:
# predicates


However, the sky is the limit when it comes to complexity. Ever hear of the 'Fizz Buzz' interview question?<br/>
For those that haven't, the 'Fizz Buzz' question requires a programmer to do the following:<br/>
1. return 'Fizz' if the number is divisible by some number (usually 3)
2. return 'Buzz' if it is divisible by another number (usually 5)
3. return 'Fizzbuzz' if divisible by both
4. return just the number if none of the above

In [None]:
# One-line 'FizzBuzz'


There are some trade-offs though when you start making your list comprehensions more complex:<br/>
1. Readability
2. Understandability
3. Conventionally, complex comprehensions are frowned on-they go against the 'Zen of Python'

In [None]:
# What's the 'Zen of Python' you ask?
import this

# Does more than just lists too

In [2]:
from string import ascii_lowercase as lower

In [None]:
# Dictionary Comprehensions


In [None]:
# Set comprehensions


Let's take a second and look at the above example. What has happened here?

# Sometimes, doing it all at once isn't the best approach

# Workshop time!

In [None]:
# Team Classic
def stack_coins_classic(n):
    stack = True # a stack of 4 coins
    chocolates = []
    for i in range(n):
        coins = 0       
        for j in range(4):
            coins += 1
        chocolates.append(stack)
    print('All chocolates ready')
    return chocolates

def make_envelopes_classic(chocolates):
    # for each envelope needed, put coins in envelopes
    envelopes = [stack for j in chocolates]
    print('All envelopes full')
    return envelopes

def hand_out_classic(envelopes):
    # for each envelope made, hand them out
    for env in envelopes:
        pass
    print('Deliveries complete')

def new_year_classic(n):
    chocolates = stack_coins_classic(n)
    envelopes = make_envelopes_classic(chocolates)
    deliveries = hand_out_classic(envelopes)
    print('Delivery Complete')

In [None]:
# Team Generator
def stack_coins_gen(n):
    stack = True
    for i in range(n):
        coins = 0     
        for j in range(4):
            coins += 1
        print(f'Number {i+1} stack ready')
        yield stack

def make_envelopes_gen(chocolates):
    full_envelope = True
    while chocolates:
        print('Envelope full')
        yield full_envelope

def hand_out_gen(envelopes):
    delivery = True
    while envelopes:
        print('Out for delivery')
        yield delivery

def new_year_gen(n):
    coins = stack_coins_gen(n)
    envelope = make_envelopes_gen(coins)
    delivery = hand_out_gen(envelope)
    for person in range(n):
        next(coins)
        next(envelope)
        next(delivery)
    return 'Delivery Complete'

In [None]:
def vs():
    return print()

In [None]:
%time new_year_classic(5)
vs()

%time new_year_gen(5)

# Generators

<center><img src="http://nvie.com/img/relationships.png"/></center>

At the core of `generators` and `iterators` lie the concept of **lazy (or on-demand) evaluation**. This means that it won't evaluate the *`next`* expression until you ask it to.

In [4]:
from IPython.display import HTML

#HTML('<center><img src=https://media.giphy.com/media/fd61KYrYcyoQo/giphy.gif></center>')
HTML('<center><img src=https://media.giphy.com/media/MB7K6KdwWfN7y/giphy.gif></center>')

## Lists vs iterators

In [None]:
# iterate through list


Now, that I have gone through all the elements of the list, what happens if I try to go back and see something?

In [None]:
# Print out a slice

You have already seen and used *iterables* before. An example of one is seen just above: `range`

In [None]:
# Iterate through range


But the key method associated with `iterators` is `next()`. Watch what happens when we call `next()` on a range.

In [None]:
a_range = range(10)
next(a_range)

Now what happens when I try to look back at `a_range`?

In [None]:
# Print slice of range


So, `range` is iterable, but not an iterator. It also doesn't store state. Can we hack it though?

In [None]:
# iter() makes any iterable an iterator (if it has a __iter__ and __next__ method)
b_range = iter(range(10))
for i in range(5):
    print(next(b_range))

`iterators` are *consumed* when they have been processed. They cannot go backwards, and they preserve no state after completion. However, think of this more as a feature than a fault.

## What are some use cases for ***not*** wanting to keep all of your data?

In [None]:
# Classic prime number example of an iterator
def check_prime(number):
    for divisor in range(2, int(number ** 0.5) + 1):
        if number % divisor == 0:
            return False
    return True

In [None]:
class Primes:
    def __init__(self, max_number):
        self.max = max_number
        self.number = 1

    def __iter__(self):
        return self

    def __next__(self):
        self.number += 1
        if self.number >= self.max:
            raise StopIteration
        elif check_prime(self.number):
            return self.number
        else:
            return self.__next__()

In [None]:
# Instantiate the iterator
primes = Primes(100000000000)

In [None]:
# primes is an iterator, not a list of values...it waits to compute the next value
print(primes)

In [None]:
# Compute the next value without storing it
i = 0
for x in primes:
    if i > 100:
        break
    print(x)
    i += 1

The defining traits of `generators` is that they produce a sequence of results instead of a single value and they preserve ***state***.<br/>
This means that after every time the `generator` gives control back to the user, it keeps it stores its parameter space.

In [None]:
# Plain file reader
def opener(file_path, obj_type, r_or_w = 'r', delimiter = '\t'):
    with open(file_path, r_or_w) as infile:
        for line in infile:
            row = [obj_type(i) for i in line.strip().split(delimiter)]
            yield row

You should have seen a very different keyword than normally used: `yield`<br/>
`yield` is only used in `generators` and has a specific use<br/>
It allows the function to temporarily hand control back to the user.

In [None]:
array = opener('./attached/test_array.dat', float)
next(array) # This "primes" the generator to start working.

In [None]:
# Let's start working
print(next(array))

The first time a `next()` function is used to get the `generator` warmed up. <br/>
After that, `next()` just tells the `generator` to go till the next `yield`

In [None]:
# Multiple inline yields
def printer():
    i = 0
    yield 'Starting'
    for i in range(3):
        yield f'{i}'
    yield 'Complete'

In [None]:
example = printer()
print(next(example))

In [None]:
for i in range(3):
    print(next(example))

In [None]:
print(next(example))

# Coroutines: generators that both receive values

In [None]:
# Incrementer with break case
def grep(pattern):
    print(f'Looking for {pattern}')
    while True:
        entry = yield
        if pattern in entry:
            print(f"{pattern} found. You're awesome, yay and stuff...")
        else:
            print(f'"{entry}" is not "{pattern}" is it? Try again')

In [None]:
finder = grep('gen')
next(finder)

In [None]:
finder.close()

In [None]:
finder.send("generators are cool")

# Generator expressions

To top finish off the lession, let's combine what we learned about list comprehensions and generators to make a generator expression

In [None]:
%%time
# stadard list appending
limit = 1e14
i = 0
make_squares = []
while True:
    entry = i ** 2
    if entry > limit:
        break
    else:
        make_squares.append(entry)
    i += 1

In [None]:
%%time
# list comprehension
squares = [x ** 2 for x in range(len(make_squares))]

In [None]:
%%time
# Generator expression
sq = (x**2 for x in range(len(make_squares)))

### Would you look at that?!? Wow, that is fast...wait, I tricked you. Generators are lazy, and so are generator expressions

In [None]:
# Make an expression infinitely loopable
def fake_while_loop():
    i = 0
    while True:
        yield i
        i += 1

# set up my fake while loop
whl = fake_while_loop()

# Now, yield every square one at a time.
sqs = (x**2 for x in whl)

In [None]:
# Let' use it
sqs

In [None]:
%%time
# Okay, let's try that again
for i in range(int(1e14)):
    next(sqs)

## Wait, the iteratable `range` could be consumed by `list` and `sum`, and generators are iterables, can you consume a generator?

In [None]:
print(sum(x**2 for x in range(100)))

S = list(x**2 for x in range(1000))
print(S[:25])

If the last example looks kind of familiar, it is the first example we looked at for using list comprehensions. By using a generator expression that is consumed by `list`, we made our own list comprehension

# Conclusions

1. List comprehensions can compress simple for loops down to a single line
2. List comprehensions tend to be more efficient than standard for loops when the data is sufficiently large
3. The same syntax to make a list comprehension can be used to make dictionarys, sets, and generators
4. Generators are iterators that lazily evaluate the next value and `yield` it back
5. Generators can receive values, as well and `yield` them
6. Once a generator (or any iterator) is consumed when complete
7. Custom coroutines have to be 'primed' once before use

# Advanced Coroutines

**Disclaimer**: I briefly discuss the decorator seen next. However, there are a lot more to decorators than the scope of this lecture can cover. If you do not understand the next part, don't worry: we are not expecting you to know them yet. We are using them here for presentation purposes only.

In [None]:
# Make a decorator that automatically primes coroutine
def coroutine(func):
    def start(*args,**kwargs):
        cr = func(*args,**kwargs)
        next(cr)
        return cr
    return start

In [None]:
# iterate through an array
@coroutine
def formatter(target):
    while True:
        entry = yield
        target.send([float(i) for i in line.strip().split('\t')])
    
@coroutine
def row_mean(target):
    while True:
        row = yield
        target.send(sum(row)/len(row))

@coroutine
def cool_printer():
    while True:
        content = yield
        print(f'The average for the row is {content}')

cp = cool_printer()
avg = row_mean(cp)
f = formatter(avg)        
with open('./attached/test_array.dat', 'r') as infile:
        for line in infile:
            f.send(line)