# Generators
## Simple generator examlpe

Generator is a function which returns a generator iterator. 

Generator uses ```yield```, which  temporarily suspends processing, remembering the location execution state (including local variables and pending try-statements) when pass to ```next```

In [1]:
def simpleGeneratorFun(): 
    yield 1
    yield 2
    yield 3

In [2]:
g = simpleGeneratorFun()

In [3]:
g

<generator object simpleGeneratorFun at 0x1023d5c78>

In [4]:
next(g)

1

In [5]:
next(g)

2

In [6]:
next(g)

3

In [7]:
next(g)

StopIteration: 

Generator iterators can be used in ```for```

In [None]:
g = simpleGeneratorFun()
for x in g:
    print(x)

But is can be consumed once

In [None]:
for x in g:
    print(x)

Generator iterators can be convert into ```list```

In [None]:
g = simpleGeneratorFun()
list(g)

After consumed, nothing will be retrieved

In [None]:
list(g)

Generator above is equivalent with the belowing class

In [None]:
class EquivalentSimpleGenerator:
    def __init__(self):
        self.i = 1
        
    def __next__(self): # support next()
        if self.i <= 3:
            result = self.i
            self.i += 1
            return result
        else:
            raise StopIteration
            
    def __iter__(self): # support list()
        return self

In [None]:
g = EquivalentSimpleGenerator()
for x in g:
    print(x)

In [None]:
g = EquivalentSimpleGenerator()
list(g)

In [None]:
list(g)

In [None]:
g = EquivalentSimpleGenerator()
g

In [None]:
next(g)

In [None]:
next(g)

In [None]:
next(g)

In [None]:
next(g)

In [None]:
g = simpleGeneratorFun()
for x in g:
    print(x)

In [None]:
g = simpleGeneratorFun()
list(g)

```generator``` is a short-cutting method to define and create an generator iterator object. Don't mix it with normal function call.

## Another example

In [None]:
def generator_generator_function():
    i = 0
    while True:
        if i < 10:
            yield i
            i += 1
        else:
            break

In [None]:
g = generator_generator_function()

In [None]:
for x in g:
    print(x)

In [None]:
class EquvalentGenerator:
    def __init__(self):
        self.i = 0
        
    def __next__(self):
        if self.i < 10:
            result = self.i
            self.i += 1
            return result
        else:
             raise StopIteration
                
    def __iter__(self):
        return self
        

In [None]:
g = EquvalentGenerator()
list(g)

In [None]:
g = EquvalentGenerator()
for x in g:
    print(x)

## Generator comprehension

list comprehension

In [None]:
iterator = [x for x in range(4)]
list(iterator)

In [None]:
list(iterator)

generator comprehension

In [None]:
iterator = (x for x in range(4))
iterator

In [None]:
list(iterator)

In [None]:
list(iterator)

## Why using generator iterator?

Generator iterators are lazy and thus produce items one at a time and only when asked. 
So they are much more memory efficient when dealing with large datasets.

But generator iterators can be consumed only-once

# itertools

Most itertools return generator iterator

## Why use itertools?

* Use itertools can save a lot of for loop code
* Force dividing iteration pattern and computation up
* make intent more explicit

In [None]:
from itertools import count, cycle, repeat, accumulate, chain, compress, dropwhile, filterfalse, groupby, islice, starmap, takewhile, tee, zip_longest, product, permutations, combinations, combinations_with_replacement

## Infinite iterators

### count

In [None]:
for i, x in enumerate(count(10)):
    print(x)
    if i==5:
        break

### cycle

In [None]:
for i, x in enumerate(cycle([1,2,3])):
    print(x)
    if i == 5:
        break

### repeat

In [None]:
for i, x in enumerate(repeat(1, 10)):
    print(x)

## Iterables function

### chain
link several iterables

In [None]:
for i , x in enumerate(chain([1,2,3], [11,22,33], [44, 55])):
    print(x)

### chain.from_iterable

In [None]:
for i , x in enumerate(chain.from_iterable([[1,2,3], [11,22,33], [44, 55]])):
    print(x)

### compress
select by indicator

In [None]:
list(compress(['A', 'B', 'C'], [0, 1, 1]))

### filterfalse

In [None]:
def is_uppercase(s):
    return s.upper() == s

### filterfalse

In [None]:
list(filterfalse(is_uppercase, ['A', 'B', 'c', 'C', 'D', 'e']))

compare with filter

In [None]:
list(filter(is_uppercase, ['A', 'B', 'c', 'C', 'D', 'e']))

### takewhile
"break" while condition is meet

In [None]:
list(takewhile(is_uppercase, ['A', 'B', 'c', 'C', 'D']))

### dropwhile
stop dropping when condition is not meet. Condition checking is turned down later.

In [None]:
list(dropwhile(is_uppercase, ['A', 'B', 'c', 'd', 'C', 'D']))

### groupby

注意， 并不是全局的groupby，如果需要groupby， 应该先做sort

In [None]:
data = [
    'apple',
    'bed',
    'apart',
    'bird'
]
list(groupby(data, key=lambda s:s[0]))

In [None]:
data = [
    'apple',
    'apart',
    'bird',
    'bed',
    'birth'
]
list(groupby(data, key=lambda s:s[0]))

### islice

In [None]:
data = [
    'apple',
    'apart',
    'bird',
    'bed',
    'birth'
]
list(islice(['a', 'd', 'e', 'f', 'g'], 2, None))

### map and starmap

In [None]:
import operator as op

In [None]:
data = [
    (1, 2),
    (3, 4),
    (5, 6)
]

list(starmap(op.mul, data))

In [None]:
data = [1,2,3]
list(map(lambda x:x**2, data))

### zip longest

In [None]:
list(zip_longest([1,2,3], [2,3], fillvalue=0))

compare with zip (shortest)

In [None]:
list(zip([1,2,3], [2,3]))

### accumulate

In [None]:
list(
    accumulate([1,2,3,4,5], lambda x, y: x + y)
)

In [None]:
list(
    accumulate([1,2,3,4,5], lambda x, y: x * y)
)

## tee

```optimized copying``` iterable several times

In [None]:
a = [1,2,3,4,5,6,7,8]
b, c = tee(a, 2)

In [8]:
list(b)

NameError: name 'b' is not defined

In [None]:
list(b) # b is consumed

In [None]:
list(c)

## reduce
While not imported from ```itertools```, ```reduece``` is often used together with itertools

In [None]:
from functools import reduce

In [None]:
reduce(lambda x, y: x + y, [1, 2, 3, 4])

compare with accumulate

In [None]:
list(accumulate([1, 2, 3, 4], lambda x, y: x + y))

An memory efficient and concise way to find person with largest age

In [None]:
class Person:
    def __init__(self, age):
        self.age = age

In [None]:
people = (Person(age) for age in [32, 34, 29, 27, 31, 37, 18, 29])

In [None]:
def get_elder(p1, p2):
    return p1 if p1.age >= p2.age else p2

p_eldest = reduce(get_elder, people)
p_eldest.age

## Pipeline style

In [None]:
from functools import reduce

In [None]:
x = [1, 2, 3, 4]

squared = map(lambda x:x**2, x)

filtered = filter(lambda x:x > 4, squared)

reduce(lambda x, y: x + y, filtered)

For pipeline style, use pyfunctional

In [None]:
# b is consumed

from functional import seq

(
seq(1, 2, 3, 4)\
    .map(lambda x: x ** 2)\
    .filter(lambda x: x > 4)\
    .reduce(lambda x, y: x + y)
)

see more in documentation

## Combinatoric iterators:

In [None]:
list(product(['a', 'b', 'c'], [1,2,3]))

In [None]:
list(permutations(['a', 'b', 'c'], 2))

In [None]:
list(combinations(['a', 'b', 'c'], 2))

In [None]:
list(combinations_with_replacement(['a', 'b', 'c'], 2))