# Comprehension
*Comprehension should ideally be purely functional*

`[ expr(item) for item in iterable ]` -> list comprehension

`{ expr(item) for item in iterable }` -> set comprehension

`{ key_expr: value_expr for item in iterable }` -> set comprehension

In [1]:
words = "I am a sentence with some words randomly writed".split()
from pprint import pprint as pp

In [3]:
#list comprehension
[len(word) for word in words]

[1, 2, 1, 8, 4, 4, 5, 8, 6]

In [4]:
#set comprehension
{len(word) for word in words}

{1, 2, 4, 5, 6, 8}

In [14]:
#dict comprehension
capitals_of_country = {'United Kingdom': 'London',
                       'Brazil': 'Brasilia',
                       'Argentina': 'Buenos Aires'}

country_of_capitals = {capital:country for country, capital in capitals_of_country.items()}
country_of_capitals

{'Brasilia': 'Brazil', 'Buenos Aires': 'Argentina', 'London': 'United Kingdom'}

In [16]:
#dict comprehension
words = "Bruno de Almeida Silveira".split()

first_letters = {word:word[0] for word in words}
first_letters

{'Almeida': 'A', 'Bruno': 'B', 'Silveira': 'S', 'de': 'd'}

In [19]:
#Filtering predicates
from math import sqrt

def is_prime(x):
    if x < 2:
        return False
    for i in range(2, int(sqrt(x)) + 1):
        if x % i == 0:
            return False
    return True

print([x for x in range(1001) if is_prime(x)])

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997]


### Iterable protocol
iterables podem ser passados para a função built-in `iter()` para conseguir um iterator.

`iterator = iter(iterable)`

### Iterator protocol
Iterator objects can be passed to the built-in `next()` function to fetch the next item.

`item = next(iterator)`

In [22]:
iterable = ['monday', 'tuesday', 'wednesday', 'thursday', 'friday']

iterator = iter(iterable)

print(next(iterator))
print(next(iterator))
print(next(iterator))
print(next(iterator))
print(next(iterator))

monday
tuesday
wednesday
thursday
friday


# Generators
*Especificam sequencias de Iterables, todos os generators são iterators, são lazy evaluados*

Use the word `yield` to return next value of iterator

- Generators resume execution
- Can maintain state in local variables
- Complex control flow
- Lazy evaluation

In [27]:
#basic generator
def gen123():
    print('about to yield 1')
    yield 1
    print('about to yield 2')
    yield 2
    print('about to yield 3')
    yield 3
    print('about to return')
    
g = gen123()
print(next(g))
print(next(g))
print(next(g))
print(next(g))

about to yield 1
1
about to yield 2
2
about to yield 3
3
about to return


StopIteration: 

In [41]:
#generator examples
def take(count, iterable):
    counter = 0
    for item in iterable:
        if count == counter:
            return
        counter += 1
        yield item

def run_take():
    items = [2,4,6,8,10]
    for item in take(3, items):
        print(item)

def distinct(iterable):
    seen = set()
    for item in iterable:
        if item in seen:
            continue
        yield item
        seen.add(item)

def run_distinct():
    items = [1, 3, 3, 8, 9, 9]
    for item in distinct(items):
        print(item)
        
def run_both():
    items = [13, 6, 6, 2, 1, 1]
    for item in take(3, distinct(items)):
        print(item)
        
        
if __name__ == "__main__":
    print("running - take(3)")
    run_take()
    
    print("running - distinct")
    run_distinct()
    
    print("running - both")
    run_both()

running - take(3)
2
4
6
running - distinct
1
3
8
9
running - both
13
6
2


In [46]:
#working with infinite and laziness
def infinite_fib():
    yield 1
    a = 1
    b = 1
    while True:
        yield(b)
        a,b = b, a+b

for x in infinite_fib():
    print(x)

1
1
2
3
5
8
13
21
34
55
89
144
233
377
610
987
1597
2584
4181
6765
10946
17711
28657
46368
75025
121393
196418
317811
514229
832040
1346269
2178309
3524578
5702887
9227465
14930352
24157817
39088169
63245986
102334155
165580141
267914296
433494437
701408733
1134903170
1836311903
2971215073
4807526976
7778742049
12586269025
20365011074
32951280099
53316291173
86267571272
139583862445
225851433717
365435296162
591286729879
956722026041
1548008755920
2504730781961
4052739537881
6557470319842
10610209857723
17167680177565
27777890035288
44945570212853
72723460248141
117669030460994
190392490709135
308061521170129
498454011879264
806515533049393
1304969544928657
2111485077978050
3416454622906707
5527939700884757
8944394323791464
14472334024676221
23416728348467685
37889062373143906
61305790721611591
99194853094755497
160500643816367088
259695496911122585
420196140727489673
679891637638612258
1100087778366101931
1779979416004714189
2880067194370816120
4660046610375530309
7540113804746346429


IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



6470687397118762014123938098178388035471287069871898102109347426987448359897922497536149419677118018220313169946454624931318886158400087760334728232334221350903966433278010549019155895518777074607707565247504662835235893463715107285099366012064134172914973704187744806619931444346998384328127980876574845720856011988766376364575973943295403165754943269489339960406900535010439634887820469877932794881352181448465550134055175798915021386073064231426870436073846119034054951425586016028063728222984021151702583967798907087723760989336353320620093323499117437224522349390442762101959825940372799655560673019183979221452081649406087393449471246270643832819517037338301544095176579654070010154510850636907687836549266569341199668212762048657854526851413709483550426708922007872273611652549051541540912267172980592127661372917742525127446286231243534160031905645819609498765753991581574447028802985217704475546247035224093753700726361452251838974790469655411391610318288280194110057994887754026011812581463

KeyboardInterrupt: 

# Generator Comprehensions
`( expr(item) for item in iterable )`

Create a generator object, Lazy evaluation

generators are single use objects. Useful to save memory.

In [47]:
million_squares = (x*x for x in range(1, 1000001))
million_squares

<generator object <genexpr> at 0x7f93550f2570>

In [48]:
sum(x*x for x in range(1, 10000001))

333333383333335000000

In [51]:
sum((x for x in range(1001) if is_prime(x)))

76127

## Batteries for iterations - ITERTOOLS

In [62]:
from itertools import islice, count

all_primes = (x for x in count() if is_prime(x))

hundred_primes = islice(all_primes, 100)
print(list(hundred_primes))

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541]


## Iteration tools
#### built-in
- sum()
- any()
- zip()
- all()
- min()
- max()
- enumarate()

#### standard module ITERTOOLS module
- chain()
- islice()
- count()
- see docs =]