# generator functions
A list of a billion numbers takes up a lot of memory. If you only want the elements one at a time, <br>
there’s no good reason to keep them all around. If you only end up needing the first several elements, <br>
generating the entire billion is hugely wasteful. Often all we need is to iterate over the collection using <br>
for and in. In this case we can create generators, which can be iterated over just like lists but<br>
generate their values lazily on demand.<br>

The flip side of laziness is that you can only iterate through a generator once. If you need to iterate <br>
through something multiple times, you’ll need to either re-create the generator each time or use a list. <br>
If generating the values is expensive, that might be a good reason to use a list instead.<br>

GENERATOR FUNCTIONS
- help cleaning up the code 
- uses them instead of for-loops
- generator functions return a lazy iterator an iterator object with a sequence of values
- These are objects that you can loop over but unlike lists, lazy iterators do not store their contents in memory. 
- Using __yield__ will result in a generator object. - Using __return__ will result in the first line of the file only.
-  Calling a generator function creates an generator object.  However, it does not start running the function.
- The function only executes on next()
-  The difference between yield and return is that yield returns a value and pauses the execution while maintaining the internal states, </br>
whereas the return statement returns a value and terminates the execution of the function. 
-  The generator is called just like a normal function. However, __its execution is paused on encountering the yield keyword.__ </br>
This sends the first value of the iterator stream to the calling environment. However, __local variables and their states are saved internally.__ </br>
This includes any variable bindings local to the generator, the instruction pointer, the internal stack, and any exception handling.</br>
- This allows you to resume function execution whenever you call one of the generator’s methods.  </br>
- That way, when next() is called on a generator object (either explicitly or implicitly within a for loop), </br>
the previously yielded variable num is incremented, and then yielded again. 
- Unless your generator is infinite, __you can iterate through it one time only.__
- Once all values have been evaluated, the generator is deemed exhausted. The iteration will stop and the for loop will exit. 
- If you used next(), then instead you’ll get an explicit StopIteration exception.

One of the __advantages__ of the generator over the iterator is that __elements are generated dynamically.__</br>
Since the next item is generated only after the first is consumed, it is __more memory efficient__ than the iterator. 

    1. Do you need the entire results in memory?
    2. Do you need to reuse the raw results as is?
    3. Is your result reasonably small to fit in the memory?
    4. Do you want to process the results after you have obtained all the results?

If all of the above is yes, then an iterator should suffice. Otherwise, you may want to consider using a generator to benefit from the delayed execution and yielding on the fly.


## yield

In [1]:
# One way to create generators is with functions and the YIELD operator:
# the  function does the same as range
def generate_range(n):
    i = 0
    while i < n:
        yield i  # every call to yield produces a value of the generator
        i += 1

In [3]:
# The following loop will consume the yielded values one at a time until none are left:
for i in generate_range(5): 
    print(f"i: {i}")

i: 0
i: 1
i: 2
i: 3
i: 4


## next
triggers the iterator for the next value 

In [None]:
# A second way to create generators is by using for comprehensions wrapped in parentheses:
evens_below_20 = (i for i in generate_range(20) if i % 2 == 0)

In [5]:
# iteration with next()
print(next(evens_below_20))
print(next(evens_below_20))
print(next(evens_below_20))
print(next(evens_below_20))

0
2
4
6


##  enumerate
returns values and their indices

In [14]:
days = iter([ 'Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday' ])
for i, day in enumerate(days):
    print(f"day {i+1} is {day}")

day 1 is Sunday
day 2 is Monday
day 3 is Tuesday
day 4 is Wednesday
day 5 is Thursday
day 6 is Friday
day 7 is Saturday


## generator comprehension
- shorter way of defining simple generator functions.
- They’re useful in the same cases where list comprehensions are used, with an added benefit: </br>
you can create them without building and holding the entire object in memory before iteration. 

In [None]:
liste_ = [x * x for x in range(10) if x %2 ==0]
print(liste_)

# (expression for i in s if condition)
gen = (x * x for x in range(10) if x % 2 == 0)
print(gen)
print(next(gen))
print(next(gen))
print(next(gen))

[0, 4, 16, 36, 64]
<generator object <genexpr> at 0x7ff81ec957b0>
0
4
16


In [None]:
# The generator expression can also be passed in a function. It should be passed without parentheses.
sum(x * x for x in range(10))


285

## A Generator Solution

In [None]:
with open("../data/test.txt") as wwwlog:
    bytecolumn = (line.rsplit(" ", 1)[1] for line in wwwlog)
    bytes_sent = (int(x) for x in bytecolumn if x != '-')
    print("Total", sum(bytes_sent))

Total 135667


## Performance - of generator objects
- list you get from the list comprehension is 87,624 bytes, while the generator object is only 120. 
- This means that the list is over 700 times larger than the generator object!

In [None]:
import sys
nums_squared_lc = [i * 2 for i in range(10000)]
print(sys.getsizeof(nums_squared_lc))

nums_squared_gc = (i ** 2 for i in range(10000))
print(sys.getsizeof(nums_squared_gc))

87616
112


## memory vs. speed
- If the list is smaller than the running machine’s available memory, then list comprehensions can be faster to evaluate than the equivalent generator expression.
- Here, you can see that summing across all values in the list comprehension took about a third of the time as summing across the generator. 
- If speed is an issue and memory isn’t, then a list comprehension is likely a better tool for the job.

In [None]:
import cProfile
cProfile.run('sum([i * 2 for i in range(10000)])')

         5 function calls in 0.001 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    0.001    0.001 <string>:1(<listcomp>)
        1    0.000    0.000    0.001    0.001 <string>:1(<module>)
        1    0.000    0.000    0.001    0.001 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.sum}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}




In [None]:
cProfile.run('sum((i * 2 for i in range(10000)))')

         10005 function calls in 0.002 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10001    0.001    0.000    0.001    0.000 <string>:1(<genexpr>)
        1    0.000    0.000    0.002    0.002 <string>:1(<module>)
        1    0.000    0.000    0.002    0.002 {built-in method builtins.exec}
        1    0.001    0.001    0.002    0.002 {built-in method builtins.sum}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}




## send, throw, close - Advanced Generator Methods

### send()
- Resumes the execution and “sends” a value into the generator function.

In [None]:
def f(x=None):
    while True:
        x = yield
        yield x*2
             
g = f()
next(g)
g.send(4)

8

In [None]:
next(g)
g.send(10)

20

### throw
-allows you to throw exceptions with the generator.

### close
- allows you to stop a generator.

# itertools
https://docs.python.org/3/library/itertools.html

## prodcut

In [None]:
from itertools import product
# product can substitute nested for-loops

A = [1, 2, 3]
B = [1, 2]

## nested for-loop
print([(i, j) for i in A for j in B])

## subsitution with product
print(list(product(A, B)))

C = [[7, 3], [9, 8], [1,4]]
print(list(product(*C))) # product of sub arrays

# # specify the number of repetitions to compute the product of an iterable with itself
print(list(product(B, repeat=3)))  # = product(A, A, A)


[(1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)]
[(1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)]
[(7, 9, 1), (7, 9, 4), (7, 8, 1), (7, 8, 4), (3, 9, 1), (3, 9, 4), (3, 8, 1), (3, 8, 4)]
[(1, 1, 1), (1, 1, 2), (1, 2, 1), (1, 2, 2), (2, 1, 1), (2, 1, 2), (2, 2, 1), (2, 2, 2)]


## permutation
- Permutations are subsets that can be seleted from a set of objects, and the order
of the objects matter insofar as the same objects ordered in a different manner are counted as distinct. $\rarr (1, 2) \neq (2, 1)$
- itertools.permutations(iterable, r=None)
- r = lenght of permutations defaults to lenght of iterable
- permutation are in lexicographic order according to the order of the input 
- Elements are treated as unique based on their position, not on their value. 

In [9]:
from itertools import permutations

A = [1,2,3]

print(len(list(permutations(A)))) # number of permutations
print(*list(permutations(A)))
print(*list(permutations(A, 2)))
print(*list(permutations("abc", 3)))

6
(1, 2, 3) (1, 3, 2) (2, 1, 3) (2, 3, 1) (3, 1, 2) (3, 2, 1)
(1, 2) (1, 3) (2, 1) (2, 3) (3, 1) (3, 2)
('a', 'b', 'c') ('a', 'c', 'b') ('b', 'a', 'c') ('b', 'c', 'a') ('c', 'a', 'b') ('c', 'b', 'a')


## combination
- Combinations are the possible subsets of a certain lenght $r$ of a set, <br> while 
the order does not matter. $\rarr (1,2) = (2,1)$
- Return r length subsequences of elements from the input iterable.
- Combinations are emitted in lexicographic sorted order. 
- So, if the input iterable is sorted, the combination tuples will be produced in sorted order.
- itertools.combinations(iterable, r)

In [10]:
from itertools import combinations

A = [1, 2, 3, 4]

print(len(list(combinations(A, 2))))  # number of combinations
print(*list(combinations(A, 2))) # combination of 4 elements in groups of 2
print(*list(combinations(A, 3))) # combinations of 4 elements in groups of 3
print(*list(combinations("abc", 2)))

6
(1, 2) (1, 3) (1, 4) (2, 3) (2, 4) (3, 4)
(1, 2, 3) (1, 2, 4) (1, 3, 4) (2, 3, 4)
('a', 'b') ('a', 'c') ('b', 'c')


In [3]:
from itertools import combinations_with_replacement, combinations

# when you take one element from the lsit it gets replaced and is thus available to build a subset with itself
print(list(combinations_with_replacement('12345', 2))) 

A = [1, 1, 3, 3, 3]
print(list(combinations(A, 2)))

[('1', '1'), ('1', '2'), ('1', '3'), ('1', '4'), ('1', '5'), ('2', '2'), ('2', '3'), ('2', '4'), ('2', '5'), ('3', '3'), ('3', '4'), ('3', '5'), ('4', '4'), ('4', '5'), ('5', '5')]
[(1, 1), (1, 3), (1, 3), (1, 3), (1, 3), (1, 3), (1, 3), (3, 3), (3, 3), (3, 3)]


## cycle()
cycle iterator cycles over a collection

In [None]:
seq1 = ['Joe', 'Jana', 'Joseph']
cycle1 = itertools.cycle(seq1)
print(next(cycle1))
print(next(cycle1))
print(next(cycle1))
print(next(cycle1)) # cycles to the beginning

Joe
Jana
Joseph
Joe


## count

In [None]:
count1 = itertools.count(100, 10)
print(next(count1))
print(next(count1))
print(next(count1))
print(next(count1))
print(next(count1))

100
110
120
130
140


## accumulate 
running addition

In [22]:
vals = [10, 20, 60, 40, 50, 15, 30]
accu = itertools.accumulate(vals)
print(list(accu))

[10, 30, 90, 130, 180, 195, 225]


In [23]:
# goes over the numbers and sticks with the max 
accu2 = itertools.accumulate(vals, max)
print(list(accu2))

[10, 20, 60, 60, 60, 60, 60]


## chain
chains two sequences

In [None]:

x =itertools.chain('ABCD', '1234')
print(list(x))

['A', 'B', 'C', 'D', '1', '2', '3', '4']


## dropwhile() / takewhile()

In [None]:
def fct(x):
    return x < 40

In [None]:
print(vals)
# drops values as long as (fct retuns True) trigger point is not reached
print(list(itertools.dropwhile(fct, vals)))
# returns values until (fct is False) trigger is reached
print(list(itertools.takewhile(fct, vals)))

[10, 20, 60, 40, 50, 15, 30]
[60, 40, 50, 15, 30]
[10, 20]


In [None]:
def get_sequence_upto(x):
    for i in range(x):
        yield i

In [None]:
seq = get_sequence_upto(5)
print(next(seq))
print(next(seq))
print(next(seq))
print(next(seq))
print(next(seq))
print(next(seq))  # The function finally terminates when next() encounters the StopIteration error.

0
1
2
3
4


StopIteration: 

In [None]:
# In the following example, function square_of_sequence() acts as a generator.
# It yields the square of a number successively on every call of next().

def square_of_sequence(x):
    for i in range(x):
        yield i * i


gen = square_of_sequence(5)

while True:
    try:
        print("Received on next(): ", next(gen))
    except StopIteration:
        break

Received on next():  0
Received on next():  1
Received on next():  4
Received on next():  9
Received on next():  16


In [None]:
# We can use the for loop to traverse the elements over the generator. 
# In this case, the next() function is called implicitly and the StopIteration is also automatically taken care of.
squres = square_of_sequence(5)
for sqr in squres:
    print(sqr)

0
1
4
9
16


## Permutations

## groupby
https://docs.python.org/2/library/itertools.html#itertools.groupby

groupby objects yield key-group pairs where the group is a generator.

    A. Group consecutive items together
    B. Group all occurrences of an item, given a sorted iterable
    C. Specify how to group items with a key function *


In [64]:
from itertools import groupby
s = "AAAABBBBGGGGHHZZTTTTTTAGGAAAAGGGGGAA"
print("Key - Values in consecutive groups")
for key, group in groupby(s):
    print(f"{key} - {''.join(group)}") # you can unpack an iterator just once

Key - Values in consecutive groups
A - AAAA
B - BBBB
G - GGGG
H - HH
Z - ZZ
T - TTTTTT
A - A
G - GG
A - AAAA
G - GGGGG
A - AA


In [51]:
from itertools import groupby

t = input()
kg = [(len((list(group))), int(key)) for key, group in groupby(t)]
print(*kg)

(1, 1) (1, 7) (1, 6) (1, 1) (2, 5) (3, 2) (1, 5) (1, 2) (4, 5) (3, 2) (3, 3)
