Generator functions are functions which contain at least one **yield** statement.

When a generator function is called, Python creates a **generator** object.

Generators implement the **iterator protocol**.

Generators are inherently **lazy** iterators (*and can be infinite*).

Generators **are** iterators, and can be used in the same way (*for loops, comprehensions, etc*).

Generators become **exhausted** once the function **returns** a value.

<br>

### General information

In [1]:
def song():   # generator function (- a function that uses the 'yield' statement)
    print('line 1')
    yield("I'm a lumberjack and I'm OK")
    print('line 2')
    yield('I sleep all night and I work all day')
    return "Now generator is exhausted"         # we can ignore return statement
                                                # any code after 'return' will be ignored
    
song()

<generator object song at 0x7fb15c437138>

In [2]:
gen = song()  # we can think of generator function as generator factory.
              # when the generator function is called, the generator is returned.
              # In fact, generator is iterator,
              # since it implement the iterator protocol: __iter__ and __next__

next(gen)     # This resulting generator is executed by calling 'next()'.
              # The function body executes until it encounters a yield statement.
              # It yields the value and then suspects itself until 'next()' is called again.

line 1


"I'm a lumberjack and I'm OK"

In [3]:
next(gen)     #= song.__next__()

line 2


'I sleep all night and I work all day'

In [4]:
# next(gen)   #> StopIteration: Now generator is exhausted
              # If function body encounters a 'return' then exception occurs.
              # The message of exception is what 'return' contains.
              # Thus, when the function returns a value, the generator (iterator) is exhausted

In [5]:
# next(gen)   #> StopIteration:
              # The second exception doesn't contains any additional message.

In [6]:
print('__iter__' in dir(gen))  # generators are iterators
print('__next__' in dir(gen))

print(iter(gen) is gen)

True
True
True


<br>

*example: Fibonacci sequence as generator*

```
0 1 1 2 3 5 8 13 ...

Fib(n) = Fib(n-1) + Fib(n-2)

Fib(0) = 0
Fib(1) = 1
```

In [7]:
def fib_gen(n):
    fib_0 = 0
    fib_1 = 1
    if n == 0:
        return None
    yield fib_0
    if n == 1:
        return None
    yield fib_1
    for i in range(n-2):
        fib_0, fib_1 = fib_1, fib_0 + fib_1
        yield fib_1

In [8]:
gen = fib_gen(7)

for num in gen:          #= print(*gen)
    print(num, end=' ')

0 1 1 2 3 5 8 

In [9]:
gen = fib_gen(2)
print(*gen)

0 1


In [10]:
gen = fib_gen(1)
print(*gen)

0


In [11]:
gen = fib_gen(0)
print(*gen)




<br>

### Making an iterable from a generator

In [12]:
class Squares:
    def __init__(self, n):
        self.n = n
        
    def __iter__(self):
        return Squares.squares_gen(self.n)   # return new iterator
    
    @staticmethod
    def squares_gen(n):
        for i in range(n):
            yield i ** 2
            
            
sq = Squares(5)
print(*sq)
print(*sq)

0 1 4 9 16
0 1 4 9 16


Note: Google's Python style guide explicitly discourages using static methods ([*google.github.io*](http://google.github.io/styleguide/pyguide.html#2174-decision) - '*write a module level function instead*')

<br>
<br>

In [13]:
# closure style iterable generator (from FAQ)
def SquaresClosure(n):
    def inner():
        return squares_gen(n)
    
    def squares_gen(n):
        for i in range(n):
            yield i ** 2
    return inner

sq_func = SquaresClosure(5)
print(*sq_func())
print(*sq_func())

0 1 4 9 16
0 1 4 9 16


<br>

*Example with card deck*:

In [14]:
from collections import namedtuple

Card = namedtuple('Card', 'rank, suit')
SUITS = ('Spades', 'Hearts', 'Diamonds', 'Clubs')
RANKS = tuple(range(2, 11)) + tuple('JQKA')        # (2,3,4,5,6,7,8,9,10,'J','Q','K','A')

```
suit_index = card_index // len(RANKS)
rank_index = card_index % len(RANKS)
```

In [15]:
def card_gen():
    for i in range(len(SUITS) * len(RANKS)):
        suit = SUITS[i // len(RANKS)]
        rank = RANKS[i % len(RANKS)]
        yield Card(rank, suit)
        
for card in card_gen():
    print(card)

Card(rank=2, suit='Spades')
Card(rank=3, suit='Spades')
Card(rank=4, suit='Spades')
Card(rank=5, suit='Spades')
Card(rank=6, suit='Spades')
Card(rank=7, suit='Spades')
Card(rank=8, suit='Spades')
Card(rank=9, suit='Spades')
Card(rank=10, suit='Spades')
Card(rank='J', suit='Spades')
Card(rank='Q', suit='Spades')
Card(rank='K', suit='Spades')
Card(rank='A', suit='Spades')
Card(rank=2, suit='Hearts')
Card(rank=3, suit='Hearts')
Card(rank=4, suit='Hearts')
Card(rank=5, suit='Hearts')
Card(rank=6, suit='Hearts')
Card(rank=7, suit='Hearts')
Card(rank=8, suit='Hearts')
Card(rank=9, suit='Hearts')
Card(rank=10, suit='Hearts')
Card(rank='J', suit='Hearts')
Card(rank='Q', suit='Hearts')
Card(rank='K', suit='Hearts')
Card(rank='A', suit='Hearts')
Card(rank=2, suit='Diamonds')
Card(rank=3, suit='Diamonds')
Card(rank=4, suit='Diamonds')
Card(rank=5, suit='Diamonds')
Card(rank=6, suit='Diamonds')
Card(rank=7, suit='Diamonds')
Card(rank=8, suit='Diamonds')
Card(rank=9, suit='Diamonds')
Card(rank=10, 

In [16]:
def card_gen_lazy():
    for suit in SUITS:
        for rank in RANKS:
            yield Card(rank, suit)

for card in card_gen_lazy():
    print(card)

Card(rank=2, suit='Spades')
Card(rank=3, suit='Spades')
Card(rank=4, suit='Spades')
Card(rank=5, suit='Spades')
Card(rank=6, suit='Spades')
Card(rank=7, suit='Spades')
Card(rank=8, suit='Spades')
Card(rank=9, suit='Spades')
Card(rank=10, suit='Spades')
Card(rank='J', suit='Spades')
Card(rank='Q', suit='Spades')
Card(rank='K', suit='Spades')
Card(rank='A', suit='Spades')
Card(rank=2, suit='Hearts')
Card(rank=3, suit='Hearts')
Card(rank=4, suit='Hearts')
Card(rank=5, suit='Hearts')
Card(rank=6, suit='Hearts')
Card(rank=7, suit='Hearts')
Card(rank=8, suit='Hearts')
Card(rank=9, suit='Hearts')
Card(rank=10, suit='Hearts')
Card(rank='J', suit='Hearts')
Card(rank='Q', suit='Hearts')
Card(rank='K', suit='Hearts')
Card(rank='A', suit='Hearts')
Card(rank=2, suit='Diamonds')
Card(rank=3, suit='Diamonds')
Card(rank=4, suit='Diamonds')
Card(rank=5, suit='Diamonds')
Card(rank=6, suit='Diamonds')
Card(rank=7, suit='Diamonds')
Card(rank=8, suit='Diamonds')
Card(rank=9, suit='Diamonds')
Card(rank=10, 

In [17]:
# to iterable, plus reversed-method
class CardDeck:
    SUITS = ('Spades', 'Hearts', 'Diamonds', 'Clubs')
    RANKS = tuple(range(2, 11)) + tuple('JQKA')
    
    def __iter__(self):
        return CardDeck.card_gen_lazy()  # each time returns new iterator
    
    @staticmethod
    def card_gen_lazy():
        for suit in CardDeck.SUITS:
            for rank in CardDeck.RANKS:
                yield Card(rank, suit)

    def __reversed__(self):              # returns reversed iterator
        return CardDeck.reversed_card_gen_lazy()
    
    @staticmethod
    def reversed_card_gen_lazy():
        for suit in reversed(CardDeck.SUITS):
            for rank in reversed(CardDeck.RANKS):
                yield Card(rank, suit)

In [18]:
deck = CardDeck()
list(deck)

[Card(rank=2, suit='Spades'),
 Card(rank=3, suit='Spades'),
 Card(rank=4, suit='Spades'),
 Card(rank=5, suit='Spades'),
 Card(rank=6, suit='Spades'),
 Card(rank=7, suit='Spades'),
 Card(rank=8, suit='Spades'),
 Card(rank=9, suit='Spades'),
 Card(rank=10, suit='Spades'),
 Card(rank='J', suit='Spades'),
 Card(rank='Q', suit='Spades'),
 Card(rank='K', suit='Spades'),
 Card(rank='A', suit='Spades'),
 Card(rank=2, suit='Hearts'),
 Card(rank=3, suit='Hearts'),
 Card(rank=4, suit='Hearts'),
 Card(rank=5, suit='Hearts'),
 Card(rank=6, suit='Hearts'),
 Card(rank=7, suit='Hearts'),
 Card(rank=8, suit='Hearts'),
 Card(rank=9, suit='Hearts'),
 Card(rank=10, suit='Hearts'),
 Card(rank='J', suit='Hearts'),
 Card(rank='Q', suit='Hearts'),
 Card(rank='K', suit='Hearts'),
 Card(rank='A', suit='Hearts'),
 Card(rank=2, suit='Diamonds'),
 Card(rank=3, suit='Diamonds'),
 Card(rank=4, suit='Diamonds'),
 Card(rank=5, suit='Diamonds'),
 Card(rank=6, suit='Diamonds'),
 Card(rank=7, suit='Diamonds'),
 Card(rank

In [19]:
rev = reversed(CardDeck())
list(rev)

[Card(rank='A', suit='Clubs'),
 Card(rank='K', suit='Clubs'),
 Card(rank='Q', suit='Clubs'),
 Card(rank='J', suit='Clubs'),
 Card(rank=10, suit='Clubs'),
 Card(rank=9, suit='Clubs'),
 Card(rank=8, suit='Clubs'),
 Card(rank=7, suit='Clubs'),
 Card(rank=6, suit='Clubs'),
 Card(rank=5, suit='Clubs'),
 Card(rank=4, suit='Clubs'),
 Card(rank=3, suit='Clubs'),
 Card(rank=2, suit='Clubs'),
 Card(rank='A', suit='Diamonds'),
 Card(rank='K', suit='Diamonds'),
 Card(rank='Q', suit='Diamonds'),
 Card(rank='J', suit='Diamonds'),
 Card(rank=10, suit='Diamonds'),
 Card(rank=9, suit='Diamonds'),
 Card(rank=8, suit='Diamonds'),
 Card(rank=7, suit='Diamonds'),
 Card(rank=6, suit='Diamonds'),
 Card(rank=5, suit='Diamonds'),
 Card(rank=4, suit='Diamonds'),
 Card(rank=3, suit='Diamonds'),
 Card(rank=2, suit='Diamonds'),
 Card(rank='A', suit='Hearts'),
 Card(rank='K', suit='Hearts'),
 Card(rank='Q', suit='Hearts'),
 Card(rank='J', suit='Hearts'),
 Card(rank=10, suit='Hearts'),
 Card(rank=9, suit='Hearts'),


<br>

### Short form

In [20]:
gn = (i**2 for i in range(5))
print(gn)
print(type(gn))

<generator object <genexpr> at 0x7fb15c3f0408>
<class 'generator'>


In [21]:
print(*gn)

0 1 4 9 16


In [22]:
list(gn)  # generator is exhausted

[]

<br>

### Generator inside generator

In [23]:
mult_gen = ((i*j for j in range(1,11))
            for i in range(1,4))

list(mult_gen)  # in this case it's difficult to get result in convenient form

[<generator object <genexpr>.<genexpr> at 0x7fb15c437408>,
 <generator object <genexpr>.<genexpr> at 0x7fb15c437e58>,
 <generator object <genexpr>.<genexpr> at 0x7fb15c437c00>]

In [24]:
mult_gen = ((i*j for j in range(1,11))
            for i in range(1,4))

[list(row) for row in mult_gen]

[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
 [2, 4, 6, 8, 10, 12, 14, 16, 18, 20],
 [3, 6, 9, 12, 15, 18, 21, 24, 27, 30]]

Alternative ways to output:

In [25]:
mult_gen = ((i*j for j in range(1,11))
            for i in range(1,4))

for row in mult_gen:
    print(', '.join([str(item) for item in row]))

1, 2, 3, 4, 5, 6, 7, 8, 9, 10
2, 4, 6, 8, 10, 12, 14, 16, 18, 20
3, 6, 9, 12, 15, 18, 21, 24, 27, 30


In [26]:
mult_gen = ((i*j for j in range(1,11))
            for i in range(1,4))

for row in mult_gen:
    print(*row, sep=', ')

1, 2, 3, 4, 5, 6, 7, 8, 9, 10
2, 4, 6, 8, 10, 12, 14, 16, 18, 20
3, 6, 9, 12, 15, 18, 21, 24, 27, 30


<br>

It can be useful to combine generator with traditional list comprehension:

In [27]:
mult_comb = ([i*j for j in range(1,11)]
             for i in range(1,4))

list(mult_comb)

[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
 [2, 4, 6, 8, 10, 12, 14, 16, 18, 20],
 [3, 6, 9, 12, 15, 18, 21, 24, 27, 30]]

<br>
<br>

example with Pascal's triangle:
```
    1
   1 1
  1 2 1
 1 3 3 1
1 4 6 4 1
```

We need to know how to calculate combinations:
```
C(n, k) = n! / (k! (n-k)!)
```

* row 0, column 0: n=0, k=0: c(0, 0) = 0! / 0! 0! = 1/1 = 1
* row 4, column 2: n=4, k=2: c(4, 2) = 4! / 2! 2! = 4x3x2 / 2x2 = 6

In other words, we need to calculate the following list of lists:
```
            c(0,0)
        c(1,0)  c(1,1)
    c(2,0)  c(2,1)  c(2,2)
c(3,0)  c(3,1)  c(3,2)  c(3,3)
```

In [28]:
from math import factorial

In [29]:
def combo(n, k):
    return factorial(n) // (factorial(k) * factorial(n-k))

In [30]:
size = 6  # global variable

In [31]:
# creating Pascal's triangle via list comprehension
pascal_ls = [[combo(n,k) for k in range(n+1)] for n in range(size+1)]

pascal_ls

[[1],
 [1, 1],
 [1, 2, 1],
 [1, 3, 3, 1],
 [1, 4, 6, 4, 1],
 [1, 5, 10, 10, 5, 1],
 [1, 6, 15, 20, 15, 6, 1]]

In [32]:
# creating Pascal's triangle using generators
pascal_gn = ((combo(n,k) for k in range(n+1)) for n in range(size+1))

[list(row) for row in pascal_gn]

[[1],
 [1, 1],
 [1, 2, 1],
 [1, 3, 3, 1],
 [1, 4, 6, 4, 1],
 [1, 5, 10, 10, 5, 1],
 [1, 6, 15, 20, 15, 6, 1]]

<br>

### Performance research

In [33]:
from timeit import timeit

In [34]:
size = 600

In [35]:
timeit('[[combo(n,k) for k in range(n+1)] for n in range(size+1)]',
       globals=globals(), number=1)

2.35700031000124

In [36]:
timeit('((combo(n,k) for k in range(n+1)) for n in range(size+1))',
       globals=globals(), number=1)

5.489999239216559e-06

In [37]:
timeit('([combo(n,k) for k in range(n+1)] for n in range(size+1))',
       globals=globals(), number=1)

6.326001312118024e-06

In [38]:
timeit('[(combo(n,k) for k in range(n+1)) for n in range(size+1)]',
       globals=globals(), number=1)

0.00136031999863917

<br>

Time of creation plus running over all elements:

In [39]:
def pascal_list(size):
    ls = [[combo(n,k) for k in range(n+1)] for n in range(size+1)]
    for row in ls:
        for item in row:
            pass

In [40]:
timeit('pascal_list(size)', globals=globals(), number=1)

2.326436156001364

In [41]:
def pascal_gen(size):
    gn = ((combo(n,k) for k in range(n+1)) for n in range(size+1))
    for row in gn:
        for item in row:
            pass

In [42]:
timeit('pascal_gen(size)', globals=globals(), number=1)

2.3458863039995776

<br>

Memory taken:

In [43]:
import tracemalloc

In [44]:
def pascal_list(size):
    ls = [[combo(n,k) for k in range(n+1)] for n in range(size+1)]
    for row in ls:
        for item in row:
            pass
    stats = tracemalloc.take_snapshot().statistics('lineno')
    print(stats[0].size, 'bytes')

In [45]:
tracemalloc.stop()
tracemalloc.clear_traces()
tracemalloc.start()
pascal_list(300)

1998608 bytes


In [46]:
def pascal_gen(size):
    gn = ((combo(n,k) for k in range(n+1)) for n in range(size+1))
    for row in gn:
        for item in row:
            pass
    stats = tracemalloc.take_snapshot().statistics('lineno')
    print(stats[0].size, 'bytes')

In [47]:
tracemalloc.stop()
tracemalloc.clear_traces()
tracemalloc.start()
pascal_gen(300)

600 bytes


<br>

### yield from

In [48]:
def matrix_gen(n):
    gen = ( (i*j for j in range(1, n+1))
            for i in range(1, n+1)
          )
    return gen

In [49]:
def matrix_iterator(n):
    for row in matrix_gen(n):
        for item in row:
            yield item
            
for item in matrix_iterator(3):
    print(item, end=' ')

1 2 3 2 4 6 3 6 9 

In [50]:
# alternative with 'yield from'

def matrix_iterator(n):
    for row in matrix_gen(n):
        yield from row
            
for item in matrix_iterator(3):
    print(item, end=' ')

1 2 3 2 4 6 3 6 9 

<br>

In [51]:
file_1 = 'working_files/car-brands-1.txt'
file_2 = 'working_files/car-brands-2.txt'
file_3 = 'working_files/car-brands-3.txt'
files = (file_1, file_2, file_3)

In [52]:
def gen_clean_data(file):
    with open(file) as fh:
        for row in fh:
            yield row.strip()


def brands(*files):
    for f_name in files:
        yield from gen_clean_data(f_name)
        

gen = brands(*files)
print(*gen, sep=', ')

Alfa Romeo, Aston Martin, Audi, Bentley, Benz, BMW, Bugatti, Cadillac, Chevrolet, Chrysler, Citroen, Corvette, DAF, Dacia, Daewoo, Daihatsu, Datsun, De Lorean, Dino, Dodge, Farboud, Ferrari, Fiat, Ford, Honda, Hummer, Hyundai, Jaguar, Jeep, KIA, Koenigsegg, Lada, Lamborghini, Lancia, Land Rover, Lexus, Ligier, Lincoln, Lotus, Martini, Maserati, Maybach, Mazda, McLaren, Mercedes-Benz, Mini, Mitsubishi, Nissan, Noble, Opel, Peugeot, Pontiac, Porsche, Renault, Rolls-Royce, Saab, Seat, Škoda, Smart, Spyker, Subaru, Suzuki, Toyota, Vauxhall, Volkswagen, Volvo
