<a href="https://colab.research.google.com/github/aj225patel/python-fundamentals/blob/main/advanced/generators.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Generators**

> Generators are functions that can be paused and resumed on the fly, returning an object that can be iterated over.





*   Unlike lists, they are lazy and thus produce items one at a time and only when asked.
*   So they are much more memory efficient when dealing with large datasets.
* A generator is defined like a normal function but with the yield statement instead of return.



https://www.python-engineer.com/courses/advancedpython/14-generators/

In [13]:
def mygenerator():
  yield 3
  yield 2
  yield 1

g = mygenerator()

for i in g:
  print(i)

3
2
1


In [14]:
g1 = mygenerator()

list(g1)

[3, 2, 1]

In [15]:
g2 = mygenerator()

while 1:
  print(next(g2))


3
2
1


StopIteration: 

In [16]:
g3 = mygenerator()

while 1:
  try:
    print(next(g3))
  except StopIteration:
    print("generator is iterated completely")
    break

3
2
1
generator is iterated completely


In [17]:
g4 = mygenerator()

s = sum(g4)  # 3 + 2 + 1 = 6
print(f"sum: {s}")


g5 = mygenerator()
sorted_list = sorted(g5)

print(f"sorted list: {sorted_list}")

sum: 6
sorted list: [1, 2, 3]


## Execution of a generator function



*   Calling the function does not execute it. Instead, the function returns a generator object which is used to control execution.
* Generator objects execute when next() is called. When calling next() the first time, execution begins at the start of the function and continues until the first yield statement where the value to the right of the statement is returned.
* Subsequent calls to next() continue from the yield statement (and loop around) until another yield is reached. If yield is not called because of a condition or the end is reached, a **StopIteration exception** is raised:



In [22]:
def countdown(num: int):
  print('starting')
  while num > 0:
    yield num
    num -= 1

cd = countdown(4)

value = next(cd)
print(value)
value = next(cd)
print(value)
value = next(cd)
print(value)
value = next(cd)
print(value)


starting
4
3
2
1


## Big advantage: Generators save memory!

In [25]:
def firstn(n: int) -> list:
  nums = []
  num = 0

  while num < n:
    nums.append(num)
    num += 1
  return nums

print(firstn(10))
print(f"sum = {sum(firstn(10))}")

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
sum = 45


### Generators are more memory efficient

> Since the values are generated lazily, i.e. only when needed, it saves a lot of memory, especially when working with large data. Furthermore, we do not need to wait until all the elements have been generated before we start to use them.



In [26]:
from typing import Generator
def firstn_generators(n: int):
  num = 0

  while num < n:
    yield num
    num += 1

print(f"sum = {sum(firstn_generators(10))}")

sum = 45


In [29]:
import sys

n = 100000
print(f"Size of a list object = {sys.getsizeof(firstn(n))} bytes")
print(f"Size of a generator object = {sys.getsizeof(firstn_generators(n))} bytes")

Size of a list object = 800984 bytes
Size of a generator object = 104 bytes


### Another example: Fibonacci numbers

In [31]:
def fibonacci(limit: int):
  a, b = 0, 1
  while a < limit:
    yield a
    a, b = b, a + b

fib = fibonacci(30)

for i in fib:
  print(i)

0
1
1
2
3
5
8
13
21


## Generator Expression

* Just like list comprehensions, generators can be written in the same syntax except with parenthesis instead of square brackets. Be careful not to mix them up, since generator expressions are often slower than list comprehensions because of the overhead of function calls
* (https://stackoverflow.com/questions/11964130/list-comprehension-vs-generator-expressions-weird-timeit-results/11964478#11964478)

In [32]:
mygen = (i for i in range(10) if i % 2 == 0)

for i in mygen:
  print(i)

0
2
4
6
8


In [34]:
n = 100000
lst = [i for i in range(n) if i % 2 == 0]
gen = (i for i in range(n) if i % 2 == 0)

print(f"Size of a list object = {sys.getsizeof(lst)} bytes")
print(f"Size of a generator object = {sys.getsizeof(gen)} bytes")


Size of a list object = 444376 bytes
Size of a generator object = 104 bytes


## Concept behind a generator

* This class implements our generator as an iterable object. It has to implement ______iter__ and ______next__ to make it iterable, keep track of the current state (the current number in this case), and take care of a StopIteration

In [35]:
class firstn:
    def __init__(self, n):
        self.n = n
        self.num = 0

    def __iter__(self):
        return self

    def __next__(self):
        if self.num < self.n:
            cur = self.num
            self.num += 1
            return cur
        else:
            raise StopIteration()

firstn_object = firstn(1000000)
print(sum(firstn_object))

499999500000
