A generator is defined like a normal function but with a "yield" keyword instead of the return keyword.

In [4]:
def mygenerator():
    yield 1
    yield 2
    yield 3
    
g = mygenerator()  #g is a generator object
g

<generator object mygenerator at 0x00785808>

I can loop over this object.

In [5]:
for i in g:
    print(i)

1
2
3


In [6]:
def mygenerator():
    yield 1
    yield 2
    yield 3
    
g = mygenerator()
value = next(g)
print(value)
value = next(g)
print(value)
value = next(g)
print(value)

1
2
3


If I try to run it a 4th time, it will raise a stop iteration because a generator object will always raise a stop iteration if it doesn't reach another yield statement.

You can use generators as input to other functions that take iterables.

In [7]:
def mygenerator():
    yield 1
    yield 2
    yield 3
    
g = mygenerator()
sum(g)

6

In [8]:
def mygenerator():
    yield 1
    yield 2
    yield 3
    
g = mygenerator()
sorted(g)

[1, 2, 3]

Let's have a closer look at the execution of a generator function.

In [9]:
def countdown(num):
    print('Starting')
    while num>0:
        yield num
        num -= 1

cd = countdown(4)

In [10]:
value = next(cd)

Starting


In [11]:
value

4

In [12]:
value = next(cd)

In [13]:
value

3

Let's have a look at the big advandage of generators: they are very memory efficien. They save a lot of memory when you work with large data.

In [14]:
def firstn(n):
    nums = []
    num = 0
    while num < n:
        nums.append(num)
        num += 1
    return nums

firstn(10)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [15]:
sum(firstn(10))

45

In [16]:
def firstn_gen(n):
    num = 0
    while num<n:
        yield num
        num += 1
        
list(firstn_gen(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [17]:
sum(firstn_gen(10))

45

I didn't need to save all the numbers inside an array so I saved a lot of memory. Let's see the difference of sizes between the 2 objects.

In [18]:
import sys
print(sys.getsizeof(firstn(10)))  #returns the size of this object in bytes
print(sys.getsizeof(firstn_gen(10)))

92
56


In [19]:
print(sys.getsizeof(firstn(1000000)))
print(sys.getsizeof(firstn_gen(1000000)))

4348728
56


Another advantage of the generator object is that we don't have to wait until all the elements have been generated before we start to use them.

In [20]:
def fibonacci(limit):
    # 0 1 1 2 3 5 8 13 ...
    a, b = 0, 1
    while a<limit:
        yield a
        a, b = b, a + b
        
fib = fibonacci(30)
for i in fib:
    print(i)

0
1
1
2
3
5
8
13
21


Let's have a look at generator expressions. Generator expressions are written the same way as list comprehensions but with parentheses instead of square brackets.

In [22]:
mygenerator = (i for i in range(10) if i%2 == 0)
list(mygenerator)

[0, 2, 4, 6, 8]

In [24]:
mylist = [i for i in range(10)]
mylist

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [26]:
mygenerator = (i for i in range(1000000) if i%2 == 0)
mylist = [i for i in range(1000000)]
print(sys.getsizeof(mygenerator))
print(sys.getsizeof(mylist))

56
4348728
