Generators simplifies creation of iterators. A generator is a function that produces a sequence of results instead of a single value.
Each time the yield statement is executed the function generates a new value.

In [3]:
def yrange(n):
    i = 0
    while i < n:
        yield i
        i += 1
y = yrange(3)
print(y)
print(next(y))
print(next(y))
print(next(y))
print(next(y))

<generator object yrange at 0x04398AB0>
0
1
2


StopIteration: 

So a generator is also an iterator. You don’t have to worry about the iterator protocol.

The word “generator” is confusingly used to mean both the function that generates and what it generates. In this chapter, I’ll use the word “generator” to mean the genearted object and “generator function” to mean the function that generates it.

Can you think about how it is working internally?

When a generator function is called, it returns a generator object without even beginning execution of the function. When next method is called for the first time, the function starts executing until it reaches yield statement. The yielded value is returned by the next call.

The following example demonstrates the interplay between yield and call to __next__ method on generator object.

In [5]:
def foo():
    print("begin")
    for i in range(3):
        print("before yield", i)
        yield i
        print("after yield", i)
        print("end")
f = foo()
print(f)
print(next(f))
print(next(f))
print(next(f))
print(next(f))

<generator object foo at 0x047152B0>
begin
before yield 0
0
after yield 0
end
before yield 1
1
after yield 1
end
before yield 2
2
after yield 2
end


StopIteration: 

In [6]:
def integers():
    """Infinite sequence of integers."""
    i = 1
    while True:
        yield i
        i = i + 1

def squares():
    for i in integers():
        yield i * i

def take(n, seq):
    """Returns first n values from the given sequence."""
    seq = iter(seq)
    result = []
    try:
        for i in range(n):
            result.append(next(seq))
    except StopIteration:
        pass
    return result

print(take(5, squares())) # prints [1, 4, 9, 16, 25]

[1, 4, 9, 16, 25]


In [7]:
def readfiles(filenames):
    for f in filenames:
        for line in open(f):
            yield line

def grep(pattern, lines):
    return (line for line in lines if pattern in line)

def printlines(lines):
    for line in lines:
        print(line, end="")

def main(pattern, filenames):
    lines = readfiles(filenames)
    lines = grep(pattern, lines)
    printlines(lines)

The code is much simpler now with each function doing one small thing. We can move all these functions into a separate module and reuse it in other programs.

Problem 2: Write a program that takes one or more filenames as arguments and prints all the lines which are longer than 40 characters.

Problem 3: Write a function findfiles that recursively descends the directory tree for the specified directory and generates paths of all the files in the tree.

Problem 4: Write a function to compute the number of python files (.py extension) in a specified directory recursively.

Problem 5: Write a function to compute the total number of lines of code in all python files in the specified directory recursively.

Problem 6: Write a function to compute the total number of lines of code, ignoring empty and comment lines, in all python files in the specified directory recursively.

Problem 7: Write a program split.py, that takes an integer n and a filename as command line arguments and splits the file into multiple small files with each having n lines.

## Itertools

The itertools module in the standard library provides lot of intersting tools to work with iterators.

Lets look at some of the interesting functions.

chain – chains multiple iterators together.

In [12]:
import itertools
it1 = iter([1, 2, 3])
it2 = iter([4, 5, 6])
print(list(itertools.chain(it1, it2)))


[1, 2, 3, 4, 5, 6]


In [15]:
import itertools
for x, y in itertools.zip_longest(["a", "b", "c"], [1, 2, 3,4]):
     print(x, y)

a 1
b 2
c 3
None 4


Problem 8: Write a function peep, that takes an iterator as argument and returns the first element and an equivalant iterator.
it = iter(range(5))
x, it1 = peep(it)
print(x, list(it1))

Problem 9: The built-in function enumerate takes an iteratable and returns an iterator over pairs (index, value) for each value in the source.
 list(enumerate(["a", "b", "c"])
[(0, "a"), (1, "b"), (2, "c")]
for i, c in enumerate(["a", "b", "c"]):
...     print(i, c)
...
0 a
1 b
2 c
Write a function my_enumerate that works like enumerate.

Problem 10: Implement a function izip that works like itertools.izip.



http://www.dabeaz.com/generators-uk/