The yield statement is used to define generators, replacing the return of a function to provide a result to its caller without destroying local variables. Unlike a function, where on each call it starts with new set of variables, a generator will resume the execution where it was left off.

### About Python Generators

Since the yield keyword is only used with generators, it makes sense to recall the concept of generators first.

The idea of generators is to calculate a series of results one-by-one on demand (on the fly). In the simplest case, a generator can be used as a list, where each element is calculated lazily. Lets compare a list and a generator that do the same thing - return powers of two:

In [3]:
the_list = [2**x for x in range(5)]
print(type(the_list))
for element in the_list:
    print(element)

<class 'list'>
1
2
4
8
16


In [4]:
len(the_list)

5

In [8]:
# As easy as list comprehensions, but with '()' instead of '[]':
>>> the_generator = (x+x for x in range(3))
print(type(the_generator))

for element in the_generator:
    print(element)

len(the_generator)

<class 'generator'>
0
2
4


TypeError: object of type 'generator' has no len()

Iterating over the list and the generator looks completely the same. However, although the generator is iterable, it is not a collection, and thus has no length. Collections (lists, tuples, sets, etc) keep all values in memory and we can access them whenever needed. A generator calculates the values on the fly and forgets them, so it does not have any overview about the own result set.

Generators are especially useful for memory-intensive tasks, where there is no need to keep all of the elements of a memory-heavy list accessible at the same time. Calculating a series of values one-by-one can also be useful in situations where the complete result is never needed, yielding intermediate results to the caller until some requirement is satisfied and further processing stops.

### Using the Python "yield" keyword

A good example is a search task, where typically there is no need to wait for all results to be found. Performing a file-system search, a user would be happier to receive results on-the-fly, rather the wait for a search engine to go through every single file and only afterwards return results. Are there any people who really navigate through all Google search results until the last page?

Since a search functionality cannot be created using list-comprehensions, we are going to define a generator using a function with the yield statement/keyword. The yield instruction should be put into a place where the generator returns an intermediate result to the caller and sleeps until the next invocation occurs. Let's define a generator that would search for some keyword in a huge text file line-by-line.

In [9]:
def search(keyword, filename):
    print('generator started')
    f = open(filename, 'r')
    # Looping through the file line by line
    for line in f:
        if keyword in line:
            # If keyword found, return it
            yield line
    f.close()

Now, assuming that my "directory.txt" file contains a huge list of names and phone numbers, lets find someone with "Python" in the name:

In [13]:
the_generator = search('Python', 'directory.txt')
# Nothing happened

When we call the search function, its body code does not run. The generator function will only return the generator object, acting as a constructor:

In [14]:
print (type(search))
type(the_generator)

<class 'function'>


generator

This is a bit tricky, since everything below def search(keyword, filename): is normally meant to execute after calling it, but not in the case of generators. In fact, there was even a long discussion, suggesting to use "gen", or other keywords to define a generator. However, Guido decided to stick with "def", and that's it. You can read the motivation on PEP-255.

To make the newly-created generator calculate something, we need to access it via the iterator protocol, i.e. call it's next method

In [15]:
print(next(the_generator)) # we need to have that directory file.

generator started


FileNotFoundError: [Errno 2] No such file or directory: 'directory.txt'

Ref: https://www.pythoncentral.io/python-generators-and-yield-keyword/
        `