## Generators vs regular functions (or yield vs return) - basics

From python.org (https://docs.python.org/3/howto/functional.html):

__Generators__ are a special class of __functions__ that simplify the task of writing iterators.

__Regular__ functions compute a value and return it, but __generators__ return an __iterator__ that returns a stream of values.

An __iterator__ is an object representing a stream of data; this object returns the data one element at a time. 

Generators are also supported by other langues, for example C#: 
https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/yield

### Generator basics

In [9]:
def example_generator():
    # use yield instead of return to turn a function into a generator.
    yield 1
    yield 2
    yield 30


In [2]:
# a generator function returns an object
g = example_generator()
print(g)

<generator object example_generator at 0x000002822AFC87C8>


In [3]:
# read out the first value
n = next(g)
print(n)

1


In [4]:
# continue
n = next(g)
print(n) # 2
n = next(g)
print(n) # 30


2
30


In [5]:
# what will happen when there are no more values?
n = next(g) # ?
print(n)

StopIteration: 

In [41]:
# normal use is in a for loop.
# a for loop iterates over the values that the generator returns.
for value in example_generator():
    print(value)

1
2
30


## Example - counting word lengths

In this example we want to write a function that will count the length of each word in a sentence and returns the length to the caller. The words are seperated by spaces.

In [1]:
# Define the example sentence
words = 'this is a sentence to test the word count functions'

### Count word lengths function with return

In [6]:
def count_word_lengths(sentence):
    # define a list for storing the length of each word
    word_lengths = list()
    
    for word in sentence.split():
        # Add each word length to the list
        word_lengths.append(len(word))
    return word_lengths


In [7]:
print(count_word_lengths(words))


[4, 2, 1, 8, 2, 4, 3, 4, 5, 9]


### Count word lengths function with a generator function

In [11]:
def count_word_lengths_yield(sentence):
    for word in sentence.split():
        # no intermediate storage needed. 
        # saves two statements. this also makes the intent of the function clearer.
        yield len(word)


When calling the function you get a generator instead of a list.

In [12]:
print(count_word_lengths_yield(words))

<generator object count_word_lengths_yield at 0x00000216E80337C8>


It is still possible to convert this to a list using the list() function.

In [13]:
print(list(count_word_lengths_yield(words)))

[4, 2, 1, 8, 2, 4, 3, 4, 5, 9]


### Using map to count word lengths

In this case also useful is a more functional style using the 'map' function. Map executes a function for each element in a stream. In this case the 'len' function is used.

Note: Pandas has a 'map' method as well that can be called on a Pandas Series object (https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.map.html).

In [16]:
# this gives the same result as the previous examples.
print(list(map(len, words.split())))

[4, 2, 1, 8, 2, 4, 3, 4, 5, 9]


## Usage of word lengths
For example find large words.
iterate over list/sequence

In [20]:
def count_large_words(word_counter, sentence, max_length):
    cnt = 0
    
    # you can use the 'for .. in' for iterating over both lists and generators.
    for word_length in word_counter(sentence):
        if word_length > max_length:
            cnt += 1
    return cnt

In [21]:
# use the return version to count the word lengths

cnt = count_large_words(count_word_lengths, words, 4)
print(cnt)

3


In [38]:
# using the yield version gives the same result, but in the generator
# needs less storage.

cnt = count_large_words(count_word_lengths_yield, words, 4)
print(cnt)



3


In [24]:
# using a one liner :)
# this is less readable in my opinion in Python
cnt = len(list(filter(lambda x: x > 4, map(len, words.split()))))
print(cnt)

3


In [7]:
# using functools and reduce.
import functools

cnt = functools.reduce(lambda x, y: x + 1 if len(y) > 4 else x, words.split(), 0)
print(cnt)

3


### Infinite sequences

A generator can also be used to simulate an infinite sequence of values. For example prime numbers, random numbers etc. Below an example of random x,y coordinates.

In [27]:
import random


def rnd_coordinates(size_x, size_y):
        random.seed()
        while True:
            yield (random.randint(0, size_x - 1), random.randint(0, size_y - 1))


In [30]:
import itertools

# get 10 random coordinates.
for coord in itertools.islice(rnd_coordinates(50, 50), 10):
    print(coord)

(9, 9)
(20, 33)
(21, 7)
(40, 48)
(22, 5)
(10, 45)
(42, 2)
(32, 49)
(1, 30)
(33, 46)
