
<img src='images/gdd-logo.png' width='300px' align='right' style="padding: 15px">

# <font color='#1EB0E0'>Generators in Python</font>

Simply speaking, a generator is a function that returns an object (iterator) which we can iterate over (one value at a time).

In this notebook we will cover:

- [The difference between Generator and Normal functions](#diff)
- [Build a generator](#build)
- [Python generators with a loop](#loop)
    - [<mark>Exercise: Build your own generator</mark>](#ex-build)
- [Python generator expression](#expression)
    - [<mark>Exercise: Use your generators</mark>](#ex-use)
- [Use of generators](#use)
- [Real life use case](#real)

---
<a id='diff'></a>
## Differences between Generator function and Normal function

Here is how a generator function differs from a normal function.

Generator function contains one or more <font color='green'>**yield**</font> statements.
When called, it returns an object (iterator) but does not start execution immediately.
Methods like `__iter__()` and `__next__()` are implemented automatically. So we can iterate through the items using `next()`.

Once the function yields, the function is paused and the control is transferred to the caller.
Local variables and their states are remembered between successive calls.
Finally, when the function terminates, StopIteration is raised automatically on further calls.
Here is an example to illustrate all of the points stated above. We have a generator function named `my_gen()` with several yield statements.

---
<a id='build'></a>
## Build a generator

In [None]:
def my_gen():
    n = 1
    print('This is printed first')
    # Generator function contains yield statements
    yield n

    n += 1
    print('This is printed second')
    yield n

    n += 1
    print('This is printed at last')
    yield n

An interactive run in the interpreter is given below. 

It returns an object but does not start execution immediately.

In [None]:
a = my_gen()

We can iterate through the items using `next()`.

In [None]:
next(a)

In [None]:
x = my_gen()
y = my_gen()

In [None]:
next(x)
next(y)
next(x)
next(x)
next(y)

Once the function yields, the function is paused and the control is transferred to the caller.

Local variables and theirs states are remembered between successive calls.

In [None]:
next(a)

In [None]:
next(a)

Finally, when the function terminates, StopIteration is raised automatically on further calls.

In [None]:
next(a)

One interesting thing to note in the above example is that the value of variable n is remembered between each call.

Unlike normal functions, the local variables are not destroyed when the function yields. Furthermore, the generator object can be iterated only once.

To **restart the process** we need to create another generator object using something like `a = my_gen()`.

One final thing to note is that we can use generators with **for loops** directly.

This is because a for loop takes an iterator and iterates over it using `next()` function. It automatically ends when StopIteration is raised. 

In [None]:
a = my_gen()
next(a)

In [None]:
a = my_gen()

for item in my_gen():
    print(item)

---
<a id='loop'></a>
## Python Generators with a Loop

The above example is of less use and we studied it just to get an idea of what was happening in the background.

Normally, generator functions are implemented with a loop having a suitable terminating condition.

Let's take an example of a generator that reverses an iterable.

In [None]:
def rev_iter(my_iter):
    length = len(my_iter)
    for i in range(length - 1, -1, -1):
        yield my_iter[i]

For loop to reverse the string

In [None]:
for char in [1, 2, 3][::-1]:
    print(char)

In [None]:
for char in rev_iter([1, 2, 3, 4, 5, 6, 7]):
    print(char)

In this example, we have used the `range()` function to get the index in reverse order using the for loop.

**Note:** This generator function not only works with lists, but also with other kinds of iterables like strings, tuples, etc.

---
<a id='ex-build'></a>
## <mark>Exercises: Create your own generators:</mark>

★ Create a generator for the 6 times table.

★★ Create a generator where you can select a letter of the alphabet.

★★★ Create a generator for the [fibonacci sequence](https://www.onlinecasinolegends.nl/wp-content/uploads/2020/07/fibonacci-systeem.png)

**Answers**

In [None]:
# %load answers/ex-gen-build1.py

In [None]:
# %load answers/ex-gen-build2.py

In [None]:
# %load answers/ex-gen-build3.py

---
<a id='expression'></a>
## Python Generator Expression

Simple generators can be easily created on the fly using generator expressions. It makes building generators easy.

Similar to the lambda functions which create anonymous functions, generator expressions create anonymous generator functions.

The syntax for generator expression is similar to that of a list comprehension in Python. But the square brackets are replaced with round parentheses.

The major difference between a list comprehension and a generator expression is that a list comprehension produces the entire list while the generator expression produces one item at a time.

They have lazy execution ( producing items only when asked for ). For this reason, a generator expression is much more memory efficient than an equivalent list comprehension.

In [None]:
# list of square numbers using a comprehension
list_ = [x**2 for x in range(1,5)]

# same thing can be done using a generator expression
generator = (x**2 for x in range(1,5))

print(list_)
print(generator)

We can see above that the generator expression did not produce the required result immediately. Instead, it returned a generator object, which produces items only on demand.

Here is how we can start getting items from the generator:

In [None]:
a = (x**2 for x in range(1,5))
list(a)

In [None]:
a = (x**2 for x in range(1,5))

print(next(a))
print(next(a))
print(next(a))
print(next(a))

next(a)

In [None]:
a = (x**2 for x in range(1,5))
print(next(a))
print(list(a))
print(next(a))

Generator expressions can be used as function arguments. When used in such a way, the round parentheses can be dropped.

In [None]:
sum(x**2 for x in range(1,5))

max(x**2 for x in range(1,5))

---
<a id='ex-use'></a>
## <mark>Exercise </mark>

★ Sum up the six times table (up to $12 \times 6$)

★★ Print all the vowels and their place in the alphabet.

In [None]:
def get_alphabet_letter(alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'):
    for ind, letter in enumerate(alphabet):
        yield (letter, ind+1)

[x for x in get_alphabet_letter() if x[0] in 'AEIOU']

★★★ Multiply the first 1- numbers in the fibonacci sequence

**Answers**

In [None]:
%load answers/ex-gen-use1.py

In [None]:
%load answers/ex-gen-use2.py

In [None]:
%load answers/ex-gen-use3.py

---
<a id='use'></a>
## Use of Python Generators

There are several reasons that make generators a powerful implementation.

**1. Easy to Implement**

Generators can be implemented in a clear and concise way as compared to their iterator class counterpart. Following is an example to implement a sequence of power of 2 using an iterator class.
```python
class PowTwo:
    def __init__(self, max=0):
        self.n = 0
        self.max = max

    def __iter__(self):
        return self

    def __next__(self):
        if self.n > self.max:
            raise StopIteration

        result = 2 ** self.n
        self.n += 1
        return result
```
The above program was lengthy and confusing. Now, let's do the same using a generator function.

In [None]:
def pow_two_gen(max=0):
    n = 0
    while n < max:
        yield 2 ** n
        n += 1

Since generators keep track of details automatically, the implementation was concise and much cleaner.

**2. Memory Efficient**
A normal function to return a sequence will create the entire sequence in memory before returning the result. This is an overkill, if the number of items in the sequence is very large.

Generator implementation of such sequences is memory friendly and is preferred since it only produces one item at a time.

**3. Represent Infinite Stream**
Generators are excellent mediums to represent an infinite stream of data. Infinite streams cannot be stored in memory, and since generators produce only one item at a time, they can represent an infinite stream of data.

The following generator function can generate all the even numbers (at least in theory).

In [None]:
def all_even():
    n = 0
    while True:
        yield n
        n += 2

**4. Pipelining Generators**
Multiple generators can be used to pipeline a series of operations. This is best illustrated using an example.

Suppose we have a generator that produces the numbers in the Fibonacci series. And we have another generator for squaring numbers.

If we want to find out the sum of squares of numbers in the Fibonacci series, we can do it in the following way by pipelining the output of generator functions together.

In [None]:
def fibonacci_numbers(nums):
    x, y = 0, 1
    for _ in range(nums):
        x, y = y, x+y
        yield x

def square(nums):
    for num in nums:
        yield num**2

print(sum(square(fibonacci_numbers(10))))

This pipelining is efficient and easy to read (and yes, a lot cooler!).

---
<a id='real'></a>

## Real life use case

Let's say we had some files, in this case we have three different books. Let's imagine that these files were very large, yet we only needed to access a certain portion of each file. We wouldn't want to put all the files into memory as we don't have enough space. Instead we can use a generator to iterate through each file and perform what we need to one after the other.

For this we're going to use the glob.glob function to list all the files with extension `.txt` in the `/data` folder:

In [None]:
import glob

glob.glob('data/*.txt')

In [None]:
def read_file(file_path):
    for file in glob.glob(file_path):
        yield open(file).read()

file = read_file('data/*.txt')

Now using next() we can read in the first line of each book (the title)

In [None]:
next(file).split('\n')[0]