# Lecture 7: Iterators and Generators

> Iterators

> Generators

> The Itertools Module

## Iterators

Iterators are objects that can be iterated upon.

An object is called iterable if we can get an iterator from it. Most built-in containers in Python like: list, tuple, string etc. are iterables.

Python iterator object must implement two special methods, _ _iter_ _() and _ _next_ _(), collectively called the iterator protocol.

### Iterating Through an Iterator

We use the next() function to manually iterate through all the items of an iterator.

In [None]:
# define a list
my_list = [4, 7, 0, 3]

# get an iterator using iter()
my_iter = iter(my_list)

# iterate through it using next()
print(next(my_iter))
print(next(my_iter))

print(my_iter.__next__())
print(my_iter.__next__())

In [None]:
# When we reach the end and there is no more data to be returned, it will raise the StopIteration Exception

print(next(my_iter))

In [None]:
"""
A more elegant way of automatically iterating is by using the for loop.
Using this, we can iterate over any object that can return an iterator (for example list, string, file etc.)
"""

for element in my_list:
    print(element)

In [None]:
"""
As we see in the above example, the for loop was able to iterate automatically through the list.
"""

# for element in iterable:
#     # do something with element


# Is actually implemented as:
# create an iterator object from that iterable
iterable = [4, 7, 0, 3]

iter_obj = iter(iterable)
while True:
    try:
        element = next(iter_obj)
        print(element)
    except StopIteration:
        break
        
# Ironically, this for loop is actually an infinite while loop.

### Building Custom Iterators

In [None]:
"""
The __iter__() method returns the iterator object itself.
If required, some initialization can be performed.

The __next__() method must return the next item in the sequence.
On reaching the end, and in subsequent calls, it must raise StopIteration.
"""

class PowTwo:
    """Class to implement an iterator of powers of two"""
    def __init__(self, max_value=0):
        self.max_value = max_value

    def __iter__(self):
        self.n = 0
        return self

    def __next__(self):
        if self.n <= self.max_value:
            result = 2 ** self.n
            self.n += 1
            return result
        else:
            raise StopIteration


numbers = PowTwo(3)

# create an iterable from the object
iter_obj = iter(numbers)

print(next(iter_obj))
print(next(iter_obj))
print(next(iter_obj))
print(next(iter_obj))
print(next(iter_obj))

In [None]:
# We can also use a for loop to iterate over our iterator class.
for i in PowTwo(5):
    print(i)

### Python Infinite Iterators

In [None]:
"""
The built-in function iter() can be called with two arguments 
where the first argument must be a callable object (function) and second is the sentinel. 
The iterator calls this function until the returned value is equal to the sentinel.
"""

print(int())

inf = iter(int, 1)
next(inf)
next(inf)

In [None]:
"""
We can also build our own infinite iterators. 
The following iterator will, theoretically, return all the odd numbers.
"""

class InfIter:
    """Infinite iterator to return all odd numbers"""
    def __iter__(self):
        self.num = 1
        return self

    def __next__(self):
        num = self.num
        self.num += 2
        return num

a = iter(InfIter())
print(next(a))
print(next(a))
print(next(a))
print(next(a))

In [None]:
"""
The advantage of using iterators is that they save resources.
Like shown above, we could get all the odd numbers without storing the entire number system in memory.
We can have infinite items (theoretically) in finite memory.
"""

print(next(a))

## Generators

There is a lot of work in building an iterator in Python.
We have to implement a class with __iter__() and __next__() method,
keep track of internal states, and raise StopIteration when there are no values to be returned.

Generators simplifies creation of iterators.
A generator is a function that produces a sequence of results instead of a single value.

### Create Generators in Python

It is fairly simple to create a generator in Python.
It is as easy as defining a normal function, but with a 'yield' statement instead of a 'return' statement.

The difference is that while a return statement terminates a function entirely, yield statement pauses the function saving all its states and later continues from there on successive calls.

### Differences between Generator function and Normal function

> Generator function contains one or more yield statements.

> When called, it returns an object (iterator) but does not start execution immediately.

> Methods like __iter__() and __next__() are implemented automatically. So we can iterate through the items using next().

> Once the function yields, the function is paused and the control is transferred to the caller.

> Local variables and their states are remembered between successive calls.

> Finally, when the function terminates, StopIteration is raised automatically on further calls.

In [None]:
# A simple generator function with one yield
def yrange(n):
    i = 0
    while i < n:
        yield i
        i += 1
        
y = yrange(3)
print(y)

print(next(y))
print(next(y))
print(next(y))
print(next(y))

In [None]:
# A simple generator function with several yields
def my_gen():
    n = 1
    print('This is printed first')
    yield n
    
    n += 1
    print('This is printed second')
    yield n
    
    n += 1
    print('This is printed at last')
    yield n

In [None]:
a = my_gen()

print(next(a))
print()
print(next(a))
print()
print(next(a))
print()
print(next(a))

In [None]:
# Using for loop
for item in my_gen():
    print(item)
    print()

In [None]:
"""
So a generator is also an iterator. You don’t have to worry about the iterator protocol.

The word "generator" to mean the genearted object.
The words "generator function" to mean the function that generates it.

When a generator function is called, it returns a generator object without even beginning execution of the function.

When 'next' method is called for the first time, the function starts executing until it reaches 'yield' statement.
The yielded value is returned by the next call.
"""

def foo():
    print("begin")
    for i in range(3):
        print("before yield", i)
        yield i
        print("after yield", i)
    print("end")

In [None]:
f = foo()
print(next(f))
print()
print(next(f))
print()
print(next(f))
print()
print(next(f))

### Python Generators with a Loop

In [None]:
# Let's take an example of a generator that reverses a string.
def rev_str(my_str):
    length = len(my_str)
    for i in range(length - 1, -1, -1):
        yield my_str[i]


# For loop to reverse the string
for char in rev_str("hello"):
    print(char)

### Generator Expressions

In [None]:
a = (x * x for x in range(10))
print(a)

In [None]:
sum((x * x for x in range(10))) == sum(x * x for x in range(10))

### Use of Python Generators

#### 1. Easy to Implement

In [None]:
# An example to implement a sequence of power of 2 using an iterator class

class PowTwo:
    def __init__(self, max_value=0):
        self.n = 0
        self.max_value = max_value

    def __iter__(self):
        return self

    def __next__(self):
        if self.n > self.max_value:
            raise StopIteration

        result = 2 ** self.n
        self.n += 1
        return result

In [None]:
# Let's do the same using a generator function

def PowTwoGen(max_value=0):
    n = 0
    while n < max_value:
        yield 2 ** n
        n += 1

#### 2. Memory Efficient

A normal function to return a sequence will create the entire sequence in memory before returning the result. This is an overkill, if the number of items in the sequence is very large.

Generator implementation of such sequences is memory friendly and is preferred since it only produces one item at a time.

In [None]:
my_list = [i for i in range(100)]
print(f"Size of my_list: {my_list.__sizeof__()}")

my_gen = (i for i in range(100))
print(f"Size of my_gen: {my_gen.__sizeof__()}")

#### 3. Represent Infinite Stream

Generators are excellent mediums to represent an infinite stream of data. Infinite streams cannot be stored in memory, and since generators produce only one item at a time, they can represent an infinite stream of data.

In [None]:
# The following generator function can generate all the even numbers (at least in theory).

def all_even():
    n = 0
    while True:
        yield n
        n += 2

#### 4. Pipelining Generators

Multiple generators can be used to pipeline a series of operations. This is best illustrated using an example.

Suppose we have a generator that produces the numbers in the Fibonacci series. And we have another generator for squaring numbers.

If we want to find out the sum of squares of numbers in the Fibonacci series, we can do it in the following way by pipelining the output of generator functions together.

In [None]:
def fibonacci_numbers(nums):
    x, y = 0, 1
    for _ in range(nums):
        x, y = y, x + y
        yield x

def square(nums):
    for num in nums:
        yield num ** 2

print(sum(square(fibonacci_numbers(10))))

In [None]:
# This pipelining is efficient and easy to read.

## The Itertools Module

### Function: accumulate()

Accumulate function in itertools module accumulates the iterables based on the function(func) provided to it as an argument.

In [None]:
import itertools

# example 1
iterables = [1, 3, 6, 2, 7, 9, 3, 1, 11]
data = itertools.accumulate(iterables)
print(list(data))

In [None]:
# example 2
iterables = [1, 3, 6, 2, 7, 9, 3, 1, 11]
data = itertools.accumulate(iterables, lambda x, y : x * y)
print(list(data))

In [None]:
# example 3
def collect(x, y):
    z = y.lower()
    return x + z + z

iterables = ['A', 'B', 'C', 'D']
data = itertools.accumulate(iterables, collect)
print(list(data))

### Function: combinations()

In [None]:
shapes = ['circle', 'triangle', 'square']

result = itertools.combinations(shapes, 2)

for each in result:
    print(each)

### Function: combinations_with_replacement()

In [None]:
shapes = ['circle', 'triangle', 'square']

result = itertools.combinations_with_replacement(shapes, 2)

for each in result:
    print(each)

### Function: count()

Syntax: itertools.count(start=0, step=1)

In [None]:
for i in itertools.count(10, 3):
    print(i)
    if i > 20:
        break

### Function: cycle()

Syntax: itertools.cycle(iterable)

In [None]:
colors = ['red', 'orange', 'yellow', 'green', 'blue', 'violet']

count = 0
for color in itertools.cycle(colors):
    print(color)
    count += 1
    if count >= 10:
        break

### Function: chain()

Syntax: itertools.chain(*iterables)

In [None]:
colors = ['red', 'orange', 'yellow', 'green', 'blue']
shapes = ['circle', 'triangle', 'square', 'pentagon']

result = itertools.chain(colors, shapes)
print(result)

for each in result:
    print(each)

### Function: compress()

Syntax: itertools.compress(data, selectors)

In [None]:
shapes = ['circle', 'triangle', 'square', 'pentagon']
selections = [True, False, True, False]

result = itertools.compress(shapes, selections)
print(result)

for each in result:
    print(each)

### Function: groupby()

Syntax: itertools.groupby(iterable, key=None)

In [None]:
robots = [
    {'name': 'blaster',
    'faction': 'autobot'},
    
    {'name': 'galvatron',
    'faction': 'decepticon'},
    
    {'name': 'jazz',
    'faction': 'autobot'},
    
    {'name': 'metroplex',
    'faction': 'autobot'},
    
    {'name': 'megatron',
    'faction': 'decepticon'},
    
    {'name': 'starcream',
    'faction': 'decepticon'}
]

for key, group in itertools.groupby(robots, key=lambda x: x['faction']):
    print(key)
    print(list(group))
    print()

### Function: permutations()

Syntax: itertools.permutations(iterable, r=None)

In [None]:
alpha_data = ['a', 'b', 'c']

result = itertools.permutations(alpha_data)
print(result)

for each in result:
    print(each)

### Function: product()

This function creates the cartesian products from a series of iterables.

In [None]:
num_data = [1, 2, 3]
alpha_data = ['a', 'b', 'c']

result = itertools.product(num_data, alpha_data)
print(result)

for each in result:
    print(each)

### Function: repeat()

Syntax: itertools.repeat(object[, times])

This function will repeat an object over and over again. Unless, there is a times argument.

In [None]:
# If we use the times argument, we can limit the number of times it will repeat.
for i in itertools.repeat("spam", 3):
    print(i)

### References
<ol>
<li> <a href="https://www.programiz.com/python-programming/iterator">Python Iterators</a> </li>
<li> <a href="https://www.programiz.com/python-programming/generator">Python Generators</a> </li>
<li> <a href="https://medium.com/@jasonrigden/a-guide-to-python-itertools-82e5a306cdf8">A Guide to Python Itertools</a> </li>
</ol>