# Iterators

Iterators are objects with an <i>\_\_iter\_\_</i> and a <i>\_\_next\_\_</i> method. These methods allow you to iterate through the object in a _for_ loop and get successive values with the _next_ function. Data structures that you can iterate through in a _for_ loop like lists, strings, and dictionaries all have a corresponding iterator that can be instantiated by the _iter_ function. Here is an example of one:

In [1]:
list_iterator = iter([1, 2, 3, 4, 5])

You can put it in a _for_ loop like you would a list:

In [2]:
for i in list_iterator:
    print(i)

1
2
3
4
5


Once you iterate through an iterator, it is exhausted and cannot be iterated through again:

In [3]:
for i in list_iterator:
    print(i)

You can also get successive values one at a time with the _next_ function

In [4]:
list_iterator = iter([1, 2])
print(next(list_iterator))
print(next(list_iterator))

1
2


Once the iterator is exhausted, the _next_ function will raise a _StopIteration_ exception

In [5]:
next(list_iterator)

StopIteration: 

# Generators

A generator is a special kind of iterator returned by a generator function. A generator function is characterized by _yield_ statements in place of a _return_ statement. When the _next_ function is called on a generator, it will run the code of its parent generator function until it reaches a _yield_ statement. It will then return the object of the _yield_ statement and pause. The code will resume when the _next_ function is called on the generator again and pause after the next _yield_ statement. Here is an example:

In [6]:
def generator_func():
    a = 1
    yield a
    b = 2
    yield b
    c = 3
    yield c
    
my_generator = generator_func()
print(next(my_generator)) # The next function runs a = 1 inside the generator function, then when it hits yield a, it returns the object of yield a, which is a or 1.

1


In [7]:
print(next(my_generator)) # Running the next function again runs b = 2 and returns b.

2


In [8]:
print(next(my_generator)) # Similar story here

3


When the generator has finished running through the code in the generator function, the generator is exhausted

In [9]:
next(my_generator) # Running the next function on an exhausted generator raises a StopIteration exception like a normal iterator

StopIteration: 

You can also iterate through a generator with a _for_ loop like a normal iterator

In [10]:
for i in generator_func():
    print(i)

1
2
3


Generators are very useful if you want to iterate through many objects without actually storing all of those objects in memory at once. As a practical example, here is a function that gets all subsequences for an input list:

In [11]:
def get_all_subsequences(seq: list, start=0):
    subseqs = [list(seq)]
    for i in range(start, len(seq)):
        e = seq.pop(i)
        subseqs.extend(get_all_subsequences(seq, start=i))
        seq.insert(i, e)
    return subseqs

In order to profile its memory usage, we can use memory profiler. You can install it by running the following cell.

In [12]:
!pip3 install memory_profiler
%load_ext memory_profiler

Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.2.2[0m[39;49m -> [0m[32;49m22.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49m/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip[0m


Notice the memory increment when we iterate through it. We are storing all 2<sup>n</sup> possible subsequences in memory

In [13]:
%memit for subseq in get_all_subsequences(list(range(20))): pass

peak memory: 218.10 MiB, increment: 150.18 MiB


Now if we create an equivalent generator function, we can significantly decrease how much memory is being used at once

In [14]:
def get_all_subsequences_generator(seq: list, start=0):
    yield list(seq)
    for i in range(start, len(seq)):
        e = seq.pop(i)
        yield from get_all_subsequences(seq, start=i)
        seq.insert(i, e)

%memit for subseq in get_all_subsequences_generator(list(range(20))): pass

peak memory: 153.77 MiB, increment: 69.68 MiB


### Some additional notes on generators

In python3, _yield from a_ is equivalent to _for x in a; yield x_. For example:

In [15]:
def generator_func_for():
    for x in [1, 2, 3]:
        yield x

for i in generator_func_for():
    print(i)

1
2
3


In [16]:
def generator_func_yield_from():
    yield from [1, 2, 3]

for i in generator_func_yield_from():
    print(i)

1
2
3


Also in python3, a _return_ statement inside a generator function immediately exhausts the corresponding generator and will cause it to ignore any subsequent _yield_ statements. For example:

In [17]:
def generator_func_with_return():
    a = 1
    yield a
    b = 2
    return b
    c = 3
    yield c

my_generator = generator_func_with_return()
print(next(my_generator)) # The first yield statement is unaffected by return statement

1


The _return b_ statement in the generator function is actually equivalent to _raise StopIteration(b)_

In [18]:
next(my_generator)

StopIteration: 2

If you put it in a _for_ loop, it will ignore _yield_ statements after the _return_ statement

In [19]:
for i in generator_func_with_return():
    print(i)

1
