# Iteration in Python


## Range objects


In [451]:
my_range = range(10)

In [452]:
print(type(my_range))

<class 'range'>


In [453]:
print(my_range)

range(0, 10)


In [454]:
print(list(my_range))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [455]:
print(my_range[2])

2


In [456]:
import collections.abc

In [457]:
print(f"Are ranges iterables? {isinstance(
    my_range, collections.abc.Iterable)}")
print(f"Are ranges iterators?  {isinstance(
    my_range, collections.abc.Iterator)}")

Are ranges iterables? True
Are ranges iterators?  False


In [458]:
print(f"Does calling iter() on a range return the same object each time? {
      iter(my_range) == iter(my_range)}")

Does calling iter() on a range return the same object each time? False


In [459]:
my_generator_expression = (x for x in range(10))

In [460]:
print(type(my_generator_expression))

<class 'generator'>


In [461]:
try:
    print(my_generator_expression[0])
except TypeError as e:
    print(e)

'generator' object is not subscriptable


In [462]:
print(f"Are generators iterables? {isinstance(
    my_generator_expression, collections.abc.Iterable)}")
print(f"Are ranges iterators?  {isinstance(
    my_generator_expression, collections.abc.Iterator)}")

Are generators iterables? True
Are ranges iterators?  True


In [463]:
print(f"Does calling iter() on a generator return the same object each time? {
      iter(my_generator_expression) == iter(my_generator_expression)}")

Does calling iter() on a generator return the same object each time? True


In [464]:
print(f"Does calling iter() on a generator just return itself? {
      iter(my_generator_expression) == my_generator_expression}")

Does calling iter() on a generator just return itself? True


## Where does this make a difference?


This matters because it changes what it means to pass the objects around. In particular, any time you want to continue iterating under different conditions, such as finding a matching value in a list or string, or finding the first warm day after a freeze in temperature data.


Imagine that you wanted to find the position of the matching quotes in the following string. You could iterate until until you find the first quotation mark, and then continue iterating until you find the second one.


In [465]:
some_pangrams = """Two common examples of pangrams are "The quick red fox jumps over the lazy brown dog." and "Sphynx of black quartz, judge my vow." """
print(some_pangrams)

Two common examples of pangrams are "The quick red fox jumps over the lazy brown dog." and "Sphynx of black quartz, judge my vow." 


Let us try with generator expressions and with range objects directly.

### Iterating with a generator expression

In [466]:
def find_quote_with_generator_expression(quote_string=some_pangrams, starting_position=0):
    quote_start_pos = None
    quote_end_pos = None
    pos_generator = (i for i in range(len(quote_string)))
    for possible_start_pos in pos_generator:
        if quote_string[possible_start_pos] == '"':
            quote_start_pos = possible_start_pos
            break
    # We have found the first quotation mark.  Now let us continue iterating until we find the next one.
    for possible_end_pos in pos_generator:
        # Uncomment the following print statement if you want a step-by-step view of the results
        # print(f"checking {possible_end_pos}, {quote_string[possible_end_pos]}")
        if quote_string[possible_end_pos] == '"':
            quote_end_pos = possible_end_pos
            break
    return (quote_start_pos, quote_end_pos)

In [467]:
print(find_quote_with_generator_expression())

(36, 85)


This seems to work the way that we might expect.  The second **for** loop continues the iteration.

### Iterating with range objects

In [468]:
def find_quote_with_range(quote_string=some_pangrams, starting_position=0):
    quote_start_pos = None
    quote_end_pos = None
    pos_range = range(starting_position, len(quote_string))
    for possible_start_pos in pos_range:
        if quote_string[possible_start_pos] == '"':
            quote_start_pos = possible_start_pos
            break
    # We have found the first quotation mark.  Now let us continue iterating until we find the next one.
    for possible_end_pos in pos_range:
        # Uncomment the following print statement if you want a step-by-step view of the results
        # print(f"checking {possible_end_pos}, {quote_string[possible_end_pos]}")
        if quote_string[possible_end_pos] == '"':
            quote_end_pos = possible_end_pos
            break
    return (quote_start_pos, quote_end_pos)

In [469]:
print(find_quote_with_range())

(36, 36)


This doesn't work the way that we migh expect.  The function finds the first quotation mark both times, because each for loop starts from the begining of the string.

This might seem counterintuitive.  In the first version we pass the same generator to both **for** loops and it works.  In the second version we pass the same **range()** object to both **for** loops and it doesn't work.  What's the difference?

### Explaining the difference between using range and generator objects

The two functions above look simlar but behave differently because Python's **for** loop doesn't iterate over the object you pass to the **in** clause.  Instead, it calls **iter()** on that object and iterates over whatever it returns.

When **iter()** is called on a generator expression, and on iterators generally, the result is themselves.  Each time **iter()** is called on a range object, a new iterator starting from the beginning is returned.

### Can we fix the range-based function?

To fix the range based version of the function we need to make it more like the generator based one.
1.  We need some object to hold the state of the iteration
2.  We need both **for** loops to iterate over that object
1.  **for** loops obtain the object that they iterate over by calling **iter()** on the object in their **in** clause
1.  Therefore, to control what **for** loops iterate over, we need to control what the **iter()** function returns
1.  iterators return themselves when **iter()** is called on them
1.  Therefore an iterator object can be used to preserve the state of iteration across loops

In [470]:
def find_quote_with_range_iterator(quote_string=some_pangrams, starting_position=0):
    quote_start_pos = None
    quote_end_pos = None
    pos_range = range(starting_position, len(quote_string))
    pos_range_iterator = iter(pos_range)
    for possible_start_pos in pos_range_iterator:
        if quote_string[possible_start_pos] == '"':
            quote_start_pos = possible_start_pos
            break
    # We have found the first quotation mark.  Now let us continue iterating until we find the next one.
    for possible_end_pos in pos_range_iterator:
        # Uncomment the following print statement if you want a step-by-step view of the results
        # print(f"checking {possible_end_pos}, {quote_string[possible_end_pos]}")
        if quote_string[possible_end_pos] == '"':
            quote_end_pos = possible_end_pos
            break
    return (quote_start_pos, quote_end_pos)

In [471]:
print(find_quote_with_range_iterator())

(36, 85)


## What about generator functions?

Each call to a generator function returns a new generator iterator.

This is more visible when using Python type hints, since a generator function that **yield**s, say, integers, doesn't return any integers.  It returns an iterator that produces them.

In [472]:
def my_generator_function():
    while True:
        yield 7

In [473]:
print(type(my_generator_function))

<class 'function'>


In [474]:
my_generator_function_result = my_generator_function()

In [475]:
type(my_generator_function_result)

generator

In [476]:
my_generator_iterator_result = next(my_generator_function_result)
print(f"The iterator produced {my_generator_iterator_result} of type {
      type(my_generator_iterator_result)}")

The iterator produced 7 of type <class 'int'>
