In [1]:
%%HTML
<!-- execute this cell before continue -->
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Lato">
<style>.reveal * { font-family: "Lato" !important; } .reveal h1, .reveal h2, .reveal h3, .reveal h4, .reveal h5, .reveal h6 { font-family: "Lato" !important; } .reveal .code_cell *, .reveal code, .reveal code * { font-family: monospace !important; }</style>

![Erudio logo](../../img/erudio-logo-small.png)

<h2 style="font-weight: bold;">
    Iterator, Generators, Context Managers &amp; Decorators
</h2>

![orange-divider](https://user-images.githubusercontent.com/7065401/98619088-44ab6000-22e1-11eb-8f6d-5532e68ab274.png)


<img src="../../img/python-logo.png" width="25%" align="right" style="padding-left: 30px;" />The Python programming language offers several "exotic" constructs for structuring programs.  Once you have used Python even a moderate amount, you have certainly **used** iterator, context managers, and decorators; however, you can work in Python for quite a while without yet creating your own instances of each.

In this module, we show you how to create and utilize custom iterators, generators, context manager, and decorators.  Each of these often make programs clearer, more expressive, faster, and more robust.



<h2 style="font-weight: bold;">
    What is covered in this course?
</h2>

![orange-divider](https://user-images.githubusercontent.com/7065401/98619088-44ab6000-22e1-11eb-8f6d-5532e68ab274.png)

* Understand the iterator protocol
* Class-based and function-based iterators
* The `itertools` module
* A "calculus" of infinite or unbounded streams
* Standard library and custom context managers
* Use the `contextlib` module to aid writing context managers
* Write custom decorators as functions or classes
* Write parameterized decorators and decorator factories

<h2 style="font-weight: bold;">
    Learning Objectives
</h2>

![orange-divider](https://user-images.githubusercontent.com/7065401/98619088-44ab6000-22e1-11eb-8f6d-5532e68ab274.png)

* Construct lazy data streams with generator expressions and generator functions
* Write custom iterator classes
* Combine lazy data streams using itertools
* Wrap operations in custom contexts
* Decorate functions in reusable orthogonal ways

# The Iterator Protocol

Every Python programmer, even beginners, use iteratables frequently.  They just don't always think about the fact they are doing so.  Every time you write a loop, or a list comprehension, or the constructors `list()`, `set()`, or `tuple()`, you are using an iterable.

An important distinction has already arisen in this introduction.  Although often the two terms are used carelessly, *iterator* and *iterable* are two distinct kinds of things in Python.  However, a great many objects are both iterators and iterables, and hence often the distinction is easy to miss.

## Iterators

An *iterator* is entirely defined as an object that has a method named `.__next__()`; it **may**, on some call to that method, raise the exception `StopIteration`.  Once an iterator has *once* raised `StopIteration`, it is *exhausted*. According to the Python documentation:

> Once an iterator’s `.__next__()` method raises StopIteration, it must continue to do so on subsequent calls. Implementations that do not obey this property are deemed broken.

The built-in function `next()` is a shortcut for calling the `.__next__()` method of an object.

To demonstrate, let us look at a "non-broken" (but also almost useless) iterator.

In [6]:
class MyIterator:
    def __init__(self):
        self.n = 10
        self.exhausted = False
    def __next__(self):
        if self.exhausted or self.n == 13:
            self.exhausted = True
            raise StopIteration
        self.n += 1
        return self.n

In [7]:
my_iterator = MyIterator()
while True:
    print(next(my_iterator))

11
12
13


StopIteration: 

In [8]:
print(next(my_iterator))  # Continues to raise StopIteration

StopIteration: 

The problem is, that only being able to call `next()` on an object is not very flexible.  For example, being able to loop over an object to get each successive item would be a lot more convenient.

In [9]:
my_iterator = MyIterator()
for n in my_iterator:
    print(n)

TypeError: 'MyIterator' object is not iterable

## Iterables

In order to support a loop, or a list comprehension, or a constructor for a collection type, we instead need an **iterable**.  An iterable is simply an object that has a method named `.__iter__()` which return an *iterator* when called with no arguments.  The built-in function `iter()` is a shortcut for calling that method, although it is uncommon to call it explicitly. More often, `.__iter__()` gets called "behind" the scenes in a loop or the like.

In [10]:
class MyIterable:
    def __iter__(self):
        return MyIterator()

In [11]:
my_iterable = MyIterable()
iter(my_iterable)

<__main__.MyIterator at 0x18eeaad98a0>

That isn't yet so useful, but it suddenly becomes so when we put it in a loop and other contexts that want an iterable.

In [12]:
list(my_iterable)

[11, 12, 13]

In [13]:
[n/7 for n in my_iterable]

[1.5714285714285714, 1.7142857142857142, 1.8571428571428572]

In [14]:
for n in my_iterable:
    print(n)

11
12
13


In [15]:
for x in enumerate(my_iterable):
    print(x)

(0, 11)
(1, 12)
(2, 13)


In [16]:
list(zip(my_iterable, "abcdefghi"))

[(11, 'a'), (12, 'b'), (13, 'c')]

Notice that `my_iterable` is **not** an iterator.

In [17]:
next(my_iterable)

TypeError: 'MyIterable' object is not an iterator

## Dual-function objects

Many Python objects are both iterators and iterables.  For example, file objects do both things (and also a bunch more; they are also context managers, for example). We can open a Project Gutenberg dictionary of slang from 1913.  Or technically an excerpt from it, which I put in a file in the repository.

In [20]:
# Open a file, read some bytes from it.
slang = open('data/slang.txt', encoding="utf8")
print(slang.read(388), end='')

~A 1~, first-rate, the very best; “she’s a prime girl, she is; she is
A 1.”—_Sam Slick_. The highest classification of ships at Lloyd’s; common
term in the United States; also at Liverpool and other English seaports.
Another, even more intensitive form is “first-class, letter A, No. 1.”
Some people choose to say A I, for no reason, however, beyond that of
being different from others.



As an iterator, we might do this to print the next 5 lines:

In [22]:
for i in range(5):
    print(next(slang), end='')

~About Right~, “to do the thing ABOUT RIGHT,” _i.e._, to do it
properly, soundly, correctly; “he guv it ’im ABOUT RIGHT,” _i.e._, he
beat him severely.

~Abraham-man~, a vagabond, such as were driven to beg about the country


As an iterable, we might use it a bit differently:

In [23]:
count = 1
for line in slang:
    print(line, end='')
    if (count := count+1) > 8:
        break

after the dissolution of the monasteries.—_See_ BESS O’ BEDLAM,
_infra_. They are well described under the title of _Bedlam
Beggars_.—_Shakspeare’s K. Lear_, ii. 3.

    “And these, what name or title e’er they bear,
    Jarkman, or Patrico, Cranke, or Clapper-dudgeon,
    Frater, or ABRAM-MAN; I speak to all
    That stand in fair election for the title


---

So far this is rather formal, but in the next lesson we start to make this useful.

# Exercise

## Description

Some Python objects—that you work with, every time you code—are iterators. Others are iterables.  Many are both. Some are neither.  In the below setup, a number of objects are defined, you need to characterize their behavior correctly for each of them.  Everyting is characterized wrongly in the setup; fix that.

## Setup

In [26]:
# Define names for the possible answers.
from collections import namedtuple
from enum import Enum
class Kind(Enum):
    ITERATOR, ITERABLE, BOTH, NEITHER, WRONG = range(5)

things = dict(
    a = [range(10), Kind.WRONG],
    b = [open('data/tmp-file', 'r+b'), Kind.WRONG],
    c = [[1, 2, 3, 4], Kind.WRONG],
    d = [(1, 2, 3, 4), Kind.WRONG],
    e = [123.45, Kind.WRONG],
    f = ["12345", Kind.WRONG],
    g = [zip("abc", "def"), Kind.WRONG],
    h = [lambda n: range(n), Kind.WRONG],
    i = [{1, 2, 3, 4}, Kind.WRONG],
    j = [{1: 2, 3: 4, 5: 6}, Kind.WRONG],
    k = [namedtuple("Thing", "a b c"), Kind.WRONG],
    l = [namedtuple("Thing", "a b c")(1, 2, 3), Kind.WRONG],
    m = [(n for n in range(10)), Kind.WRONG]
)

# Solution 1

In [27]:
from collections.abc import Iterable, Iterator

def kind(o):
    if isinstance(o, Iterable):
        if isinstance(o, Iterator):
            return Kind.BOTH
        else:
            return Kind.ITERABLE
    elif isinstance(o, Iterator):
        return Kind.ITERATOR
    else:
        return Kind.NEITHER
    
for k, v in things.items():
    v[1] = kind(v[0])

# Solution 2

In [28]:
# This solution is somewhat coarser since it can consume elements
# also subtly wrong about write-only files, and a few other things
def kind(o):
    try:
        iter(o)
        try:
            next(o)
            return Kind.BOTH
        except StopIteration:
            # An exhausted iterator is an iterator
            return Kind.BOTH
        except:
            return Kind.ITERABLE
    except:
        try:
            next(o)
            return Kind.Iterator
        except StopIteration:
            return Kind.Iterator
        except:
            return Kind.NEITHER
    
for k, v in things.items():
    v[1] = kind(v[0])        

# Test Cases

In [29]:
def test_kinds():
    kinds = [v[1] for v in things.values()]
    correct = {'a': Kind.ITERABLE,
               'b': Kind.BOTH,
               'c': Kind.ITERABLE,
               'd': Kind.ITERABLE,
               'e': Kind.NEITHER,
               'f': Kind.ITERABLE,
               'g': Kind.BOTH,
               'h': Kind.NEITHER,
               'i': Kind.ITERABLE,
               'j': Kind.ITERABLE,
               'k': Kind.NEITHER,
               'l': Kind.ITERABLE,
               'm': Kind.BOTH}
    for n, k in enumerate(correct):
        assert correct[k] == kinds[n], f"{k} is {correct[k]}"
    
test_kinds()

-------------
Materials licensed under [CC BY-NC-ND 4.0](https://creativecommons.org/licenses/by-nc-nd/4.0/) by the authors