<div style="position: relative;">
<img src="https://user-images.githubusercontent.com/7065401/98728503-5ab82f80-2378-11eb-9c79-adeb308fc647.png"></img>

<h1 style="color: white; position: absolute; top:27%; left:10%;">
     Advanced Python
</h1>
<h2 style="color: white; position: absolute; top:36%; left:10%;">
    Iterators, Generators, Context Managers, and Decorators
</h2>


<h3 style="color: #ef7d22; font-weight: normal; position: absolute; top:58%; left:10%;">
    David Mertz, Ph.D.
</h3>

<h3 style="color: #ef7d22; font-weight: normal; position: absolute; top:63%; left:10%;">
    Data Scientist
</h3>
</div>

# Class-Based Iterators

In this lesson we will create some more practially useful iterators than the formal demonstration of the protocol we saw in the first lesson.

Most of the toy examples you will see in most tutorials take something that is already an iterable, and create a custom class that simply iterates over that underlying iterable.  However, wrapping a list, or a file handle, or the range object, is underwhelming.  Let us pick an example that uses a little bit more original code; the first pass will still qualify as "toy" but heading in a more practical direction.

## A random number generator

The Python `random` module internally uses a kind of generator called the Mersenne Twister.  Like all pseudo-random number generators, it contains internal state, but based on that will produce a completely deterministic sequence of numbers.  *Eventually* this internal state will repeat and the numbers will cycle.  However, the scale involved in this cycling is much longer than the lifetime of the universe, so it is not a practical problem.

Let us create a much worse, but independent, random number generator with this same general property of using a finite amount of internal state.  Like Python's `random`, we may optionally seed this random number generator for repeatable results.

This class will be both iterable and an iterator; it does so by using a very common trick of having it's `.__iter__()` method simply return `self`.  Our iterator will produce numbers between 0 and 1, but will terminate iteration when a cycle is reached.

In [None]:
class Random:
    "Cyclical pseudo-random numbers. May be seeded with a list of integers"
    def __init__(self, seed=[907, 911, 919, 929, 937], scale=500):
        if not isinstance(seed, list) or not all(isinstance(n, int) for n in seed):
            raise ValueError("Seed must list of integers")
        self._seed = seed
        self._scale = scale
        # What internal states has the generator seen
        self._num = 0
        self._seed_pos = 0
        self._states = {(self._num, self._seed_pos)}
        
    def __iter__(self):
        return self
    
    def advance(self):
        self._num = (self._num + 13*self._seed[self._seed_pos]) % self._scale
        self._seed_pos = (self._seed_pos+1) % len(self._seed)
        if (self._num, self._seed_pos) in self._states:
            raise StopIteration
        self._states.add((self._num, self._seed_pos))
    
    def __next__(self):
        self.advance()
        return self._num/self._scale

We can do all of our iterator and iterable things with a `Random` instance.

In [None]:
rnd = Random([220, 231, 456, 789, 502])
next(rnd)

In [None]:
for n, r in enumerate(rnd):
    print(r, end=' ')
    if n > 15:
        break

Notice that we get different cylce lengths with different seeds.

In [None]:
rnd = Random([220, 231, 456, 789, 502])
len(list(rnd))

In [None]:
rnd = Random([321, 231, 456, 789, 502])
len((list(rnd)))

And different sequences.

In [None]:
rnd0 = Random([220, 231, 456, 789, 502])
rnd1 = Random([321, 231, 456, 789, 502])
rnd2 = Random()
list(rnd0)[:10], list(rnd1)[:10], list(rnd2)[:10]

## An iterable data structure

Python does not have a binary tree data in its standard library.  It is easy to write one, and sometimes a powerful data structure.  For illustration, a simple one is shown; this particular one is neither balanced nor sorted, although those are commonly properties one designs for specific use cases.

<img src="bintree.png" width="25%"/>

A fairly bare-bones binary tree requires very little code.  Even the `.__str__()` method is completely optional.

In [None]:
class BinTree:
    def __init__(self, val, _depth=0):
        self.val = val
        self.left = None
        self.right = None
        self._depth = _depth # Internal, not part of actual API
        
    def set_children(self, leftval, rightval):
        self.left = type(self)(leftval, _depth=self._depth+1)
        self.right = type(self)(rightval, _depth=self._depth+1)
        
    def __str__(self):
        if self.left is not None:   # Assume symmetry, i.e.: `self.right is not None`
            children = f"\n{self.left}{self.right}"
        else:
            children = "\n"
        return f"{'  '*self._depth}{self.__class__.__name__}({self.val}){children}"     

We can create the same tree as in the diagram.

In [None]:
a = BinTree('A')
a.set_children('B', 'F')
a.left.set_children('D', 'E')
a.right.set_children('C', 'I')
a.left.right.set_children('G', 'H')
a.left.right.right.set_children('J', 'K')

And print it off, leveraging the `.__str__()` method we included.

In [None]:
print(a)

In [None]:
print(a.left.right)

### Looping

One thing we **cannot** yet do is iterate over the nodes of these trees.  We have a decision.  We could definitely make a class that was a dual iterator/iterable as we did with `Random`, and have its `.__iter__()` return `self`.  However, this is a case where separating the two protocols makes sense.  One concern is that there are different ways to "walk" a tree: notably *depth-first* and *breadth-first*.  Perhaps we would like flexibility to decide that question later.

In [None]:
class IterBinTree(BinTree):
    def __init__(self, val, _depth=0, walker=None):
        if walker is None:
            walker = lambda _: iter([val, ...])
        self.walker = walker
        super().__init__(val, _depth)
        
    def __iter__(self):
        return self.walker(self)

In [None]:
a = IterBinTree('A')
a.set_children('B', 'F')
a.left.set_children('D', 'E')
a.right.set_children('C', 'I')
a.left.right.set_children('G', 'H')
a.left.right.right.set_children('J', 'K')

In [None]:
print(a)

So far, we have an iterable tree already.  However, it doesn't descend, just loop over the top node's value, then an ellipsis.  We have followed the full *iterable* protocol already.

In [None]:
for node in a:
    print(node, end=' ')

Now let us create a more useful *iterator* for a tree.

In [None]:
class TreeWalker:
    def __init__(self, tree):
        self.seq = [tree.val]
        if tree.left is not None:
            tree.left.walker = type(self)
            for val in tree.left:
                self.seq.append(val)
        if tree.right is not None:
            tree.right.walker = type(self)
            for val in tree.right:
                self.seq.append(val)
        self.pos = -1
        
    def __next__(self):
        self.pos += 1
        if self.pos >= len(self.seq):
            raise StopIteration
        return self.seq[self.pos]

In [None]:
a = IterBinTree('A', walker=TreeWalker)
a.set_children('B', 'F')
a.left.set_children('D', 'E')
a.right.set_children('C', 'I')
a.left.right.set_children('G', 'H')
a.left.right.right.set_children('J', 'K')

In [None]:
print(a)

In [None]:
for node in a:
    print(node, end=' ')

Since a new "walker" iterator is created whenever we enter a new loop or other constructions using an iterable, if the underlying tree changes, the iterator will change accordingly.

In [None]:
# A ".remove_childen()" method might be better API
e = a.left.right
e.left = e.right = None
print(a)

In [None]:
' '.join(list(a))

## Dynamic iterator

One strength of the design here is that we could substitute in a different kind of iterator if we want to walk the tree differently.  Here we would rather read right-to-left rather than left-to-write.

In [None]:
class RightToLeftWalker(TreeWalker):
    def __init__(self, tree):
        self.seq = [tree.val]
        if tree.right is not None:
            tree.right.walker = type(self)
            for val in tree.right:
                self.seq.append(val)
        if tree.left is not None:
            tree.left.walker = type(self)
            for val in tree.left:
                self.seq.append(val)
        self.pos = -1

In [None]:
a.walker = RightToLeftWalker
' '.join(list(a))