# Iteration Inside Out

## Inside Python's Iteration Protocol

### Naomi Ceder
#### 2020-05-08 2 PM CDT, via https://www.twitch.tv/nceder/
#### Videos also archived at https://bit.ly/exploring-ceder

#### Yes, I am available for corporate training - contact me for info

#### https://naomiceder.tech, @naomiceder**

## Before we start 

This notebook can (will) be found at https://github.com/nceder/exploring_python

*The Quick Python Book*, 3rd ed (contact me for a code) - http://bit.ly/quick-python

PyCon 2020 Online! - https://us.pycon.org/2020/online/ 

Pycon 2020 Online YouTube channel - https://www.youtube.com/channel/UCMjMBMGt0WJQLeluw6qNJuA

### A note about Python shells

We'll be using this notebook to create cells that will connect to a session of the Python interpreter. This kind of session is often called a REPL  (read-eval-print-loop). It reads what you type, it evaluates it, it prints the result and repeats until you stop it. 

Other ways of having a Python REPL (what I, as an old-timer call a "shell") are:
* running Python at the commandline
* using ipython
* using the shell window in IDLE
* using the shell/command window in many IDE's

I'm using Jupyter so that I can package a little bit of text more easily. 

**If you want to play along (please do), you can use whatever works for you.**

## Iteration = repetition with code and data

## Iteration protocol

### “Python’s most powerful useful feature”

-- Dave Beazley, "[Iterations of Evolution: The Unauthorized Biography of the For-Loop](https://www.youtube.com/watch?v=2AXuhgid7E4)"

In [None]:
# for loop (Python style)
a_list = [1, 2, 3, 4]

for item in a_list:
    print(item)

## Obvious, right?

It wasn't always so obvious...

## It *used* to be surprising

### Python and `for` loops

The `for` statement in Python differs a bit 
from what you may be
used to in C or Pascal.  Rather than always iterating over an
arithmetic progression of numbers (like in Pascal), or leaving the user
completely free in the iteration test and step (as C), Python's for 
statement iterates over the items of any sequence (e.g., a list
or a string), in the order that they appear in the sequence.

-- Python V 1.1 Docs, 1994

### And it works the same for different types
* `for key in a_dictionary:`
* `for char in a_string:`
* `for record in query_results:`
* `for line in a_file:`

etc...

## How does that work?

* **How does a `for` loop know the “next” item?**
* **How can `for` loops use so many different types?**
* **What makes an object “work” in a `for` loop?**

## Iteration protocol

* iteration in Python relies on a **protocol**, not types (from Python 2.2)
* It's a good example of Python's “duck typing” - anything that follows the protocol can be iterated over

### Iteration Protocol: 
* for iteration you need an **iterable** object
* and an **iterator** (which Python usually handles for you)

## iterable

An object capable of returning its members **one at a time.** Examples of iterables include **all sequence types** (such as `list`, `str`, and `tuple`) and **some non-sequence types** like `dict`, file objects, and objects of any **classes you define** with an **`__iter__()`** method or with a **`__getitem__()`** method that implements Sequence semantics.

Iterables can be used in a `for` loop and in many other places where a sequence is needed (`zip()`, `map()`, …). When an iterable object is passed as an argument to the built-in function `iter()`, it returns an **iterator** for the object. This iterator is good for **one pass** over the set of values. When using iterables, it is usually **not necessary to call `iter()`** or deal with iterator objects yourself. The `for` statement **does that automatically for you,** creating a **temporary unnamed variable** to hold the iterator for the duration of the loop. *See also iterator, sequence, and generator.*

--Python glossary

## Iterable
* returns members one at a time
* e.g, `list`, `str`, `tuple` (sequence types)
* any class with `__iter__()` method that returns iterator
* **or** any class with `__getitem__()` with sequence semantics
* `for` statement creates an unnamed iterator from iterable automatically

### An iterable...

must return an iterator when the `iter()` function is called on it.

#### There are 2 ways an object can return a iterator - it can
* have a **`__getitem__()`** method with Sequence semantics - i.e., access items by integer index in [ ].
* implement an **`__iter__()`** method that returns an iterator (more on this soon)


### Is it an iterable?
* Does it have an `__iter__()` method?

In [None]:
# check with hasattr
a_list = [1, 2, 3, 4]

hasattr(a_list, "__iter__")

* Does it have `__getitem__()` that is sequence compliant? (harder to decide)

## Let’s make an iterable -  `Repeater`

A object that can be iterated over and returns the same value for the specified number of times.

```
repeat = Repeater("hello", 4)

for i in repeat:
    print(i)

hello
hello
hello
hello
```

### As an iterable, using `__getitem__()`

In [None]:
class Repeater:
    def __init__(self, value, limit):
        self.value = value
        self.limit = limit
        
    def __getitem__(self, index):
        if 0 <= index < self.limit:
            return self.value
        else:
            raise IndexError

In [None]:
repeat = Repeater("hello", 4)

# does it have an __iter__ method?
hasattr(repeat, "__iter__")

In [None]:
# __getitem__ with sequence semantics?

repeat[4]

In [None]:
# can the iter() function return an iterator?

iter(repeat)

In [None]:
# for loop

for item in repeat:
    print(item)

In [None]:
# list comprehension

[x for x in repeat]

### Behind the scenes

* an iterator is being created from the `repeat` object
* it can return the items using integer indexes starting from 0
* it continues until an IndexError is thrown
* each time it is iterated on a new iterator is created and it starts from the beginning

In [None]:
class Repeater:
    def __init__(self, value, limit):
        self.value = value
        self.limit = limit
        
    def __getitem__(self, index):      # The bit we need for an iterable
        if 0 <= index < self.limit:
            return self.value
        else:
            raise IndexError      # only needed if we want iteration to end

### Yes, it's really that simple...

* ONLY the `__getitem__()` method was needed
* an IndexError is needed to end iteration


### So let's make a Fibonacci interator...




Create a class that when instantiated will let you use index access to get the nth fibonacci number, again, up to the limit specified. 

* takes a maximum number
* uses sequence semantics


```
fibs = FibList(6)
fibs[5] --> 8
fibs[0] --> 1
```

To make a class indexable it needs to implement a `__getitem__` method that takes an index and returns the element **and** it needs to raise an IndexError when the index is >= the limit.
```
class SomeIndexable(object):
    def __getitem__(self, index):
         return an_item
```

You should be able to use this in a for loop as well.

In [None]:
class FibIterable:
    def __init__(self, limit):
        self.limit = limit
        
    def __getitem__(self, index):
        a = 1
        b = 1
        if index >= self.limit:
            raise IndexError
        if index < 2:
            return b
        else:
            for i in range(1, index):
                fib = a + b
                a, b = b, fib
            return fib

fib = FibIterable(10)

fib[4]
#for f in fib:
#    print(f)

## But... what's an *Iterator*?

The Python `for` loop relies on being able to get a **next** item, but...

* the **iterable** doesn't know which item is next
* **the loop itself doesn't care** exactly where in the series that item is (or what type it is)
* the loop relies on the **iterator** to keep track of what's next
* any object that can do that can be iterated over, i.e., it is an **iterator**


An **iterator** has a `__next__()` method (in Python 2 `next()`) that tracks and returns the next item in the series, and you use the `next()` function to return the next item for iteration.

### Iterator
* has `__next__()` method
* calls to `__next__()` method (`next()` function) return successive items
* raises `StopIteration` when no more data
* further calls just raise `StopIteration`
* must have `__iter__()` method, which returns self
* iterators are therefore iterables
* once exhausted they do not “refresh”

### iterator

An object representing **a stream of data**. Repeated **calls to the iterator’s `__next__()`** method (or passing it to the built-in function `next()`) **return successive items** in the stream. When **no more data are available a StopIteration exception is raised** instead. At this point, the iterator object is exhausted and any further calls to its `__next__()` method just raise StopIteration again... 

..Iterators are required to have an `__iter__()` method that returns the iterator object itself so **every iterator is also iterable** and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. **A container object (such as a list) produces a fresh new iterator each time** you pass it to the `iter()` function or use it in a for loop. Attempting this **with an iterator will just return the same exhausted iterator object** used in the previous iteration pass, making it appear like an empty container.

--Python glossary

### Let’s make a iterator - `RepeatIterator`

* implement `__next__()` method to return next item
* implement `__iter__()` method to return itself

In [None]:
class RepeatIterator:
    def __init__(self, value, limit):
        self.value = value
        self.limit = limit
        self.count = 0
        
    def __next__(self):  
        if self.count < self.limit:
            self.count += 1
            return self.value
        else:
            raise StopIteration
            
    def __iter__(self):
        return self


In [None]:
repeat_iter = RepeatIterator("Hi", 4)

# __getitem__ with sequence semantics?
repeat_iter[0]

In [None]:
 repeat_iter = RepeatIterator("Hi", 4) 
# does it have an __iter__ method?
 hasattr(repeat_iter, "__iter__")

In [None]:
# does it return next item using next() function?

next(repeat_iter)

In [None]:
# calling iter on it, returns object itself
print(repeat_iter)

repeat_iter_iter = iter(repeat_iter)
print(repeat_iter_iter)

In [None]:
# calling iter() on iterable always returns new iterator
print(repeat)
old_repeat_iter = iter(repeat)
print(old_repeat_iter)

In [None]:
# after 1 next(), how many repetitions left?


for item in repeat_iter:
    print(item) 


### So making an iterator is pretty easy, too...“
* `__next__()` method 
* `__iter__()` method that returns self
* “exhaustion” after one pass

In [None]:
# Let's loop again

for item in repeat_iter:
    print(item)


In [None]:
# one more next?
next(repeat_iter) 


### Behind the scenes
* `for` called `iter()` on repeat_iter, which returned itself (to anonymous var)
* `for` called `next()` on iterator to get values for loop
* `for` caught `StopIteration` and stopped iterating

### (but you probably want to use a generator instead... see below)

## Iteration in Python

* is a **protocol** (since Python 2.2)
* requires an **iterable** to iterate over
* requires an **iterator** (often automatically created behind the scenes) to track what's **next**
* **iterators can be used as iterables,** but don't "renew"


In [None]:
class MyCounter(object):
    def __init__(self, limit):
        self.limit = limit
        self._i = 0
    
    # def next(self): in Python2.x
    def __next__(self):
        if self._i < self.limit:
            cur_value = self._i
            self._i += 1
            return cur_value
        else: 
            raise StopIteration()
    def __iter__(self):
        return self

for x in MyCounter(4):
    print(x)

### How about a Fibonacci iterator?

* takes maximum number
* doesn't use sequence semantics

Let's create an iterator class that return fibonacci numbers up to limit given when the object was created. 

So:
```
my_fibiter = FibIter(6)
for fib in my_fibiter:
    print(fib)
    
1
1
2
3
5
8
```

In [None]:
class FibIterator:
    def __init__(self, limit):
        self.limit = limit
        self.count = 0
        self.a = 1
        self.b = 1
        
    def __next__(self): 

        if self.count < 2:
            fib = self.b
        elif self.count < self.limit:
            fib = self.a + self.b
            self.a, self.b = self.b, fib
        else:
            raise StopIteration
        self.count += 1
        return fib
                
    def __iter__(self):
        return self
    
fib2 = FibIterator(10)

for f in fib2:
    print(f)

## Generators

### Generator expressions

A generator expression is another way to ceate an iterable and use next to iterate over it.

* similar to a list comprehension, but uses ( ) instead of [ ]
* iterates over iterable as called (unlike list comprehension)

In [None]:
# create generator expression
repeat_gen_exp = ("hiya" for _ in range(5))
print(repeat_gen_exp)
print(hasattr(repeat_gen_exp, "__iter__"))

print(next(repeat_gen_exp))
for x in repeat_gen_exp:
    print(x)

#for x in repeat_gen_exp:
#    print(x)

#next(repeat_gen_exp)

In [None]:
a_list = [1, 2, 3, 4]

# list comp is generated all at once, won't change
a_list_gen = (z for z in a_list)
for x in a_list_gen:
    print(x)
    if len(a_list) < 10:
        a_list.append(x)
a_list

In [None]:
a_list = [1, 2, 3, 4]

# list comp is generated all at once, won't change
a_list_comp = [z for z in a_list]
for x in a_list_comp:
    print(x)
    if len(a_list) < 10:
        a_list.append(x)
a_list

### Behind the scenes

* generator objects are created when the expression is encountered, but not run until iterated over
* generator object is an iterator
* generator expressions actually are generator functions (see below) 


## generator functions

Generator functions are functions that behave like iterators. 

* They save their state, so that they can know which is next
* They use the `yield` keyword, instead of `return` (`yield` makes a function a generator)
* generator functions return iterators

In [None]:
def repeat_gen(value, limit):
    for i in range(limit):
        yield value
    
repeat_gen_obj = repeat_gen("olá", 5)
print(repeat_gen_obj)
for item in repeat_gen_obj: 
    print(item) 
    

for x in repeat_gen_obj:
    print("x =", x)

#repeat_gen_obj = repeat_gen("olá", 5)
#print(repeat_gen_obj)#next(gen_ob)
#print(repeat_gen_obj)
#print(hasattr(repeat_gen_obj, '__next__'))
#print(hasattr(repeat_gen_obj, '__iter__'))  

In [None]:
# But can you nest them?


for x in repeat_gen("hi", 5):
    print("x =", x)
    for z in repeat_gen("hello", 2):
        print("  z =", z )

print(repeat_gen)
print(repeat_gen("hi", 5))

### Behind the scenes

* executing generator function (with `yield`) creates generator object
* generator object is an iterator
* generator object saves state at each call of `yield`

### Exercise: How could we make a fibonacci generator?

Considering what we've seen about generator functions, how could we make a generator function for fibonacii numbers? 

Based on the example above, implement a generator that would return the first n fibonacci numbers.

In [None]:
def fib_gen(limit):
    cur_fib = 0
    next_fib = 1
    yield next_fib
    count = 1
    while count < limit:
        cur_fib, next_fib = next_fib, cur_fib + next_fib
        yield next_fib
        count += 1
        
for x in fib_gen(6):
    print(x)

## Resources

* [Python Tutorial - iterators](https://docs.python.org/2.7/tutorial/classes.html#iterators)
* [Python Tutorial - generators](https://docs.python.org/2.7/tutorial/classes.html#generators)
* [Python Tutorial - generator expressions](https://docs.python.org/2.7/tutorial/classes.html#generator-expressions)
* [Iterator types documentation](https://docs.python.org/dev/library/stdtypes.html#iterator-types)
* [Iterators, Functional Programming HOWTO](https://docs.python.org/dev/howto/functional.html#iterators)
* [Iterations of Evolution: The Unauthorized Biography of the For-Loop](https://www.youtube.com/watch?v=2AXuhgid7E4) - Dave Beazley, PyCon Pakistan 2017

## Thanks

### Final Notes


This notebook - https://github.com/nceder/exploring_python

Videos also archived at https://bit.ly/exploring-ceder

*The Quick Python Book*, 3rd ed, (contact me for a code) - http://bit.ly/quick-python

Me - https://naomiceder.tech, @naomiceder

PyCon 2020 Online! - https://us.pycon.org/2020/online/ 

Pycon 2020 Online YouTube channel - https://www.youtube.com/channel/UCMjMBMGt0WJQLeluw6qNJuA

In [None]:
class FibIter:
    def __init__(self, count):
        self._a, self._b = 0, 1
        self._count = count

    def __next__(self):
        if self._count <= 0:
            raise StopIteration

        self._count = self._count - 1
        self._a, self._b = self._b, self._a + self._b
        return self._a

    def __iter__(self):
        return self


fib6 = FibIter(6)
for fib in fib6:
    print(fib)

In [None]:
def fib_gen_while(value):
    a, b = 0, 1
    while value > 0:
        a, b = b, a + b
        value = value - 1
        yield a

def fib_gen_for(value):
    a, b = 0, 1
    for x in range(value):
        a, b = b, a + b
        yield a

def test_for():
    for fib in fib_gen_for(10):
        print(fib)
        
def test_while():
    for fib in fib_gen_while(10):
        print(fib)
        
%timeit test_for

%timeit test_while