Iterators, Generators and Classic Coroutines

In [10]:
import sentence
from sentence import Sentence
s = Sentence('"The time has come," the Walrus said,')

In [15]:
for words in s:
    print(words,sep='\t')

list(s)
s[0]
s[5]
s[-1]

The
time
has
come
the
Walrus
said


'said'

Why Sequences Are Iterable: The iter Function

In [1]:
#Whenever Python needs to iterate over an object x, it automatically calls iter(x).

#The iter built in function:

#1 Checks whether the object implements __iter__ and calls that to obtain an iterator.

#2 If __iter__ is not implemented but __getitem__ is then iter() creates an iterator that tries to fetch items by index starting form 0

#3 If that fails, Python raised TypeError usually sayign C object is not iterable where c is the class of the target object.

Imp

In [2]:
#Hence clearly an object is not only considered iterable when it necessarily inherits the iter method but also iterable if it only inherits the getitem method.

In [3]:
#Hence all the sequnce objects have either the iter or the getitem method

In [None]:
class Spam:
    def __getitem__(self, i):
        print('->', i)
        raise IndexError()


spam_can  =Spam()
iter(spam_can)
list(spam_can)

#checking if iterable

from collections import abc
isinstance(spam_can, abc.Iterable)
class GooseSpam:
    def __iter__(self):
        pass
from collections import abc
issubclass(GooseSpam, abc.Iterable)
goose_spam_can = GooseSpam()
isinstance(goose_spam_can, abc.Iterable)

#Imp

In [21]:
#the most accurate way to check whether an object x is iterable is to call iter(x) and handle a TypeError exception if it isnt. THis is more accurate than using isinstance and issubclass because iter(x) aso considers the legacy __getitem__ method while Iterable ABC does not.


In [9]:
#Using iter with a Callable
from random import randint
def d6():
    return randint(1,6)  #Generates random integer between 1 and 6

d6_iter = iter(d6,1)   #Iterates until 1 is reached

for roll in d6_iter:
    print(roll)

6
5


In [None]:
#One useful application of the second form of iter() is to build a block reader. For example reading fixed width blocks form a binary database file until the end of file is readched:

from functools import partial

with open('mydata.db', 'rb') as f:
    read64 = partial(f.read, 64)
    for block in iter(read64, b''):
        process_block(block)

Iterables Versus Iterators

In [19]:
#iterable

#Any object form which the iter built in funciton can obtain an iterator. Objects implementing __iter__ method returning an iterator are iterable. Sequences are always iterable as are objects implementing a __getitem__ method that accepts 0 based indexes.

#Here is a simple for loop iterating over a str. The str 'ABC' is the iterable here. You dont see it but there is an iterator behind the curtain.
from collections import abc
s = 'ABC'
for char in s:
    print(char)


A
B
C


In [33]:
a = iter(range(3))
a.__next__()
s = 'ABC'
iter(s).__next__()
s.__nex

'A'

In [54]:
#If there was no for statement and we had to emulate the for machinery by hand with a while loop this is what wed have to write


s = 'ABC'
a = []

while len(a)==0:
    b = iter(s)
    c = iter(s)
    while len(a)<len(s):
        a.append(c.__next__())
        try:
            print(b.__next__())
        except StopIteration:
            break
        
#The below is given in the book
#>>> s = 'ABC'
#>>> it = iter(s)  
#>>> #while True:
#...     try:
#...         print(next(it))  
#...     #except StopIteration:  
#...       #  del it  
#...         #break#

#Build an iterator it form the iterable

#Repeatedly call nex on the iterator ot o;btain the nex iterm.



A
B
C


In [1]:
#StopIteartion signals that the iterator is exhausted. This exception is handles internally byt the iter() built in that is part of th elogic of for loops an other iteration contexts like list compreshension, iteratble unpacking etc.

#Pythons standard itnerface for an iterator has two methods:

#__next__
#Return the next item in the serise, raising StopIteration if there are no more.

#__iter__
#returns self this allows iterators to be used where an iterable is expected for example in a for loop

#That interface is formalized in teh collections.abc,.Iterator ABC., Which declares the __next__ abstract method and subclasses Iterable where the abstract __iter__ method is declared



In [2]:
#__subclasshook__ supports structural type checks with isinstance and issubclass. We saw it in "Structural Typing with ABCs"

#_check_methods traverses the __mro__ of the class to check whether the methods are inmplemented in its base classes. Its defined in that smae module of the abc. If methods are implemented the C class will be recognized as a virtual subclass of Iterator. In other words 


##IMP -- Because the only methods required of an iterator are __next__ and __iter__ there is no way to check whether there are remaining iterms, other tha to call next() and catch StopIteration. Also its not possible to reset an iterator. If you need to start over you need to call iter() on the iterable that built the iterator in the first place. 

#Calling iter() on the iterator itself wont helpo either because as mentioned Iterator.__iter__ is implemented by returning self, so thsi will not reset a depleted iterator.

#That minimal interaface is sensible because in reality not all iterators are resetable . Fo rexample if an iterator is reading packets form the network theres no way to rewind it

Sentence Classes with `__iter__`

In [None]:
#Senctence Take #2: A Classic Iterator


##Also DOnt make the iterable an iterator for itself

|
|
#Iterables have an __iter__ method that instantiates a new iterator everytime. Iterators implement a __next__ ethod that return s individual iterms, and an __iter_- method that returns self.

#Therefore, iterators are also iterabe but iterables are not iterators.

#It may be tempting to implement __next__ in addition to ))titer__ in the Sentence class, making each Senctence instance at the same time an iterable and iterator over itself. But this is rarely a good idea. Its also a common antipatters, according to alex martelli who has a lot of experience reviewing Python code at Google.


###Sentence Take 3:-
|
|
|


In [None]:
#How a generator works

---> Any python function hta has the yield keyword in its doby is a generator function: a function which when called returns a generator object. In other words, a generator function is a generator factory.

In [1]:
#A generator function htat yields thrtee numbers


def gen_123():
    yield 1
    yield 2
    yield 3
g = gen_123()
next(g)

In [4]:
#The following example makes the interaaction between a for loop and the body of the function more explicit


def gen_AB():
    print('start')
    yield 'A'
    print('continue')
    yield 'B'
    print('end.')

In [5]:
for c in gen_AB():
    print('-->',c)

start
--> A
continue
--> B
end.


In [10]:
a = iter(gen_AB())
a.__next__()

#Hence it is clear that iter is a generator function which when called builds a generator object that implements teh Iterator interface so the Senteence Iterator class is no longer neeeded

##--about Lazy evaluation

##Laziness is considered a goood train at least in programming languages and APIs. A lazy implementation postpones productin values to the last possible moemnet. TH is saves memeory and may avoid wasting CPU cycles, too

start


'A'

Lazy Sentences

In [11]:
#Sentence Take4: Lzy Generator

#The Iterator interface is designed to be lazy nex(0 yields one item at a time. The opposite of lazy is eager: lazy evaluatino and eager evaluation are technical terms in programming language thory.


#Our sentence implementations so far have not been lazy because the __init__ eagerly builds a list of all words in the text, binding it to the self.words attribute.This requires processing the entire text, and the list may use as much memory as the text itself. Most of this work will be in vain if the user only iteratest over the first couple of words. If you wonder, "Is there a lazy of doing this in python the answer is oftern "Yes"


##---------re.finditer is the lazy function



In [None]:
#refer sentence_gen2.py


In [12]:
#The gen_AB generator function is used by a list comprehenson then by a lgenrator expression

def gen_AB():
    print('start')
    yield 'A'
    print('continue')
    yield 'B'
    print('end.')

In [16]:
res1 = [x*3 for x in gen_AB()]


start
continue
end.


In [17]:
for i in res1: #this for loop iteratoes over the res1 list built by the list comprehension
    print('---->',i)

----> AAA
----> BBB


In [19]:
res2 = (x*3 for x in gen_AB())
res2

<generator object <genexpr> at 0x7f81ff44be00>

In [20]:
for i in res2:
    print('---->',i)

start
----> AAA
continue
----> BBB
end.


In [21]:
#Only when the for loop iteratoes over res2 this generator gets items from gen_AB. Each iteraton of the for loop implicitly calls nex(res2), which in tyuren calls next() on the generator object returned by gen_AB() advancing it to the next yield.

#refer to sentence_genexp.py

In [24]:

import sentence_genexp

s = sentence_genexp.Sentence("Helloe")

In [25]:
s

Sentence('Helloe')

When to Use Generator Expressions

In [26]:
#Contrasting Iterators and Generators

An Arithmetic Progression Generator

In [63]:
#Made all by myself

class ArithmeticProgression1:
    def __init__(self,*args) -> None:
        self.start = args[0]
        self.interval = args[1]
        self.end = args[2]

    def __repr__(self) -> str:
        cont = [self.start]
        while cont[-1]<self.end:
            cont.append(round((cont[-1]+self.interval),3))
        if cont[-1]>self.end:
            cont[-1] = cont[-2]
        return f'{cont}'



In [65]:
ap1 = ArithmeticProgression(1,0.2,3)
%timeit ap1

14.9 ns ± 0.858 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)


In [58]:
#The version from the book

class ArithmeticProgression2:
    def __init__(self, begin, step, end=None) -> None:
        self.begin = begin
        self.step = step
        self.end = end #None for infinite series


    def __iter__(self):
        result_type = type(self.begin + self.step)
        result = result_type(self.begin)
        forever = self.end is None
        index = 0
        while forever or result < self.end:
            yield result
            index += 1
            result = self.begin + self.step * index

In [66]:
ap2 = ArithmeticProgression(1,0.2,5)
%timeit list(ap2)

3.78 µs ± 210 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


In [67]:
#Below is a function equivalent of the ArithmeticProgression class

def aritprog_gen(begin, step, end):
    result = type(begin + step)(begin)
    forever = end is None
    index = 0
    while forever or result<end:
        yield result
        index += 1
        result = begin + step*index

Arithmetic Progression with itertools


In [3]:
import itertools
gen = itertools.count(1,.5)
next(gen)

1

In [6]:
next(gen)

2.0

In [7]:
#However this loop will never stop

In [9]:
#On the otehr hand, there is the itertools.takewhile function: it returns a generator that consumes another generator an stops whena given predicate evalueate to False. So we can combine the two and write this;

gen = itertools.takewhile(lambda n: n<3 ,itertools.count(1,.5))

In [11]:
list(gen)

[1, 1.5, 2.0, 2.5]

In [12]:
def aritprog_gen(begin, step, end=None):
    first = type(begin + step)(begin)
    ap_gen = itertools.count(first, step)
    if end is None:
        return ap_gen
    return itertools.takewhile(lambda n: n<end, ap_gen)

In [13]:
aritprog_gen(1,2)

count(1, 2)

Generator Functions in the Standard Library

In [14]:
#The standard provides many generators, from plain-text file objects providing line by line iteration , to the awsome os.walk function, wchich yields filenames while traversing a directory tree, making recursive filesystem searches as simple as a for loop.


#os.walk -- an awesome function which yields filenames while traversing a directory tree, making recursive filesytem searches as simple as a for loop.

#os.walk generator function is impressive but in this section I want to focus on general purpose function that take arbitray iterables as argumenst anad return generators that yield selected, computed or rearrangd  items. In the following tables, I summarize two dozen of the, fro the built-in, itertools and functools modules. Fo rconvenience, I grouped thame by high-level functionality, regardless of where they are defined.


In [16]:
#Filter generator function sexamples

def vowel(c):
    return c.lower() in 'aeiou'

list(filter(vowel, 'Aardvark')) #List iterates over

['A', 'a', 'a']

In [17]:
import itertools

list(itertools.filterfalse(vowel, 'Aardvark'))

['r', 'd', 'v', 'r', 'k']

In [19]:
list(itertools.dropwhile(vowel, 'Aardvark'))
#Consumes the second argument while predicate computes truthy then yields every  remaining item

['r', 'd', 'v', 'a', 'r', 'k']

In [20]:
list(itertools.takewhile(vowel, 'Aardvark'))

['A', 'a']

In [22]:
list(itertools.compress('Aardvark', (1, 0, 1, 1, 0, 1)))

#returns from the first argument when the second is true

['A', 'r', 'd', 'a']

In [24]:
list(itertools.islice('Aardvark', 4))
#slices up till 4th index

['A', 'a', 'r', 'd']

In [26]:
list(itertools.islice('Aardvark', 4, 7))
#slices starting from 4th index up till 7th index

['v', 'a', 'r']

In [28]:
list(itertools.islice('Aardvark', 1,7,2))
#starts from first index and continues up till 7th index 
#And prints every second element

['a', 'd', 'a']