#Examining Iterators in Python

In this session we are going to look at iterable/iterator objects

An iterable is a class that impments the \_\_next\_\_ method.
An iterator is a class that implements the \_\_iter\_\_ method. The iter() method must return an iterable. If an iterator is also an iterable, then the iter() method must return 'self'


An iterator doesn't have to be iterable and vice versa. We can put iter() and next() in one class to make that class an iterable iterator. 

In [3]:
# this class is an iterable
class Iterable_class:
    def __next__():
        pass

In [5]:
# this class is an iterator
class Iterator_class:
    def __iter__():
        return Iterable_class()


From Dive into Python3, we are going to look at an iterator/iterable class. Our iterator/iterable is goint to read a file containing regular expression rules; then put the rules into a cache list to be used later. 

In [6]:
class LazyRule:
    re_patternfile = 're_plural_rules.txt'

    def __init__(self):
        print('calling init')
        self.pattern_file = open(self.re_patternfile, 'r')
        self.cache = []

    def __iter__(self):
        self.cache_index = 0
        print('calling iter')
        return self

    def __next__(self):
        self.cache_index += 1
        print(self.cache_index, len(self.cache))
        if len(self.cache) >= self.cache_index:
            return self.cache[self.cache_index-1]

        if self.pattern_file.closed:
            print('stopIter: pattern file is closed ')
            raise StopIteration

        line = self.pattern_file.readline()
        if not line:
            print('stopIter: pattern file is exhausted')
            self.pattern_file.close()
            raise StopIteration

        pattern, search, replace = line.split(';', 3)
        func = build_match_and_apply(pattern, search, replace)
        self.cache.append(func)
        return func


In this class, the 're_patternfile' variable is a class variable. A class variable is avaiable to all instances of the class and can be accessed via 'self.re_patternfile'. 

Because our class is an iterator/iterable, the iter() method return self instead of returning another iterable class; and our class also implement the next() method. Here is how we use our LazyRules class:
        rules = LazyRule()
        for match_rule, apply_rule in rules:
First, we instatiate the class and assign that object to 'rules'. The institiation calls init() to open the self.re_patternfile. When we enter the for loop, python call iter() method from rules. After calling the iter() method, the for loop will continues to call the next() method until StopIteration is raised. 

There are three parts in our next() method and we will dissect them next. However, we ware going to visit them in reverse order.

        pattern, search, replace = line.split(';', 3)
        func = build_match_and_apply(pattern, search, replace)
        self.cache.append(func)
        return func
This block at the end process each new line from the re_patternfile, append the new rules into the cache, and return the tuple func. 

        if self.pattern_file.closed:
            print('stopIter: pattern file is closed ')
            raise StopIteration

        line = self.pattern_file.readline()
        if not line:
            print('stopIter: pattern file is exhausted')
            self.pattern_file.close()
            raise StopIteration
This block is a two parters. First, it says that if the file is closed, raise the StopIteration and the for loop will catch that exception and terminate. The second part handles closing the file when we have already read the last line; this also raise the StopIteration to terminate the for loop. When either of these conditions are met, we are not going to expand our cache list. 

        self.cache_index += 1
        print(self.cache_index, len(self.cache))
        if len(self.cache) >= self.cache_index:
            return self.cache[self.cache_index-1]
The first time we process the re_patternfile, this 'if' condition in this block is ignore. After we process everything in the rule file, the StopIteration is raised, the next time we enter our for loop again, iter() set the cache_index to zero and our if statement here become valid. Each for loop call, as long as the increased cache_index is less than the cache length, our next() method will return something, and we won't reach the 'if self.pattern_file.closed: StopIteration' condition (remember that we only process the file and expand the cache with rules in the first for loop. In subsequent for loops, we only only return elements of the cache list). Until the cache_index exceeds the length of the cache list, then we will not return the element of the cache; instead, we will reach the block where we are checking for closed file and raise the StopIteration. 

###Lesson Learned
1. next() method needs to return something before a condtion to Stop the iteration. 
2. iter() is called once when entering the for loop.
3. Iterator and iterable are different implementations. 
4. We can combine iterator and iterable. 