# Lecture 11: Iterators and generators

- Implement iterator behavior in custom classes so they can be looped through using the same convenient syntax as built-in containers
- Use generators to create simple, custom iterators

- 在自定义类中实现迭代器行为，以便它们可以使用与内置容器相同的方便语法进行循环
- 使用生成器创建简单的自定义迭代器

__Reading material__: 
- [Python tutorial](https://docs.python.org/3.9/tutorial/) 9.8 - 9.10
- Optional: http://anandology.com/python-practice-book/iterators.html

We’re learning to write the blueprint for our own container objects, that is, objects that contain multiple elements that we can access individually and iterate over in a for loop. Python lists, tuples, sets, and dictionaries are all built-in container objects. 

我们正在学习为我们自己的容器对象编写蓝图，即包含多个元素的对象，我们可以单独访问这些元素并在 for 循环中迭代。 Python 列表、元组、集合和字典都是内置的容器对象。

In [1]:
for element in [1, 2, 3]:  # List
    print(element) 
for element in (1, 2, 3):  # tuple
    print(element)
for key in {'one':1, 'two':2}:  # dict
    print(key)
for char in "123":  # char
    print(char)

1
2
3
1
2
3
one
two
1
2
3


In [2]:
myList = [1, 2, 3]

In [5]:
# dir(myList)
myIterator = iter(myList)

In [6]:
type(myIterator)

list_iterator

In [7]:
# dir(myIterator)
print(next(myIterator))

1


In [5]:
print(next(myIterator))

2


In [6]:
print(next(myIterator))

3


In [8]:
print(next(myIterator))

StopIteration: 

In [7]:
# first creates an iterator by calling iter(myList)
# repeatedly calls the next method on this iterator until it raises StopIteration
# 首先通过调用 iter(myList) 创建一个迭代器
# 重复调用此迭代器的下一个方法，直到引发 StopIteration
for elem in myList:
    print(elem)

1
2
3


## Iterators
Now we’re trying to create our own.
Our custom container might have a built-in container object as an instance variable. For instance, the following __Reverse__ class has an instance variable data that is already a container. In this case, we just need to write some code that tells a for loop how to iterate over that instance variable.

现在我们正在尝试创建我们自己的。
我们的自定义容器可能有一个内置容器对象作为实例变量。 例如，下面的 __Reverse__ 类有一个已经是容器的实例变量数据。 在这种情况下，我们只需要编写一些代码来告诉 for 循环如何迭代该实例变量。

In [8]:
class Reverse:
    """Iterator for looping over a sequence backwards."""
    def __init__(self, data):
        self.data = data

    def __iter__(self):
        return ReverseIterator(self)
    
class ReverseIterator:
    def __init__(self, reverseObject):
        self.index = len(reverseObject.data)
        self.r = reverseObject
        
    def __next__(self):
        if self.index == 0:   # self.index = 4 bc self.index = len(reverseObject.data) --> 4
            raise StopIteration
        self.index = self.index - 1
        return self.r.data[self.index]

In [9]:
rev = Reverse('spam')

In [10]:
print(rev.data)

spam


In [11]:
revIter = iter(rev)

In [12]:
print(next(revIter))

m


In [13]:
print(next(revIter))

a


In [14]:
print(revIter.r, revIter.index)

<__main__.Reverse object at 0x103f6c880> 2


In [15]:
print(next(revIter))
print(revIter.r, revIter.index)

p
<__main__.Reverse object at 0x103f6c880> 1


In [16]:
print(next(revIter))
print(revIter.r, revIter.index)

s
<__main__.Reverse object at 0x103f6c880> 0


In [21]:
print(next(revIter))
print(revIter.r, revIter.index)

StopIteration: 

In [18]:
print(next(revIter))

StopIteration: 

In [22]:
for char in rev:
    print(char)

m
a
p
s


- The simplest sort of container object will have its own `__next__` method that, when called, returns to the for loop the next element in the container. When there are no more elements in the container, it raises a StopIteration exception (see 8.4 in [Python tutorial](https://docs.python.org/3.7/tutorial/) ) instead of returning an element. The for loop terminates when it gets this exception.

- In general, however, the container object doesn’t need to have its own `__next__` method. Instead, it may assign the job of picking the next element to a separate object, called an __iterator__.

- 最简单的容器对象将拥有自己的 __next__ 方法，调用该方法时，会将容器中的下一个元素返回给 for 循环。 当容器中没有更多元素时，它会引发 StopIteration 异常（参见 [Python 教程](https://docs.python.org/3.7/tutorial/) 中的 8.4）而不是返回元素。 for 循环在获得此异常时终止。

- 然而，一般来说，容器对象不需要有自己的 __next__ 方法。 相反，它可以将挑选下一个元素的工作分配给一个单独的对象，称为 __iterator__。

- In general, an iterator is any object that defines a suitable `__next__` method. When an iterator object’s `__next__` method is invoked, the method should return the next element of some collection – whatever that may mean. How the `__next__` method is written defines the order in which the elements of a collection are iterated over in a for loop.

- Your collection appoints an iterator by defining an `__iter__` method that returns an instance of an iterator object.

- 通常，迭代器是定义合适的 __next__ 方法的任何对象。 当迭代器对象的 __next__ 方法被调用时，该方法应该返回某个集合的下一个元素——无论这意味着什么。 `__next__` 方法的编写方式定义了集合元素在 for 循环中迭代的顺序。

- 你的集合通过定义一个返回迭代器对象实例的 `__iter__` 方法来指定一个迭代器。

If the collection has its own next method, the collection’s `__iter__` method can return self; the container will serve as its own iterator. Note that in the following example, the __Reverse__ class is both the container and the iterator object. But in general, the iterator can be a separate object from the container.

如果集合有自己的 next 方法，集合的 __iter__ 方法可以返回 self； 容器将作为它自己的迭代器。 请注意，在以下示例中， __Reverse__ 类既是容器又是迭代器对象。 但一般来说，迭代器可以是一个独立于容器的对象。

In [8]:
class Reverse:
    """Iterator for looping over a sequence backwards."""
    def __init__(self, data):
        self.data = data
        self.index = len(data)

    def __iter__(self):
        return self

    def __next__(self):
        if self.index == 0:
            raise StopIteration
        self.index = self.index - 1
        return self.data[self.index]

In [9]:
rev = Reverse('apple')
iter(rev)
for char in rev:
    print(char)

e
l
p
p
a


In [10]:
rev = Reverse('apple')
revIter = iter(rev)
print(rev is revIter)

True


Here's another example:

In [23]:
class ThreeElementContainer:
    def __init__(self, a = 0, b = 0, c = 0):
        self.a = a
        self.b = b
        self.c = c
        self.i = 0
    
    def __iter__(self):
        print("iter called")
        return self
        
    def __next__(self):
        print("next called")
        if self.i == 0:
            el = self.a
        elif self.i == 1:
            el = self.b
        elif self.i == 2:
            el = self.c
        else:
            print("raised an exception")
            raise StopIteration
        self.i += 1
        return el
        
    def __str__(self):
        return "[" + str(self.a) + ", " + str(self.b) + ", " + str(self.c)+ "]"
        
t = ThreeElementContainer(5,10,15)
print(t)


[5, 10, 15]


In [24]:
for el in t:
    print(el)

iter called
next called
5
next called
10
next called
15
next called
raised an exception


In [25]:
for el in t:
    print(el)

iter called
next called
raised an exception


In [26]:
class ThreeElementContainer2:
    def __init__(self, a = 0, b = 0, c = 0):
        self.a = a
        self.b = b
        self.c = c
        self.i = 0
    
    def __iter__(self):
        print("iter called")
        self.i = 0 # reset self.i
        return self
        
    def __next__(self):
        print("next called")
        if self.i == 0:
            el = self.a
        elif self.i == 1:
            el = self.b
        elif self.i == 2:
            el = self.c
        else:
            print("raised an exception")
            raise StopIteration
        self.i += 1
        return el
        
    def __str__(self):
        return "[" + str(self.a) + ", " + str(self.b) + ", " + str(self.c)+ "]"
        
t = ThreeElementContainer2(5,10,15)
print(t)

[5, 10, 15]


In [27]:
for el in t:
    print(el)

iter called
next called
5
next called
10
next called
15
next called
raised an exception


In [28]:
for el in t:
    print(el)

iter called
next called
5
next called
10
next called
15
next called
raised an exception


## Generators
The generator is the elegant brother of iterator that allows you to write iterators like the one you saw earlier, but in a much easier syntax where you do not have to write classes with `__iter__` and `__next__` methods.

## 生成器
生成器是迭代器的优雅兄弟，它允许您像之前看到的那样编写迭代器，但使用更简单的语法，您不必使用 `__iter__` 和 `__next__` 方法编写类。

Now let's write a __generator__ every_other(data) that yields every other
element of the data. 

In [23]:
def every_other(data):
    for index in range(0,len(data),2):
        yield data[index]   # yield equal return
        

In [24]:
for char in every_other("supercalifragilisticexpialidocious"):  # here for give generator: every_other("supercalifragilisticexpialidocious")
    print(char, end=" ")

s p r a i r g l s i e p a i o i u 

In [25]:
f = every_other("supercalifragilisticexpialidocious")

In [26]:
print(type(f))

<class 'generator'>


In [27]:
next(f)

's'

In [34]:
next(f)

'p'

In [35]:
next(f)

'r'

The magic word with generators is `yield`. There is no return statement in the function `every_other`. The return value of the function will actually be a generator. Inside the for loop when the execution reaches the yield statement, the value of data[index] is returned and the generator state is suspended. During the second next call, the generator resumes from the index at which it stopped earlier and increases this index by one. It continues with the for loop and comes to the yield statement again.

`yield` basically replaces the return statement of a function but rather provides a result to its caller without destroying local variables. Thus, in the next iteration, it can work on this local variable value again. So unlike a normal function that you have seen before, where on each call it starts with new set of variables - a generator will resume the execution where it was left off.

生成器的神奇词是 yield 。 函数 every_other 中没有 return 语句。 该函数的返回值实际上是一个生成器。 在 for 循环内部，当执行到 yield 语句时，返回 data[index] 的值，生成器状态被挂起。 在第二次 next 调用期间，生成器从它之前停止的索引处恢复，并将该索引增加 1。 它继续 for 循环并再次来到 yield 语句。

`yield` 基本上取代了函数的 return 语句，而是在不破坏局部变量的情况下向其调用者提供结果。 因此，在下一次迭代中，它可以再次处理这个局部变量值。 因此，与您之前看到的普通函数不同，在每次调用时，它都以一组新的变量开始——生成器将从中断处恢复执行。

Now let's rewrite the ThreeElementContainer class using a generator:

In [1]:
class ThreeElementContainer:
    def __init__(self, a = 0, b = 0, c = 0):
        self.a = a
        self.b = b
        self.c = c
#        self.i = 0
    
#    def __iter__(self):
#        print("iter called")
#        return self

    def __iter__(self):
        return self.generator()
    
    def generator(self):
        yield self.a
        yield self.b
        yield self.c
        # i = 0
        # yield self.a + i
        # i += 1
        # yield self.b + i
        # i += 1
        # yield self.c + i
        
#    def next(self):
#        print("next called")
#        if self.i == 0:
#            el = self.a
#        elif self.i == 1:
#            el = self.b
#        elif self.i == 2:
#            el = self.c
#        else:
#            print("raised an exception")
#            raise StopIteration
#        self.i += 1
#        return el
        
    def __str__(self):
        return "[" + str(self.a) + ", " + str(self.b) + ", " + str(self.c)+ "]"
        
t = ThreeElementContainer(5,10,15)
# print(t)
for el in t:
    print(el)


5
10
15


## Generator expressions
The generator expressions are the generator equivalent of a list comprehension, but with parentheses instead of square brackets. Just like a list comprehension returns a list, a generator expression will return a generator.

In [12]:
[x * x for x in range(1,10)]
# This is a list comprehension

[1, 4, 9, 16, 25, 36, 49, 64, 81]

In [14]:
squares = (x * x for x in range(1,10))  # squares is generator
print(type(squares))
print(list(squares))

<class 'generator'>
[1, 4, 9, 16, 25, 36, 49, 64, 81]


We can also create the same generator every_other (as above) in a single line using a
lambda function and a generator expression.

In [16]:
every_other2 = lambda data: (char for char in data[::2])
for char in every_other2("supercalifragilisticexpialidocious"):
    print(char, end=" ")

print("")
# equivalent to
for char in (char for char in "supercalifragilisticexpialidocious"[::2]):
    print(char, end=" ")

print("")

# a more memory-efficient solution
every_other3 = lambda data: (data[index] for index in range(0,len(data),2)) # range would work, too
for char in every_other3("supercalifragilisticexpialidocious"):
    print(char, end=" ")

s p r a i r g l s i e p a i o i u 
s p r a i r g l s i e p a i o i u 
s p r a i r g l s i e p a i o i u 

In [4]:
squares = (x * x for x in range(1,4))
print(next(squares))
print(next(squares))
print(next(squares))
print(next(squares))

1
4
9


StopIteration: 

In [3]:
squares = (x * x for x in range(1,4))
for i in squares:
    print(i)

print("again!")

1
4
9
again!
