# Agenda: Making your classes iterable

1. What is the iterator protocol in Python?
2. How can we implement this protocol in our own classes?
3. How can we do it even better with two classes?
4. Using generators for yet a third way to accomplish things

# Iterator protocol

One of the first things that people learn when they learn Python is that they can iterate over a number of different data structures with a `for` loop.

In [1]:
for one_character in 'abcd':
    print(one_character)

a
b
c
d


In [2]:
for one_item in [10, 20, 30]:
    print(one_item)

10
20
30


# What's happening here? How can this be?

1. The `for` loop asks the object at the end of the line: Are you iterable?
   - If the answer is "no," then we get a `TypeError` exception, and the loop (and program) ends
2. If we didn't get an error, then the `for` loop asks the object for its next value
    - If there are no more values, then the object raises `StopIteration`
3. The value is assigned to the loop variable
4. The body of the loop executes with that loop variable assigned
5. We go back to step 2, asking for the next value

In [3]:
for counter in 5:
    print(counter)

TypeError: 'int' object is not iterable

In [4]:
for counter in range(5):
    print(counter)

0
1
2
3
4


# `for` loops are pretty dumb!

- The loop doesn't know how many iterations we'll have, and doesn't control it
- The loop doesn't know what values we'll get back, what type they are, or how many there will be
- There is no index!

# The whole protocol is

- `iter`
- `next`
- `StopIteration`

Let's rewrite what we said before, and integrate these pieces.

1. The `for` loop invokes `iter` on the object at the end of the line. The value we get back is known as an `iterator`.
   - If the value is not iterable, then we get a `TypeError` exception, and the loop (and program) ends
2. If we didn't get an error, then the `for` loop invokes `next` on the returned iterator object
    - If there are no more values, then the iterator raises `StopIteration`
3. The value is assigned to the loop variable
4. The body of the loop executes with that loop variable assigned
5. We go back to step 2, asking for the next value

In [5]:
iter(5)

TypeError: 'int' object is not iterable

In [6]:
s = 'abcde'

In [7]:
iter(s)

<str_ascii_iterator at 0x1103abc10>

In [8]:
iter(s)

<str_ascii_iterator at 0x1103abe50>

In [9]:
iter(s)

<str_ascii_iterator at 0x1103aba30>

In [10]:
f = open('/etc/passwd')

In [11]:
iter(f)

<_io.TextIOWrapper name='/etc/passwd' mode='r' encoding='UTF-8'>

In [12]:
f

<_io.TextIOWrapper name='/etc/passwd' mode='r' encoding='UTF-8'>

In [13]:
iter(f) is f

True

We see that when we ask an object to give us its iterator with `iter`, we might get the object itself back. Some objects are their own iterators. But other objects give us new values, special-purpose iterators.

In [14]:
s

'abcde'

In [15]:
i = iter(s)   # now I've grabbed the iterator for s

In [16]:
next(i)

'a'

In [17]:
next(i)

'b'

In [18]:
next(i)

'c'

In [19]:
next(i)

'd'

In [20]:
next(i)

'e'

In [21]:
next(i)

StopIteration: 

# The iterator protocol is deliberately simple!

It doesn't let you go backwards. It doesn't let you reset. It doesn't let you skip forward.  You can only ask for the next value.

For our own class to be iterable, we'll need to do three things:

- It must respond to `iter`, by implementing the `__iter__` method
- It must respond to `next`, by implementing the `__next__` method
- It must raise `StopIteration` when we reach the end of the values in `__next__`.

In [22]:
s

'abcde'

In [23]:
dir(s)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'removeprefix',
 'removesuffix',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'stri

In [24]:
dir(f)

['_CHUNK_SIZE',
 '__class__',
 '__del__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__enter__',
 '__eq__',
 '__exit__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__next__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '_checkClosed',
 '_checkReadable',
 '_checkSeekable',
 '_checkWritable',
 '_finalizing',
 'buffer',
 'close',
 'closed',
 'detach',
 'encoding',
 'errors',
 'fileno',
 'flush',
 'isatty',
 'line_buffering',
 'mode',
 'name',
 'newlines',
 'read',
 'readable',
 'readline',
 'readlines',
 'reconfigure',
 'seek',
 'seekable',
 'tell',
 'truncate',
 'writable',
 'write',
 'write_through',
 'writelines']

In [26]:
# this is a very simple class to play with iteration

class MyClass:
    pass

m = MyClass()    

for one_item in m:
    print(one_item)

TypeError: 'MyClass' object is not iterable

In [29]:
# this is a very simple class to play with iteration

class MyClass:
    def __init__(self, data):
        self.data = data
    def __iter__(self):
        return self    # this says: I'm my own iterator!

m = MyClass('abcde')    

for one_item in m:
    print(one_item)

TypeError: iter() returned non-iterator of type 'MyClass'

In [30]:
# this is a very simple class to play with iteration

class MyClass:
    def __init__(self, data):
        self.data = data
        self.index = 0
    def __iter__(self):
        return self    # this says: I'm my own iterator!
    def __next__(self):
        if self.index >= len(self.data):
            raise StopIteration('too far!')

        value = self.data[self.index]   # Grab the current value
        self.index += 1
        return value

m = MyClass('abcde')    

for one_item in m:
    print(one_item)

a
b
c
d
e


In [31]:

class MyClass:
    def __init__(self, data):
        self.data = data
        self.index = 0
        print(f'\tIn __init__ with {self.data=} and {self.index=}')
    def __iter__(self):
        print(f'\tIn __iter__ with {self.data=} and {self.index=}')
        return self    # this says: I'm my own iterator!
    def __next__(self):
        print(f'\tIn __next__ with {self.data=} and {self.index=}')

        if self.index >= len(self.data):
            print(f'\t\tRaising StopIteration!')
            raise StopIteration('too far!')

        value = self.data[self.index]   # Grab the current value
        self.index += 1
        print(f'\t\tReturning {value}!')
        return value

m = MyClass('abcde')    

for one_item in m:
    print(one_item)

	In __init__ with self.data='abcde' and self.index=0
	In __iter__ with self.data='abcde' and self.index=0
	In __next__ with self.data='abcde' and self.index=0
		Returning a!
a
	In __next__ with self.data='abcde' and self.index=1
		Returning b!
b
	In __next__ with self.data='abcde' and self.index=2
		Returning c!
c
	In __next__ with self.data='abcde' and self.index=3
		Returning d!
d
	In __next__ with self.data='abcde' and self.index=4
		Returning e!
e
	In __next__ with self.data='abcde' and self.index=5
		Raising StopIteration!


# Exercise: Circle

1. Define a class, `Circle`, that takes two arguments:
    - The first will be an iterable sequence that we'll call `data`
    - The second will be an integer, the number of values we should get back when iterating, `maxtimes`
2. If we iterate over an instance of `Circle`, we get `maxtimes` iterations.
    - If `maxtimes` is smaller than the length of `data`, we'll only get `maxtimes` values
    - If `maxtimes` is *larger* than the length of `data`, we'll get `maxtimes` values by going through all of `data` and then starting from the beginning again!

```python
c = Circle('abcd', 7)

for one_item in c:
    print(one_item)   # a b c d a b c
```    

In [38]:
class Circle:
    def __init__(self, data, maxtimes):
        self.data = data
        self.maxtimes = maxtimes
        self.index = 0
    def __iter__(self):
        return self    # this says: I'm my own iterator!
    def __next__(self):
        if self.index >= self.maxtimes:
            raise StopIteration('too far!')

        value = self.data[self.index % len(self.data)]   
        self.index += 1
        return value

c = Circle('abcd', 9)    

for one_item in c:
    print(one_item)

a
b
c
d
a
b
c
d
a


In [39]:
# the % operator does integer division, and returns the remainder

10 % 1

0

In [40]:
10 % 2 

0

In [41]:
10 % 3

1

In [42]:
10 % 4

2

In [43]:
10 % 5

0

In [44]:
10 % 6 

4

In [45]:
# reverse it, with a smaller number % a bigger number

0 % 5

0

In [46]:
1 % 5

1

In [47]:
2 % 5

2

In [48]:
3 % 5

3

In [49]:
4 % 5

4

In [50]:
5 % 5

0

In [51]:
6 % 5

1

In [52]:
s = 'abcde'

i1 = iter(s)
i2 = iter(s)

In [53]:
next(i1)

'a'

In [54]:
next(i2)

'a'

In [56]:
next(i1)

'b'

In [57]:
next(i1)

'c'

In [58]:
next(i1)

'd'

In [59]:
next(i2)

'b'

In [60]:
# watch this with Circle

c = Circle('abcd', 9)    

print('First time')
for one_item in c:
    print(one_item)

print('Second time')
for one_item in c:
    print(one_item)    

First time
a
b
c
d
a
b
c
d
a
Second time


In [63]:
# two-class method for iterator design
# (1) is our main class
# (2) our iterator class

class CircleIterator:
    def __init__(self, data, maxtimes):
        self.data = data
        self.maxtimes = maxtimes
        self.index = 0
    def __next__(self):
        if self.index >= self.maxtimes:
            raise StopIteration('too far!')

        value = self.data[self.index % len(self.data)]   
        self.index += 1
        return value

class Circle:
    def __init__(self, data, maxtimes):
        self.data = data
        self.maxtimes = maxtimes
    def __iter__(self):
        return CircleIterator(self.data, self.maxtimes)


c = Circle('abcd', 9)    

print('First time')
for one_item in c:
    print(one_item)

print('Second time')
for one_item in c:
    print(one_item)    

First time
a
b
c
d
a
b
c
d
a
Second time
a
b
c
d
a
b
c
d
a


In [64]:
iter(c)

<__main__.CircleIterator at 0x1100bf250>

In [65]:
iter(c)

<__main__.CircleIterator at 0x110213950>

# Two-part iterators

1. You can iterate over the original object as many times as you want.
2. You can have multiple iterators on the same object
3. The iterator class can be used by multiple classes
4. The main class can now concentrate on its specific responsibility

# Exercise: Only vowels

1. Define a class, `OnlyVowels`, that takes a string as input.
2. When we iterate over an instance of `OnlyVowels`, we get, one by one, each of the vowels in the string. We ignore non-vowels entirely.
3. When there are no vowels left, we raise `StopIteration`
4. Create this class as its own iterator. Then separate it out into a two-class structure.

In [70]:
# single-class implelmentation

class OnlyVowels:
    def __init__(self, text):
        self.text = text
        self.index = 0
    def __iter__(self):
        return self
    def __next__(self):
        while self.index < len(self.text):
            value = self.text[self.index]
            self.index += 1

            if value in 'aeiou':
                return value

        raise StopIteration

In [71]:
text = 'hello to everyone out there!'

for one_vowel in OnlyVowels(text):
    print(one_vowel, end=' ')

e o o e e o e o u e e 

In [74]:
# two-class implementation

class OnlyVowelsIterator:
    def __init__(self, text):
        self.text = text
        self.index = 0
    def __next__(self):
        while self.index < len(self.text):
            value = self.text[self.index]
            self.index += 1

            if value in 'aeiou':
                return value

        raise StopIteration    
        
class OnlyVowels:
    def __init__(self, text):
        self.text = text
    def __iter__(self):
        return OnlyVowelsIterator(self.text)

text = 'hello to everyone out there!'

for one_vowel in OnlyVowels(text):
    print(one_vowel, end=' ')

e o o e e o e o u e e 

In [75]:
# if you want a two-class iterator, but you don't want a second class, you
# could use a generator function! If __iter__ is implemented as a generator function (method),
# then it returns an iterable (generator) that Python can invoke next on,time after time,
# and get one value for each iteration.

# single-class implelmentation

class OnlyVowels:
    def __init__(self, text):
        self.text = text
        self.index = 0
    def __iter__(self):
        while self.index < len(self.text):
            value = self.text[self.index]
            self.index += 1

            if value in 'aeiou':
                yield value

text = 'hello to everyone out there!'

for one_vowel in OnlyVowels(text):
    print(one_vowel, end=' ')

e o o e e o e o u e e 

# Use itertools!

`itertools` is a module that comes with Python, in the standard library. It includes about 20 different classes, each of which produces a different kind of iterator. 