# Iterators and Iterables

**iterator** objects by implementing:

* a `__next__` method that returns the next element in the container
* an `__iter__` method that just returns the object itself (the iterator object)

Doing that we could use a `for` loop, list comprehensions, and in fact use that iterator object anywhere an iterable was expected (like `enumerate`, `sorted`, and so on).

However, we had two outstanding issues/questions:
* when we looped over the iterator using a `for` loop (or a comprehension, or other functions that do some form of iteration), we saw that the `__iter__` was always called first.
* the iterator gets exhausted after we have finished iterating it fully - which means we have to create a new iterator every time we want to use a new iteration over the collection - can we somehow avoid having to remember to do that every time?

#### How to avoid this?

In [2]:
class Cities:
    def __init__(self):
        self._cities = ['Paris', 'Berlin', 'Rome', 'Madrid', 'London']
        self._index = 0
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self._index >= len(self._cities):
            raise StopIteration
        else:
            item = self._cities[self._index]
            self._index += 1
            return item
        

In [3]:
cities = Cities()
list(enumerate(cities))

[(0, 'Paris'), (1, 'Berlin'), (2, 'Rome'), (3, 'Madrid'), (4, 'London')]

we need to re-create it every time we want to start the iterations from the beginning

In [4]:
cities=Cities()
[item.upper() for item in cities]

['PARIS', 'BERLIN', 'ROME', 'MADRID', 'LONDON']

In [5]:
cities=Cities()
sorted(cities)

['Berlin', 'London', 'Madrid', 'Paris', 'Rome']

### Solution: split the iterator part of our code from the data part of our code.

In [6]:
class Cities:
    def __init__(self):
        self._cities = ['New York', 'Newark', 'New Delhi', 'Newcastle']
        
    def __len__(self):
        return len(self._cities)
    

In [7]:
class CityIterator:
    def __init__(self, city_obj):
        # cities is an instance of Cities
        self._city_obj = city_obj
        self._index = 0
        
    def __iter__(self):
        return self
    
    def __next__(self):
        if self._index >= len(self._city_obj):
            raise StopIteration
        else:
            item = self._city_obj._cities[self._index]
            self._index += 1
            return item
        

In [8]:
cities = Cities()

In [9]:
iter_1 = CityIterator(cities)

In [10]:
for city in iter_1:
    print(city)

New York
Newark
New Delhi
Newcastle


In [11]:
iter_2 = CityIterator(cities)
[city.upper() for city in iter_2]

['NEW YORK', 'NEWARK', 'NEW DELHI', 'NEWCASTLE']

## One problem here

In [12]:
for city in cities:
    print(city)

TypeError: 'Cities' object is not iterable

### How to fix this?

#### Iterables

An **iterable** is an object that:
* implements the `__iter__` method
* and that method returns an **iterator** which can be used to iterate over the object

Iterable vs Iterator

An iterable is an object that implements<br>
`__iter__` → returns an iterator<br>

An iterator is an object that implements<br>
`__iter__` → returns itself (an iterator)<br>
`__next__` → returns the next element<br>



In [29]:
class Cities:
    def __init__(self):
        self._cities = ['New York', 'Newark', 'New Delhi', 'Newcastle']
        
    def __len__(self):
        return len(self._cities)
    
    def __getitem__(self, s):
        print('getting item...')
        return self._cities[s]
    
    def __iter__(self):
        print('Calling Cities instance __iter__')
        return self.CityIterator(self)
    
    class CityIterator:
        def __init__(self, city_obj):
            # cities is an instance of Cities
            print('Calling CityIterator __init__')
            self._city_obj = city_obj
            self._index = 0

        def __iter__(self):
            print('Calling CitiyIterator instance __iter__')
            return self

        def __next__(self):
            print('Calling __next__')
            if self._index >= len(self._city_obj):
                raise StopIteration
            else:
                item = self._city_obj._cities[self._index]
                self._index += 1
                return item

In [30]:
cities = Cities()

In [31]:
cities[0]

getting item...


'New York'

In [32]:
next(iter(cities))

Calling Cities instance __iter__
Calling CityIterator __init__
Calling __next__


'New York'

In [33]:
cities = Cities()
for city in cities:
    print(city)

Calling Cities instance __iter__
Calling CityIterator __init__
Calling __next__
New York
Calling __next__
Newark
Calling __next__
New Delhi
Calling __next__
Newcastle
Calling __next__


## Python's Built-In Iterables and Iterators

Python provides many functions that return iterables or iterators<br>
Additionally, the iterators perform lazy evaluation<br>

You should always be aware of whether you are dealing with an iterable or an iterator. `Why?`<br>

if an object is an `iterable` (but not an iterator) you can iterate over it many times<br>
if an object is an `iterator` you can iterate over it only once<br>


range(10) → iterable<br>
zip(l1, l2) → `iterator`<br>
enumerate(l1) → `iterator`<br>
open('cars.csv') → `iterator`<br>
dictionary .keys() -> iterable<br>
dictionary .values() -> iterable<br>
dictionary .items() -> iterable<br>


In [34]:
r_10 = range(10)

In [35]:
'__iter__' in dir(r_10)

True

But it is not an iterator:

In [36]:
'__next__' in dir(r_10)

False

However, we can request an iterator by calling the __iter__ method, or simply using the iter() function:

In [37]:
r_10_iter = iter(r_10)

In [38]:
'__iter__' in dir(r_10_iter)

True

In [39]:
'__next__' in dir(r_10_iter)

True

In [40]:
[num for num in range(10)]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [42]:
z = zip([1, 2, 3], 'abc')


In [43]:
z

<zip at 0x23020e0a440>

It is an iterator:

In [44]:
print('__iter__' in dir(z))
print('__next__' in dir(z))

True
True


In [45]:
list(z)

[(1, 'a'), (2, 'b'), (3, 'c')]

In [46]:
d = {'a': 1, 'b': 2}

In [47]:
keys = d.keys()

In [48]:
'__iter__' in dir(keys), '__next__' in dir(keys)

(True, False)

In [49]:
iter(keys) is keys

False

In [50]:
values = d.values()
iter(values) is values

False

In [51]:
items = d.items()
iter(items) is items

False