# 01 - Iterating Collections

#### Lecture

This following point will be a key takeaway of these first few sections so I have pasted it at the very top, so that it's easy to find later on, but don't need to worry about it right now if reading chronologically:

- An iterable is an object that implements __iter__ (returns an iterator)
- An iterator is an object that implements __iter__ (returns itself) and `__next__`

We saw how sequence types support iteration by being able to access elements by index. We could even write our custom sequence types by implementing the `__getitem__` method.

But there are some limitations:

* items must be numerically indexable, with indexing starting at `0`
* cannot be used with unordered collections, such as sets

If we think about iterating over a collection, what we really need is a way to request the **next** item in the collection.

This is like picking marbles out from a bag. The marbles are unlabeled and we want to ensure that we don't pick the same one twice.

If we can do that, our collection does not require being indexable, nor does it need to be ordered (i.e. we don't need the notion of relative positions of elements in the container).

This is exactly what iterables are in general - they provide a method that returns the "next" element in the collection. This approach works equally well with sequence type collections, as well as unordered collection types such as sets.

Of course, the order in which **next** returns items from an unordered colllection is not known in advance - and we see that when we iterate over a set for example:

In [1]:
s = {'x', 'y', 'b', 'c', 'a'}
for item in s:
    print(item)

y
a
c
b
x


As you can see the order in which the elements of the set was returned, did not match the order in which we added elements to the set.

Furthermore, we cannot use indexing to access elements in a set:

In [2]:
s[0]

TypeError: 'set' object does not support indexing

We have a couple of recommendations for general iteration:

- Be able to 'get the next item' in the collection (not necessarily through a sequential index).
- Make the collection finite.
- Allow exhaustion of the iterable via the exception `StopIteration`.
- Keep track of the indices somehow to ensure that the same elements aren't being called multiple times
- Be able to use a `for` loop, comprehension, etc.
- Restart the iteration from the beginning (without having to create a new instance of the object.
- Support the `__next__` special method which returns an item from the collection

#### Our Own Implementation

Let's go ahead and define a kind of iterable ourselves. 

What we'll want to do is to have a container type of class that implements the `__next__` method, instead of that `__getitem__` method. 

Let's create our own implementation that we can iterate through to generate square numbers. 

- Since we want our collection to be finite, we'll require in a specific length. This means we can also implement `__len__`.
- Every time we call `__next__`, it should return the next element in the collection - so we'll have to keep track of where we are in the iteration somehow. We'll do this with `i`.
- When we want to exhaust the iterable, we should raise the `StopIteration` error when `next` is called.

In [1]:
class Squares:
    def __init__(self, length):
        self.length = length
        self.i = 0
    
    def __next__(self):
        if self.i >= self.length:
            raise StopIteration
        else:
            result = self.i ** 2
            self.i += 1
            return result   
    
    def __len__(self):
        return self.length

Now let's generate some square values:

In [4]:
sq = Squares(5)

while True:
    try: 
        print(next(sq))
    
    except StopIteration:
        break

0
1
4
9
16


Now the iterable is exhausted. Calling `next(sq)` again will throw the exception:

In [8]:
print(next(sq))

StopIteration: 

But, we still cannot iterate through the collection using `for` loop or comprehension, so our collection is technically NOT iterable:

In [9]:
sq = Squares(10)
for item in sq:
    print(item)

TypeError: 'Squares' object is not iterable

# 02 - Iterators

In the last lecture we saw that we could approach iterating over a collection using this concept of `next`.

But there were some downsides that did not resolve (yet!):
* we cannot use a `for` loop
* once we exhaust the iteration (repeatedly calling next), we're essentially done with object. The only way to iterate through it again is to create a new instance of the object.

First we are going to look at making our `next` be usable in a `for` loop.

This idea of using `__next__` and the `StopIteration` exception is exactly what Python does.

So, somehow we need to tell Python that the object we are dealing with can be used with `next`. Python knows we have a `__next__` method but how do we tell Python that it will behave in a way consistent with using a `while` loop to iterate.

In other words, it knows we have `__next__`, but how does it know we implement `StopIteration`?

To do so, we create an `iterator` type object.

**Protocols**

A protocol is simply a fancy way of saying that our class is going to implement certain functionality that Python can count on - it's basically a contract with Python. We'll go into much more detail of protocols in Part 4 of the deep dive. 

**The Iterator Protocol**

Here, if we tell Python that we're implementing particular methods, it will support certain things. In this case, it will support iterating over things using a `while` loop.

We're going to implement the **iterator protocol**.

Iterators are objects that implement:
* a `__next__` method
* an `__iter__` method that simply returns the object itself

That's it - that's all there is to an iterator - two methods, `__iter__` and `__next__`.

**If an object is an iterator, we can use it with `for` loops, comprehensions, etc.**. We can also use `enumerate`, `sorted` and other functions because they support iterators.

How do we make our `Squares` instance an iterator? 

All we have to do is add an `__iter__` method and make it return `self`. That's it.

In [10]:
class Squares:
    def __init__(self, length):
        self.length = length
        self.i = 0
    
    def __next__(self):
        if self.i >= self.length:
            raise StopIteration
        else:
            result = self.i ** 2
            self.i += 1
            return result   
    
    def __iter__(self):
        return self

In [11]:
sq = Squares(5)

for item in sq:
    print(item)

0
1
4
9
16


This iterator cannot be restarted which is an issue, but it is not a requirement of the **iterator protocol**.

As we said above, `sorted` will work because `sorted` can take iterators:

In [12]:
sq = Squares(5)

sorted(sq)

[0, 1, 4, 9, 16]

#### Iterators vs Iterables

In general iterables can be re-used for iteration (like lists, tuples, ranges, etc) - because they are not the object performing the iteration. 

The iterator does that since it implements `__next__`. However, since iterators also implement `__iter__` they are technically also iterables. 

But, iterables that are not iterators generally do not become exhausted. (We can technically break these conventions by making an iterator with `__iter__` that returns a new iterator instead of itself, but we don't need to worry about this.)

To reiterate,

- An iterable is an object that implements __iter__ (returns an iterator)
- An iterator is an object that implements __iter__ (returns itself) and `__next__`

So, iterables DO NOT implement `__next__`.

So when does `__iter__` get called?

It won't be called with our `while True` approach - that only calls `__next__`. Let's add print statements to make this clear.

In [27]:
class Squares:
    def __init__(self, length):
        self.length = length
        self.i = 0
    
    def __next__(self):
        
        print('__next__ called')
        if self.i >= self.length:
            raise StopIteration
        else:
            result = self.i ** 2
            self.i += 1
            return result   
    
    def __iter__(self):
        
        print('__iter__ called')
        return self

In [28]:
sq = Squares(2)

while True:
    try:
        print(next(sq))
    
    except StopIteration:
        break

__next__ called
0
__next__ called
1
__next__ called


But `__iter__` **will** be called at the beginning when a `for` loop (or comprehension) is used.

In [29]:
sq = Squares(2)

for item in sq:
    print(item)

__iter__ called
__next__ called
0
__next__ called
1
__next__ called


So what's actually happening? 

Firstly, let's note:

Python needs a consistent way to get an iterator either from an iterable or from an iterator. 

- If the object is an iterable (has no `__next__` implemented) and we want this object to be an iterator, Python needs to return an iterator. This iterator will have a new memory address. When we call `__next__` on our object, Python will actually call `__next__` on the iterator until we get `StopIteration`.

- If the object is an iterator already, then Python doesn't need to return a new iterator - it can just return itself because it's already an iterator. Then, we can call `__next__` on this iterator (the same `__next__` in the original object) until we get `StopIteration`.

In [49]:
sq = Squares(5)
sq_iterator = iter(sq)
print(sq_iterator is sq)

__iter__ called
True


Then, Python calls `__next__` on `sq_iterator`. **In this particular case**, `sq` IS`sq_iterator` so it doesn't matter whether we call it on `sq` or `sq_iterator`. But in general, Python calls `__next__` on the iterator object.

In [None]:
while True:
    try:
        print(next(sq_iterator))
    
    except StopIteration:
        break

# 03 - Iterators  and Iterables

Previously we saw that we could create **iterator** objects by simply implementing:

* a `__next__` method that returns the next element in the container
* an `__iter__` method that just returns the object itself (the iterator object)

The drawback is that iterators get **exhausted**, and when this happens, the **iterator is a useless, throwaway object**. This means we have to create a new iterator every time we want to use a new iteration over the collection - can we somehow avoid having to remember to do that every time?

Let's break down the iterator into two distinct things:

1. The collection (container) of items/elements/marbles in a bag.
2. A method to iterate over the collection.

Why should we have to recreate the collection (1.) just to iterate over them again?

So if we separate our iterator into these two parts, then we should have:

1. A separate iterator object (which will always be a throwaway object - there's no escaping that), **created every time we need to start a fresh iteration**.
2. A collection which is iterable and **only created once**. This will be used to maintain/mutate our data so it may have `append`,`pop` methods, etc.

In this case, the iterator is responsible for iterating over the collection.

**Example**

Let's look at an example where we break up the iterator into a collection and an iterator object.

Firstly, the unseparated version:

In [1]:
class Cities:
    def __init__(self):
        self._cities = ['Paris', 'Berlin', 'Rome', 'Madrid', 'London']
        self._index = 0
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self._index >= len(self._cities):
            raise StopIteration
        else:
            item = self._cities[self._index]
            self._index += 1
            return item

Now let's break it up. Remember, our `_cities` list object may contain millions of data points or be pulled from an API which may take a long time so we can see why it would be wasteful to have to create a new instance of `Cities` every time our iterator gets exhausted. 

In [5]:
class Cities:
    def __init__(self):
        self._cities = ['New York', 'Newark', 'New Delhi', 'Newcastle']
        
    def __len__(self):
        return len(self._cities)

And let's create our iterator this way:

In [4]:
# container part
class Cities:
    def __init__(self):
        self._cities = ['New York', 'Newark', 'New Delhi', 'Newcastle']
        
    def __len__(self):
        return len(self._cities)

# iterator part
class CityIterator:
    def __init__(self, city_obj):
        # cities is an instance of Cities
        self._city_obj = city_obj
        self._index = 0
        
    def __iter__(self):
        return self
    
    def __next__(self):
        if self._index >= len(self._city_obj):
            raise StopIteration
        else:
            item = self._city_obj._cities[self._index]
            self._index += 1
            return item

1. So now we can create our `Cities` instance **once** to generate a container of all our data elements (perhaps from an API).
2. Then, we create our iterator instance and pass it the **instance** of our collection of our objects.
3. Then, we can use the iterator to iterate through our collection of objects using a `for` loop for example. But once the loop terminates, our `city_iterator` is an exhausted, throwaway object. 


In [52]:
cities = Cities()
city_iterator = CityIterator(cities)

for item in city_iterator:
    print(item)

New York
Newark
New Delhi
Newcastle


It would be nice if we didn't have to manually create a new iterator every time. 

This is where the **formal definition of a Python iterable** comes in...

**An iterable is a Python object that implements the iterable protocol. The iterable protocol requires that the object implements a single method: `__iter__`. This method returns a *new instance of the iterator object* which is used to iterate over the iterable.**

#### Making an Iterable

Let's quickly paste in from above our iterator which will be used by our iterable.

In [5]:
class CityIterator:
    def __init__(self, city_obj):
        # cities is an instance of Cities
        self._city_obj = city_obj
        self._index = 0
        
    def __iter__(self):
        return self
    
    def __next__(self):
        if self._index >= len(self._city_obj):
            raise StopIteration
        else:
            item = self._city_obj._cities[self._index]
            self._index += 1
            return item

So let's make our `Cities` instance a formal **iterable** by adding `__iter__` which returns a **new instance** of the iterator object.

In [1]:
class Cities:
    def __init__(self):
        self._cities = ['New York', 'Newark', 'New Delhi', 'Newcastle']
        
    def __len__(self):
        return len(self._cities)
    
    def __iter__(self):
        return CityIterator(self)     

Don't let the `self` in `return CityIterator(self)` fool you into thinking that, since we are returning `self`, `cities` must be an iterator. Why? 

Because, we start with a `Cities` instance called `cities`. Since we've fulfilled the protocol requirements, when we call `iter(cities)`, we return a **new instance** of an iterator.

`CityIterator` *is* however an **iterator** because it has implemented **both** `__iter__` and `__next__`. 

As a result, **iterators are themselves iterables but, they're iterators that *can* become exhausted**. 

Iterables on the other hand **never become exhausted** because they always return a new iterator that iterates over the original collection.

In any case, calling `iter()` on something will **always return an iterator**. 

**Chronology**

When iterating over an **iterable**, Python first:
- Calls the `iter()` to obtain an iterator.
- Then, it starts iterating over the **iterator** using `next` and `StopIteration`, etc.
- After the iteration is complete, if we execute another on another line using a `for` loop, it will work because **iterables never become exhausted**.

Here's proof:

In [7]:
cities = Cities()

for city in cities:
    print(city)

New York
Newark
New Delhi
Newcastle


In [8]:
for city in cities:
    print(city)

New York
Newark
New Delhi
Newcastle


Note that every time we execute a `for` loop, we always start off by calling `__iter__`. Updating `CityIterator` with a print statement in `__iter__`, `__init__` and `__next__`, and `Cities` with a print statement in `__iter__` will make that clear:

In [11]:
class Cities:
    def __init__(self):
        self._cities = ['New York', 'Newark', 'New Delhi', 'Newcastle']
        
    def __len__(self):
        return len(self._cities)
    
    def __iter__(self):
        print('Cities __iter__ called')
        return CityIterator(self)     

In [12]:
class CityIterator:
    def __init__(self, city_obj):
        print('CityIterator object created!')
        # city_obj is an instance of Cities
        self._city_obj = city_obj
        self._index = 0
        
    def __iter__(self):
        print("CityIterator __iter__ called")
        return self
    
    def __next__(self):
        print("CityIterator __next__ called")
        if self._index >= len(self._city_obj):
            raise StopIteration
        else:
            item = self._city_obj._cities[self._index]
            self._index += 1
            return item

Now, let's call `__iter__` and then the `for` loop. 

In [18]:
cities = Cities()

city_iter_1 = iter(cities)

Cities __iter__ called
CityIterator object created!


In [19]:
for city in city_iter_1:
    print(city)

CityIterator __iter__ called
CityIterator __next__ called
New York
CityIterator __next__ called
Newark
CityIterator __next__ called
New Delhi
CityIterator __next__ called
Newcastle
CityIterator __next__ called


Immediately executing a `for` loop tells Python to **always** call the `__iter__` first, expecting an iterator to be returned (it needs a consistent way of ensuring that it's provided an iterator because only iterators implement `__next__`). Then it calls `__next__` on the provided iterator. 

#### Final, Neat Iterator-Iterable Solution

To keep things self-contained, we can put the `CityIterator` class within `Cities` and note that `return CityIterator(self)` -> `return self.CityIterator(self)`.

In [None]:
class Cities:
    def __init__(self):
        self._cities = ['New York', 'Newark', 'New Delhi', 'Newcastle']
        
    def __len__(self):
        return len(self._cities)
    
    def __iter__(self):
        print('Calling Cities instance __iter__')
        return self.CityIterator(self)
    
    class CityIterator:
        def __init__(self, city_obj):
            # cities is an instance of Cities
            print('Calling CityIterator __init__')
            self._city_obj = city_obj
            self._index = 0

        def __iter__(self):
            print('Calling CitiyIterator instance __iter__')
            return self

        def __next__(self):
            print('Calling __next__')
            if self._index >= len(self._city_obj):
                raise StopIteration
            else:
                item = self._city_obj._cities[self._index]
                self._index += 1
                return item

#### Mixing Iterables and Sequences

`Cities` is an iterable but not a sequence because we haven't implemented `__getitem__`. Recalling that, if we implement `__getitem__` and use a `for` loop, Python repeatedly calls `__getitem__`. 

But as we've just seen, every time we call a `for` loop, Python calls `__iter__` to return an iterator, and then repeatedly calls `__next__` until we exhaust the iterator.

So which approach does Python take?

Let's take the self-contained class above and add `__getitem__` (delegating the responsibilities of slicing and indexing to the underlying list object of our collection), so that our instance is a **sequence and an iterable**:

In [20]:
class Cities:
    def __init__(self):
        self._cities = ['New York', 'Newark', 'New Delhi', 'Newcastle']
        
    def __len__(self):
        return len(self._cities)
    
    def __iter__(self):
        print('Calling Cities instance __iter__')
        return self.CityIterator(self)
    
    def __getitem__(self, s):
        print('getting item via __getitem__')
        return self._cities[s]
    
    class CityIterator:
        def __init__(self, city_obj):
            # cities is an instance of Cities
            print('Calling CityIterator __init__')
            self._city_obj = city_obj
            self._index = 0

        def __iter__(self):
            print('Calling CitiyIterator instance __iter__')
            return self

        def __next__(self):
            print('Calling __next__')
            if self._index >= len(self._city_obj):
                raise StopIteration
            else:
                item = self._city_obj._cities[self._index]
                self._index += 1
                return item

In [21]:
for city in cities:
    print(city)

Cities __iter__ called
CityIterator object created!
CityIterator __next__ called
New York
CityIterator __next__ called
Newark
CityIterator __next__ called
New Delhi
CityIterator __next__ called
Newcastle
CityIterator __next__ called


As you can see, **Python prefers the iterator-iterable approach**.

So to sum up, when Python wants to loop over some object using `for`, it first checks for `__iter__` by default. If it can't find any, it resorts to a `__getitem__` method.

Python's list object has both `__iter__` and `__getitem__` implemented so it prefers to use the `__iter__` protocol.

# 04 - Example 1 - Consuming Iterators Manually

It can be useful to manually iterate through an iterator using the `next()` function.

A fairly typical use case for this would be when reading data from a CSV file where you know the first few lines consist of information about the data rather than just the data itself.

Let's try this using a CSV file I have saved alongside the Jupyter notebook.

Let's first load the data and see what it looks like:

In [51]:
with open("../Section 04 - Iterables and Iterators/cars.csv") as file:
    
        for idx, line in enumerate(file):
            if idx > 4:
                break
            
            else:
                print(line)

Car;MPG;Cylinders;Displacement;Horsepower;Weight;Acceleration;Model;Origin

STRING;DOUBLE;INT;DOUBLE;DOUBLE;DOUBLE;DOUBLE;INT;CAT

Chevrolet Chevelle Malibu;18.0;8;307.0;130.0;3504.;12.0;70;US

Buick Skylark 320;15.0;8;350.0;165.0;3693.;11.5;70;US

Plymouth Satellite;18.0;8;318.0;150.0;3436.;11.0;70;US



As we can see, the values are delimited by `;` and the first two lines consist of the column names, and column types.

The reason for the spacing between each line is that each line ends with a newline, and our print statement also emits a newline by default. So we'll have to strip those out.

Here's what we want to do: 
* read the first line to get the column headers and create a named tuple class
* read data types from second line and store this so we can cast the strings we are reading to the correct data type
* read the data rows and parse them into a named tuples

As we might expect, as we're looping through each row we'll need to have `if` statements to catch and deal with the first two rows independently.

In [52]:
with open("../Section 04 - Iterables and Iterators/cars.csv") as file:
    row_index = 0
    for line in file:
        if row_index == 0:
            # header row
            headers = line.strip('\n').split(';')
            print(headers)
        elif row_index == 1:
            # data type row
            data_types = line.strip('\n').split(';')
            print(data_types)
        else:
            # data rows
            data = line.strip('\n').split(';')
            print(data)
        row_index += 1
        
        if row_index == 4:
            break

['Car', 'MPG', 'Cylinders', 'Displacement', 'Horsepower', 'Weight', 'Acceleration', 'Model', 'Origin']
['STRING', 'DOUBLE', 'INT', 'DOUBLE', 'DOUBLE', 'DOUBLE', 'DOUBLE', 'INT', 'CAT']
['Chevrolet Chevelle Malibu', '18.0', '8', '307.0', '130.0', '3504.', '12.0', '70', 'US']
['Buick Skylark 320', '15.0', '8', '350.0', '165.0', '3693.', '11.5', '70', 'US']


With the code above, we get each row of data as a list whose elements correspond to the data headers if it's the first row, data types if it's the second row, and all other rows are the car data.

Just to make our goal clear, let's print out the data types and data below one another:

In [53]:
data_types = ['STRING', 'DOUBLE', 'INT', 'DOUBLE', 'DOUBLE', 'DOUBLE', 'DOUBLE', 'INT', 'CAT']
data_row = ['Chevrolet Chevelle Malibu', '18.0', '8', '307.0', '130.0', '3504.', '12.0', '70', 'US']

print(data_types)
print(data_row)

['STRING', 'DOUBLE', 'INT', 'DOUBLE', 'DOUBLE', 'DOUBLE', 'DOUBLE', 'INT', 'CAT']
['Chevrolet Chevelle Malibu', '18.0', '8', '307.0', '130.0', '3504.', '12.0', '70', 'US']


We want to apply the above data type to the below data element-wise. We can do this with the following function and a list comprehension

In [54]:
def cast(data_type, value):
    if data_type == 'DOUBLE':
        return float(value)
    elif data_type == 'INT':
        return int(value)
    else:
        return str(value)
    
[cast(data_type, value) for data_type, value in zip(data_types, data_row)]

['Chevrolet Chevelle Malibu', 18.0, 8, 307.0, 130.0, 3504.0, 12.0, 70, 'US']

As you can see, our data is now in the correct type (float values like 18.0 are actual floats and not strings: '18.0').

Our code became really messy with nested `if` statements because we had to deal with the first and second row separately. But, if we convert our `file` iterable into an iterator, we can use `next()` to manually go to the next line. 

This is good because our iterator is a consumable and we can't go backwards in it, and it also makes our code clean.

In [57]:
from collections import namedtuple
cars = []

def cast(data_type, value):
    if data_type == 'DOUBLE':
        return float(value)
    elif data_type == 'INT':
        return int(value)
    else:
        return str(value)

def cast_row(data_types, data_row):
    return [cast(data_type, value) for data_type, value in zip(data_types, data_row)]
            

with open("../Section 04 - Iterables and Iterators/cars.csv") as file:
    
    file_iter = iter(file)  
    
    headers = next(file_iter).strip('\n').split(';')      # get 0th row
    Car = namedtuple('Car', headers)
    
    data_types = next(file_iter).strip('\n').split(';')   # get 1st row
    
    for line in file_iter:
        data = line.strip('\n').split(';')
        data = cast_row(data_types, data)
        car = Car(*data)
        cars.append(car)

cars[0]

Car(Car='Chevrolet Chevelle Malibu', MPG=18.0, Cylinders=8, Displacement=307.0, Horsepower=130.0, Weight=3504.0, Acceleration=12.0, Model=70, Origin='US')

This approach is fine, but if we wanted we could clean things up a little bit more. 

If you noticed, we have an empty `cars` list defined outside and we append to it from the inside. We can replace that with a list comprehension. In the full notes, he shortens the code down significantly with two list comprehensions but it becomes quite unreadable, so I've left it out here.

# 05 - Example 2 - Cyclic Iterators

Here's an example - suppose we have a loop that iterates over some range of integers. As we loop through those integers we want to create a tuple containing the integer and a string that cycles over a finite set (smaller than the list of integers).

```
1, 2, 3, 4, 5, 6, 7, 8, 9, ...

N, S, W, E
```

and we want to generate

```
1N, 2S, 3W, 4E, 5N, 6S, 7W, 8E, 9N, ...
```


We could do it this way by creating a custom iterator for the list `['N', 'S', 'W', 'E']` that will cycle over that list indefinitely:

In [2]:
class CyclicIterator:
    def __init__(self, lst):
        self.lst = lst
        self.i = 0
        
    def __iter__(self):
        return self
    
    def __next__(self):
        result = self.lst[self.i % len(self.lst)]
        self.i += 1
        return result

In [3]:
iter_cycl = CyclicIterator('NSWE')

In [4]:
for i in range(10):
    print(next(iter_cycl))

N
S
W
E
N
S
W
E
N
S


Of course, there's an easy alternative way to do this as well, using:
* repetition
* zip
* a list comprehension

We need to repeat the array ['N', 'S', 'W', 'E'] for as many times as we have elements in our range of integers - we can even create way more than we need - because when we `zip` it up with the range of integers, the smallest length iterable will be used:

In [6]:
numbers = range(1, 10)
iter_cycle = CyclicIterator('NSWE')

sol = zip(list(numbers), iter_cycl)

list(sol)

[(1, 'W'),
 (2, 'E'),
 (3, 'N'),
 (4, 'S'),
 (5, 'W'),
 (6, 'E'),
 (7, 'N'),
 (8, 'S'),
 (9, 'W')]

Here's a solution that doesn't use zip:

In [23]:
n = 10
iter_cycl = CyclicIterator('NSWE')

[str(i) + next(iter_cycl) for i in range(1, n)]

['1N', '2S', '3W', '4E', '5N', '6S', '7W', '8E', '9N']

Here's one that does:

In [24]:
n = 10
iter_cycl = CyclicIterator('NSWE')

[str(number)+direction for number, direction in zip(range(1, n), iter_cycl)]

['1N', '2S', '3W', '4E', '5N', '6S', '7W', '8E', '9N']

There's actually an even easier way yet, and that's to use our `CyclicIterator`, but instead of building it ourselves, we can simply use the one provided by Python in the standard library!!

In [25]:
import itertools

n = 10
iter_cycl = itertools.cycle('NSWE')
[str(number)+direction for number, direction in zip(range(1, n), iter_cycl)]

['1N', '2S', '3W', '4E', '5N', '6S', '7W', '8E', '9N']

`itertools.cycle` is an iterator that takes an iterable (not just a sequence like our `CyclicIterator`) which can be looped through indefinitely using `next()`.

# 06 - Lazy Iterables

An iterable is an object that can return an iterator (`__iter__`).

In turn an iterator is an object that can return itself (`__iter__`), and return the next value when asked (`__next__`).

Nothing in all this says that the iterable needs to be a finite collection, or that the elements in the iterable need to be materialized (pre-created) at the time the iterable / iterator is created.

Lazy evaluation is when evaluating a value is deferred until it is actually requested.

It is not specific to iterables however.

Simple examples of lazy evaluation are often seen in classes for calculated properties. 
For example:

In [None]:
class Actor:
    def __init__(self, actor_id):
        self.actor_id = actor_id
        self.movies = None

    @property
    def movies(self):
        if self.movies is None:
            self.movies = lookup_movies_in_db(self.actor_id)
        return self.movies

The movies of the actor is not calculated until it is requested via the property. Another example is that websites do not show all posts immediately; they will show a batch and only show the next batch once the first batch has been consumed.

#### Example 1 - Circle

Let's look at a proper example of a lazy class property:

In [31]:
import math

class Circle:
    def __init__(self, r):
        self.radius = r
        
    @property
    def radius(self):
        return self._radius
    
    @radius.setter
    def radius(self, r):
        self._radius = r
        self.area = math.pi * r**2

First note:

`self.radius = r` in the `__init__` is fine; it's calling the setter attribute under `@radius.setter`. We don't have a `self._radius` in the `__init__` but that's okay because we can easily access it via the proper or setter.

As you can see, in this circle class, every time we set the radius, we re-calculate and store the area. When we request the area of the circle, we simply return the stored value.

In [32]:
c = Circle(1)

(c.radius, c.area)

(1, 3.141592653589793)

This is not lazy as we are storing the value of the area before we actually need it. We also don't want an approach that calculates the area every time we request it. 

So, we need a middle-ground:

In [34]:
class Circle:
    def __init__(self, r):
        self.radius = r
        
    @property
    def radius(self):
        return self._radius
    
    @radius.setter
    def radius(self, r):
        self._radius = r
        self._area = None

    @property
    def area(self):
        if self._area is None:
            print('Calculating area...')
            self._area = math.pi * self.radius ** 2
        return self._area

In [37]:
c = Circle(1)

print(c.area)
print(c.area)
print('')
c.radius = 2
print(c.area)

Calculating area...
3.141592653589793
3.141592653589793

Calculating area...
12.566370614359172


Looking at:

```
    @radius.setter
    def radius(self, r):
        self._radius = r
        self._area = None
```

What we do to the `self._area` property is called 'invalidating the property'. Since the value of `_area` is invalid once a new radius has been set, we need to reflect that by setting its value to `None`.

dsads

#### Example 2 - Factorials

In [10]:
class Factorials:
    def __iter__(self):
        return self.FactIter()
    
    class FactIter:
        def __init__(self):
            self.i = 0
            
        def __iter__(self):
            return self
        
        def __next__(self):
            result = math.factorial(self.i)
            self.i += 1
            return result

In [59]:
factorials = Factorials()
fact_iter = iter(factorials)

for _ in range(10):
    print(next(fact_iter))

1
1
2
6
24
120
720
5040
40320
362880


You'll notice that the main part of the iterable code is in the iterator, and the iterable itself is nothing more than a thin shell that allows us to create and access the iterator. This is so common, that there is a better way of doing this that we'll see when we deal with generators.

Also, remember that `factorials` is an iterable so if loop through it using a `for` loop, we will first execute `__iter__` and expect an iterator. Then, we will call `__next__` on that iterator until we exhaust it. We can't exhaust our iterator in this situation as the iterable is infinite, so we'll just `break` for convenience.

In [60]:
factorials = Factorials()

for _ in factorials:
    print(_)
    if _ == 120:
        break

1
1
2
6
24
120


But we can't call `next()` on `factorials` itself because it's an iterable, not an iterator - we haven't implemented `__next__`.

# 07 - Python's Built-In Iterables and Iterators

We should always be aware of whether we are dealing with an iterable or an iterator. Why?

- If an object is an **iterable** (but not an iterator), you can iterate over it many times, because it calls a new iterator every time.
- If an object is an **iterator**, you can only iterate over it once, because it returns itself an eventually exhausts itself.

Here are some inbuilts that we've been dealing with:

- `range(10)` -> iterable
- `zip(l1, l2)` -> iterator
- `enumerate(l1)` -> iterator
- `open('cars.csv')` -> iterator
- `my_dict.keys()` -> iterable
- `my_dict.values()` -> iterable
- `my_dict.items()` -> iterable

Here's proof on the first two:

In [88]:
my_range = range(10)
print(list(my_range))
print(list(my_range))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [89]:
l1 = [1, 2, 3]
l2 = 'abc'
my_zip = zip(l1, l2)
print(list(my_zip))
print(list(my_zip))

[(1, 'a'), (2, 'b'), (3, 'c')]
[]


We can check if something is an iterable/iterator explicitly by using `dir()` which tells us all the attributes of an object. Iterables have `__iter__` only while iterators have `__iter__` and `__next__`

In [90]:
r = range(10)
'__iter__' in dir(r), '__next__' in dir(r)

(True, False)

Another way is to check if `iter(object) is object`. 

If `True`, then it's an iterator because iterators' `__iter__` method always return itself. If `False`, it's an iterable. 

In [97]:
iter(r) is r

False

#### cars csv Example

Consider this example, where we want to find out all the different origins in the file (last column of each row) - let's do this using both approaches.

Remeber `f` (i.e. `open('cars.csv')`) is an iterator which means it has a number of elements that can be evaluated sequentially with `next()`. In this case, each element is a string that looks something like:

```'Honda Civic 1500 gl;44.6;4;91.00;67.00;1850.;13.8;80;Japan\n',```

We want to extract the origin (Japan in the above case).

Here are two approaches:

In [98]:
origins = set()
with open("../Section 04 - Iterables and Iterators/cars.csv") as f:
    rows = f.readlines()
for row in rows[2:]:
    origin = row.strip('\n').split(';')[-1]
    origins.add(origin)
print(origins)

{'US', 'Japan', 'Europe'}


In [99]:
origins = set()
with open("../Section 04 - Iterables and Iterators/cars.csv") as f:
    next(f), next(f)
    for row in f:
        origin = row.strip('\n').split(';')[-1]
        origins.add(origin)
print(origins)

{'US', 'Japan', 'Europe'}


Now consider the first approach: we loaded the **entire** file into memory (a list), and then iterated through all the rows.

But in the second approach, we still iterated through all the rows, but we only need to store **one row** at a time - the overhead was therefore far smaller.

Often we can process files one row at a time and loading the entire file first, especially for huge files, is not always desirable.

# 08 - Sorting Iterables

There's nothing really new here - we have seen the `sorted()` function before when we looked at sorting sequences.

The `sorted()` function will in fact work with any iterable, not just sequences.

Let's try this by creating a custom iterable and then sorting it.

For this example, we'll create an iterable of random numbers, and then sort it.

In [17]:
import random

class RandomInts:
    def __init__(self, length, *, seed=0, lower=0, upper=10):
        self.length = length
        self.seed = seed
        self.lower = lower
        self.upper = upper
        
    def __len__(self):
        return self.length
    
    def __iter__(self):
        return self.RandomIterator(self.length, 
                                   seed = self.seed, 
                                   lower = self.lower,
                                   upper=self.upper)
    
    
    class RandomIterator:
        def __init__(self, length, *, seed, lower, upper):
            self.length = length
            self.lower = lower
            self.upper = upper
            self.num_requests = 0
            random.seed(seed)
            
        def __iter__(self):
            return self
        
        def __next__(self):
            if self.num_requests >= self.length:
                raise StopIteration
            else:
                result = random.randint(self.lower, self.upper)
                self.num_requests += 1
                return result

We have `random.seed(seed)` instead of `self.seed = random.seed(seed)` because all we want to do is set/reset the seed whenever a new iterator is created. This will take care of that.

In [18]:
randoms = RandomInts(4)

In [19]:
for num in randoms:
    print(num)

6
6
0
4


In [20]:
for num in randoms:
    print(num)

6
6
0
4


We will keep getting the same values each time because, when we call the `for` loop, our `__iter__` in `RandInts` is called with a specific seed value. This returns a new iterator, taking in that seed value and upon initialisation, it immediately sets it to that value. 

Now that we have our iterable, we can either iterate through it like above, or we can list them out with `list()`

In [21]:
sorted(randoms)

[0, 4, 6, 6]

# 09 - The iter() Function

When python needs to perform iteration over an iterable, we've seen that it uses the `__iter__` method that returns an iterator. Then, Python calls `__next__` on that newly returned iterator.

But Python uses `__iter__` **indirectly**. 

What it actually does is call the `iter(obj)` function - there's a big difference. Why? 

This is what Python does when `iter(obj)` is called:

- Looks for an `__iter__` method.
    - If it's there, it uses it.
    - If not,\
          -> look for a `__getitem__` method.\
                  ---> If it's there, **create an iterator object instance** and return that.\
                  ---> If not, raise TypeError exception saying it's not an iterable.





Because if the object passed to the `iter()` function only implements `__getitem__` (i.e. a sequence), we *still* get an **iterator** back, even though our sequence class may not have an `__iter__` method.

What does that iterator look like? It's basically an instance of the class below:

In [23]:
class SeqIterator:
    def __init__(self, seq):
        self.seq = seq
        self.index = 0

    def __iter__(self):
        return self

    def __next__(self):
        try: 
            item = self.seq[self.index]
            self.index += 1
            return item

        except IndexError:
            raise StopIteration()

**Basic Error Handling**

In Python, we have two ways of dealing with ambiguous objects as inputs. 

The first is to 'ask for permission/ look before you leap' using an `if` and `else` block. For example:

```python
obj = 100
if is_iterable(obj): # not an inbuilt function
    for i in obj:
        print(i)

else:
    print('Error: obj is not iterable')
```

The other is to 'ask for forgiveness' using a `try` and `except`block. For example:

```python
obj = 100
try:
    for i in obj:
        print(i)

except TypeError:
    print('Error: obj is not iterable')
```

In python, it's recommended that ***it's easier to ask for forgiveness than it is to get permission***.

# 10 - Iterating Callables

By **iterating callables**, what we really mean is **iterating over the return values of a callable**.

Take a look at the example below:

In [5]:
num = 6

def countdown():
    global num
    num -= 1
    
    return num

while True:
    val = countdown()
    if val == 0:
        break
    else:
        print(val)

5
4
3
2
1


We could approach this in another way by using an iterator. It will need to know two things: the callable function `countdown()` and the **sentinel** value which will result in the `StopIteration` and iterator exhaustion procedure. 

A **sentinel value** roughly speaking is a unique placeholder value (singleton objects) that can be used to flag different things. For example, a sentinel-controlled loop is one that terminates when the sentinel value is reached. `None` is a sentinel value.

**Second form of the `iter()` function**

We saw that `iter()` returns an iterator which leverages either the `__iter__` method (if iterator protocol implemented) or `__getitem__` (if sequence protocol implemented). 

We can also let it control termination with the sentinel value -> **`iter(callable, sentinel)`**.

This will return an **iterator** that will:
- call the callable when `next()` is called.
- either
  - raise `StopIteration`if the result is equal to the **sentinel** value.
  - return the result otherwise.

Before we use this, we'll make our own so we understand how it works.

##### Example 1

**Writing our own `iter()`**

In this example we are going to create a counter function (using a closure) - it's a pretty simplistic function - `counter()` will return a closure that we can then call to increment an internal counter by `1` every time it is called:

In [6]:
def counter():
    i = 0
    
    def inc():
        nonlocal i
        i += 1
        return i
    return inc

**Quick reminder**: We need the `nonlocal` keyword because the `inc()` function contains `i +=1` which is equivalent to `i = i + 1` which is an **assignment, not a reference**. Therefore, python will assume `i` on the RHS (`i + 1`) is local, which raises the Reference before assignment error.

This function allows us to create a simple counter, which we can use as follows:

In [10]:
cnt = counter()
print(cnt())
print(cnt())

1
2


But this may not be the best way of doing it. We can instead use an iterator.

In [14]:
class CounterIterator:
    def __init__(self, counter_callable, sentinel):
        self.counter_callable = counter_callable
        self.sentinel = sentinel
        
    def __iter__(self):
        return self
    
    def __next__(self):
        result = self.counter_callable()
        if result == self.sentinel:
            raise StopIteration
        else:
            return result

In [17]:
cnt = counter()  # remember this is a callable because cnt returns a function. cnt() is valid and returns a value.
cnt_iter = CounterIterator(cnt, sentinel=5)
for c in cnt_iter:
    print(c)

1
2
3
4


Everything seems good, except for the fact that the iterator is still *alive*. 

In [18]:
next(cnt_iter)

6

We need to exhaust/consume the iterator. So, let's make a flag in our iterator class to handle this. But, let's also generalise it away from just 'counters'.

In [21]:
class CallableIterator:
    def __init__(self, callable_, sentinel):
        self.callable_ = callable_
        self.sentinel = sentinel
        self.is_consumed = False
        
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.is_consumed:
            raise StopIteration
        else:
            result = self.callable_()
            if result == self.sentinel:
                self.is_consumed = True
                raise StopIteration
            else:
                return result

Now it should behave as a normal iterator that cannot continue iterating once the first `StopIteration` exception has been raised:

In [22]:
cnt = counter()
cnt_iter = CallableIterator(cnt, 5)
for c in cnt_iter:
    print(c)

1
2
3
4


In [23]:
next(cnt_iter)

StopIteration: 

**Using the inbuilt `iter(callable, sentinel)`**

Now, we can do it achieve the same functionality via the inbuilt function:

In [25]:
cnt = counter()

cnt_iter = iter(cnt, 5)

for i in cnt_iter:
    print(i)

1
2
3
4


The utility of `iter()` is if you have a callable such that, when called, returns one value after another, you can iterate through it until you reach a desired value.

For example, we can generate random values using `random.randint()` until we reach a specific value, but we'll first need to make it a callable using a `lambda`.

In [38]:
import random

rand_iter = iter(lambda: random.randint(0, 10), 5)
# random.seed(0)
for i in rand_iter:
    print(i)

7
8
1


You can keep calling the code block above to see how many times it takes to reach the sentinel value of 5. You can also uncomment the seed line to make it consistent.

##### Example 2

Here's another quick example of the countdown that uses the closure approach:

In [56]:
def countdown(val):

    def run():
        nonlocal val
        val -= 1
        return val

    return run

takeoff = countdown(5)
takeoff_iter = iter(takeoff, -1)

for i in takeoff_iter:
    print(i)

4
3
2
1
0


**Again, a quick reminder:** We need the `nonlocal` keyword because the `run()` function contains `val -= 1` which is equivalent to `val = val - 1` which is an **assignment, not a reference**. Therefore, python will assume `val` on the RHS (`val - 1`) is local, which raises the Reference before assignment error.

# 11 - Delegating Iterators

Delegation, as we've seen before, helps by not making us implement methods when they've already been implemented at a lower level which can be leveraged. It's probably easiest to see this with an example:

In [57]:
from collections import namedtuple

Person = namedtuple('Person', 'first last')

class PersonNames:
    def __init__(self, persons):
        try:
            self._persons = [person.first.capitalize()
                             + ' ' + person.last.capitalize()
                            for person in persons]
        except (TypeError, AttributeError):
            self._persons = []

- `persons` in the `__init__` is meant to be an iterable/list of `Person` objects (named tuples)
- We need to catch the `TypeError` if `person` is not an iterable.
- We need to catch an `AttributeError` if the `person` named tuple does not have a `.first` or `.last` attribute.

In this case of exception, we fail silently for sake of simplicity, but this shouldn't ever be done in practice.

In [62]:
persons = [Person('michaeL', 'paLin'), Person('eric', 'idLe'), 
           Person('john', 'cLeese')]

In [63]:
person_names = PersonNames(persons)
person_names._persons

['Michael Palin', 'Eric Idle', 'John Cleese']

At this stage, we can iterate through these names e.g. `for name in person_names._persons` (requiring us to know about this class's pseudoprivate attributes), but what we can't do is iterate through `person_names` because the `PersonNames` class does not have any iterator/iterable protocol implemented.

In [67]:
for person in person_names:
    print(person)

TypeError: 'PersonNames' object is not iterable

Because we'll want to iterate through this list without exhaustion, we should implement the iterable protocol. Therefore, we need `__iter__` which returns a new iterator.

It would be a lot of effort to write an iterator class which implements `__next__` to get an element by index, increment the index, and raise StopIteration. See below.

```python
    def __next__(self):
        try: 
            item = self.seq[self.index]
            self.index += 1
            return item

        except IndexError:
            raise StopIteration()
```

Instead, we can just **delegate** all of iterator creation to `iter()`

Note that we can't return the `._persons` list in `__iter__` because lists are iterables, but we can recover an iterator from any iterable very easily using `iter()` and return that.

In [72]:
class PersonNames:
    def __init__(self, persons):
        try:
            self._persons = [person.first.capitalize() + ' ' + person.last.capitalize()
                            for person in persons]
        except TypeError:
            self._persons = []
    
    def __iter__(self):
        return iter(self._persons)

And now, `PersonNames` is iterable!

In [73]:
persons = [Person('michaeL', 'paLin'), Person('eric', 'idLe'), 
           Person('john', 'cLeese')]
person_names = PersonNames(persons)

In [74]:
for p in person_names:
    print(p)

Michael Palin
Eric Idle
John Cleese


`person_names` is a true iterable (because it returns a new iterator every time it's called) so we can use list comprehensions etc.

# 12 - Reversed Iteration

Sometimes we may want to iterate through an iterable but in **reverse** order.

Of course, this means the collection being iterated must be finite.

Python has a built-in function called `reversed()` to do this that will work with any type that implement the sequence protocol. But for iterables in general it's a little more complicated.

Also note, for our own custom objects, `reversed()` doesn't automatically do the reversing for us. It's up to us to write the functionality in `__reversed__`.

The `reversed()` method works very similar to `iter()`. 

For a **sequence**:

- Looks for an `__reversed__` method.
    - If it's there, it uses it, returning an iterator.
    - If not,\
          -> look for a `__getitem__` and `__len__` method for determining the end of the sequence to work back from.\
                  ---> If they're there, **create an iterator object instance**, leveraging those methods.\
                  ---> If not, raise TypeError exception saying it's not reversible.

Let's first build a custom iterable.

For this example we are going to build a custom iterable that returns cards from a 52-card deck.

The deck will be in order of suits (Spades, Hearts, Diamonds and Clubs) and card values (from 2 (lowest) to Ace (highest)).

We are going to use lazy loading - i.e. we are not going to pre-build our card deck.

We just need to recognize that each suit contains `13` cards, so an integer division of the index of the card in the deck will tell us which suit it is. But of course we start indexing at 0.

**Example**

If the requested card is the `6`th in the deck (i.e. index = `5`):

`5 // 13 = 0` ==> first suit (Spades)

If the requested card is the `13`th in the deck (i.e. index = `12`):

`12 // 13 = 0` ==> first suit (Spades)

If the requested card is the `14`th in the deck (i.e. index = `13`):

`13 // 13 = 1` ==> second suit (Hearts)

To determine which card in the suit we are interested in, we simply need to use the `%` operator, again recognizing that there are `13` cards in each suit:

**Example**

If the requested card is the `6`th in the deck (i.e. index = `5`):

`5 % 13 = 5` ==> `5`th card in the suit

If the requested card is the `13`th in the deck (i.e. index = `12`):

`12 % 13 = 12` ==> `12`th card in the suit

If the requested card is the `14`th in the deck (i.e. index = `13`):

`13 % 13 = 0` ==> `1`st card in the suit

In [95]:
_SUITS = ('Spades', 'Hearts', 'Diamonds', 'Clubs')
_RANKS = tuple(range(2, 11) ) + tuple('JQKA')
from collections import namedtuple

Card = namedtuple('Card', 'rank suit')

class CardDeck:
    def __init__(self):
        self.length = len(_SUITS) * len(_RANKS)

    def __len__(self):
        return self.length
    
    def __iter__(self):
        return self.CardDeckIterator(self.length)
        
    class CardDeckIterator:
        def __init__(self, length):
            self.length = length
            self.i = 0
            
        def __iter__(self):
            return self
        
        def __next__(self):
            if self.i >= self.length:
                raise StopIteration
            else:
                suit = _SUITS[self.i // len(_RANKS)]
                rank = _RANKS[self.i % len(_RANKS)]
                self.i += 1
                return Card(rank, suit)

We can now iterate over a deck of cards as follows:

In [96]:
deck = CardDeck()

j = 0 # just so we don't print all 52 cards

for card in deck:
    if j < 6:
        print(card)
        j += 1
    else:
        break

Card(rank=2, suit='Spades')
Card(rank=3, suit='Spades')
Card(rank=4, suit='Spades')
Card(rank=5, suit='Spades')
Card(rank=6, suit='Spades')
Card(rank=7, suit='Spades')


Now that we have our deck, how would we obtain the last `7` cards in reverse order from the deck lazily, i.e., without loading the entire thing into memory?

We can't use `reversed()` just yet because we haven't implemented it, so let's do that. 

We know it must return a new iterator, but we don't want to create a whole new iterator class just for reversing. So instead, let's modify our current iterator class to take in a reverse flag that defaults to false: `reverse=False`. This is the same pattern as `sorted()`.

In [99]:
_SUITS = ('Spades', 'Hearts', 'Diamonds', 'Clubs')
_RANKS = tuple(range(2, 11) ) + ('J', 'Q', 'K', 'A')
from collections import namedtuple

Card = namedtuple('Card', 'rank suit')

class CardDeck:
    def __init__(self):
        self.length = len(_SUITS) * len(_RANKS)

    def __len__(self):
        return self.length
    
    def __iter__(self):
        return self.CardDeckIterator(self.length)
        
    def __reversed__(self):
        return self.CardDeckIterator(self.length, reverse=True)
    
    class CardDeckIterator:
        def __init__(self, length, *, reverse=False):
            self.length = length
            self.reverse = reverse
            self.i = 0
            
        def __iter__(self):
            return self
        
        def __next__(self):
            if self.i >= self.length:
                raise StopIteration
            else:
                if self.reverse:
                    index = self.length -1 - self.i
                else:
                    index = self.i
                suit = _SUITS[index // len(_RANKS)]
                rank = _RANKS[index % len(_RANKS)]
                self.i += 1
                return Card(rank, suit)

In [100]:
deck = reversed(CardDeck())
j = 0 # just so we don't print all 52 cards

for card in deck:
    if j < 6:
        print(card)
        j += 1
    else:
        break

Card(rank='A', suit='Clubs')
Card(rank='K', suit='Clubs')
Card(rank='Q', suit='Clubs')
Card(rank='J', suit='Clubs')
Card(rank=10, suit='Clubs')
Card(rank=9, suit='Clubs')


As mentioned earlier, this entire process was necessary because our iterable made from `CardDeck` is **not a sequence**. 

If it was a sequence, it would have `__len__` and `__getitem__` implemented. Therefore, `reversed()` will make its own iterator all by itself and utilise the sequence protocol methods for reverse iteration. But of course, we can always override this approach by implementing our own `__reversed__` into our sequence class.

# 13 - Caveat Using Iterators for Function Arguments

When a function requires an iterable for one of its arguments, it will also work with any iterator (since iterators are themselves iterables).

But things can go wrong if you do that!

Let's say we have an iterator that returns a collection of random numbers, and we want, for each such collection, find the minimum amd maximum value:

In [1]:
import random

In [2]:
class Randoms:
    def __init__(self, n):
        self.n = n
        self.i = 0
        
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.i >= self.n:
            raise StopIteration
        else:
            self.i += 1
            return random.randint(0, 100)

In [3]:
random.seed(0)
l = list(Randoms(10))
print(l)

[49, 97, 53, 5, 33, 65, 62, 51, 100, 38]


Now we can easily find the min and max values:

In [4]:
min(l), max(l)

(5, 100)

But watch what happens if we do this:

In [5]:
random.seed(0)
l = Randoms(10)

In [6]:
min(l)

5

In [7]:
max(l)

ValueError: max() arg is an empty sequence

That's because when `min` ran, it iterated over the **iterator** `Randoms(10)`. When we called `max` on the same iterator, it had already been exhausted - i.e. the argument to max was now empty!

There's a couple of work arounds.

1. Make an iterable out of your iterator e.g. `l = list(my_iterator)` and use `l` from now on. This will store it all in memory though.
2. If the iterator is `open('my_file.csv') as f`, then create a new iterator by opening the file twice.
3. Prevent your functions from accepting an iterator instead of an iterable as an argument. Or, make an iterable out of the iterator

For the last point, it is as simple as:

```python

def my_function(data):

    if iter(data) is data:
        raise ValueError('data cannot be an iterator')
        # OR
        data = list(data)
    
    # rest of code
```