### Idiomatic Python: The `iter` Function

This is the first of a series of videos I am planning to create on writing more Pythonic code.

Often we end up writing an algorithm using the first solution that comes to mind, especially if we have a background in another language (maybe Java, Javascript, C, etc), but also when we are beginners.

The thing is that each language has its strengths and unique abilities, and simply "translating" an algorithm from one language to another does not necessarily take advantage of the target language features specifically.

And so it is with Python - Python has some unique language features that, as Python developers, we should leverage - often called writing idiomatic Python, or Pythonic code.

In this series we are going to look at various specific examples - over time I will add all these videos to the `Idiomatic Python` playlist in this channel.

Some will be extremely short, some a bit longer, and my list will likely never be exhaustive!

You will also find that some topics are more general even than just Python, but are applicable to Python development as well.

Today, we're going to look at the `iter` function.

You've probably used the `iter` function before if you were trying to recover an iterator from an iterable, so you could use `next` on it.

For example:

In [1]:
numbers = [1, 2, 3, 4, 5]

In [2]:
numbers

[1, 2, 3, 4, 5]

Now a list is not an iterator object, but it is an iterable.

So while we cannot call `next()` directly on our list:

In [3]:
try:
    next(numbers)
except TypeError as ex:
    print("TypeError:", ex)

TypeError: 'list' object is not an iterator


we can get an iterator for it first, and then call `next()` on the iterator:

In [4]:
iter_numbers = iter(numbers)

for _ in range(5):
    print(next(iter_numbers))

1
2
3
4
5


So this common way of using `iter()` is to retrieve an iterator, that you can call `next()` on to iterate over the iterator.

But, `iter()` can also be called with **two** arguments.

In that case, the first argument needs to be a callable, and the second argument is a sentinel value.

When you call iter with those two arguments, it builds and returns an iterator that calls the first argument (hence why it needs to be a callable), until the callable returns a value that is equal (`==`) to some specific value (called the **sentinel** value), at which point the iterator is considered "empty" (or exhausted, or fully iterated).

Here's a rather silly example (we'll get to a more realistic example in a bit):

In [5]:
import random

def gen_randint():
    return random.randint(0, 10)

Now we want to generate a sequence of these random integers, until we hit `5` for the first time.

We could do it easily this way:

In [6]:
random.seed(0) 

sentinel = 5
while True:
    result = gen_randint()
    if result != sentinel:
        print(result)
    else:
        break

6
6
0
4
8
7
6
4
7


But we can leverage `iter()` to achieve the same thing in a much simpler way.

First we create an iterator object (specifically a `callable_iterator` object):

In [7]:
iterator = iter(gen_randint, 5)

print(type(iterator))

<class 'callable_iterator'>


And we can use it this way:

In [8]:
random.seed(0)

for number in iter(gen_randint, 5):
    print(number)

6
6
0
4
8
7
6
4
7


As you can see, the iterator basically keeps calling `gen_randint` until that function returns the sentinel value `5`.

As it is an iterator, you can of course call `next()` on it (you can also call `iter()` on it, and it will return the object back since it already is an iterator):

In [9]:
iterator = iter(gen_randint, 5)

iter(iterator) is iterator

True

In [10]:
random.seed(0)

for _ in range(4):
    print(next(iterator))

6
6
0
4


How about if the function we want to call needs arguments - we can use either a lambda function to get around this, or use a `partial` function.

For example, suppose we have this function:

In [11]:
def gen_randint(min_, max_):
    return random.randint(min_, max_)

We want to use this function as the callable in `iter()` with the values `0` and `10`.

We can use a lambda to create a new function that is callable, and returns the value of calling `gen_randint` with the specific `min_` and `max_` values:

In [12]:
gen_lambda = lambda: gen_randint(0, 10)

We can call this function normally, and it does not require any arguments:

In [13]:
random.seed(0)

for _ in range(4):
    print(gen_lambda())

6
6
0
4


Another way to do this, is to use `partial`, located in the `functools` module:

In [14]:
from functools import partial

In [15]:
gen_partial = partial(gen_randint, 0, 10)

This works the same way as the lambda:

In [16]:
random.seed(0)

for _ in range(4):
    print(gen_partial())

6
6
0
4


And now we can use either of these approaches to create our callable iterator using `iter()`:

In [17]:
random.seed(0)

for number in iter(lambda: gen_randint(0, 10), 5):
    print(number)

6
6
0
4
8
7
6
4
7


Or, using the partial:

In [18]:
random.seed(0)

for number in iter(partial(gen_randint, 0, 10), 5):
    print(number)

6
6
0
4
8
7
6
4
7


#### Application

One very common application of this way of using `iter()` is when reading data from some source in "chunks" of a certain size.

For example, this is an extremely common way you'll see of reading data from a socket - this is even shown as an example in the Python docs [here](https://docs.python.org/3/library/socket.html#example)

We're not going to get into sockets here, so instead let's just see how we would read a text file in chunks, just to see how this works - but the same pattern applies to any problem where you are essentially running a loop, calling the same function each time, until the function returns a specific value - the sentinel value.

First I'm going to create a local text file:

In [19]:
with open("test.txt", "w") as f:
    for _ in range(10):
        f.write(f"0123456789")

We can read back the file line by line this way:

In [20]:
with open("test.txt") as f:
    for line in f.readlines():
        print(line)

0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789


But what if we wanted to read the file in chunks of `12` characters at a time?

We could do it this way:

In [21]:
with open("test.txt") as f:
    while True:
        chunk = f.read(12)
        if chunk == "":
            break
        print(chunk)

012345678901
234567890123
456789012345
678901234567
890123456789
012345678901
234567890123
456789012345
6789


You'll notice that our sentinel value is an empty string - this means there's nothing to left to read from the file, and we therefore break out of our infinite while loop.

A much easier way to write this is using the `iter()` function:

In [22]:
with open("test.txt") as f:
    for chunk in iter(partial(f.read, 12), ""):
        print(chunk.strip())

012345678901
234567890123
456789012345
678901234567
890123456789
012345678901
234567890123
456789012345
6789


Or, if you particularly object to using `partial`, you could do the same thing using a lambda:

In [23]:
with open("test.txt") as f:
    for chunk in iter(lambda: f.read(12), ""):
        print(chunk.strip())

012345678901
234567890123
456789012345
678901234567
890123456789
012345678901
234567890123
456789012345
6789


So, compare the two approaches:

```python
with open("test.txt") as f:
    while True:
        chunk = f.read(12)
        if chunk == "":
            break
        print(chunk)
```            

vs

```python
with open("test.txt") as f:
    for chunk in iter(partial(f.read, 12), ""):
        print(chunk.strip())
```

or

```python
with open("test.txt") as f:
    for chunk in iter(lambda: f.read(12), ""):
        print(chunk.strip())
```

My view is that the second (or third) option is far more Pythonic and expressive than the first option.