### Idiomatic Python - Iterating Over Collections

First we need to understand the difference between a sequence and an iterable. A sequence is an iterable whose elements can be accessed via an index.

For example a `list` is a sequence type:

In [1]:
l = [1, 2, 3, 4, 5]
l[0], l[-1]

(1, 5)

Other sequence types include tuples and strings.

But more general that sequence types are something called **iterables** - for example, sets are iterable but are not sequence types.

In many programming languages, iterating over sequence types is primarily done using indexing.

For example:

In [2]:
l = [1, 2, 3, 4, 5, 6]

In [3]:
for i in range(len(l)):
    print(l[i] ** 2)

1
4
9
16
25
36


In order to use indexing, the collection must be a sequence type (i.e. must support element access via indexing).

Python however prefers the more general concept of iterating iterables, whether the iterable is a sequence type or not.

So, in Python, using indexing for iterating over sequences is rarely used (it definitely has its place, but in general, there are often more Pythonic ways of doing so).

The easiest way to iterate over an iterable (in general, not just sequence types) in Python is simply to use a `for` loop - no indexing needed:

In [4]:
for el in l:
    print(el ** 2)

1
4
9
16
25
36


Sometimes however, we need do to know the index of the element as well, for example:

In [5]:
l = [1, 2, 3, 4, 5]
for i in range(len(l)):
    l[i] = (l[i] + 2) ** 2
    
l

[9, 16, 25, 36, 49]

Here we need the index because we want to modify the element at that index.

But there is a Pythonic way of doing this as well - using the `enumerate` function.

The `enumerate` function returns an iterator that iterates over the specific iterable (and so this is not restricted to sequence types either), and includes the index of the element (if it is a sequence, otherwise just consider it a loop index) - the elements we get from `enumerate` are tuples, containing the index and the value:

In [6]:
l = [1, 2, 3, 4, 5]
iterator = enumerate(l)
print(next(iterator))
print(next(iterator))

(0, 1)
(1, 2)


So we can use `enumerate` in a loop this way:

In [7]:
for row in enumerate(l):
    print(row)

(0, 1)
(1, 2)
(2, 3)
(3, 4)
(4, 5)


Of course, we can unpack the index and the value directly in the loop:

In [8]:
l = [1, 2, 3, 4, 5]
for index, value in enumerate(l):
    l[index] = (value + 2) ** 2
    
l

[9, 16, 25, 36, 49]

Sometimes we may need to iterate over multiple iterables at the same time, and again you may be tempted to use indexing to to this (assuming, of course, they are all sequence type and hence support indexing):

In [9]:
l1 = [1, 2, 3, 4, 5]
l2 = ['a', 'b', 'c', 'd', 'e']

In [10]:
for i in range(len(l1)):
    print(l1[i], l2[i])

1 a
2 b
3 c
4 d
5 e


But what happens if the two lists are not all the same length - we need to account for it:

In [11]:
l1 = [1, 2, 3, 4, 5]
l2 = ['a', 'b', 'c', 'd']

for i in range(min(len(l1), len(l2))):
    print(l1[i], l2[i])


1 a
2 b
3 c
4 d


The more Pythonic way of doing this is to use the `zip` function which returns an iterator that can be used to iterate over multiple iterables in "parallel":

In [12]:
l1 = [1, 2, 3, 4, 5]
l2 = list("abcde")

for el in zip(l1, l2):
    print(el)

(1, 'a')
(2, 'b')
(3, 'c')
(4, 'd')
(5, 'e')


Of course we can also just unpack the tuples returned by `zip` directly in the loop variables:

In [13]:
for v1, v2 in zip(l1, l2):
    print(v1, v2)

1 a
2 b
3 c
4 d
5 e


This extends to more than just two iterables:

In [14]:
l1 = [1, 2, 3, 4, 5]
l2 = "abcde"
l3 = (100, 200, 300, 400, 500)

for v1, v2, v3 in zip(l1, l2 , l3):
    print(v1, v2, v3)

1 a 100
2 b 200
3 c 300
4 d 400
5 e 500


`zip` will stop iterating based on the **shortest** sequence, so we don't have to play with the `min` function as we did when using indexing:

In [15]:
l1 = [1, 2, 3, 4, 5]
l2 = "abc"

for v1, v2 in zip(l1, l2):
    print(v1, v2)

1 a
2 b
3 c


And `zip` is not restricted to sequence types, but will work with any iterable.

If you want to iterate based on the **longest** iterable, that is possible too, and Python covers that case too - we just need to use the `zip_longest` in the `itertools` module. The only thing is we need to specify what the **fill** value should be for the iterables that are shorter than the longest one.

In [16]:
from itertools import zip_longest

In [17]:
l1 = [1, 2, 3, 4, 5]
l2 = "abcd"
l3 = (100, 200, 300)

Just like `zip`, `zip_longest` returns an iterator, and we can simply iterate over it using a `for` loop (as well as other mechanisms such as using `next`, passing it to a function such `list()`, etc).

In [18]:
for el in zip_longest(l1, l2, l3):
    print(el)

(1, 'a', 100)
(2, 'b', 200)
(3, 'c', 300)
(4, 'd', None)
(5, None, None)


As you can see the **default** fill value is `None`, but we can specify a custom one if needed:

In [19]:
for el in zip_longest(l1, l2, l3, fillvalue="N/A"):
    print(el)

(1, 'a', 100)
(2, 'b', 200)
(3, 'c', 300)
(4, 'd', 'N/A')
(5, 'N/A', 'N/A')


And of course we can unpack the tuples in the iteration directly into loop variables:

In [20]:
for v1, v2, v3 in zip_longest(l1, l2, l3, fillvalue="N/A"):
    print(v1, v2, v3)

1 a 100
2 b 200
3 c 300
4 d N/A
5 N/A N/A


#### Looping in Reverse Order

Many techniques you might have learned in other languages for iterating over a sequence in reverse order requires reversing the indexing you use to loop:

In [21]:
l = [1, 2, 3, 4, 5]
for i in range(len(l)-1, -1, -1):
    print(l[i])

5
4
3
2
1


Again, using indexing for looping over collections is not very Pythonic. Moreover, it does not extend to iterables in general.

For sequence types, a better way of doing this could be to use slicing:

In [22]:
for el in l[::-1]:
    print(el)

5
4
3
2
1


This is good, but `l[::-1]` actually creates a new list that contains all the elements of the original list - so, needless memory overhead.

Instead, we can use the `reversed` function which also returns an iterator and uses lazy evaluation:

In [23]:
l

[1, 2, 3, 4, 5]

In [24]:
reversed(l)

<list_reverseiterator at 0x105af9bd0>

As you can see it is an iterator, and does not compute (and store) the entire reversed list ahead of time.

In [25]:
for el in reversed(l):
    print(el)

5
4
3
2
1


And of course, it works for more than just sequence types, as long as the iterable implements the `__reversed__` method.

#### Conclusion

So, as a general rule of thumb, using indexing for looping over iterables is usually not considered Pythonic - instead use regular `for` loops, and leverage functions such as `enumerate`, `zip`, and `zip_longest`.