### Idiomatic Python - The `itertools` Module

A lot of the code we write has to do with iteration. As I showed you in a previous video, using indexing for iterating over sequence types is generally (not always) not considered Pythonic.

The `itertools` module provides us with a wealth of additional functions to help perform operations involving iteration.

As Pythonic developers we need to get to know this module really well, and use it to our advantage.

I would urge you to read the docs on this module - they are well written, concise and the module itself does not contain hundreds of functions - just a few really useful ones.

[https://docs.python.org/3/library/itertools.html](https://docs.python.org/3/library/itertools.html)

Let's explore a few of those functions.

#### The `map` and `starmap` Functions

The `map` function is used to apply a transformation to the elements of an iterable. 

In [1]:
l = [1, 2, 3, 4, 5]

def square(x):
    return x ** 2

We can generate a new list consisting of the `square` function applied to each of the elements of the list `l` using `map` as follows:

In [2]:
results = map(square, l)

Now, `results` is not a list - it is an iterator. This means that the mapping is "lazy" and does not happen until we actually iterate over the elements of `results`:

In [3]:
for result in results:
    print(result)

1
4
9
16
25


We could easily achieve a similar thing using a list comprehension, but if we want to have an iterator (with lazy evaluation) instead of the storage and initial CPU upfront cost, we could use a generator expression as well.

In [4]:
results = (square(el) for el in l)

In [5]:
results

<generator object <genexpr> at 0x1183d4f90>

In [6]:
for result in results:
    print(result)

1
4
9
16
25


So yes, using a generator expression works just fine, and is Pythonic, but many people prefer using `map` for the simpler syntax:

In [7]:
results = (square(el) for el in l)
results = map(square, l)

Related to the `map` function is the `starmap` function, located in the `itertools` module.

In [8]:
from itertools import starmap

This is used when the function that you would use in a `map` requires more than one positional argument, for example we might use a function such as the `pow` function in the `math` module:

In [9]:
from math import pow

In [10]:
pow(2, 3)

8.0

Now suppose we have this list of "inputs" for the `pow` function, and we want to apply the `pow` function to each of them.

In [11]:
l = [(2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5)]

Or more simply:

In [12]:
l = [(2, x) for x in range(6)]
l

[(2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5)]

We again could use a generator expression to do this:

In [13]:
results = (pow(x, y) for x, y in l)
list(results)

[1.0, 2.0, 4.0, 8.0, 16.0, 32.0]

or you could do it this way too:

In [14]:
results = (pow(*args) for args in l)
list(results)

[1.0, 2.0, 4.0, 8.0, 16.0, 32.0]

But we could also use the `starmap` function that will unpack each tuple in our list as positional arguments for `pow`:

In [15]:
results = starmap(pow, l)
list(results)

[1.0, 2.0, 4.0, 8.0, 16.0, 32.0]

Again compare the two approaches side by side:

In [16]:
results = (pow(x, y) for x, y in l)
results = starmap(pow, l)

Nothing wrong with the generator approach, but `starmap` is much cleaner and expressive syntax.

#### The `chain` Function

The `chain` function is very useful for iterating across multiple iterables without having to create a temporary union of all the iterables.

For example, suppose we have these three iterables:

In [17]:
l1 = [1, 2, 3, 4, 5]
l2 = "abcd"
l3 = (100, 200, 300)

We want to iterate over the elements of  `l1`, `l2`, and `l3`, in that order.

We could do it by joining all the iterables together - but given the different iterable types, will take a bit of work to get it right, We could do it this way:

In [18]:
combined = l1 + list(l2) + list(l3)

And then we can iterate over the combined list:

In [19]:
for el in combined:
    print(el, end=" ")

1 2 3 4 5 a b c d 100 200 300 

But two things: we had to duplicate data (so more memory), and we had to write the code to combine the iterables.

Much simpler to use `chain` - again it is a "lazy" iterator (so no memory overhead, and delays calculaations until an element is actually requested - so if you do not iterate over all the elements of the combined iterable, you have saved some calculation time compared to the first approach).

In [20]:
from itertools import chain

for el in chain(l1, l2, l3):
    print(el, end=" ")

1 2 3 4 5 a b c d 100 200 300 

The `chain.from_iterable` function is a slight variant that you can use if, instead of having separate iterables like we had with `l1`, `l2` and `l3`, we have a single iterable that contains iterables, and we want to loop over the elements of each of the iterable sub-elements.

Suppose we had this list:

In [21]:
l = [[1, 2, 3], "abc", (10, 20, 30)]

Using `chain` would not give us quite what we are looking for:

In [22]:
for el in chain(l):
    print(el, end=" ")

[1, 2, 3] abc (10, 20, 30) 

Instead, we can use `chain.from_iterable`:

In [23]:
for el in chain.from_iterable(l):
    print(el, end=" ")

1 2 3 a b c 10 20 30 

#### Using `islice`

As you know, slicing is reserved for sequence types, such as lists, tuples, strings, etc.

More general iterables, such as those returned by functions such as `zip`, `map`, custom generators or generator expressions, etc do not support slicing, since they do not support indexing.

In [24]:
data = (el * 2 for el in range(5))

In [25]:
try:
    data[0]
except TypeError as ex:
    print(ex)

'generator' object is not subscriptable


The `islice` function is extremely useful if we need to slice an iterator:

In [26]:
from itertools import islice

for el in islice((el * 2 for el in range(10)), 3):
    print(el, end=" ")

0 2 4 

It even supports start, stop, and step, just like with regular slicing - however it does not support negative slicing,

In [27]:
for el in islice((el * 2 for el in range(10)), 1, None, 2):
    print(el, end=" ")

2 6 10 14 18 

In [28]:
for el in islice((el * 2 for el in range(10)), 1, 5, 2):
    print(el, end=" ")

2 6 

You can even slice sets, although, since sets are not ordered in any particular way, there is probably little reason to do so.

In [29]:
s = {'a', 'b', 10, 3.2}

for el in islice(s, 0, 2):
    print(el)

b
10


Slicing the keys of a dictionary, given the insertion order is maintained, might however be more meaningful - like say determining the first item that was inserted in the dictionary:

In [30]:
d = {
    "b": 100,
    "a": 200
}

We could first convert the keys of the dictionary to a list using a dictionary view:

In [31]:
keys = list(d.keys())
keys[0]

'b'

But this again uses more memory than we need (we only want to know about the first item), so `islice` might be a better option:

In [32]:
list(islice(d.keys(), 0, 1))

['b']

If you like this style of programming with the `itertools` module, you should also consider looking at the `more-itertools` library that provides even more functions for iterables:

[https://more-itertools.readthedocs.io/en/stable/](https://more-itertools.readthedocs.io/en/stable/)

But at the very least, get to know the `itertools` module. 