# Python's Secret Weapon: itertools
`itertools` is an under-appreciated gem that deserves a spot in every Python programmer's toolkit. As Pythoneers put it, it's ["pretty much the coolest thing ever"](https://jmduke.com/2013/11/29/itertools)!


## What is itertools really?

According to [itertools docs](https://docs.python.org/3/library/itertools.html), it standardizes a core set of fast, memory efficient tools that are useful by themselves or in combination. Together, they form an **“iterator algebra”** making it possible to construct specialized tools succinctly and efficiently in pure Python.

To understand itertools, you need to understand iterators. Simply put, iterators are objects that let you traverse collections of data, such as lists, tuples, dictionaries, and sets, and return one element at a time. This makes them extremely useful when handling large datasets.

To understand iterators, we recommend [this article](https://diveintopython3.net/iterators.html).


Fluency with itertools allows you to write beautiful code that does more with less. By composing these functions, you can tackle problems in very few lines. Itertools are the secret weapon of Pythonistas in-the-know.

In this tutorial, we'll explore essential itertools through examples. Let's level up your Python skills and add some itertools magic!

## Generate consecutive data points

Let's say you are doing a profit forcasting for your business, and when counting future numbers, it hinges on the assumption that the costs will grow at an linear rate say 0.05 dollars every year, starting from 10. So you might want to create a list of theoretical costs to pass to your overall simulation function.

You can do such using list comprehension:

In [2]:
costs = [10+0.05*i for i in range(100)]
# costs = [10, 10.05, 10.1, 10.15 ...]

Let's see how it's done with `itertools.count()`. 
Basically `count` takes a start and a step: `count([start=0], [step=1])` and gives the infinite sequence: `start, start + step, start + 2 * step, ...`


In [6]:
import itertools
costs = itertools.count(10, 0.05) # this gives an infinite sequence to iter on with `while`

In some other cases, you want to create a sequence of repeating patterns.

The `cycle()` function returns an iterator that repeats the contents of the arguments it is given indefinitely.

In [48]:
# a palette of 3 colors
palette = itertools.cycle(['#377eb8', '#ff7f00', '#4daf4a'])

for i, color in zip(range(7), palette):
    print(i, color)

0 #377eb8
1 #ff7f00
2 #4daf4a
3 #377eb8
4 #ff7f00
5 #4daf4a
6 #377eb8


The `repeat()` function returns an iterator that produces the same value each time it is accessed.


In [49]:
list(itertools.repeat(None, 5))

[None, None, None, None, None]

## Filter data points
The `islice()` function returns an iterator which returns selected items from the input iterator, by index. Let's say we want to filter `costs` in our previous example to the first 10 elements.



In [50]:
cost_next_10years = itertools.islice(costs, 10)
list(cost_next_10years)

[10,
 10.05,
 10.100000000000001,
 10.150000000000002,
 10.200000000000003,
 10.250000000000004,
 10.300000000000004,
 10.350000000000005,
 10.400000000000006,
 10.450000000000006]

The `dropwhile()` function returns an iterator that produces elements of the input iterator after a condition becomes false _for the first time_; afterwards, returns every element.


In [51]:
# dropwhile checks if the element is equal to zero, and return elements after the 1st non-zero value
list(itertools.dropwhile(lambda i: i == 0, [0,0,1,4,0]))

[1, 4, 0]

Here is another example of leveraging `dropwhile()`. We can see, version number is cleaned in a way that all leading zeros are dropped while keeping the zeros after non-zero values：`10` -> `10` and `00901` -> `901`.

In [52]:
def ver_list(version):
        version = [int(i) for i in version.split(".")]
        return list(itertools.dropwhile(lambda i: i == 0, version))
version = '3.10.00901'
ver_list(version)

[3, 10, 901]

The opposite is `takewhile()`. It returns elements from the iterable _as long as_ the predicate is true. 



In [53]:
print(list(itertools.takewhile(lambda i: i < 2, [-1, 0, 3, 4, 1])))
print(list(itertools.takewhile(lambda i: i < 2, [4, 1, -1, 0])))

[-1, 0]
[]


`compress()` offers another way to filter the contents of an iterable. Instead of calling a function, it uses the values in another iterable to indicate when to accept a value and when to ignore it.


In [54]:
items = [1, 2, 3, 4, 5]
flags = [False, True, True, False, True]
list(itertools.compress(items, flags))

[2, 3, 5]


## Merging data points

The `chain()` function takes several iterators as arguments and returns a single iterator that produces the contents of all of the inputs as though they came from a single iterator.


In [55]:
for i in itertools.chain([1, 2, 3], ['a', 'b', 'c']):
    print(i, end=' ')

1 2 3 a b c 

You might have used native `zip` to combine the elements of several iterators into tuples. 

In [56]:
for t in zip([1, 2, 3], ['a', 'b', 'c']):
    print(t)

(1, 'a')
(2, 'b')
(3, 'c')


`zip()` stops when the first input iterator is exhausted. To process all of the inputs, even if the iterators produce different numbers of values, use `zip_longest()`.

In [57]:
r1 = range(10)
r2 = range(5)

print('zip stops early:')
print(list(zip(r1, r2)))

r1 = range(10)
r2 = range(5)

print('\nzip_longest processes all of the values:')
print(list(itertools.zip_longest(r1, r2)))

zip stops early:
[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)]

zip_longest processes all of the values:
[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, None), (6, None), (7, None), (8, None), (9, None)]


## Combining in various ways

The `accumulate()` lets you get accumulated sums, or accumulated results of other binary functions (specified via the optional func argument).



In [26]:
sales = [300, 100, 200, 500, 400]
cum_sales = list(itertools.accumulate(sales))
cum_sales

[300, 400, 600, 1100, 1500]

`product` gives the Cartesian product of input iterables. 
Roughly equivalent to nested for-loops in a generator expression. For example,` product(A, B`) returns the same as` ((x,y) for x in A for y in B`.

In [36]:
list(itertools.product(['hello', 'bye'], ['Jason', 'Julia']))

[('hello', 'Jason'), ('hello', 'Julia'), ('bye', 'Jason'), ('bye', 'Julia')]

Let's use it to produce a deck of cards:

In [30]:
FACE_CARDS = ('J', 'Q', 'K', 'A')
SUITS = ('♠', '♥', '♦', '♣')

# first use chain to get the rank
card_ranks = itertools.chain(range(2, 11), FACE_CARDS)

DECK = list(itertools.product(card_ranks, SUITS))
print(DECK)

[(2, '♠'), (2, '♥'), (2, '♦'), (2, '♣'), (3, '♠'), (3, '♥'), (3, '♦'), (3, '♣'), (4, '♠'), (4, '♥'), (4, '♦'), (4, '♣'), (5, '♠'), (5, '♥'), (5, '♦'), (5, '♣'), (6, '♠'), (6, '♥'), (6, '♦'), (6, '♣'), (7, '♠'), (7, '♥'), (7, '♦'), (7, '♣'), (8, '♠'), (8, '♥'), (8, '♦'), (8, '♣'), (9, '♠'), (9, '♥'), (9, '♦'), (9, '♣'), (10, '♠'), (10, '♥'), (10, '♦'), (10, '♣'), ('J', '♠'), ('J', '♥'), ('J', '♦'), ('J', '♣'), ('Q', '♠'), ('Q', '♥'), ('Q', '♦'), ('Q', '♣'), ('K', '♠'), ('K', '♥'), ('K', '♦'), ('K', '♣'), ('A', '♠'), ('A', '♥'), ('A', '♦'), ('A', '♣')]


Use `itertools.permutations()` to get all possible permutations of an iterable and `itertools.combinations()` to get all possible combinations of an iterable.


In [46]:
list(itertools.permutations('abcd', 3))

[('a', 'b', 'c'),
 ('a', 'b', 'd'),
 ('a', 'c', 'b'),
 ('a', 'c', 'd'),
 ('a', 'd', 'b'),
 ('a', 'd', 'c'),
 ('b', 'a', 'c'),
 ('b', 'a', 'd'),
 ('b', 'c', 'a'),
 ('b', 'c', 'd'),
 ('b', 'd', 'a'),
 ('b', 'd', 'c'),
 ('c', 'a', 'b'),
 ('c', 'a', 'd'),
 ('c', 'b', 'a'),
 ('c', 'b', 'd'),
 ('c', 'd', 'a'),
 ('c', 'd', 'b'),
 ('d', 'a', 'b'),
 ('d', 'a', 'c'),
 ('d', 'b', 'a'),
 ('d', 'b', 'c'),
 ('d', 'c', 'a'),
 ('d', 'c', 'b')]

In [45]:
list(itertools.combinations('abcd', 3))

[('a', 'b', 'c'), ('a', 'b', 'd'), ('a', 'c', 'd'), ('b', 'c', 'd')]

Let's look at a slightly more complex example of using `permutation` and `product` together.

In [39]:
year_tokens = ['%Y']
month_tokens = ['%b', '%B']
day_tokens = ['%d']
# use product so that it accepts 2 types of month_tokens
# this generates to [('%Y', '%b', '%d'), ('%Y', '%B', '%d')]
prods = itertools.product(year_tokens, month_tokens, day_tokens)

# use permuations to allow all arrangement of year, month, day orders
perms = [y for x in prods for y in itertools.permutations(x)]
list(perms)


[('%Y', '%b', '%d'),
 ('%Y', '%d', '%b'),
 ('%b', '%Y', '%d'),
 ('%b', '%d', '%Y'),
 ('%d', '%Y', '%b'),
 ('%d', '%b', '%Y'),
 ('%Y', '%B', '%d'),
 ('%Y', '%d', '%B'),
 ('%B', '%Y', '%d'),
 ('%B', '%d', '%Y'),
 ('%d', '%Y', '%B'),
 ('%d', '%B', '%Y')]

Itertools provide the building blocks to efficiently tackle programming challenges. Add these tools to your Python toolkit and get creative - with the power of "iterator algebra" at your fingertips, you can combine functions in novel ways to write more Pythonic code!

Credit:
- https://pymotw.com/3/itertools/index.html
- https://jmduke.com/2013/11/29/itertools
- https://docs.python.org/3/library/itertools.html