# Itertools module

The builtin Python module `itertools` offers some common iterators for more complex traversal or filtering.

### Infinite iterators

- `itertools.count(n)` iterates on all integers starting from n.
- `itertools.cycle(L)` iterates on all items of the list L and starts again from the start when reaching the end.
- `itertools.repeat(k, [max])` repeats endlessly the same value k.


In [102]:
import itertools

# first int bigger than 1000 multiple of 333
for i in itertools.count(1000):
    if i % 333 == 0:
        print(i, '\n')
        break

# 10 first elements of the cycle
count = 0
for i in itertools.cycle([1, 2, 3]):
    count += 1
    print(i, end=' ')
    if count == 10:
        break
print('\n')

# 10 iterations of the repetition
count = 0
for i in itertools.repeat(5):
    count += 1
    print(i, end=' ')
    if count == 10:
        break
print('\n')

# repeat can be used to give an argument to a function using map
squares = map(pow, range(10), itertools.repeat(2))
print(list(squares))

1332 

1 2 3 1 2 3 1 2 3 1 

5 5 5 5 5 5 5 5 5 5 

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


### Combinatoric iterators

- `itertools.product(A, B)` iterates on all (a, b) pairs with a in A and b in B
- `itertools.permutations(A, k)` iterates on all permutations of k elements in A (`xy` and `yx` are different permutations)
- `itertools.combinations(A, k)` iterates on all combinations of k elements in A (`xy` and `yx` are the same combination)
- `itertools.combinations_with_replacement(A, k)` iterates on all combinations of k elements in A (an element can appear multiple times)



In [92]:
# All pairs or (x, y) with x in 'ABC' and y in [1, 2]
for p in itertools.product('ABC', [1, 2]):
    print(p, end=' ')
print('\n')

# All pairs or (x, y, z) with x, y and z in [1, 2]
for p in itertools.product([1, 2], repeat=3):
    print(p, end=' ')
print('\n')

# All permutations of 2 letters of ABC (the order does matter)
for p in itertools.permutations('ABC', 2):
    print(p, end=' ')
print('\n')

# All combinations of 2 letters of ABC (the order does not matter)'
for p in itertools.combinations('ABC', 2):
    print(p, end=' ')
print('\n')

# All combinations of 2 letters of ABC (the order does not matter but repetitions are allowed)'
for p in itertools.combinations_with_replacement('ABC', 2):
    print(p, end=' ')
print('\n')

('A', 1) ('A', 2) ('B', 1) ('B', 2) ('C', 1) ('C', 2) 

(1, 1, 1) (1, 1, 2) (1, 2, 1) (1, 2, 2) (2, 1, 1) (2, 1, 2) (2, 2, 1) (2, 2, 2) 

('A', 'B') ('A', 'C') ('B', 'A') ('B', 'C') ('C', 'A') ('C', 'B') 

('A', 'B') ('A', 'C') ('B', 'C') 

('A', 'A') ('A', 'B') ('A', 'C') ('B', 'B') ('B', 'C') ('C', 'C') 



### Filtering iterators

Some iterators let us filter the items from an existing iterator :
- `itertools.islice(it, [start_idx], end_idx, [step])` iterates on a slice of an iterator.
- `itertools.filterfalse(fn, it)` is similar to the builtin `filter`, but iterates on elements returning False.
- `itertools.compress(it, list of True|False)` is similar to the builtin `filter`, it iterates only on elements of the iterator for which the selector is True.
- `itertools.dropwhile(fn, it)` skips all items until one returns False, then iterates on all remaining items.
- `itertools.takewhile(fn, it)` iterates on items until one returns False, then it stops.

In [93]:
# iterate on a slice of an iterator (all even indices between 10 and 20)
res = itertools.islice(range(100), 10, 20, 2)
print(list(res))

[10, 12, 14, 16, 18]


In [94]:
is_multiple_of_5 = lambda x: x % 5 == 0 
is_not_multiple_of_5 = lambda x: not is_multiple_of_5(x)

# iterate only on multiples of 5 using a filtering function
filtered = filter(is_multiple_of_5, range(1, 21))
print(list(filtered))

# iterate only on non multiples of 5 using a filtering function
filtered_false = itertools.filterfalse(is_multiple_of_5, range(1, 21))
print(list(filtered_false))

# iterate only on multiples of 5 using an iterator of bool
compressed = itertools.compress(range(1, 21), itertools.cycle([False, False, False, False, True]))
print(list(compressed))

# skip items until one is a multiple of 5, then iterate on the rest of the items
res = itertools.dropwhile(is_not_multiple_of_5, range(1, 21))
print(list(res))

# iterate on items until one is a multiple of 5
res = itertools.takewhile(is_not_multiple_of_5, range(1, 21))
print(list(res))


[5, 10, 15, 20]
[1, 2, 3, 4, 6, 7, 8, 9, 11, 12, 13, 14, 16, 17, 18, 19]
[5, 10, 15, 20]
[5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
[1, 2, 3, 4]


### Other iterators

- `itertools.zip_longest(it1, it2)` is similar to the builtin `zip`, but does not stop when the shortest iterable is finished.
- `itertools.starmap(fn, tuple_list)` is similar to the builtin `map`, but it takes the parameters an an iterable of tuples (instead of several iterables).
- `itertools.chain(it1, it2, ...)` iterates successively on the items of each iterable.
- `itertools.accumulate(it)` iterates on the aggregation of items in the iterator (sum by default)
- `itertools.groupby(it, key_fn)` splits an iterator into a list of iterators grouped by the result of the key function (one iterator per key).  
  __Note :__ It requires the input to be sorted by key first.
- `itertools.tee(it)` duplicates an iterator into 2 identical iterators (analogy to a T pipe)  
  __Note :__ The original iterator should no longer be used after the call.

In [95]:
# Zip the 2 iterables and use 0 to fill the shortest iterable
A = range(10)
B = [x**2 for x in range(5)]
print(list(itertools.zip_longest(A, B, fillvalue=0)))

[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 0), (6, 0), (7, 0), (8, 0), (9, 0)]


In [96]:
# starmap used to provide parameters grouped as a tuple
squares = map(pow, range(4), itertools.repeat(2))   # parameters are provided in 2 iterables
squares = itertools.starmap(pow, [(0, 2), (1, 2), (2, 2), (3, 2), (4, 2)]) # parameters in a single iterable
print(list(squares))

[0, 1, 4, 9, 16]


In [97]:
# iterate on several iterables
A = range(5)
B = range(10, 15)
chain = itertools.chain(A, B)
print(list(chain))

[0, 1, 2, 3, 4, 10, 11, 12, 13, 14]


In [98]:
# accumulate on the sum of integers
accumulated = itertools.accumulate(range(1, 11))
print(list(accumulated))

# accumulate on the product of the integers
accumulated = itertools.accumulate(range(1, 11), lambda x, y: x*y)
print(list(accumulated))

[1, 3, 6, 10, 15, 21, 28, 36, 45, 55]
[1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800]


In [103]:
is_even = lambda x : x % 2

# itertools.groupby() only works if the input list/iterator is sorted by key already !
L = sorted(range(20), key=is_even)

grouped = itertools.groupby(L, is_even)
for (key, group) in grouped:
    print(key, '=>', list(group))


0 => [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
1 => [1, 3, 5, 7, 9, 11, 13, 15, 17, 19]


In [100]:
# duplicate an iterator to have 2 independent copies of it
original_it = range(10)
copy1, copy2 = itertools.tee(original_it)
# should no longer use original_it
print(list(copy1))
print(list(copy2))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
