#### The itertools module is a collection of tools for handling iterators.
This module implements a number of iterator building functions.

In [1]:
import itertools
import operator # for demonstation

***

**itertools.accumulate(iterable[, func, initial=None])** - makes an iterator that returns accumulated sums (default), or another accumulated results of other binary functions (specified via the optional func argument). Functions can be passed around very much like variables taking a function as an argument. It also takes an iterable. It returns the accumulated results. The results are themselves contained in an iterable.

*constructs and returns an iterator: <itertools.accumulate object>*

Usually, the number of elements output matches the input iterable. However, if the keyword argument 'initial' is provided, the accumulation leads off with the initial value so that the output has one more element than the input iterable (see line #5 below).

In [21]:
data = [2, 4, 6, 8, 10, 0, 3, 5, 7]

lst_one = print(f'#1 lst_one is: {list(itertools.accumulate(data))}') #1
lst_two = print(f'#2 lst_two is: {list(itertools.accumulate(data, operator.mul))}') #2
lst_three = print(f'#3 lst_three is: {list(itertools.accumulate(data, max))}') #3
lst_five = print(f'#4 lst_five is: {list(itertools.accumulate(data, initial=100))}') #4

# a lambda using example: amortize a 5% loan of 1000 (the initial value) with 4 annual payments of 90:
cashflows = [-100, -100, -100, -100] #5
lst_six = list(itertools.accumulate(cashflows, lambda balance, payment: balance*1.05 + payment, initial=1000))
print(f'#5 {lst_six}')


#1 lst_one is: [2, 6, 12, 20, 30, 30, 33, 38, 45]
#2 lst_two is: [2, 8, 48, 384, 3840, 0, 0, 0, 0]
#3 lst_three is: [2, 4, 6, 8, 10, 10, 10, 10, 10]
#4 lst_five is: [100, 102, 106, 112, 120, 130, 130, 133, 138, 145]
#5 [1000, 950.0, 897.5, 842.375, 784.4937500000001]


***

**itertools.chain(*iterables)** - takes a series of iterables and proceeds them until all of the iterables are exhausted and return united iterator as <class 'itertools.chain'>.

In [16]:
first_seq = ['a', 'b', 'c', 'd']
second_seq = [1, 2, 3, 4]

chain_seq = itertools.chain(first_seq, second_seq)
print(list(chain_seq))

['a', 'b', 'c', 'd', 1, 2, 3, 4]


***

**itertools.combinations(iterable, r)** - returns r length tuples of elements from the input iterable.

The combination tuples are emitted in lexicographic ordering according to the order of the input iterable. So, if the input iterable is sorted, the combination tuples will be produced in sorted order.

Elements are treated as unique based on their position, not on their value. So if the input elements are unique, there will be no repeat values in each combination.

The number of items returned is n! / r! / (n-r)! when 0 <= r <= n or zero when r > n.

In [29]:
colors = ['white', 'black', 'purple', 'green'] #1
color_comb = itertools.combinations(colors, 2)
print(f'#1 List of color combinations tuples is: {list(color_comb)}')

letters = 'abcde' #2
letters_comb = itertools.combinations(letters, 3)
print(f'#2 List of letter combinations tuples is: {list(letters_comb)}')

print(f'#3 with range: {list(itertools.combinations(range(3), 2))}') #3

#1 List of color combinations tuples is: [('white', 'black'), ('white', 'purple'), ('white', 'green'), ('black', 'purple'), ('black', 'green'), ('purple', 'green')]
#2 List of letter combinations tuples is: [('a', 'b', 'c'), ('a', 'b', 'd'), ('a', 'b', 'e'), ('a', 'c', 'd'), ('a', 'c', 'e'), ('a', 'd', 'e'), ('b', 'c', 'd'), ('b', 'c', 'e'), ('b', 'd', 'e'), ('c', 'd', 'e')]
#3 with range: [(0, 1), (0, 2), (1, 2)]


***

**itertools.combinations_with_replacement(iterable, r)** - returns r length tuples of elements from the input iterable allowing individual elements to be repeated more than once.

The number of items returned is (n+r-1)! / r! / (n-1)! when n > 0.

In [32]:
print(list(itertools.combinations_with_replacement(range(5), 2)))

[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (1, 1), (1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (2, 4), (3, 3), (3, 4), (4, 4)]


***

**itertools.compress(data, selector)** - makes an iterator that filters elements from data returning only those that have a corresponding element in selector (or mask) that evaluates to 'True'. Stops when either the data or selectors iterables has been exhausted. Roughly equivalent to: (d for d, s in zip(data, selectors) if s)

In [35]:
some_data = ['one', 'two', 'three', 'four', 'five', 'six']
some_selector = (1, 0, 0, 1)
some_result = list(itertools.compress(some_data, some_selector))
print(some_result)

['one', 'four']


***

**itertools.count(start=0, step=1)** - makes an iterator that returns evenly spaced values starting with the start number. Often used as an argument to map() to generate consecutive data points. Also, used with zip() to add sequence numbers.

In [39]:
for i in itertools.count(10, 0.5):
    print(i)
    if i > 15:
        break

10
10.5
11.0
11.5
12.0
12.5
13.0
13.5
14.0
14.5
15.0
15.5


In [28]:
# count() and zip() - tuple generator

some_string = 'ABCDEF'
final_thing = zip(itertools.count(), some_string)
print(list(final_thing))


[(0, 'A'), (1, 'B'), (2, 'C'), (3, 'D'), (4, 'E'), (5, 'F')]


In [38]:
# count() and map() - sequence generator
# lambda and count() as arguments in map()
cubes = map(lambda x: x**3, itertools.count())
for i in cubes:
    if i > 1000:
        break
    else:
        print(i, end=' ')


0 1 8 27 64 125 216 343 512 729 1000 

***

**itertools.cycle(iterable)** - makes an iterator returning elements from the iterable and saving a copy of each. When the iterable is exhausted, return elements from the saved copy. Repeats indefinitely - may require significant auxiliary storage (depending on the length of the iterable).

In [41]:
counter = 0
for i in itertools.cycle((1, 2, 3)):
    print(i)
    counter += 1
    if counter > 6:
        break

1
2
3
1
2
3
1


***

**itertools.dropwhile(predicate, iterable)** - makes an iterator that drops elements from the iterable as long as the predicate is 'True'; afterwards, returns every element. Note, the iterator does not produce any output until the predicate first becomes 'False', so it may have a lengthy start-up time.

In [46]:
some_values = [11, 12, 13, 14, 15, 16, 17, 18, 19]
cutted_values = list(itertools.dropwhile(lambda x: x % 7 != 0, some_values))
print(cutted_values)

[14, 15, 16, 17, 18, 19]


***

**itertools.filterfalse(predicate, iterable)** - makes an iterator that filters elements from iterable returning only those for which the predicate is 'False'. If predicate is 'None', returns the items that are false. 

In [7]:
only_odd = list(itertools.filterfalse(lambda x: not (x % 2), range(10)))
print(f'Only odd numbers are: {only_odd}')

none_predicate = list(itertools.filterfalse(None, range(5)))
print(f'Only zero if predicate is None: {none_predicate}')

Only odd numbers are: [1, 3, 5, 7, 9]
Only zero if predicate is None: [0]


***

**itertools.groupby(iterable, key=None)** - makes an iterator that returns consecutive keys and groups from the iterable. The key is a function computing a key value for each element. If not specified or is None, key defaults to an identity function and returns the element unchanged. Generally (see examples below), the iterable needs to already be sorted on the same key function.

The operation of groupby() generates a break or new group every time the value of the key function changes (which is why it is usually necessary to have sorted the data using the same key function). That behavior differs from SQL’s GROUP BY which aggregates common elements regardless of their input order.

The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list.

In [13]:
# Using group() with a preliminarily sorted sequence without a key function:
sorted_string = 'AAAABBBCCCDDEE'
print(f'Keys are: {[key for key, group in itertools.groupby(sorted_string)]}') # Keys are:
print(f'Groups are: {[list(group) for key, group in itertools.groupby(sorted_string)]}') # Groups are:

Keys are: ['A', 'B', 'C', 'D', 'E']
Groups are: [['A', 'A', 'A', 'A'], ['B', 'B', 'B'], ['C', 'C', 'C'], ['D', 'D'], ['E', 'E']]


In [19]:
# Using groupby() with a json sequence:
cities = [{
    'continent': 'Africa',
    'country': 'Gabon'
}, {
    'continent': 'Asia',
    'country': 'Nepal'
}, {
    'continent': 'Africa',
    'country': 'Egypt'
}, {
    'continent': 'Asia',
    'country': 'Bhutan'
}, {
    'continent': 'Africa',
    'country': 'Zambia'
}]

# the sequence pre-sorting on the same key function:
sorted_cities = sorted(cities, key=lambda x: x['continent'])

# and finally:
for key, group in itertools.groupby(sorted_cities, key=lambda x: x['continent']):
    print(key)
    print(list(group))


Africa
[{'continent': 'Africa', 'country': 'Gabon'}, {'continent': 'Africa', 'country': 'Egypt'}, {'continent': 'Africa', 'country': 'Zambia'}]
Asia
[{'continent': 'Asia', 'country': 'Nepal'}, {'continent': 'Asia', 'country': 'Bhutan'}]


***

**itertools.islice(iterable[, start], stop[, step])** - allows to cut out a piece of an iterable.
It's very similar to the regular slicing method with slice(). However slice() creates a copy of the original sequence(list, tuple, string, etc.) that will take up a considerable amount of memory if the original sequence is large. On the contrary, islice() returns an iterable, therefore it's faster since elements can be generated on the fly. But unlike regular slicing, islice() does not support negative values for start, stop or step. 

In [27]:
# Single value passed along with iterable acts as the stop parameter:
slice_a = itertools.islice(range(5), 3) #1
print(f'#1 {list(slice_a)}')

# If stop is None, then iteration continues until the iterator is exhausted:
slice_b = itertools.islice(range(5), 1, None) #2
print(f'#2 {list(slice_b)}')

# with step:
slice_c = itertools.islice((1, 2, 1, 2, 1, 2), 0, None, 2) #3
print(f'#3 {list(slice_c)}')

#1 [0, 1, 2]
#2 [1, 2, 3, 4]
#3 [1, 1, 1]
