# Welcome to the Dark Art of Coding:
## Introduction to Python
`itertools`

<img src='../../images/dark_art_logo.600px.png' width='300' style="float:right">

# Objectives
---

In this session, we should expect to:

* Understand the purpose of the itertools library
* Explore the functions available in the itertools library
* Use some of the itertools functions to solve practical data analysis/manipulation examples


# `itertools` module 
---

The itertools module includes 16 functions blah, blah, blah

```
accumulate
chain
combinations
combinations_with_replacement 
compress
count
cycle
dropwhile
filterfalse
groupby
islice
product
repeat
starmap
tee
zip_longest
```

In [1]:
import itertools as it

# Expanding generator functions
---

Some of the `itertools` functions create multiple outputs by expanding upon OR creating outputs based on the inputs

## `itertools.count()`
---

In [119]:
# count() is a bit tricky in that it will continue running forever
#    unless you tell it to stop
# 
# You can picture count() as being the much like a range() function
#    that doesn't end.

for num in it.count(10):
    print(num)
    if num > 25:
        break

10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26


In [124]:
it.count?

In [135]:
type(it.count(10))

itertools.count

In [129]:
# let's test this with a step value

for num in it.count(10, 5):
    print(num)
    if num == 35:
        break

10
15
20
25
30
35


## NOTE: **`ctrl + c`** will break infinite loops

## `itertools.cycle()`
---

In [122]:
# cycle() is also a bit tricky, in that it will also continue
#     running forever
# cycle() will iterate over a sequence endlessly, until stopped.
# 

flag = 0
for num in it.cycle([1, 2, 'three', 4]):
    if flag > 15:
        break
    flag += 1    
    print(num)

1
2
three
4
1
2
three
4
1
2
three
4
1
2
three
4


In [123]:
it.cycle?

# Mapping generator functions
---

A number of the `itertools` functions map functions against inputs

## `itertools.accumulate()`
---

In [131]:
# The accumulate() function defaults to taking the sum of the first two values
#     and then adding that sum to the third value,
#     and then adding that sum to the fourth value, and so on...

for num in it.accumulate([1, 2, 3, 4, 5, 6]):
    print(num)

1
3
6
10
15
21


In [148]:
# Iterators are designed to produce a single value at a time
# 
# The itertools library produces iterators ... this saves memory
#
# Needless to say, on occasion it is desireable to produce 
#     all the values at once: tools like the list() factory
#     function will cycle through the iterator to generate
#     all the values

list(it.accumulate([1, 2, 3, 4, 5, 6]))

[1, 3, 6, 10, 15, 21]

In [143]:
# While the accumulate() function defaults to a simple cummulative sum(), 
#     it is possible to apply other functions, as well
#
# The operator library includes a series of functions that 
#     mirror the standard operators, such as:
#     '+'   >>> operator.add()
#     '*'   >>> operator.mul()
#     '-'   >>> operator.sub()
#     '**'  >>> operator.pow() 

import operator 

list(it.accumulate([1, 2, 3, 4, 5, 6], operator.mul))

[1, 2, 6, 24, 120, 720]

In [145]:
import operator 

list(it.accumulate([2, 3, 4, 5], operator.pow))

[2, 8, 4096, 1152921504606846976]

In [149]:
# Note, this conversation is not intended to cover all the
#     nuances of the operator module, there are other
#     types of functions, such as the concat() OR concatenate 
#     function

import operator 

list(it.accumulate(['a', 'bb', 'ccc', 'dddd', 'eeeee', 'ffffff'], operator.concat))

['a',
 'abb',
 'abbccc',
 'abbcccdddd',
 'abbcccddddeeeee',
 'abbcccddddeeeeeffffff']

In [151]:
# many other functions may be used in the accumulate function
# such as min() OR max()

list(it.accumulate([5, 4, 3, 2, 1, 3, 4, 5, 3, 13, 42, 5], min))

[5, 4, 3, 2, 1, 1, 1, 1, 1, 1, 1, 1]

In [153]:
list(it.accumulate([1, 2, 3, 4, 5, 3, 13, 42, 5, 4, 3, 2, 1], max))

[1, 2, 3, 4, 5, 5, 13, 42, 42, 42, 42, 42, 42]

# Merging generator functions
---

## `itertools.chain()`
---

In [154]:
# The chain function concatenates sequences

chain1 = it.chain([1, 2, 3], [97, 98, 99])
list(chain1)

[1, 2, 3, 97, 98, 99]

In [25]:
# You can chain() the content of disparate objects, such as:
#     lists
#     sets
#     range objects
# You may chain() more than two sequences at a time, as well


chain2 = it.chain([1, 2, 3], (97, 98, 99), range(15))
list(chain2)

[1, 2, 3, 97, 98, 99, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]

## `itertools.zip_longest()`
---

In [157]:
# There is a built-in function that zippers sequences together
# 
# zip() pairs the first element with the first element
#             the second element with the second element
#             the third with the third and so on...
#             until the shortest sequences is exhausted

list(zip([1, 2, 3, 4, 42, 42, 42, 42], [11, 22, 33, 44]))

[(1, 11), (2, 22), (3, 33), (4, 44)]

In [158]:
# itertools has a function called zip_longest that
#     fills in empty elements with a default once
#     the shortest sequence is exhausted....

list(it.zip_longest([1, 2, 3, 4, 42, 42, 42, 42], [11, 22, 33, 44]))

[(1, 11),
 (2, 22),
 (3, 33),
 (4, 44),
 (42, None),
 (42, None),
 (42, None),
 (42, None)]

In [159]:
# The zip_longest() function allows us to apply a default
#     fill value...

list(it.zip_longest([1, 2, 3, 4, 42, 42, 42, 42], [11, 22, 33, 44], fillvalue=9000))

[(1, 11),
 (2, 22),
 (3, 33),
 (4, 44),
 (42, 9000),
 (42, 9000),
 (42, 9000),
 (42, 9000)]

# Reorganizing generator functions
---

## `itertools.groupby()`
---

In [77]:
# the items listed for grouping must be sorted into the desired
#     order (even if that order is not truly sorted lexigraphically).

for item in it.groupby('PPYYYYTTTTTHHHHHHONNNNN'):
    print(item)

('P', <itertools._grouper object at 0x1051f9b00>)
('Y', <itertools._grouper object at 0x1050b6da0>)
('T', <itertools._grouper object at 0x1050b6f98>)
('H', <itertools._grouper object at 0x1051f9b00>)
('O', <itertools._grouper object at 0x1051f94e0>)
('N', <itertools._grouper object at 0x1051f94a8>)


In [75]:
# each group is an iterable
#     we will examine the contents with list()

for item, group in it.groupby('PPYYYYTTTTTHHHHHHONNNNN'):
    print(item, list(group))

P ['P', 'P']
Y ['Y', 'Y', 'Y', 'Y']
T ['T', 'T', 'T', 'T', 'T']
H ['H', 'H', 'H', 'H', 'H', 'H']
O ['O']
N ['N', 'N', 'N', 'N', 'N']


In [82]:
# if the item you want to group is not sorted, it may be possible
#     to use the sorted() function to sort the contents

for item, group in it.groupby(sorted([1, 2, 3, 4, 5, 1, 2, 3, 1, 2, 3, 4])):
    print(item, list(group))

1 [1, 1, 1]
2 [2, 2, 2]
3 [3, 3, 3]
4 [4, 4]
5 [5]


# Filtering generator functions
---

## `itertools.filterfalse()`
---

In [87]:
# filterfalse drops items for which the predicate equivocates to True
#    i.e. it filters for items that are False

ff = it.filterfalse(lambda a: len(a) == 3, ['abc', 'abd', 'abde', 'asdef'])
list(ff)

['abde', 'asdef']

## `itertools.islice()`
---

In [91]:
it.islice([0, 11, 22, 33, 44, 55, 66], 2)

<itertools.islice at 0x105286818>

In [88]:
list(it.islice([0, 11, 22, 33, 44, 55, 66], 2))

[0, 11]

In [89]:
list(it.islice([0, 11, 22, 33, 44, 55, 66], 2, 4))

[22, 33]

In [95]:
list(it.islice([0, 11, 22, 33, 42, 55, 66], 2, None, 2))

[22, 42, 66]

In [94]:
list(it.islice('abcdefghijklmnopqrstuvwzyz', 2, None, 3))

['c', 'f', 'i', 'l', 'o', 'r', 'u', 'z']

## `itertools.dropwhile()`
---

In [106]:
list(it.dropwhile(lambda x: x not in 'aeiou', ['b', 'c', 'd', 'e', 'g', 'h', 'i', 'j']))

['e', 'g', 'h', 'i', 'j']

In [105]:
list(it.dropwhile(lambda x: x < 4, [1, 2, 3, 4, 5, 6, 7]))

[4, 5, 6, 7]

In [164]:
def v(ltr):
    if ltr not in 'aeiou':
        return False
    return True

list(it.dropwhile(v, ['b', 'c', 'd', 'e', 'g', 'h', 'i', 'j']))

['b', 'c', 'd', 'e', 'g', 'h', 'i', 'j']

## `itertools.takewhile()`
---

In [108]:
list(it.takewhile(lambda x: x in 'aeiou', ['a', 'e', 'e', 'a', 'b', 'c', 'd', 'e', 'g', 'h', 'i', 'j']))

['a', 'e', 'e', 'a']

In [109]:
list(it.takewhile(lambda x: x < 13, [2, 4, 6, 8, 10, 12, 14, 16, 18]))

[2, 4, 6, 8, 10, 12]

In [111]:
def v(ltr):
    if ltr in 'aeiou':
        return True
    return False

list(it.takewhile(v, ['i', 'o', 'u', 'e', 'a', 'c', 'd', 'e', 'g', 'h', 'i', 'j']))

['i', 'o', 'u', 'e', 'a']

## `itertools.*`
---

There are other functions in the itertools stable. Exploring them is left as an exercise for the student.

* accumulate
* chain
* **combinations**
* **combinations_with_replacement** 
* **compress**
* count
* cycle
* dropwhile
* filterfalse
* groupby
* islice
* **product**
* **repeat**
* **starmap**
* takewhile
* **tee**
* zip_longest

# Resources
---



[https://pymotw.com/3/itertools/](https://pymotw.com/3/itertools/)

[https://docs.python.org/3/library/itertools.html](https://docs.python.org/3/library/itertools.html)

[https://www.blog.pythonlibrary.org/2016/04/20/python-201-an-intro-to-itertools/](https://www.blog.pythonlibrary.org/2016/04/20/python-201-an-intro-to-itertools/)