In [98]:
import itertools as it
import more_itertools as more_it
import seaborn as sns
import matplotlib.pyplot as plt

%matplotlib inline
%load_ext memory_profiler

The memory_profiler extension is already loaded. To reload it, use:
  %reload_ext memory_profiler


# What is an `iterable` in python?

Any Python object with a `.__iter__()` or `.__getitem__()` methods is iterable.

**E.g. #1:**: `map()` is a built-in operator function that applies `len` to each element in the iterable.

```python

    >>> list(map(len, ['cat','dogs','wombats']))
    [3, 4, 7]

```
**E.g. #2:**: Iterators are iterable, and can be used to compose _iterator algebra_

```python
    >>> list(map(math.prod, zip([2.0,3.1,4], [4, 5, 6])))
    [8.0, 15.5, 24]

```

# `itertools` Module

- Python's approach to `iterator algebra` 
- fast, memory-efficient, concise code 
    - _lazy evaluation_ (call-by-need) delays evaluatio of expression until its value is needed.

```python

    def itertools_repeat():
        for _ in it.repeat(None, 1_000_000):
            pass

    >>>> %timeit itertools_repeat()
        9.4 ms ± 602 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

    def standard_loop():
        for _ in range(1_000_000):
            pass

    >>>> %timeit standard_loop()
        20 ms ± 976 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
```

# Remarks

Refer to the Real Python example on memory efficiency

# Function breakdown of `itertools`


## Infinite iterators

| Iterator | Arguments     | Results                                        | Example                               |
|----------|---------------|------------------------------------------------|---------------------------------------|
| count()  | start, [step] | start, start+step, start+2*step, …             | <mark>`count(10) --> 10 11 12 13 14 ...` </mark>     |
| cycle()  | p             | p0, p1, … plast, p0, p1, …                     | <mark> `cycle('ABCD') --> A B C D A B C D ...` </mark> |
| repeat() | elem [,n]     | elem, elem, elem, … endlessly or up to n times | <mark> `repeat(10, 3) --> 10 10 10` </mark>            |

## Iterators terminating on the shortest input sequence

| Iterator              | Arguments                   | Results                                    | Example                                                  |
|-----------------------|-----------------------------|--------------------------------------------|----------------------------------------------------------|
| accumulate()          | p [,func]                   | p0, p0+p1, p0+p1+p2, …                     | <mark>`accumulate([1,2,3,4,5]) --> 1 3 6 10 15 `</mark>                 |
| chain()               | p, q, …                     | p0, p1, … plast, q0, q1, …                 | <mark>`chain('ABC', 'DEF') --> A B C D E F`</mark>                      |
| chain.from_iterable() | iterable                    | p0, p1, … plast, q0, q1, …                 | <mark>`chain.from_iterable(['ABC', 'DEF']) --> A B C D E F`</mark>      |
| compress()            | data, selectors             | (d[0] if s[0]), (d[1] if s[1]), …          | <mark>`compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F`</mark>            |
| dropwhile()           | pred, seq                   | seq[n], seq[n+1], starting when pred fails | <mark>`dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1` </mark>         |
| filterfalse()         | pred, seq                   | elements of seq where pred(elem) is false  | <mark>`filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8`</mark>      |
| groupby()             | iterable[, key]             | sub-iterators grouped by value of key(v)   |                                                          |
| islice()              | seq, [start,] stop [, step] | elements from seq[start:stop:step]         | <mark>`islice('ABCDEFG', 2, None) --> C D E F G`</mark>                 |
| pairwise()            | iterable                    | (p[0], p[1]), (p[1], p[2])                 | <mark>`pairwise('ABCDEFG') --> AB BC CD DE EF FG`</mark>                |
| starmap()             | func, seq                   | func(*seq[0]), func(*seq[1]), …            | <mark>`starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000`</mark>       |
| takewhile()           | pred, seq                   | seq[0], seq[1], until pred fails           | <mark>`takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4 `</mark>           |
| tee()                 | it, n                       | it1, it2, … itn splits one iterator into n |                                                          |
| zip_longest()         | p, q, …                     | (p[0], q[0]), (p[1], q[1]), …              | <mark>`zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-`</mark> |


## Combinatoric iterators

| Iterator                        | Arguments          | Results                                                       |
|---------------------------------|--------------------|---------------------------------------------------------------|
| product()                       | p, q, … [repeat=1] | cartesian product, equivalent to a nested for-loop            |
| permutations()                  | p[, r]             | r-length tuples, all possible orderings, no repeated elements |
| combinations()                  | p, r               | r-length tuples, in sorted order, no repeated elements        |
| combinations_with_replacement() | p, r               | r-length tuples, in sorted order, with repeated elements      |

Some examples:

| Examples                                 | Results                                         | Results                                                       |
|------------------------------------------|-------------------------------------------------|---------------------------------------------------------------|
| <mark>`product('ABCD', repeat=2)`</mark>               | <mark>`AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD`</mark> | cartesian product, equivalent to a nested for-loop            |
| <mark>`permutations('ABCD', 2)`</mark>                  | <mark>`AB AC AD BA BC BD CA CB CD DA DB DC`</mark>             | r-length tuples, all possible orderings, no repeated elements |
| <mark>`combinations('ABCD', 2)`</mark>                  | <mark>`AB AC AD BC BD CD`</mark>                               | r-length tuples, in sorted order, no repeated elements        |
| <mark>`combinations_with_replacement('ABCD', 2)`</mark> | <mark>`AA AB AC AD BB BC BD CC CD DD`</mark>                   | r-length tuples, in sorted order, with repeated elements      |

# `itertools` recipes

Itertools Reciples [URL](https://docs.python.org/3.6/library/itertools.html#itertools-recipes)

- `itertools.zip_longest`
- `itertools.combinations`
- `itertolls.combinations_with_replacement`
- `itertools.permutations`
- `itertools.count`
- `itertools.repeat`
- `itertools.cycle`
- `itertools.accumulate`
- `itertools.product`
- `itertools.tee`
- `itertools.islice`
- `itertools.chain`
- `itertools.filterfalse`
- `itertools.takewhile`
- `itertools.dropwhile`

## `itertools.compress()`

Combine an iterable and a boolean selector. Returns corresponding elements where boolean is `True`.

In [114]:
dates = [
    "2020-01-01",
    "2020-02-04",
    "2020-02-01",
    "2020-01-24",
    "2020-01-08",
    "2020-02-10",
    "2020-02-15",
    "2020-02-11",
]

counts = [1, 4, 3, 8, 0, 7, 9, 2]
counts_2 = counts.copy()

from itertools import compress
bools = [n > 3 for n in counts]

print(list(compress(dates, bools)))  # Compress returns iterator!

['2020-02-04', '2020-01-24', '2020-02-10', '2020-02-15']


In [115]:
df_mpg = sns.load_dataset('mpg')

df_mpg.head()

Unnamed: 0,mpg,cylinders,displacement,horsepower,weight,acceleration,model_year,origin,name
0,18.0,8,307.0,130.0,3504,12.0,70,usa,chevrolet chevelle malibu
1,15.0,8,350.0,165.0,3693,11.5,70,usa,buick skylark 320
2,18.0,8,318.0,150.0,3436,11.0,70,usa,plymouth satellite
3,16.0,8,304.0,150.0,3433,12.0,70,usa,amc rebel sst
4,17.0,8,302.0,140.0,3449,10.5,70,usa,ford torino


In [117]:
cars = df_mpg['name']
mpg_vals = df_mpg['mpg']

efficiency_threshold = 18

matching_cars = [n > efficiency_threshold for n in mpg_vals]

print(
    list(
        it.compress(
        cars,
        matching_cars
        )
    )
)


['toyota corona mark ii', 'plymouth duster', 'ford maverick', 'datsun pl510', 'volkswagen 1131 deluxe sedan', 'peugeot 504', 'audi 100 ls', 'saab 99e', 'bmw 2002', 'amc gremlin', 'datsun pl510', 'chevrolet vega 2300', 'toyota corona', 'ford pinto', 'amc gremlin', 'ford torino 500', 'chevrolet vega (sw)', 'pontiac firebird', 'mercury capri 2000', 'opel 1900', 'peugeot 304', 'fiat 124b', 'toyota corolla 1200', 'datsun 1200', 'volkswagen model 111', 'plymouth cricket', 'toyota corona hardtop', 'dodge colt hardtop', 'volkswagen type 3', 'chevrolet vega', 'ford pinto runabout', 'mazda rx2 coupe', 'volkswagen 411 (sw)', 'peugeot 504 (sw)', 'renault 12 (sw)', 'ford pinto (sw)', 'datsun 510 (sw)', 'toyouta corona mark ii (sw)', 'dodge colt (sw)', 'toyota corolla 1600 (sw)', 'plymouth duster', 'volkswagen super beetle', 'toyota carina', 'chevrolet vega', 'datsun 610', 'ford pinto', 'mercury capri v6', 'fiat 124 sport coupe', 'fiat 128', 'opel manta', 'audi 100ls', 'volvo 144ea', 'saab 99le', 't