<a href="https://colab.research.google.com/github/coda-nsit/languages/blob/master/python/python_itertools.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Example codes on `itertools` module of Python
https://www.youtube.com/watch?v=Qu3dThVy6KQ

In [0]:
import itertools

## count

In [6]:
counter = itertools.count(start=5, step=10.001)
x = [3, 4, 10, 11, 19, 24]
list(zip(counter, x))

[(5, 3),
 (15.001, 4),
 (25.002, 10),
 (35.003, 11),
 (45.004, 19),
 (55.004999999999995, 24)]

In [5]:
x_

[(5, 3), (15, 4), (25, 10), (35, 11), (45, 19), (55, 24)]

## zip_longest

In [7]:
x = list(itertools.zip_longest(range(100), x))
x[:20]

[(0, 3),
 (1, 4),
 (2, 10),
 (3, 11),
 (4, 19),
 (5, 24),
 (6, None),
 (7, None),
 (8, None),
 (9, None),
 (10, None),
 (11, None),
 (12, None),
 (13, None),
 (14, None),
 (15, None),
 (16, None),
 (17, None),
 (18, None),
 (19, None)]

## cycle

In [9]:
counter = itertools.cycle(["on", "off"])
x = [3, 4, 10, 11, 19, 24]
list(zip(counter, x))

[('on', 3), ('off', 4), ('on', 10), ('off', 11), ('on', 19), ('off', 24)]

## repeat

In [10]:
counter = itertools.repeat(2, times=3)
print(next(counter))
print(next(counter))
print(next(counter))

print(next(counter))

2
2
2


StopIteration: ignored

## Example: square every number

In [12]:
print(list(map(pow, range(10), itertools.repeat(2))))
print(list(itertools.starmap(pow, [(0, 2), (1, 2), (2, 2), (3, 2), (4, 2), (5, 2), (6, 2), (7, 2), (8, 2), (9, 2)])))

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


## Combinations and Permutations

In [17]:
x = [0, 1, 2, 3, 4]
print(list(itertools.combinations(x, 2)))
print(list(itertools.combinations_with_replacement(x, 2)))
print(list(itertools.permutations(x, 2)))
print(list(itertools.product(x, repeat=2)))

[(0, 1), (0, 2), (0, 3), (0, 4), (1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)]
[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (1, 1), (1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (2, 4), (3, 3), (3, 4), (4, 4)]
[(0, 1), (0, 2), (0, 3), (0, 4), (1, 0), (1, 2), (1, 3), (1, 4), (2, 0), (2, 1), (2, 3), (2, 4), (3, 0), (3, 1), (3, 2), (3, 4), (4, 0), (4, 1), (4, 2), (4, 3)]
[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4)]


## Chain
Chains iterables i.e. first iterate through one iterable and then the second and so on, till the chain is over.

In [20]:
letters = ["a", "b", "c", "d"]
numbers = [0, 1, 2]
names = ["rishab", "ashm", "ravish", "rohan", "banerjee", "manav", "ankush", "kaushik"]

list(itertools.chain(letters, numbers, names))

['a',
 'b',
 'c',
 'd',
 0,
 1,
 2,
 'rishab',
 'ashm',
 'ravish',
 'rohan',
 'banerjee',
 'manav',
 'ankush',
 'kaushik']

## islice: Slicing on an iterator
**Why is it useful?**

The dataset might be too big for example, 100000000000000 elements and we might need data intervals of say 100000. The entire dataset will not fit in memory.

This can be really useful if for example, we have to read a large file. But not line by line, but lines in steps of 5.


In [21]:
list(itertools.islice(range(10), 1, 5, 2))

[1, 3]

## Compress and Filter
Filter based on an iterator value.

The difference between `filter` and `compress` is that, `filter` uses a function to decide if the element should be included or not whereas, `compress` uses iterators for the same.

In [26]:
names = ["rishab", "ashm", "ravish", "rohan", "banerjee", "manav", "ankush", "kaushik"]
isMarried = [False, True, False, True, False, True, True, True]
numbers = [0, 1, 6, 10, 1, 2, 5, 11]

def less_than_5(numb):
  return numb < 5

print(list(itertools.compress(names, isMarried)))
print(list(itertools.filterfalse(less_than_5, numbers)))
print(list(filter(less_than_5, numbers)))

['ashm', 'rohan', 'manav', 'ankush', 'kaushik']
[6, 10, 5, 11]
[0, 1, 1, 2]


## Accumulate
Accumulates the values that an iterable sees. By default, it adds the iterables it sees.

In [31]:
numbers = [1, 1, 6, 10, 1, 2, 5, 11]

# add
print(list(itertools.accumulate(numbers)))

# product
import operator
print(list(itertools.accumulate(numbers, operator.mul)))


[1, 2, 8, 18, 19, 21, 26, 37]
[1, 1, 6, 60, 60, 120, 600, 6600]


## Group By
Go through an iterable and group values based on a certain key. 

Returns a stream of tuples, (key on which the items were grouped on, iterator over the items that were grouped by the key)

In [0]:
people = [
    {
        'name': 'John Doe',
        'city': 'Gotham',
        'state': 'NY'
    },
    {
        'name': 'Jane Doe',
        'city': 'Kings Landing',
        'state': 'NY'
    },
    {
        'name': 'Corey Schafer',
        'city': 'Boulder',
        'state': 'CO'
    },
    {
        'name': 'Al Einstein',
        'city': 'Denver',
        'state': 'CO'
    },
    {
        'name': 'John Henry',
        'city': 'Hinton',
        'state': 'WV'
    },
    {
        'name': 'Randy Moss',
        'city': 'Rand',
        'state': 'WV'
    },
    {
        'name': 'Nicole K',
        'city': 'Asheville',
        'state': 'NC'
    },
    {
        'name': 'Jim Doe',
        'city': 'Charlotte',
        'state': 'NC'
    },
    {
        'name': 'Jane Taylor',
        'city': 'Faketown',
        'state': 'NC'
    }
]

In [0]:
def get_state(person):
    return person['state']

In [37]:
person_group = itertools.groupby(people, get_state)
for key, group in person_group:
  print(key, group)

NY <itertools._grouper object at 0x7f142215bcf8>
CO <itertools._grouper object at 0x7f142215b630>
WV <itertools._grouper object at 0x7f142215bcf8>
NC <itertools._grouper object at 0x7f142215b630>


## tee
copying iterators.

After a iterator is copied, it shouldn't be used.

`x` and `y` are independent of each other i.e. `next(x)` won't affect y

In [0]:
x = [0, 1, 2, 3, 4]
combination_iterator = itertools.combinations(x, 2)
x, y = itertools.tee(combination_iterator)
