## Overview of itertools groupby

Let us understand how we can use `itertools.groupby` to take care of aggregations by key.

* `itertools.groupby` can be used to get the data grouped by a key.
* It can be used to take care of use cases similar to following by using aggregate functions after grouping by key.
  * Get count by order status.
  * Get revenue for each order.
  * Get order count by month.
* We need to ensure data is pre-sorted by the key, so that all the values associated with each key are grouped together.

In [1]:
import itertools as iter

In [2]:
iter.groupby?

[1;31mInit signature:[0m [0miter[0m[1;33m.[0m[0mgroupby[0m[1;33m([0m[0miterable[0m[1;33m,[0m [0mkey[0m[1;33m=[0m[1;32mNone[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m     
make an iterator that returns consecutive keys and groups from the iterable

iterable
  Elements to divide into groups according to the key function.
key
  A function for computing the group category for each element.
  If the key function is not specified or is None, the element itself
  is used for grouping.
[1;31mType:[0m           type
[1;31mSubclasses:[0m     


In [3]:
l = [1, 1, 3, 2, 1, 3, 2]

In [4]:
l_grouped = iter.groupby(l)

In [5]:
type(l_grouped)

itertools.groupby

In [6]:
list(l_grouped)

[(1, <itertools._grouper at 0x1eb971f9dc0>),
 (3, <itertools._grouper at 0x1eb971f9ac0>),
 (2, <itertools._grouper at 0x1eb971f99a0>),
 (1, <itertools._grouper at 0x1eb971f9c40>),
 (3, <itertools._grouper at 0x1eb971ffe20>),
 (2, <itertools._grouper at 0x1eb971ffb20>)]

In [8]:
l_sorted = sorted(l)
l_sorted

[1, 1, 1, 2, 2, 3, 3]

In [9]:
ls_grouped = iter.groupby(l_sorted)

In [10]:
list(ls_grouped)

[(1, <itertools._grouper at 0x1eb971f97c0>),
 (2, <itertools._grouper at 0x1eb971f9340>),
 (3, <itertools._grouper at 0x1eb971f9940>)]

```{note}
Rebuilding l_sorted and ls_grouped as ls_grouped will be flushed out after being read by `list(ls_grouped)`.
```

In [11]:
l_sorted = sorted(l)

In [12]:
ls_grouped = iter.groupby(l_sorted)

In [13]:
for e in ls_grouped:
    print(list(e[1]))

[1, 1, 1]
[2, 2]
[3, 3]


In [14]:
l_sorted = sorted(l)
ls_grouped = iter.groupby(l_sorted)
list(iter.starmap(lambda key, values: (key, len(list(values))), ls_grouped))

[(1, 3), (2, 2), (3, 2)]