## Overview of itertools groupby

Let us understand how we can use `itertools.groupby` to take care of aggregations by key.

* `itertools.groupby` can be used to get the data grouped by a key.
* It can be used to take care of use cases similar to following by using aggregate functions after grouping by key.
  * Get count by order status.
  * Get revenue for each order.
  * Get order count by month.
* We need to ensure data is pre-sorted by the key, so that all the values associated with each key are grouped together.

In [1]:
import itertools as iter

In [2]:
iter.groupby?

[0;31mInit signature:[0m [0miter[0m[0;34m.[0m[0mgroupby[0m[0;34m([0m[0mself[0m[0;34m,[0m [0;34m/[0m[0;34m,[0m [0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
groupby(iterable, key=None) -> make an iterator that returns consecutive
keys and groups from the iterable.  If the key function is not specified or
is None, the element itself is used for grouping.
[0;31mType:[0m           type
[0;31mSubclasses:[0m     


In [18]:
l = [1, 1, 3, 2, 1, 3, 2]

In [4]:
l_grouped = iter.groupby(l)

In [5]:
type(l_grouped)

itertools.groupby

In [6]:
list(l_grouped)

[(1, <itertools._grouper at 0x7feccc8ac6d8>),
 (3, <itertools._grouper at 0x7feccc8ac5c0>),
 (2, <itertools._grouper at 0x7feccc8ac438>),
 (1, <itertools._grouper at 0x7feccc8ac208>),
 (3, <itertools._grouper at 0x7feccc8ac668>),
 (2, <itertools._grouper at 0x7feccc8ac630>)]

In [7]:
l_sorted = sorted(l)

In [8]:
ls_grouped = iter.groupby(l_sorted)

In [9]:
list(ls_grouped)

[(1, <itertools._grouper at 0x7feccc8ac0b8>),
 (2, <itertools._grouper at 0x7feccc8acba8>),
 (3, <itertools._grouper at 0x7feccc8ac978>)]

```{note}
Rebuilding l_sorted and ls_grouped as ls_grouped will be flushed out after being read by `list(ls_grouped)`.
```

In [10]:
l_sorted = sorted(l)

In [11]:
ls_grouped = iter.groupby(l_sorted)

In [12]:
for e in ls_grouped:
    print(list(e[1]))

[1, 1, 1]
[2, 2]
[3, 3]


In [32]:
l_sorted = sorted(l)
ls_grouped = iter.groupby(l_sorted)
list(iter.starmap(lambda key, values: (key, len(list(values))), ls_grouped))

[(1, 3), (2, 2), (3, 2)]