# Grouping with Functions

Using Python functions in what can be fairly creative ways is a more abstract way of defining a group mapping compared with a dict or Series. Any function passed as a group key will be called once per index value, with the return values being used as the group names. More concretely, consider the example DataFrame from the previous section, which has people’s first names as index values. Suppose you wanted to group by the length of the names; you could compute an array of string lengths, but instead you can just pass the len function:

In [1]:
import pandas as pd
import numpy as np
from pandas import DataFrame, Series

In [2]:
people = DataFrame(np.arange(25).reshape(5, 5),
                    columns= ['a', 'b', 'c', 'd', 'e'],
                    index = ['one', 'two', 'three', 'four', 'five'])

In [11]:
people.groupby(len).sum()

Unnamed: 0,a,b,c,d,e
3,5,7,9,11,13
4,35,37,39,41,43
5,10,11,12,13,14


Mixing functions with arrays, dicts, or Series is not a problem as everything gets converted to arrays internally:

In [19]:
key_list = ['one', 'two', 'one', 'two', 'two']

In [20]:
people.groupby([len, key_list]).min()

Unnamed: 0,Unnamed: 1,a,b,c,d,e
3,one,0,1,2,3,4
3,two,5,6,7,8,9
4,two,15,16,17,18,19
5,one,10,11,12,13,14


In [27]:
people.groupby([len, list('12233')]).sum()

Unnamed: 0,Unnamed: 1,a,b,c,d,e
3,1,0,1,2,3,4
3,2,5,6,7,8,9
4,3,35,37,39,41,43
5,2,10,11,12,13,14
