Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: groupby aggregate with multi-level columns #9585

Open
jorisvandenbossche opened this issue Mar 4, 2015 · 1 comment
Open

ENH: groupby aggregate with multi-level columns #9585

jorisvandenbossche opened this issue Mar 4, 2015 · 1 comment

Comments

@jorisvandenbossche
Copy link
Member

See http://stackoverflow.com/questions/28833074/aggregate-group-with-multi-level-columns

So we want to have some built-in functionality for this?

The example:

import itertools
import pandas as pd

lev1 = ['foo', 'bar', 'baz']
lev2 = list('abc')

n = 6

df = pd.DataFrame({k: np.random.randn(n) for k in itertools.product(lev1,lev2)}, 
                  index=pd.DatetimeIndex(start='2015-01-01', periods=n, freq='11D'))
             bar               baz               foo            
               a     b     c     a     b     c     a     b     c
2015-01-01 -1.11  2.12 -1.00  0.18  0.14  1.24  0.73  0.06  3.66
2015-01-12 -1.43  0.75  0.38  0.04 -0.33 -0.42  1.00 -1.63 -1.35
2015-01-23  0.01 -1.70 -1.39  0.59 -1.10 -1.17 -1.51 -0.54 -1.11
2015-02-03  0.93  0.70 -0.12  1.07 -0.97 -0.45 -0.19  0.11 -0.79
2015-02-14  0.30  0.49  0.60 -0.28 -0.38  1.11  0.15  0.78 -0.58
2015-02-25 -0.26  0.51  0.82  0.05 -1.45  0.14  0.53 -0.33 -1.35

The question is here if it should be possible in groupby().aggregate() to specify that you want to apply a function to all columns of a certain level label.
E.g. df.groupby(pd.TimeGrouper('MS')).aggregate({'bar': np.sum, 'baz': np.mean, 'foo': np.min}) does not work at the moment.

Or does this lead to far?

@jorisvandenbossche jorisvandenbossche changed the title ENH: groupby aggregate ENH: groupby aggregate with multi-level columns Mar 4, 2015
@jreback
Copy link
Contributor

jreback commented Mar 4, 2015

xref of #9052, #8593
we need a way to specify how to do this, something like pd.Summary(....)

what you are doing above should work, but I think is just NotImplemented ATM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants