Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG/ENH: groupby.rolling.agg / Column(s) low already selected #15072

Closed
Earthson opened this issue Jan 6, 2017 · 2 comments

Comments

3 participants
@Earthson
Copy link

commented Jan 6, 2017

GroupBy.rolling.agg failed with Column(s) low already selected

data = pd.DataFrame({"stock": [1, 1, 1, 2, 2, 2, 2], "low": [10, 20, 30, 10, 30, 40, 80]})
data.set_index("stock", inplace=True)
# failed
data.groupby(level="stock").rolling(2).agg({"low": {"mean": "mean", "max": "max"}})
# works
data.rolling(2).agg({"low": {"mean": "mean", "max": "max"}})

This also hits groupby().resample.agg

In [13]: df = pd.DataFrame({"A": pd.to_datetime(['2015', '2017']), "B": [1, 1]})

In [14]: df
Out[14]:
           A  B
0 2015-01-01  1
1 2017-01-01  1

In [15]: df.set_index("A").groupby([0, 0]).resample("AS")
Out[15]: DatetimeIndexResamplerGroupby [freq=<YearBegin: month=1>, axis=0, closed=left, label=left, convention=e, base=0]

In [16]: df.set_index("A").groupby([0, 0]).resample("AS").agg(['sum', 'count'])
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-16-5f1c18a8d4ac> in <module>()
----> 1 df.set_index("A").groupby([0, 0]).resample("AS").agg(['sum', 'count'])

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/resample.py in aggregate(self, arg, *args, **kwargs)
    339
    340         self._set_binner()
--> 341         result, how = self._aggregate(arg, *args, **kwargs)
    342         if result is None:
    343             result = self._groupby_and_aggregate(arg,

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/base.py in _aggregate(self, arg, *args, **kwargs)
    538             return self._aggregate_multiple_funcs(arg,
    539                                                   _level=_level,
--> 540                                                   _axis=_axis), None
    541         else:
    542             result = None

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/base.py in _aggregate_multiple_funcs(self, arg, _level, _axis)
    583                 try:
    584                     colg = self._gotitem(col, ndim=1, subset=obj[col])
--> 585                     results.append(colg.aggregate(arg))
    586                     keys.append(col)
    587                 except (TypeError, DataError):

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/resample.py in aggregate(self, arg, *args, **kwargs)
    339
    340         self._set_binner()
--> 341         result, how = self._aggregate(arg, *args, **kwargs)
    342         if result is None:
    343             result = self._groupby_and_aggregate(arg,

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/base.py in _aggregate(self, arg, *args, **kwargs)
    538             return self._aggregate_multiple_funcs(arg,
    539                                                   _level=_level,
--> 540                                                   _axis=_axis), None
    541         else:
    542             result = None

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/base.py in _aggregate_multiple_funcs(self, arg, _level, _axis)
    582             for col in obj:
    583                 try:
--> 584                     colg = self._gotitem(col, ndim=1, subset=obj[col])
    585                     results.append(colg.aggregate(arg))
    586                     keys.append(col)

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/base.py in _gotitem(self, key, ndim, subset)
    675                        for attr in self._attributes])
    676         self = self.__class__(subset,
--> 677                               groupby=self._groupby[key],
    678                               parent=self,
    679                               **kwargs)

~/Envs/pandas-dev/lib/python3.6/site-packages/pandas/pandas/core/base.py in __getitem__(self, key)
    241         if self._selection is not None:
    242             raise Exception('Column(s) {selection} already selected'
--> 243                             .format(selection=self._selection))
    244
    245         if isinstance(key, (list, tuple, ABCSeries, ABCIndexClass,

Exception: Column(s) B already selected

Problem description

It's not work as expected

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-53-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.19.2+0.g825876c.dirty
nose: 1.3.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.24.1
numpy: 1.11.1
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.4.6
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: 1.1.0
tables: 3.2.3.1
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: 0.7.9.None
psycopg2: None
jinja2: 2.8
boto: 2.42.0
pandas_datareader: None

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 6, 2017

sophisticated .agg was never explicitly implemented on .groupby.rolling, so not surprising this doesn't work. if you'd like to debug please do.

In [12]: data.groupby(level="stock").rolling(2).mean()
Out[12]:
              low
stock stock
1     1       NaN
      1      15.0
      1      25.0
2     2       NaN
      2      20.0
      2      35.0
      2      60.0

In [13]: data.groupby(level="stock").rolling(2).max()
Out[13]:
              low
stock stock
1     1       NaN
      1      20.0
      1      30.0
2     2       NaN
      2      30.0
      2      40.0
      2      80.0

@jreback jreback added this to the Next Major Release milestone Jan 6, 2017

@jreback jreback changed the title Column(s) low already selected BUG/ENH: groupby.rolling.agg / Column(s) low already selected Jan 6, 2017

@jreback jreback modified the milestones: 0.20.0, Next Major Release Mar 1, 2017

@jreback jreback added the Prio-high label Mar 29, 2017

@jreback jreback modified the milestones: Next Major Release, 0.20.0, Next Minor Release Mar 29, 2017

@mahnunchik

This comment has been minimized.

Copy link

commented Sep 22, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.