nunique + TimeGrouper error #12352

Closed
wavexx opened this Issue Feb 16, 2016 · 5 comments

Comments

Projects
None yet
2 participants

wavexx commented Feb 16, 2016

This used to work in the past:

tmp = pd.DataFrame({
        'ID': {pd.Timestamp('2015-06-05 00:00:00'): '0010100903', pd.Timestamp('2015-06-08 00:00:00'): '0010150847'},
        'DATE': {pd.Timestamp('2015-06-05 00:00:00'): '2015-06-05', pd.Timestamp('2015-06-08 00:00:00'): '2015-06-08'}})
tmp.groupby(pd.TimeGrouper('D')).ID.nunique()

but now I get the obscure:

Traceback (most recent call last):
  File "test.py", line 6, in <module>
    tmp.groupby(pd.TimeGrouper('D')).ID.nunique()
  File "/usr/lib/python3/dist-packages/pandas/core/groupby.py", line 2697, in nunique
    name=self.name)
  File "/usr/lib/python3/dist-packages/pandas/core/series.py", line 227, in __init__
    data = SingleBlockManager(data, index, fastpath=True)
  File "/usr/lib/python3/dist-packages/pandas/core/internals.py", line 3736, in __init__
    ndim=1, fastpath=True)
  File "/usr/lib/python3/dist-packages/pandas/core/internals.py", line 2454, in make_block
    placement=placement)
  File "/usr/lib/python3/dist-packages/pandas/core/internals.py", line 87, in __init__
    len(self.values), len(self.mgr_locs)))
ValueError: Wrong number of items passed 2, placement implies 4

wavexx commented Feb 16, 2016

The logically equivalent:

tmp.groupby(pd.TimeGrouper('D')).ID.apply(lambda x: x.nunique())

works as intended.

jreback added this to the 0.18.0 milestone Feb 16, 2016

Contributor

jreback commented Feb 16, 2016

hmm, does look buggy.

Contributor

jreback commented Feb 16, 2016

this last worked in 0.16.2, and failed in 0.17.0 (and continues in 0.18.0),

@jreback jreback modified the milestone: 0.18.1, 0.18.0 Feb 16, 2016

wavexx commented Feb 16, 2016

Thanks for investigating the exact breaking point. I'm currently revisiting some code that I wrote for python2.7 with pandas 0.16.* and now I'm porting to python3 and 0.17.1 (currently Debian unstable).

Contributor

jreback commented Feb 16, 2016

yeah there were some fixes related to this, but this one didn't take.

@jreback jreback added a commit to jreback/pandas that referenced this issue Feb 17, 2016

@jreback jreback BUG: resample with nunique
closes #12352
d19c5fe

@jreback jreback modified the milestone: 0.18.0, 0.18.1 Feb 17, 2016

jreback closed this in f1aad46 Feb 17, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment