Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiIndex groupby bugs in 0.7.3, 0.8.0b1 and 0.8.0dev #1401

Closed
ruidc opened this issue Jun 5, 2012 · 3 comments
Closed

MultiIndex groupby bugs in 0.7.3, 0.8.0b1 and 0.8.0dev #1401

ruidc opened this issue Jun 5, 2012 · 3 comments
Milestone

Comments

@ruidc
Copy link
Contributor

ruidc commented Jun 5, 2012

import pandas
l = [['count', 'values'], ['to filter', '']]
midx = pandas.MultiIndex.from_tuples(l)
df = pandas.DataFrame([[1L, 'A']], columns=midx) #one line
print(df.groupby('to filter').groups)
#Out: {'to filter': [0L]} #was expecting 'A': [0L]
print(df.groupby([('to filter', '')]).groups)
#Out: {'to filter': [0L]} #was expecting same as above
df = pandas.DataFrame([[1L, 'A'], [2L, 'B']], columns=midx) #two lines, different group
print(df.groupby('to filter').groups)
#Out: {'A': [0L], 'B': [1L]} #fine
print(df.groupby([('to filter', '')]).groups)
#Out: {'': [1L], 'to filter': [0L]} #was expecting same as above
df = pandas.DataFrame([[1L, 'A'], [2L, 'A']], columns=midx) #two lines, same group
print(df.groupby('to filter').groups)
#Out: {'A': [0L, 1L]} #fine
print(df.groupby([('to filter', '')]).groups)
#Out: {'': [1L], 'to filter': [0L]} #was expecting same as above
@ruidc
Copy link
Contributor Author

ruidc commented Jun 5, 2012

@wesm
Copy link
Member

wesm commented Jun 11, 2012

Fixed these bugs, a few little hacks but stuff I can live with

@wesm wesm closed this as completed Jun 11, 2012
@ruidc
Copy link
Contributor Author

ruidc commented Jun 12, 2012

Thanks!
perhaps I'm just nitpicking, but in the tests, wouldn't it be better to test for the explicit results rather than just a comparison match between a one-level and two-level groupby in case the the results were the same but incorrect?

yarikoptic added a commit to neurodebian/pandas that referenced this issue Jun 21, 2012
Version 0.8.0 beta 2

* tag 'v0.8.0b2': (37 commits)
  RLS: 0.8.0 beta 2
  BUG: bytes_to_str for read_csv
  BUG: import BytesIO for py3compat
  BUG: fix compat errors for yahoo data reader
  ENH: convert datetime.datetime ourselves, 15x speedup
  Make tox work across versions of Python from 2.5 to 3.2
  Reenable py31 and py32 in .travis.yml
  TST: test coverage
  TST: oops, delete stray line
  REF: factor out ujson extension into pandasjson for now
  TST: eliminate copies in datetime64 serialization; don't copy data in DatetimeIndex, close pandas-dev#1320
  DOC: refresh time zone docs close pandas-dev#1447
  BUG: always raise exception when concat keys aren't found in passed levels, close pandas-dev#1406
  ENH: implement passed quantile array to qcut and document that plus factors, close pandas-dev#1407
  ENH: clearer out of bounds error message in cut/qcut, close pandas-dev#1409
  ENH: allow renaming of index levels when concatenating, close pandas-dev#1419
  BUG: fix MultiIndex bugs described in pandas-dev#1401
  DOC: release notes
  BUG: implement multiple DataFrame.join / merge on non-unique indexes by multiple merges, close pandas-dev#1421
  REF: remove offset names from pandas namespace
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants