BUG: group.apply with non-lexsorted levels and sort=True #14776

Closed
jreback opened this Issue Nov 30, 2016 · 0 comments

Comments

Projects
None yet
1 participant
Contributor

jreback commented Nov 30, 2016

In [1]: df = pd.DataFrame({'x': ['a', 'a', 'b', 'a'], 'y': [1, 1, 2, 2], 'z': [1, 2,3, 4]}).set_index(['x', 'y'])

In [2]: df
Out[2]:
     z
x y
a 1  1
  1  2
b 2  3
a 2  4

In [3]: df.index
Out[3]:
MultiIndex(levels=[['a', 'b'], [1, 2]],
           labels=[[0, 0, 1, 0], [0, 0, 1, 1]],
           names=['x', 'y'])

In [5]: df.index.is_lexsorted()
Out[5]: False

In [6]: df.groupby(level=[0,1]).sum()
Out[6]:
     z
x y
a 1  3
  2  4
b 2  3

In [7]: df.groupby(level=[0,1]).apply(pd.DataFrame.drop_duplicates)
Exception: cannot handle a non-unique multi-index!

In [8]: df.sort_index().groupby(level=[0,1]).apply(pd.DataFrame.drop_duplicates)
Out[8]:
     z
x y
a 1  1
  1  2
  2  4
b 2  3

# this is ok though
In [9]: df.groupby(level=[0,1], sort=False).apply(pd.DataFrame.drop_duplicates)
Out[9]:
     z
x y
a 1  1
  1  2
b 2  3
a 2  4

jreback added this to the Next Major Release milestone Nov 30, 2016

@mrocklin mrocklin added a commit to mrocklin/dask that referenced this issue Nov 30, 2016

@mrocklin mrocklin Don't force sort on groupby levels ef363e6

@jreback jreback added a commit to jreback/pandas that referenced this issue Nov 30, 2016

@jreback jreback BUG: Bug in a groupby of a non-lexsorted MultiIndex and multiple grou…
…ping levels


closes #14776
73bd00a

@jreback jreback added a commit to jreback/pandas that referenced this issue Nov 30, 2016

@jreback jreback BUG: Bug in a groupby of a non-lexsorted MultiIndex and multiple grou…
…ping levels


closes #14776
7b0bf12

@jreback jreback added a commit to jreback/pandas that referenced this issue Nov 30, 2016

@jreback jreback BUG: Bug in a groupby of a non-lexsorted MultiIndex and multiple grou…
…ping levels


closes #14776
2204602

@jreback jreback added a commit to jreback/pandas that referenced this issue Nov 30, 2016

@jreback jreback BUG: Bug in a groupby of a non-lexsorted MultiIndex and multiple grou…
…ping levels


closes #14776
cf31905

jreback closed this in f23010a Dec 4, 2016

@jreback jreback modified the milestone: 0.19.2, Next Major Release Dec 4, 2016

@jorisvandenbossche jorisvandenbossche added a commit that referenced this issue Dec 15, 2016

@jreback @jorisvandenbossche jreback + jorisvandenbossche [Backport #14777] BUG: Bug in a groupby of a non-lexsorted MultiIndex
closes #14776

Author: Jeff Reback <jeff@reback.net>

Closes #14777 from jreback/mi_sort and squashes the following commits:

cf31905 [Jeff Reback] BUG: Bug in a groupby of a non-lexsorted MultiIndex and multiple grouping levels

(cherry picked from commit f23010a)
04b83e0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment