Skip to content

Commit

Permalink
sort grouper if series/dataframe, sort df in key/name
Browse files Browse the repository at this point in the history
  • Loading branch information
mrocklin committed Feb 1, 2018
1 parent 5a90cca commit a3e5b4d
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 1 deletion.
8 changes: 7 additions & 1 deletion dask/dataframe/groupby.py
Expand Up @@ -141,7 +141,13 @@ def _groupby_raise_unaligned(df, **kwargs):
def _groupby_slice_apply(df, grouper, key, func):
# No need to use raise if unaligned here - this is only called after
# shuffling, which makes everything aligned already
df = df.sort_values(grouper)
if isinstance(grouper, (pd.DataFrame, pd.Series, pd.Index)):
grouper = grouper.sort_values()
else:
try:
df = df.sort_values(grouper)
except KeyError: # this fails when the grouper includes the index
pass
g = df.groupby(grouper)
if key:
g = g[key]
Expand Down
1 change: 1 addition & 0 deletions docs/source/changelog.rst
Expand Up @@ -18,6 +18,7 @@ DataFrame
+++++++++

- Support month timedeltas in repartition(freq=...) (:pr:`3110`) `Matthew Rocklin`_
- Sort grouper values prior to groupby-apply (:pr:`3118`) `Matthew Rocklin`_

Bag
+++
Expand Down

0 comments on commit a3e5b4d

Please sign in to comment.