Skip to content

Commit

Permalink
REGR: Fixed handling of Categorical in cython ops (#31668)
Browse files Browse the repository at this point in the history
Fixed by falling back to the wrapped Python version when the cython
version raises NotImplementedError.

Closes #31450
  • Loading branch information
TomAugspurger committed Feb 5, 2020
1 parent 1996b17 commit d3b7e64
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 1 deletion.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.0.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Fixed regressions
- Fixed regression when indexing a ``Series`` or ``DataFrame`` indexed by ``DatetimeIndex`` with a slice containg a :class:`datetime.date` (:issue:`31501`)
- Fixed regression in ``DataFrame.__setitem__`` raising an ``AttributeError`` with a :class:`MultiIndex` and a non-monotonic indexer (:issue:`31449`)
- Fixed regression in :class:`Series` multiplication when multiplying a numeric :class:`Series` with >10000 elements with a timedelta-like scalar (:issue:`31457`)
- Fixed regression in ``.groupby()`` aggregations with categorical dtype using Cythonized reduction functions (e.g. ``first``) (:issue:`31450`)
- Fixed regression in :meth:`GroupBy.apply` if called with a function which returned a non-pandas non-scalar object (e.g. a list or numpy array) (:issue:`31441`)
- Fixed regression in :meth:`DataFrame.groupby` whereby taking the minimum or maximum of a column with period dtype would raise a ``TypeError``. (:issue:`31471`)
- Fixed regression in :meth:`to_datetime` when parsing non-nanosecond resolution datetimes (:issue:`31491`)
Expand Down
4 changes: 3 additions & 1 deletion pandas/core/groupby/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -1394,7 +1394,9 @@ def func(self, numeric_only=numeric_only, min_count=min_count):
except DataError:
pass
except NotImplementedError as err:
if "function is not implemented for this dtype" in str(err):
if "function is not implemented for this dtype" in str(
err
) or "category dtype not supported" in str(err):
# raised in _get_cython_function, in some cases can
# be trimmed by implementing cython funcs for more dtypes
pass
Expand Down
16 changes: 16 additions & 0 deletions pandas/tests/groupby/aggregate/test_aggregate.py
Original file line number Diff line number Diff line change
Expand Up @@ -377,6 +377,22 @@ def test_agg_index_has_complex_internals(index):
tm.assert_frame_equal(result, expected)


def test_agg_cython_category_not_implemented_fallback():
# https://github.com/pandas-dev/pandas/issues/31450
df = pd.DataFrame({"col_num": [1, 1, 2, 3]})
df["col_cat"] = df["col_num"].astype("category")

result = df.groupby("col_num").col_cat.first()
expected = pd.Series(
[1, 2, 3], index=pd.Index([1, 2, 3], name="col_num"), name="col_cat"
)
tm.assert_series_equal(result, expected)

result = df.groupby("col_num").agg({"col_cat": "first"})
expected = expected.to_frame()
tm.assert_frame_equal(result, expected)


class TestNamedAggregationSeries:
def test_series_named_agg(self):
df = pd.Series([1, 2, 3, 4])
Expand Down

0 comments on commit d3b7e64

Please sign in to comment.