Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: groupby.first and groupby.last fallback to apply for EAs #57591

Open
rhshadrach opened this issue Feb 23, 2024 · 3 comments
Open

PERF: groupby.first and groupby.last fallback to apply for EAs #57591

rhshadrach opened this issue Feb 23, 2024 · 3 comments
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. Groupby Performance Memory or execution speed performance Reduction Operations sum, mean, min, max, etc.

Comments

@rhshadrach
Copy link
Member

#57589 showed groupby.first and groupby.last are falling back to apply for certain EAs. This should not fallback.

@rhshadrach rhshadrach added Groupby Performance Memory or execution speed performance ExtensionArray Extending pandas with custom dtypes or arrays. Reduction Operations sum, mean, min, max, etc. labels Feb 23, 2024
@phofl
Copy link
Member

phofl commented Feb 25, 2024

Do you know the type of eas that fall back?

@rhshadrach
Copy link
Member Author

From the CI run in the linked PR:

FAILED pandas/tests/extension/test_arrow.py::TestArrowArray::test_groupby_agg_extension[decimal128(7, 3)] - NotImplementedError
FAILED pandas/tests/extension/test_arrow.py::TestArrowArray::test_groupby_agg_extension[string] - NotImplementedError
FAILED pandas/tests/extension/test_arrow.py::TestArrowArray::test_groupby_agg_extension[binary] - NotImplementedError
FAILED pandas/tests/extension/test_arrow.py::TestArrowArray::test_groupby_agg_extension[time32[s]] - NotImplementedError
FAILED pandas/tests/extension/test_arrow.py::TestArrowArray::test_groupby_agg_extension[time32[ms]] - NotImplementedError
FAILED pandas/tests/extension/test_arrow.py::TestArrowArray::test_groupby_agg_extension[time64[us]] - NotImplementedError
FAILED pandas/tests/extension/test_arrow.py::TestArrowArray::test_groupby_agg_extension[time64[ns]] - NotImplementedError
FAILED pandas/tests/extension/test_arrow.py::TestArrowArray::test_groupby_agg_extension[date32[day]] - NotImplementedError
FAILED pandas/tests/extension/test_arrow.py::TestArrowArray::test_groupby_agg_extension[date64[ms]] - NotImplementedError
FAILED pandas/tests/extension/test_numpy.py::TestNumpyExtensionArray::test_groupby_agg_extension[float] - NotImplementedError: function is not implemented for this dtype: float64
FAILED pandas/tests/extension/test_numpy.py::TestNumpyExtensionArray::test_groupby_agg_extension[object] - NotImplementedError: function is not implemented for this dtype: object
FAILED pandas/tests/extension/test_sparse.py::TestSparseArray::test_groupby_agg_extension[0] - NotImplementedError: function is not implemented for this dtype: Sparse[float64, 0]
FAILED pandas/tests/extension/test_sparse.py::TestSparseArray::test_groupby_agg_extension[nan] - NotImplementedError: function is not implemented for this dtype: Sparse[float64, nan]
FAILED pandas/tests/extension/decimal/test_decimal.py::TestDecimalArray::test_groupby_agg_extension - NotImplementedError: function is not implemented for this dtype: decimal
FAILED pandas/tests/extension/test_interval.py::TestIntervalArray::test_groupby_agg_extension - NotImplementedError: function is not implemented for this dtype: interval[float64, right]
FAILED pandas/tests/extension/json/test_json.py::TestJSONArray::test_groupby_agg_extension - NotImplementedError: function is not implemented for this dtype: json
= 16 failed, 221228 passed, 6664 skipped, 1811 xfailed, 94 xpassed, 40 warnings in 1525.24s (0:25:25) =

@phofl
Copy link
Member

phofl commented Feb 25, 2024

Thx, none of those is really surprising unfortunately

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. Groupby Performance Memory or execution speed performance Reduction Operations sum, mean, min, max, etc.
Projects
None yet
Development

No branches or pull requests

2 participants