BUG: Inconsistent behavior of Groupby with None values with filter (#… #63178

koskampt · 2025-11-23T15:13:32Z

…62501)

closes BUG: Inconsistent behavior of Groupby with None values with filter #62501
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added an entry in the latest doc/source/whatsnew/v2.3.4.rst file if fixing a bug or adding a new feature.

rhshadrach

Thanks for the PR! Please always add tests. Does this also handle the tuple case on L667?

rhshadrach · 2025-11-23T17:52:15Z

pandas/core/groupby/groupby.py

                for name in names
            )

+        elif any(isna(k) for k in self.indices.keys()):


This check is expensive - this function is only ever called currently with names a list of length 1, and the rest of the method is O(1) in terms of self.indices. It's called from the inner loop of DataFramGroupBy.fitler as we're iterating over each group. This seems avoidable.

I believe we could change this function to just accept a single name (rather than a list) and then have a special case:

if isna(name): return self.indices.get(np.nan, [])

I think self.indices.get(np.nan, []) won't work as the Nan value in the self.indices can not be accessed reliable before changing the keys from Nan to np.nan. I think I have a working solution though. Will supply the updated version of the PR tomorrow.

I think self.indices.get(np.nan, []) won't work as the Nan value in the self.indices can not be accessed reliable before changing the keys from Nan to np.nan.

Isn't this what I suggested to do in #63178 (comment)

I think I misread your first comment from two days ago. To make sure we are on the same page, we can change the function _get_indices(self, names) to _get_indices(self, name). Changing the list for a single name?

Yes! It is only ever used with a single name today.

rhshadrach · 2025-11-23T17:58:08Z

pandas/core/groupby/groupby.py

            names = (converter(name) for name in names)

-        return [self.indices.get(name, []) for name in names]
+        indices = {np.nan if isna(k) else k: v for k, v in self.indices.items()}


It seems better to do this on indices cached property directly, and only in the case where there is a NaN value with if not self.dropna and self.result_index.hasnans.

Good point, will adjust.

…andas-dev#62501)

…andas-dev#62501) - Add test cases - Add tuple support - Incorporate feedback

rhshadrach · 2025-11-25T22:42:50Z

@koskampt - I opened #63202 to give some idea of what I'm thinking. If you like that, can incorporate it here. But still open to alternative solutions that do not iterate through indices within _get_indices for the reasons provided.

Even with such a solution, will still want to see the result of running the groupby ASVs to evaluate performance impact. I can also help assist here if desired.

koskampt requested a review from rhshadrach as a code owner November 23, 2025 15:13

rhshadrach reviewed Nov 23, 2025

View reviewed changes

T. Koskamp added 2 commits November 25, 2025 20:57

BUG: Inconsistent behavior of Groupby with None values with filter (p…

b2aed4a

…andas-dev#62501)

BUG: Inconsistent behavior of Groupby with None values with filter (p…

8d2126a

…andas-dev#62501) - Add test cases - Add tuple support - Incorporate feedback

koskampt force-pushed the bug-fix-grouby-with-none-values-with-filter branch from edd8a1f to 8d2126a Compare November 25, 2025 19:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Inconsistent behavior of Groupby with None values with filter (#… #63178

BUG: Inconsistent behavior of Groupby with None values with filter (#… #63178

koskampt commented Nov 23, 2025 •

edited

Loading

Uh oh!

rhshadrach left a comment

Uh oh!

rhshadrach Nov 23, 2025

Uh oh!

koskampt Nov 24, 2025

Uh oh!

rhshadrach Nov 25, 2025

Uh oh!

koskampt Nov 25, 2025 •

edited

Loading

Uh oh!

rhshadrach Nov 25, 2025

Uh oh!

rhshadrach Nov 23, 2025

Uh oh!

koskampt Nov 24, 2025

Uh oh!

rhshadrach commented Nov 25, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

BUG: Inconsistent behavior of Groupby with None values with filter (#… #63178

Are you sure you want to change the base?

BUG: Inconsistent behavior of Groupby with None values with filter (#… #63178

Conversation

koskampt commented Nov 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rhshadrach left a comment

Choose a reason for hiding this comment

Uh oh!

rhshadrach Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

koskampt Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

rhshadrach Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

koskampt Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rhshadrach Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

rhshadrach Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

koskampt Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

rhshadrach commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

koskampt commented Nov 23, 2025 •

edited

Loading

koskampt Nov 25, 2025 •

edited

Loading

rhshadrach commented Nov 25, 2025 •

edited

Loading