Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: inconsistency in Index.__getitem__ for Numpy and non-numpy dtypes #55820

Closed
3 tasks done
phofl opened this issue Nov 4, 2023 · 2 comments · Fixed by #56055
Closed
3 tasks done

BUG: inconsistency in Index.__getitem__ for Numpy and non-numpy dtypes #55820

phofl opened this issue Nov 4, 2023 · 2 comments · Fixed by #56055
Labels
API - Consistency Internal Consistency of API/Behavior Bug Deprecate Functionality to remove in pandas Index Related to the Index class or subclasses

Comments

@phofl
Copy link
Member

phofl commented Nov 4, 2023

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

ser = pd.Index([1, 2], dtype="Int64")
ser[np.array([], dtype=np.bool_)]

This raises while NumPy dtypes, e.g. "int64" works.

Traceback (most recent call last):
  File "/Users/patrick/Library/Application Support/JetBrains/PyCharm2023.2/scratches/scratch.py", line 477, in <module>
    ser[np.array([], dtype=np.bool_)]
  File "/Users/patrick/PycharmProjects/pandas/pandas/core/indexes/base.py", line 5360, in __getitem__
    result = getitem(key)
  File "/Users/patrick/PycharmProjects/pandas/pandas/core/arrays/masked.py", line 182, in __getitem__
    item = check_array_indexer(self, item)
  File "/Users/patrick/PycharmProjects/pandas/pandas/core/indexers/utils.py", line 539, in check_array_indexer
    raise IndexError(
IndexError: Boolean index has wrong length: 0 instead of 2

Issue Description

Not sure what we want to expect here, the Series case raises for all dtypes

cc @jbrockmendel

Expected Behavior

See above

Installed Versions

Replace this line with the output of pd.show_versions()

@phofl phofl added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 4, 2023
@jbrockmendel
Copy link
Member

requiring matching length for a mask seems right to me, so the numpy version looks wrong

@rhshadrach rhshadrach added Index Related to the Index class or subclasses API - Consistency Internal Consistency of API/Behavior labels Nov 5, 2023
@jorisvandenbossche
Copy link
Member

I suppose the underlying reason that this works for the numpy version, is because numpy arrays seem to allow this:

>>> arr = np.array([1, 2], dtype="int64")
>>> arr[np.array([], dtype=np.bool_)]
array([], dtype=int64)

I would have expected numpy to raise here, though. And I agree we can be more consistent here.

If this is longstanding behaviour (and matching numpy), we might want to deprecate it first.

@phofl phofl added Deprecate Functionality to remove in pandas and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API - Consistency Internal Consistency of API/Behavior Bug Deprecate Functionality to remove in pandas Index Related to the Index class or subclasses
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants