-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix return type of loc/iloc #61054
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Parthi <parthimyself90@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
pandas/core/indexing.py
Outdated
and isinstance(key, int) | ||
and isinstance(new_key, (list, slice)) | ||
): | ||
out = out.infer_objects() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will infer non-object dtype even on data that is object-dtype, no? I do not think this is the right change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rhshadrach !
I've also confirmed what you've mentioned.
>>>import pandas as pd
>>> df = pd.DataFrame({1: [11, 22], 2: [33, 44], "a": [55, 66]}, dtype=object)
>>>df.loc[0, [1, 2]]
1 11
2 33
Name: 0, dtype: int64
>>>df[[1, 2]].loc[0]
1 11
2 33
Name: 0, dtype: object
I'll do some further investigation to fix this.
I think the way to do it would be by changing getitem_lowerdim to reverse the order in which we do the indexing (i think the loop on L1069). Would need to try it to see if that breaks anything. Best guess would be MultiIndex cases (possibly their perf?) I think a bunch of getitem_lowerdim dates back to when Panel and PanelND existed, so it may be more general than it needs to be. |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.Description of Linked Issue
loc/iloc
inconsistently returns dtype. For example,This behaviour seems to happen following the below sequence:
BlockManager.fast_xs()
returns a cross-section ofdf
, determining thedtype
asobject
, sincedf.loc[0,:]
is supposed to include'a'
.NDFrame._reindex_with_indexers()
returns the result, not additionally inferring the dtype of the result.Proposed Solution
Based on the above examples, we can conclude that this issue only apprears where
axis[0]=int
amdaxis[1]=list/slice
-loc[int/slice]
.Therefore, I'd like to propose to add the below codes to additionally infer the dtype after the column selection.
Thanks!