-
-
Notifications
You must be signed in to change notification settings - Fork 17.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: specify missing keys in KeyError when passing list-like to .loc #34272
Comments
Does this needs to be fixed ? Since mentioned here 'https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike' This says it is deprecated. |
Well, I'm not proposing that the behavior be reinstated, just that the KeyError that is thrown be more verbose about what keys are actually missing |
I agree , what I meant that this was purposely deprecated to use .reindex() as per doc. |
I may have expressed myself wrong, sorry if that's the case. When I've run into this error, I thought all the columns I was wondering were present, but that actually wasn't the case. Many times, I had DataFrames with hundreds of columns were maybe just one was missing from the list-like indexer, generally because of a typo in the original CSV I had read. Since the error gives no information about what was columns were actually missing in my DataFrame, I'd then have to do something like: missing_cols = [c for c in my_list if c not in my_df.columns] What I wish is that this behavior of searching the missing columns happened out of the box when raising the KeyError (the missing columns would be listed in the KeyError message). If that were not possible due to performance, at least the first missing column could be mentioned, without hindering performance |
I also think this feature would be helpful and a useful enhancement for debugging missing or misspelled columns. Since it would only find the missing labels if it has already met the missing columns condition, I don't think this would impact performance in most cases pandas/core/indexing.py:1288 I would be happy to try and make the PR. |
yep PRs welcome here |
I'm a big fan of the strong limitation when passing list-likes to
.loc[]
. However, when the a key is missing and the KeyError is raised, it'd be helpful to know which key/s is/are missing.I understand this could have performance implications since each key of the list-like object should be searched, but also think the UX would be improved.
At least, the first missing key could be reported, without hindering performance. I'm happy to provide a PR for this if provided with some guidelines (I suppose this affects both Series and DataFrames, as well as maybe some other parts of the code I'm not aware of).
The text was updated successfully, but these errors were encountered: