-
-
Notifications
You must be signed in to change notification settings - Fork 17.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KeyError on slicing with datetime or pandas.Timestamp #5821
Comments
Your index is not monotonic (e.g. sorted). This is more of an incorrect error report If you sort it it works (with exact indexes or not)
This should be a ValueError I think; so its a 'bug' on the error report
@jtratner agree? |
@jreback thanks for pointing out the problem with my dataset! I was not aware of it. |
closing as not a bug |
Hi there, I know this issue is closed, since it's not a bug. I would argue, though, that the error message could point the user in the right direction (I googled the error message and it lead me here). |
I second Kristian. And I'll provide a bit more context which can be helpful:
Which throws |
if you have a nice reproducible example, pls open a new issue (and xref this one). This is well-defined behavior. I closed this because its not a bug, though, it could/should be a |
As requested by @jreback, I hereby create a nice reproducible example: import pandas as pd
index = pd.date_range('2016-10-29 23:00', '2016-10-30 3:00', freq='15T', tz='UTC')
index = index.tz_convert('Europe/Brussels').tz_localize(None)
ts = pd.Series(1, index=index)
ts.truncate(before='2016-10-30 2:10') which raises:
In above example, the index is non-sorted due to the naive local time, comprising a switch from Summer to Winter time. Slicing the data as: ts[pd.Timestamp('2016-10-30 2:10'):] returns the same KeyError. Truncating with a time that is contained inside the index works just fine (unless you choose a non-unique label): ts.truncate(before='2016-10-30 1:30') Interestingly: ts['2016-10-30 2:10':] raises no error, returning the correct (non-sorted !) result. Even when choosing a duplicate label! All different operations work on the sorted series, e.g.: ts.sort_index().truncate(before='2016-10-30 2:10') Though now the result is sorted as well of course. I'm using pandas version 0.19.1 |
@kdebrab the only thing that would be nicer would be the actual KeyError message for a not-found label. (It should show it as a Timestamp). So would take a fix for that. you can open a new issue, or push a PR if that works for you. |
The
So which raises an error, because you are slicing with a non-present value on a non-sorted index.
I think it really be nice to have a better error message for this, as the difference between sorted/not-sorted can be very subtle, and with a sorted index, slicing with a non-present label is perfectly fine. @jreback The only thing I don't directly find clear is why does the same example work with strings?
Shouldn't this be equivalent to |
this works with strings because a string turns into partial timestamp indexing and thus is a slice, and hence works. so maybe we ought to always make truncate a slice, then this will just work. (internally its a scalar being passed and NOT a slice), so this is pretty easy to 'fix'. and I agree that this should work. So let's create 2 new issues for this
|
@jreback I thought partial strings only turned into a slice when the string has lower resolution than the series. In the case of |
so I think its always treated as a slice (in this example) |
This is essentially a workaround around pandas-dev/pandas#5821.
So this was never settled? |
related #6066 (on Float64Index too)
The 'KeyError' on slicing was discussed multiple times but I'm still not sure if the issue below is a bug or just my misunderstanding.
I'm experiencing 'KeyError' from time to time when I try to slice my dataframes with datetime or Timestamp objects, however slicing with strings works perfectly. I was unable to construct synthetic example with pandas.date_range, so I needed to upload the piece of real data where the issue appears:
https://www.dropbox.com/s/ibzbwqs35tiydyc/tmp.h5
When I try to slice it with the pandas.Timestamp objects it results in 'KeyError':
Same for datetime objects:
However this slicing works perfectly:
Numpy version 1.8.0
Pandas version 0.13.0
The text was updated successfully, but these errors were encountered: