Should DatetimeIndex indexing with strings ever raise KeyError? #25803

shoyer · 2019-03-20T16:18:35Z

With pandas 0.24:

In [1]: import pandas as pd

In [2]: s = pd.Series([1, 2, 3], pd.to_datetime(['2018-01-01', '2018-02-02T01:01', '2018-02-02T02:02']))

In [3]: s.loc['2018-01-01'].size
Out[3]: 1

In [4]: s.loc['2018-01-02'].size
Out[4]: 0

In [5]: s.loc['2018-02-02'].size
Out[5]: 2

In [6]: s.loc['2018-03-03'].size
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~/miniconda3/envs/xarray-py37/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2601             try:
-> 2602                 return self._engine.get_loc(key)
   2603             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine._date_check_type()

KeyError: '2018-03-03'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~/miniconda3/envs/xarray-py37/lib/python3.7/site-packages/pandas/core/indexes/datetimes.py in get_loc(self, key, method, tolerance)
    998         try:
--> 999             return Index.get_loc(self, key, method, tolerance)
   1000         except (KeyError, ValueError, TypeError):

~/miniconda3/envs/xarray-py37/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2603             except KeyError:
-> 2604                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2605         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine._date_check_type()

KeyError: '2018-03-03'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 1520035200000000000

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~/miniconda3/envs/xarray-py37/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2601             try:
-> 2602                 return self._engine.get_loc(key)
   2603             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

KeyError: Timestamp('2018-03-03 00:00:00')

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 1520035200000000000

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
~/miniconda3/envs/xarray-py37/lib/python3.7/site-packages/pandas/core/indexes/datetimes.py in get_loc(self, key, method, tolerance)
   1011                     stamp = stamp.tz_localize(self.tz)
-> 1012                 return Index.get_loc(self, stamp, method, tolerance)
   1013             except KeyError:

~/miniconda3/envs/xarray-py37/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2603             except KeyError:
-> 2604                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2605         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.DatetimeEngine.get_loc()

KeyError: Timestamp('2018-03-03 00:00:00')

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-6-92239fa9614e> in <module>()
----> 1 s.loc['2018-03-03'].size

~/miniconda3/envs/xarray-py37/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(self, key)
   1498
   1499             maybe_callable = com.apply_if_callable(key, self.obj)
-> 1500             return self._getitem_axis(maybe_callable, axis=axis)
   1501
   1502     def _is_scalar_access(self, key):

~/miniconda3/envs/xarray-py37/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1911         # fall thru to straight lookup
   1912         self._validate_key(key, axis)
-> 1913         return self._get_label(key, axis=axis)
   1914
   1915

~/miniconda3/envs/xarray-py37/lib/python3.7/site-packages/pandas/core/indexing.py in _get_label(self, label, axis)
    135             # but will fail when the index is not present
    136             # see GH5667
--> 137             return self.obj._xs(label, axis=axis)
    138         elif isinstance(label, tuple) and isinstance(label[axis], slice):
    139             raise IndexingError('no slices here, handle elsewhere')

~/miniconda3/envs/xarray-py37/lib/python3.7/site-packages/pandas/core/generic.py in xs(self, key, axis, level, drop_level)
   3573                                                       drop_level=drop_level)
   3574         else:
-> 3575             loc = self.index.get_loc(key)
   3576
   3577             if isinstance(loc, np.ndarray):

~/miniconda3/envs/xarray-py37/lib/python3.7/site-packages/pandas/core/indexes/datetimes.py in get_loc(self, key, method, tolerance)
   1012                 return Index.get_loc(self, stamp, method, tolerance)
   1013             except KeyError:
-> 1014                 raise KeyError(key)
   1015             except ValueError as e:
   1016                 # list-like tolerance size must match target index size

KeyError: '2018-03-03'

(side note: this is quite the traceback!)

Bizarrely, whether indexing with a string raises a KeyError or returns an array of size 0 depends upon the value.

But more generally, does it ever make sense to raise an error? It's arguably more consistent to only return size 0 arrays.

xref #7827 and pydata/xarray#2825

The text was updated successfully, but these errors were encountered:

WillAyd · 2019-03-22T02:53:48Z

Why wouldn't you want to raise a Key Error here? Due to assumed alignment on daily precision?

IMO it's unexpected for line 4 to return a 0 sized array

shoyer · 2019-03-22T05:04:10Z

I guess this is similar to indexing with an index with duplicate values (which is probably a separate issue). It's nice to be able to rety on invariants, like the size of the result matching the number of matching values in the index.

KeyError makes sense for indexes without duplicates, because the alternative is returning a scalar, which isn't possible if there isn't a match.

jorisvandenbossche · 2019-03-22T08:24:19Z

IMO it's unexpected for line 4 to return a 0 sized array

The explanation here is that the string '2018-01-02' is to be considered as a slice, because the resolution of the string is higher than the resolution of the index:

In [31]: s.index.resolution
Out[31]: 'minute'

In [32]: _, _, resolution = parsing.parse_time_string('2018-01-02', freq=None)

In [33]: resolution 
Out[33]: 'day'

So if the strings '2018-01-01', '2018-02-02' etc are considered as slices, why not '2018-03-03' ?
The only difference is that it is "out of range" for the index. But with normal slicing, out of range slice bounds return an empty object, and don't raise an error:

In [34]: s.iloc[0:2]
Out[34]: 
2018-01-01 00:00:00    1
2018-02-02 01:01:00    2
dtype: int64

In [35]: s.iloc[10:12]
Out[35]: Series([], dtype: int64)

So given that, I agree with @shoyer that it would be more consistent (and reliable) to return an empty object here instead of raising an error.
Although, Stephan, note that it would still depend on the resolution of the passed string (so it would still depend to a certain extent on the value of the key, and you can't be sure that whathever string will not raise an error, but at least for datetime strings of the same resolution, it wouldn't depend any more on the exact value).

WillAyd added the Timeseries label Mar 22, 2019

jbrockmendel added the Indexing Related to indexing on series/frames, not to indexes themselves label Feb 22, 2020

mroeschke added the Bug label Apr 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should DatetimeIndex indexing with strings ever raise KeyError? #25803

Should DatetimeIndex indexing with strings ever raise KeyError? #25803

shoyer commented Mar 20, 2019

WillAyd commented Mar 22, 2019

shoyer commented Mar 22, 2019

jorisvandenbossche commented Mar 22, 2019 •

edited

Should DatetimeIndex indexing with strings ever raise KeyError? #25803

Should DatetimeIndex indexing with strings ever raise KeyError? #25803

Comments

shoyer commented Mar 20, 2019

WillAyd commented Mar 22, 2019

shoyer commented Mar 22, 2019

jorisvandenbossche commented Mar 22, 2019 • edited

jorisvandenbossche commented Mar 22, 2019 •

edited