Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to _get_nearest_indexer for pydata/xarray#3751 #32905

Merged
merged 18 commits into from
Mar 26, 2020
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -310,6 +310,7 @@ Indexing
- Bug in :meth:`DataFrame.iloc.__setitem__` on a :class:`DataFrame` with duplicate columns incorrectly setting values for all matching columns (:issue:`15686`, :issue:`22036`)
- Bug in :meth:`DataFrame.loc:` and :meth:`Series.loc` with a :class:`DatetimeIndex`, :class:`TimedeltaIndex`, or :class:`PeriodIndex` incorrectly allowing lookups of non-matching datetime-like dtypes (:issue:`32650`)
- Bug in :meth:`Series.__getitem__` indexing with non-standard scalars, e.g. ``np.dtype`` (:issue:`32684`)
- :meth:`Index._get_nearest_indexer` now internally uses NumPy arrays or ExtensionArrays to compute left and right distances. This preserves the fix to :issue:`26683` while maintaining compatibility with downstream :class:`Index` subclasses, e.g. xarray's CFTimeIndex (:issue:`32905`). See also `pydata/xarray#3751 <https://github.com/pydata/xarray/issues/3751>`_.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't reference an internal routine, rather only user visible things. You can just mention xarray.


Missing
^^^^^^^
Expand Down
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,7 @@ dependencies:
- s3fs # pandas.read_csv... when using 's3://...' path
- sqlalchemy # pandas.read_sql, DataFrame.to_sql
- xarray # DataFrame.to_xarray
- cftime # Needed for downstream xarray.CFTimeIndex test
- pyreadstat # pandas.read_spss
- tabulate>=0.8.3 # DataFrame.to_markdown
- pip:
Expand Down
14 changes: 9 additions & 5 deletions pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -3049,8 +3049,9 @@ def _get_nearest_indexer(self, target: "Index", limit, tolerance) -> np.ndarray:
left_indexer = self.get_indexer(target, "pad", limit=limit)
right_indexer = self.get_indexer(target, "backfill", limit=limit)

left_distances = np.abs(self[left_indexer] - target)
right_distances = np.abs(self[right_indexer] - target)
target_values = target._values
left_distances = np.abs(self._values[left_indexer] - target_values)
right_distances = np.abs(self._values[right_indexer] - target_values)

op = operator.lt if self.is_monotonic_increasing else operator.le
indexer = np.where(
Expand All @@ -3059,13 +3060,16 @@ def _get_nearest_indexer(self, target: "Index", limit, tolerance) -> np.ndarray:
right_indexer,
)
if tolerance is not None:
indexer = self._filter_indexer_tolerance(target, indexer, tolerance)
indexer = self._filter_indexer_tolerance(target_values, indexer, tolerance)
return indexer

def _filter_indexer_tolerance(
self, target: "Index", indexer: np.ndarray, tolerance
self,
target: Union["Index", np.ndarray, ExtensionArray],
jbrockmendel marked this conversation as resolved.
Show resolved Hide resolved
indexer: np.ndarray,
tolerance,
) -> np.ndarray:
distance = abs(self.values[indexer] - target)
distance = abs(self._values[indexer] - target)
indexer = np.where(distance <= tolerance, indexer, -1)
return indexer

Expand Down
15 changes: 15 additions & 0 deletions pandas/tests/test_downstream.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@
import numpy as np # noqa
import pytest

import pandas.util._test_decorators as td

from pandas import DataFrame
import pandas._testing as tm

Expand Down Expand Up @@ -47,6 +49,19 @@ def test_xarray(df):
assert df.to_xarray() is not None


@td.skip_if_no("cftime")
@td.skip_if_no("xarray", "0.10.4")
def test_xarray_cftimeindex_nearest():
jreback marked this conversation as resolved.
Show resolved Hide resolved
# https://github.com/pydata/xarray/issues/3751
import cftime
import xarray

times = xarray.cftime_range("0001", periods=2)
result = times.get_loc(cftime.DatetimeGregorian(2000, 1, 1), method="nearest")
expected = 1
assert result == expected


def test_oo_optimizable():
# GH 21071
subprocess.check_call([sys.executable, "-OO", "-c", "import pandas"])
Expand Down
1 change: 1 addition & 0 deletions requirements-dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@ tables>=3.4.2
s3fs
sqlalchemy
xarray
cftime
pyreadstat
tabulate>=0.8.3
git+https://github.com/pandas-dev/pydata-sphinx-theme.git@master
Expand Down