Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: incorrect usage of pchip for timedelta #26189

Closed
jreback opened this issue Apr 22, 2019 · 4 comments · Fixed by scipy/scipy#10090 or #26231
Closed

BUG: incorrect usage of pchip for timedelta #26189

jreback opened this issue Apr 22, 2019 · 4 comments · Fixed by scipy/scipy#10090 or #26231
Labels
Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Apr 22, 2019

This just started breaking on scipy master:
https://dev.azure.com/pandas-dev/pandas/_build/results?buildId=10858

xfailing in #26190

> /Users/jreback/pandas/pandas/tests/series/test_missing.py(1446)test_interpolate_timedelta_index()
-> result = df[0].interpolate(method=method, **kwargs)
(Pdb) n
> /Users/jreback/pandas/pandas/tests/series/test_missing.py(1447)test_interpolate_timedelta_index()
-> expected = pd.Series([0.0, 1.0, 2.0, 3.0], name=0, index=ind)
(Pdb) p result
0 days 00:00:00.000000    0.0
1 days 00:00:00.000000    1.0
2 days 00:00:00.000000    2.0
3 days 00:00:00.000000    3.0
Freq: D, Name: 0, dtype: float64
(Pdb) p df[0]
0 days 00:00:00.000000    0.0
1 days 00:00:00.000000    1.0
2 days 00:00:00.000000    NaN
3 days 00:00:00.000000    3.0
Freq: D, Name: 0, dtype: float64

I think we are passing invalid inputs

> /Users/jreback/pandas/pandas/core/missing.py(311)_interpolate_scipy_wrapper()
-> new_y = method(x, y, new_x, **kwargs)
(Pdb) p method
<function pchip_interpolate at 0x81bb9e9d8>
(Pdb) p x
array([              1,  86400000000001, 259200000000001],
      dtype='timedelta64[ns]')
(Pdb) p y
array([0., 1., 3.])
@jreback jreback added Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Apr 22, 2019
@jreback jreback added this to the 0.25.0 milestone Apr 22, 2019
@jreback
Copy link
Contributor Author

jreback commented Apr 22, 2019

cc @rgommers

@rgommers
Copy link
Contributor

thanks @jreback looking into it now

@rgommers
Copy link
Contributor

rgommers commented Apr 23, 2019

I'll add the full failure from the log here to save some clicking:

The key part is

>       if np.any(dx <= 0):
E       numpy.core._exceptions.UFuncTypeError: Cannot cast ufunc 'less_equal' input 0 from dtype('<m8[ns]') to dtype('<m8') with casting rule 'same_kind'

I think we are passing invalid inputs

It's one of those things - we never test with timedelta or datetime anywhere in scipy and in principle we'd indeed consider it invalid input, however it is a numpy builtin dtype and it mostly just works. This failure can perhaps be easily worked around.

``` [gw1] linux -- Python 3.7.3 /home/vsts/miniconda3/envs/pandas-dev/bin/python

self = <pandas.tests.series.test_missing.TestSeriesInterpolateData object at 0x7f4a444c1ba8>
interp_methods_ind = ('pchip', {})

def test_interpolate_timedelta_index(self, interp_methods_ind):
    """
    Tests for non numerical index types  - object, period, timedelta
    Note that all methods except time, index, nearest and values
    are tested here.
    """
    # gh 21662
    ind = pd.timedelta_range(start=1, periods=4)
    df = pd.DataFrame([0, 1, np.nan, 3], index=ind)

    method, kwargs = interp_methods_ind
    if method == "pchip":
        _skip_if_no_pchip()

    if method in {"linear", "pchip"}:
      result = df[0].interpolate(method=method, **kwargs)

pandas/tests/series/test_missing.py:1445:


pandas/core/generic.py:6884: in interpolate
**kwargs)
pandas/core/internals/managers.py:520: in interpolate
return self.apply('interpolate', **kwargs)
pandas/core/internals/managers.py:396: in apply
applied = getattr(b, f)(**kwargs)
pandas/core/internals/blocks.py:1127: in interpolate
downcast=downcast, **kwargs)
pandas/core/internals/blocks.py:1191: in _interpolate
interp_values = np.apply_along_axis(func, axis, data)
../../../miniconda3/envs/pandas-dev/lib/python3.7/site-packages/numpy/core/overrides.py:150: in public_api
implementation, public_api, relevant_args, args, kwargs)
../../../miniconda3/envs/pandas-dev/lib/python3.7/site-packages/numpy/lib/shape_base.py:379: in apply_along_axis
res = asanyarray(func1d(inarr_view[ind0], *args, **kwargs))
pandas/core/internals/blocks.py:1188: in func
bounds_error=False, **kwargs)
pandas/core/missing.py:237: in interpolate_1d
order=order, **kwargs)
pandas/core/missing.py:310: in _interpolate_scipy_wrapper
new_y = method(x, y, new_x, **kwargs)
../../../miniconda3/envs/pandas-dev/lib/python3.7/site-packages/scipy/interpolate/_cubic.py:343: in pchip_interpolate
P = PchipInterpolator(xi, yi, axis=axis)
../../../miniconda3/envs/pandas-dev/lib/python3.7/site-packages/scipy/interpolate/_cubic.py:235: in init
x, _, y, axis, _ = prepare_input(x, y, axis)


x = array([ 1, 86400000000001, 259200000000001],
dtype='timedelta64[ns]')
y = array([0., 1., 3.]), axis = 0, dydx = None

def prepare_input(x, y, axis, dydx=None):
    """Prepare input for cubic spline interpolators.

    All data are converted to numpy arrays and checked for correctness.
    Axes equal to `axis` of arrays `y` and `dydx` are rolled to be the 0-th
    axis. The value of `axis` is converted to lie in
    [0, number of dimensions of `y`).
    """

    x, y = map(np.asarray, (x, y))

    if np.issubdtype(x.dtype, np.complexfloating):
        raise ValueError("`x` must contain real values.")

    if np.issubdtype(y.dtype, np.complexfloating):
        dtype = complex
    else:
        dtype = float

    if dydx is not None:
        dydx = np.asarray(dydx)
        if y.shape != dydx.shape:
            raise ValueError("The shapes of `y` and `dydx` must be identical.")
        if np.issubdtype(dydx.dtype, np.complexfloating):
            dtype = complex
        dydx = dydx.astype(dtype, copy=False)

    y = y.astype(dtype, copy=False)
    axis = axis % y.ndim
    if x.ndim != 1:
        raise ValueError("`x` must be 1-dimensional.")
    if x.shape[0] < 2:
        raise ValueError("`x` must contain at least 2 elements.")
    if x.shape[0] != y.shape[axis]:
        raise ValueError("The length of `y` along `axis`={0} doesn't "
                         "match the length of `x`".format(axis))

    if not np.all(np.isfinite(x)):
        raise ValueError("`x` must contain only finite values.")
    if not np.all(np.isfinite(y)):
        raise ValueError("`y` must contain only finite values.")

    if dydx is not None and not np.all(np.isfinite(dydx)):
        raise ValueError("`dydx` must contain only finite values.")

    dx = np.diff(x)
  if np.any(dx <= 0):

E numpy.core._exceptions.UFuncTypeError: Cannot cast ufunc 'less_equal' input 0 from dtype('<m8[ns]') to dtype('<m8') with casting rule 'same_kind'

../../../miniconda3/envs/pandas-dev/lib/python3.7/site-packages/scipy/interpolate/_cubic.py:65: UFuncTypeError

</details>

@rgommers
Copy link
Contributor

Should work again with scipy master after scipy/scipy#10090

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
2 participants