Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[0.24.0rc1] passing dtype='M8' to Index raises #24753

Closed
TomAugspurger opened this issue Jan 13, 2019 · 0 comments

Comments

Projects
None yet
2 participants
@TomAugspurger
Copy link
Contributor

commented Jan 13, 2019

In 0.23.4

In [1]: import pandas as pd

In [2]: pd.Index([], dtype='M8')
Out[2]: DatetimeIndex([], dtype='datetime64[ns]', freq=None)

In 0.24.0rc1

In [2]: pd.Index([], dtype='M8')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-09844a8ae29c> in <module>
----> 1 pd.Index([], dtype='M8')

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/indexes/base.py in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
    306             else:
    307                 result = DatetimeIndex(data, copy=copy, name=name,
--> 308                                        dtype=dtype, **kwargs)
    309                 return result
    310

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/indexes/datetimes.py in __new__(cls, data, freq, start, end, periods, tz, normalize, closed, ambiguous, dayfirst, yearfirst, dtype, copy, name, verify_integrity)
    301             data, dtype=dtype, copy=copy, tz=tz, freq=freq,
    302             dayfirst=dayfirst, yearfirst=yearfirst, ambiguous=ambiguous,
--> 303             int_as_wall_time=True)
    304
    305         subarr = cls._simple_new(dtarr, name=name,

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/arrays/datetimes.py in _from_sequence(cls, data, dtype, copy, tz, freq, dayfirst, yearfirst, ambiguous, int_as_wall_time)
    366             data, dtype=dtype, copy=copy, tz=tz,
    367             dayfirst=dayfirst, yearfirst=yearfirst,
--> 368             ambiguous=ambiguous, int_as_wall_time=int_as_wall_time)
    369
    370         freq, freq_infer = dtl.validate_inferred_freq(freq, inferred_freq,

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/arrays/datetimes.py in sequence_to_dt64ns(data, dtype, copy, tz, dayfirst, yearfirst, ambiguous, int_as_wall_time)
   1704     inferred_freq = None
   1705
-> 1706     dtype = _validate_dt64_dtype(dtype)
   1707
   1708     if not hasattr(data, "dtype"):

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/arrays/datetimes.py in _validate_dt64_dtype(dtype)
   1991             raise ValueError("Unexpected value for 'dtype': '{dtype}'. "
   1992                              "Must be 'datetime64[ns]' or DatetimeTZDtype'."
-> 1993                              .format(dtype=dtype))
   1994     return dtype
   1995

ValueError: Unexpected value for 'dtype': 'datetime64'. Must be 'datetime64[ns]' or DatetimeTZDtype'.

We want that to raise eventually; ns precision should be specified. But we should maybe deprecate the old behavior first?

The best place to do it is probably before we get to arrays, so in DatetimeIndex.__new__ we can check for M8 specifically, warn, then pass through M8[ns].

We should also check

  • np.dtype("M8")
  • 'm8'
  • np.dtype('m8')

It's less clear what we should do for something like M8[us]. In the past, we used to ignore the precision

In [15]: pd.Index(list(pd.date_range('2000', periods=4).asi8), dtype='M8[us]')
Out[15]: DatetimeIndex(['2000-01-01', '2000-01-02', '2000-01-03', '2000-01-04'], dtype='datetime64[ns]', freq=None)

this should arguably raise now...

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Jan 16, 2019

DEPR/API: Non-ns precision in Index constructors
This deprecates passing dtypes without a precision to DatetimeIndex
and TimedeltaIndex

```python
In [2]: pd.DatetimeIndex(['2000'], dtype='datetime64')
/Users/taugspurger/.virtualenvs/pandas-dev/bin/ipython:1: FutureWarning: Passing in 'datetime64' dtype with no precision is deprecated
and will raise in a future version. Please pass in
'datetime64[ns]' instead.
  #!/Users/taugspurger/Envs/pandas-dev/bin/python3
Out[2]: DatetimeIndex(['2000-01-01'], dtype='datetime64[ns]', freq=None)
```

Previously, we ignored the precision, so that things like

```
In [3]: pd.DatetimeIndex(['2000'], dtype='datetime64[us]')
Out[3]: DatetimeIndex(['2000-01-01'], dtype='datetime64[ns]', freq=None)
```

worked. That is deprecated as well.

Closes pandas-dev#24739
Closes pandas-dev#24753

@jreback jreback added this to the 0.24.0 milestone Jan 21, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.