Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Assorted DatetimeIndex bugfixes #24157

Merged
merged 32 commits into from Dec 14, 2018

Conversation

Projects
None yet
5 participants
@jbrockmendel
Copy link
Member

commented Dec 8, 2018

Bit of a hodge-podge.

  • Deprecate time_rule in offsets.generate_range
  • Improve error messages if NaT gets passed to date_range, or in validate_frequency (closes #11587)
  • Preserve timezone in DatetimeIndex->CategoricalIndex->DatetimeIndex (closes #18664)
  • Preserve freq when indexing DTI/TDI with Ellipsis (closes #21282)
  • closes #11587

@TomAugspurger nothing here is urgent, so if this overlaps with DTA it can stay on the backburner.

@pep8speaks

This comment has been minimized.

Copy link

commented Dec 8, 2018

Hello @jbrockmendel! Thanks for submitting the PR.

jbrockmendel added some commits Dec 8, 2018

@codecov

This comment has been minimized.

Copy link

commented Dec 8, 2018

Codecov Report

Merging #24157 into master will decrease coverage by 49.17%.
The diff coverage is 50%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #24157       +/-   ##
===========================================
- Coverage    92.2%   43.02%   -49.18%     
===========================================
  Files         162      162               
  Lines       51700    51714       +14     
===========================================
- Hits        47670    22251    -25419     
- Misses       4030    29463    +25433
Flag Coverage Δ
#multiple ?
#single 43.02% <50%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/arrays/datetimes.py 65.01% <25%> (-33.55%) ⬇️
pandas/tseries/offsets.py 46.43% <50%> (-50.41%) ⬇️
pandas/core/arrays/datetimelike.py 48.98% <83.33%> (-47.38%) ⬇️
pandas/io/formats/latex.py 0% <0%> (-100%) ⬇️
pandas/core/categorical.py 0% <0%> (-100%) ⬇️
pandas/io/sas/sas_constants.py 0% <0%> (-100%) ⬇️
pandas/tseries/plotting.py 0% <0%> (-100%) ⬇️
pandas/tseries/converter.py 0% <0%> (-100%) ⬇️
pandas/io/formats/html.py 0% <0%> (-98.62%) ⬇️
pandas/core/groupby/categorical.py 0% <0%> (-95.46%) ⬇️
... and 122 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c911151...06d0a8e. Read the comment docs.

@codecov

This comment has been minimized.

Copy link

commented Dec 8, 2018

Codecov Report

Merging #24157 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24157      +/-   ##
==========================================
- Coverage   92.22%   92.22%   -0.01%     
==========================================
  Files         162      162              
  Lines       51785    51796      +11     
==========================================
+ Hits        47758    47767       +9     
- Misses       4027     4029       +2
Flag Coverage Δ
#multiple 90.62% <100%> (-0.01%) ⬇️
#single 43.01% <70.58%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/arrays/datetimelike.py 96.41% <100%> (+0.05%) ⬆️
pandas/tseries/offsets.py 96.84% <100%> (-0.01%) ⬇️
pandas/core/arrays/datetimes.py 98.22% <100%> (-0.35%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 01afc42...b927925. Read the comment docs.

@jbrockmendel

This comment has been minimized.

Copy link
Member Author

commented Dec 8, 2018

Azure fail is in developer.rst, unrelated.

@jbrockmendel

This comment has been minimized.

Copy link
Member Author

commented Dec 8, 2018

@datapythonista is the azure fail caused by something in this PR? It looks unrelated but re-pushing didn’t fix it.

@datapythonista

This comment has been minimized.

Copy link
Member

commented Dec 8, 2018

master was broken (I merged an old PR that didn't have some linting), but it's been fixed in #24160

If you merge master and push again, the CI should pass now.

@jreback jreback added the Timeseries label Dec 9, 2018

@jreback
Copy link
Contributor

left a comment

pls add to #6581 for the deprecation

@@ -327,6 +327,10 @@ def __getitem__(self, key):
"numpy.newaxis (`None`) and integer or boolean "
"arrays are valid indices")

if key is Ellipsis:

This comment has been minimized.

Copy link
@jreback

jreback Dec 9, 2018

Contributor

can you make a more explicit comment. This is likely a more general fix (IOW this is likely broken for all __getitem__ on EA)

This comment has been minimized.

Copy link
@jreback

jreback Dec 12, 2018

Contributor

why don't this just work as is? why is this not a view (and not a copy)?

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Dec 12, 2018

Author Member

why don't this just work as is?

as-is DatetimeIndex[...] loses its freq attribute

why is this not a view (and not a copy)?

Since the copy doesn't have deep=True, self.copy() effectively a view?

This comment has been minimized.

Copy link
@jreback

jreback Dec 13, 2018

Contributor

can you be more explicit in your comment, this is not obvious

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Dec 13, 2018

Author Member

Moved to a hopefully clearer spot a few lines down

Show resolved Hide resolved pandas/core/arrays/datetimelike.py Outdated
Show resolved Hide resolved pandas/core/arrays/datetimes.py Outdated
Show resolved Hide resolved pandas/core/arrays/datetimes.py Outdated
ci = pd.CategoricalIndex(dti)
carr = pd.Categorical(dti)
cser = pd.Series(ci)

This comment has been minimized.

Copy link
@jreback

jreback Dec 9, 2018

Contributor

can you parametrize

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Dec 9, 2018

Author Member

will take a look

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Dec 9, 2018

Author Member

Not without adding casting code that makes the verbosity a wash.

Once Categorical(DatetimeArray) works (currently raises bc DatetimeArray doesn't have _constructor) then this test can be parametrized over DatetimeIndex/DatetimeArray

@jreback jreback added the Deprecate label Dec 9, 2018

@jbrockmendel

This comment has been minimized.

Copy link
Member Author

commented Dec 9, 2018

pls add to #6581 for the deprecation

Will do. Actually, is this private enough that a deprecation cycle is unnecessary? If so we should just remove the arg now and move the whole func to arrays.datetimes, since that is the only place it is used

@jsexauer jsexauer referenced this pull request Dec 11, 2018

Open

DEPR: deprecations from prior versions #6581

0 of 94 tasks complete
@jbrockmendel

This comment has been minimized.

Copy link
Member Author

commented Dec 11, 2018

@TomAugspurger does this step on your toes?

@TomAugspurger
Copy link
Contributor

left a comment

LGTM.

I think it'll apply cleanly aside from imports.

@@ -327,6 +327,10 @@ def __getitem__(self, key):
"numpy.newaxis (`None`) and integer or boolean "
"arrays are valid indices")

if key is Ellipsis:

This comment has been minimized.

Copy link
@jreback

jreback Dec 12, 2018

Contributor

why don't this just work as is? why is this not a view (and not a copy)?

@@ -327,6 +327,10 @@ def __getitem__(self, key):
"numpy.newaxis (`None`) and integer or boolean "
"arrays are valid indices")

if key is Ellipsis:

This comment has been minimized.

Copy link
@jreback

jreback Dec 13, 2018

Contributor

can you be more explicit in your comment, this is not obvious

Show resolved Hide resolved pandas/core/arrays/datetimelike.py Outdated
Show resolved Hide resolved pandas/core/arrays/datetimes.py Outdated
@@ -2487,6 +2488,9 @@ def generate_range(start=None, end=None, periods=None,
"""
if time_rule is not None:

This comment has been minimized.

Copy link
@jreback

jreback Dec 13, 2018

Contributor

did you need to change any existing tests to account for this?

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Dec 13, 2018

Author Member

yes, tests.indexes.datetimes.test_date_range test_generate and test_generate_cday

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Dec 13, 2018

Author Member

is this sufficiently internal that we don't need to do a deprecation cycle?

This comment has been minimized.

Copy link
@jreback

jreback Dec 13, 2018

Contributor

yeah, let's just drop it entirely (fix the tests). I don't think generate_range is exposed, though maybe add in a note in api breaking changes just to cover.

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Dec 13, 2018

Author Member

Will do. OK with moving this to arrays.datetimes while we're at it?

@@ -1133,7 +1133,7 @@ Deprecations
- Passing a string alias like ``'datetime64[ns, UTC]'`` as the `unit` parameter to :class:`DatetimeTZDtype` is deprecated. Use :class:`DatetimeTZDtype.construct_from_string` instead (:issue:`23990`).
- In :meth:`Series.where` with Categorical data, providing an ``other`` that is not present in the categories is deprecated. Convert the categorical to a different dtype or add the ``other`` to the categories first (:issue:`24077`).
- :meth:`Series.clip_lower`, :meth:`Series.clip_upper`, :meth:`DataFrame.clip_lower` and :meth:`DataFrame.clip_upper` are deprecated and will be removed in a future version. Use ``Series.clip(lower=threshold)``, ``Series.clip(upper=threshold)`` and the equivalent ``DataFrame`` methods (:issue:`24203`)

- Passing a `time_rule` to `pandas.tseries.offsets.generate_range` is deprecated and will raise a ``TypeError`` in a future version. Pass an ``offset`` instead (:issue:`24157`)

This comment has been minimized.

Copy link
@jreback

jreback Dec 13, 2018

Contributor

these don't jive with what's at the top of the PR. missing 2 notes?

This comment has been minimized.

Copy link
@jreback

jreback Dec 13, 2018

Contributor

what about #11587 ?

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Dec 13, 2018

Author Member

Does it merit a note? All that is changed is an error message. But yes, this does close 11587, I'll update this above.

Show resolved Hide resolved pandas/core/arrays/datetimelike.py
# non-fixed frequencies are not meaningful for timedelta64;
# we retain that error message
raise e
# GH#11587 if index[0] is NaT _generate_range will raise

This comment has been minimized.

Copy link
@jreback

jreback Dec 13, 2018

Contributor

this is too cryptic, pls make more readable if you don't actually know what's happening here

@@ -1652,6 +1654,13 @@ def maybe_convert_dtype(data, copy):
raise TypeError("Passing PeriodDtype data is invalid. "
"Use `data.to_timestamp()` instead")

elif is_categorical_dtype(data):
# GH#18664 preserve tz in going DTI->Categorical->DTI
# TODO: cases where we need to do another pass through this func,

This comment has been minimized.

Copy link
@jreback

jreback Dec 13, 2018

Contributor

why is the TODO case here, what is an example? do you have an xfailed test?

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Dec 13, 2018

Author Member

example is Categorical(TimedeltaIndex) where we should do another pass through maybe_convert_dtype in order to issue the FutureWarning. I'm assuming there are other corner cases here that will need to be caught/tested.

@jreback jreback added this to the 0.24.0 milestone Dec 13, 2018

@jreback
Copy link
Contributor

left a comment

small comments, ping on green.

@@ -380,6 +380,7 @@ Backwards incompatible API changes
- ``max_rows`` and ``max_cols`` parameters removed from :class:`HTMLFormatter` since truncation is handled by :class:`DataFrameFormatter` (:issue:`23818`)
- :meth:`read_csv` will now raise a ``ValueError`` if a column with missing values is declared as having dtype ``bool`` (:issue:`20591`)
- The column order of the resultant :class:`DataFrame` from :meth:`MultiIndex.to_frame` is now guaranteed to match the :attr:`MultiIndex.names` order. (:issue:`22420`)
- :func:`pd.offsets.generate_range` argument ``time_rule`` has been removed; use ``offset`` instead (:issue:`24157`)

This comment has been minimized.

Copy link
@jreback

jreback Dec 13, 2018

Contributor

this probably won't render as its not in the api.rst but ok

@@ -1310,6 +1310,8 @@ Datetimelike
- Bug in :class:`Index` where calling ``np.array(dtindex, dtype=object)`` on a timezone-naive :class:`DatetimeIndex` would return an array of ``datetime`` objects instead of :class:`Timestamp` objects, potentially losing nanosecond portions of the timestamps (:issue:`23524`)
- Bug in :class:`Categorical.__setitem__` not allowing setting with another ``Categorical`` when both are undordered and have the same categories, but in a different order (:issue:`24142`)
- Bug in :func:`date_range` where using dates with millisecond resolution or higher could return incorrect values or the wrong number of values in the index (:issue:`24110`)
- Bug in :class:`DatetimeIndex` where constructing a :class:`DatetimeIndex` from a :class:`Categorical` or :class:`CategoricalIndex` would incorrectly drop timezone information (:issue:`18664`)

This comment has been minimized.

Copy link
@jreback

jreback Dec 13, 2018

Contributor

where is the note for #11587?

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Dec 13, 2018

Author Member

Just pushed, should be just below this line.

@@ -2,6 +2,7 @@
from datetime import date, datetime, timedelta
import functools
import operator
import warnings

This comment has been minimized.

Copy link
@jreback

jreback Dec 13, 2018

Contributor

prob going to fail the build as this is not needed

@jreback

This comment has been minimized.

Copy link
Contributor

commented Dec 13, 2018

linting issue. ping on green.

@jbrockmendel

This comment has been minimized.

Copy link
Member Author

commented Dec 13, 2018

Ping

@jreback jreback merged commit a1cf3f6 into pandas-dev:master Dec 14, 2018

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
pandas-dev.pandas Build #20181213.47 succeeded
Details
@jreback

This comment has been minimized.

Copy link
Contributor

commented Dec 14, 2018

thanks!

@jbrockmendel jbrockmendel deleted the jbrockmendel:deets branch Dec 14, 2018

Pingviinituutti added a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

Pingviinituutti added a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.