Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DatetimeIndex get_loc validates date object #35478

Closed
wants to merge 1 commit into from

Conversation

knabben
Copy link
Contributor

@knabben knabben commented Jul 30, 2020

Adding the date object try cast in get_loc.

@pep8speaks
Copy link

pep8speaks commented Jul 30, 2020

Hello @knabben! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-11-04 15:43:34 UTC

@knabben knabben force-pushed the date-dtindex branch 2 times, most recently from 87c400b to 1d6f0ca Compare July 30, 2020 15:41
pandas/core/indexes/datetimes.py Outdated Show resolved Hide resolved
@@ -152,7 +152,7 @@ class DatetimeArray(dtl.DatetimeLikeArrayMixin, dtl.TimelikeOps, dtl.DatelikeOps

_typ = "datetimearray"
_scalar_type = Timestamp
_recognized_scalars = (datetime, np.datetime64)
_recognized_scalars = (datetime, np.datetime64, date)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the right way to address #35466, but you also need to look at the other places where _recognized_scalars is used and make sure those tests are updated appropriately

@jbrockmendel
Copy link
Member

Do we want dti == date to do partial-matching the same way we would with a date-only string? (i guess Id be OK punting on this)(

@knabben
Copy link
Contributor Author

knabben commented Jul 30, 2020

Do we want dti == date to do partial-matching the same way we would with a date-only string? (i guess Id be OK punting on this)(

@jbrockmendel feel free to catch this, looks like you have been working with the flow on GH#19800 and GH#19301.

But having dti == date, looks ok, since the date is timestamped after.

@simonjayhawkins simonjayhawkins added Index Related to the Index class or subclasses Timeseries labels Jul 30, 2020
@simonjayhawkins simonjayhawkins added this to the 1.1.1 milestone Jul 30, 2020
@simonjayhawkins
Copy link
Member

@knabben can you add a release note to doc\source\whatsnew\v1.1.1.rst

@TomAugspurger
Copy link
Contributor

It's not clear to me that the behavior on master is incorrect. Shouldn't we be following the behavior of Python's dict, which doesn't consider dates to be datetimes at midnight?

In [18]: d = {datetime.datetime(2000, 1, 1): 1}

In [19]: datetime.date(2000, 1, 1) in d
Out[19]: False

@knabben
Copy link
Contributor Author

knabben commented Jul 31, 2020

It's not clear to me that the behavior on master is incorrect. Shouldn't we be following the behavior of Python's dict, which doesn't consider dates to be datetimes at midnight?

In [18]: d = {datetime.datetime(2000, 1, 1): 1}

In [19]: datetime.date(2000, 1, 1) in d
Out[19]: False

I had the same question in the issue, maybe a date in DateTimeIndex should never be considered and this now should be the correct behavior, this is clear after adding, date in the recognized_scalars, started to break a lot of tests, that were not expecting this to be there (like blocks, internals, etc):

_recognized_scalars = (datetime, np.datetime64, date)

It was passing before because get_loc, was using a fallback for this translation.

cc @jbrockmendel

@jbrockmendel
Copy link
Member

jbrockmendel commented Aug 1, 2020 via email

@knabben
Copy link
Contributor Author

knabben commented Aug 1, 2020

closing the PR for the lack of time to investigate it further, for those going on:

  1. It should be keep the behavior of date in DatetimeIndex equivalence, at least while in the deprecation cycle.
  2. For the long run assess if this behavior still makes sense, deprecating otherwise.

@knabben knabben closed this Aug 1, 2020
@jorisvandenbossche
Copy link
Member

It's not clear to me that the behavior on master is incorrect. Shouldn't we be following the behavior of Python's dict, which doesn't consider dates to be datetimes at midnight?

I think get_loc needs to match __eq__, which needs to match Timestamp.__eq__.

Indeed, right now, Timestamp considers a date to be equal:

In [33]: pd.Timestamp("2012-01-01") == datetime.date(2012, 1, 1)  
Out[33]: True

So I think if we want to change this to False, that should go through a deprecation cycle, and the same for the indexing behaviour.

@jorisvandenbossche
Copy link
Member

Although that for Index/Array, we return False ...

In [34]: pd.array([pd.Timestamp("2012-01-01")]) == datetime.date(2012, 1, 1) 
Out[34]: array([False])

In [35]: pd.Index([pd.Timestamp("2012-01-01")]) == datetime.date(2012, 1, 1) 
Out[35]: array([False])

@jreback jreback removed this from the 1.1.1 milestone Aug 14, 2020
@dwardzinski-jsf
Copy link

dwardzinski-jsf commented Aug 21, 2020

Just tested this PR to see if it would fix the original issue, it looks like it still doesn't :(

>>> datetime.date(2000,1,1) in pd.DatetimeIndex(['2000-01-01'])
False

@jbrockmendel
Copy link
Member

jbrockmendel commented Aug 24, 2020

Indeed, right now, Timestamp considers a date to be equal:

It also looks like Timestamp fails to raise TypeError when doing checking Timestamp[aware] < date.

Also looks like Timestamp(not_midnight) == date returns True because of NotImplemented semantics

@arw2019 arw2019 added Needs Discussion Requires discussion from core team before further action and removed Stale labels Nov 4, 2020
@jreback
Copy link
Contributor

jreback commented Nov 4, 2020

@knabben can you merge master

@jbrockmendel
Copy link
Member

-1, DTI.get_loc should raise on a date object. xref #36131

@jreback
Copy link
Contributor

jreback commented Nov 4, 2020

-1, DTI.get_loc should raise on a date object. xref #36131

good point

thanks for the effort @knabben love to have you contribute on another issue!

@jreback jreback closed this Nov 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Index Related to the Index class or subclasses Needs Discussion Requires discussion from core team before further action Timeseries
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Date objects cannot be compared against a DatetimeIndex
9 participants