Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DISC] DatetimeIndex/TimedeltaIndex ops with integers should deprecated #21939

Closed
jbrockmendel opened this issue Jul 17, 2018 · 3 comments

Comments

Projects
None yet
4 participants
@jbrockmendel
Copy link
Member

commented Jul 17, 2018

An example of the behavior that should be deprecated:

In [2]: dti = pd.date_range('2016-01-01', periods=3)
In [3]: dti + 1
Out[3]: DatetimeIndex(['2016-01-02', '2016-01-03', '2016-01-04'], dtype='datetime64[ns]', freq='D')
In [4]: dti[1] + 100
Out[4]: Timestamp('2016-04-11 00:00:00', freq='D')

In [5]: tdi = pd.timedelta_range('1D', periods=3)
In [6]: tdi - 1
Out[6]: TimedeltaIndex(['0 days', '1 days', '2 days'], dtype='timedelta64[ns]', freq='D')
In [7]: tdi[-1] - 1
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-ac830ffbe976> in <module>()
----> 1 tdi[-1] - 1

TypeError: unsupported operand type(s) for -: 'Timedelta' and 'int'

This TypeError occurs because TimedeltaIndex._box_func does not pass self.freq to Timedelta, and is one of many ways that this behavior introduces inconsistencies into the code.

  • Integer add/sub for DTI/DTI just passes through to self.shift(n), so there is a ready alternative.

  • This behavior is inconsistent between DatetimeIndex and Series[datetime64] (#20937) (same for TDI)

  • freq is not really a useful part of the Timestamp or Timedelta abstraction (#15146).

  • The existence of a Timestamp.freq gives many users the incorrect impression that Period.freq behaves the same way (#4591, #4878, #5091, #12379)

  • to_datetime fails to preserve freq in some cases (need to track down the issue)

  • Timestamp and Timedelta __eq__ ignores freq

Bottom line: DTI/TDI/Timestamp/Timedelta arithmetic with integers should be deprecated to a) cut out a bunch of hard-to-maintain code and b) avoid internal inconsistencies.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jul 17, 2018

you are made the counter-argument above though. this is just broken. i think its more confusing that we support this on DTI with a freq actually. IOW I would be in favor of only allowing this on Periods. but we right now have mostly good support on DTI, pretty good on Peridos, and a mishmash on TD/TDI. so I agree fix or remove.

why do you think freq is not useful here and but useful for Timestamp/DTI? (e.g. #15146)

@jbrockmendel

This comment has been minimized.

Copy link
Member Author

commented Jul 17, 2018

you are made the counter-argument above though

Not sure what you mean, but it sounds like I communicated poorly. I think we should (subject to a deprecation cycle)

  • get rid of Timestamp.freq attribute entirely
  • get rid of Timedelta.freq attribute entirely
  • disable DTI.__add__(int), DTI.__sub__(int), DTI.__add__(arraylike[int]), DTI.__sub__(arraylike[int])
  • disable TDI.__add__(int), TDI.__sub__(int), TDI.__add__(arraylike[int]), TDI.__sub__(arraylike[int])
@mroeschke

This comment has been minimized.

Copy link
Member

commented Jul 17, 2018

+1 on this depreciation. This behavior is not currently documented and is not explicit compared to adding datetime-like objects.

For the same reason, I am also not a big fan of integer arithmetic with periods as well but it seems we have documented it as such.
https://pandas.pydata.org/pandas-docs/stable/timeseries.html#period

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.