Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent Timestamp arithmetic #8554

Closed
Tracked by #18824
willpan opened this issue Oct 14, 2014 · 4 comments · Fixed by #22163
Closed
Tracked by #18824

Inconsistent Timestamp arithmetic #8554

willpan opened this issue Oct 14, 2014 · 4 comments · Fixed by #22163
Labels
API Design Bug Error Reporting Incorrect or improved errors from pandas Timedelta Timedelta data type
Milestone

Comments

@willpan
Copy link

willpan commented Oct 14, 2014

Subtracting a scalar datetime from a series returns timedeltas, but doesn't do the same for a dataframe.

import pandas as pd 
series = pd.Series(pd.date_range('2000-1-1', periods=100), index=range(0, 100))
df = pd.DataFrame(series)
date = pd.Timestamp('1995-1-1')

series - date
#0    1826 days                                                                                 
#1    1827 days                                                                                 
#2    1828 days
# ....   

df - date
#             0
#0  1975-01-01                                                                                  
#1  1975-01-02                                                                                  
#2  1975-01-03
# ....

It seems like both should return timedeltas.

@jreback
Copy link
Contributor

jreback commented Oct 14, 2014

hmm, this should work, marking as a bug

@jreback jreback added Bug Timedelta Timedelta data type labels Oct 14, 2014
@jreback jreback added this to the 0.15.1 milestone Oct 14, 2014
@jreback
Copy link
Contributor

jreback commented Oct 28, 2014

this seems trivial, but is actually complicated because of this:

In [11]: df = DataFrame({'A' : np.arange(5), 'B' : pd.date_range('20130101',periods=5),'C' : 'foo'})

In [12]: df
Out[12]: 
   A          B    C
0  0 2013-01-01  foo
1  1 2013-01-02  foo
2  2 2013-01-03  foo
3  3 2013-01-04  foo
4  4 2013-01-05  foo

In [13]: df.dtypes
Out[13]: 
A             int64
B    datetime64[ns]
C            object
dtype: object

practically no arithmetic operations make sense with this frame, .eg
df+1, df-Timestamp('20130101') all blow up

that said I agree a compatible dtyped op should work

e.g.

df.select_dtypes(include=['datetime'])-Timestamp('20130101') should work
(and .... + Timestamp('20130101') should raise.

If these were series all is good, and will all work appropriately (and is easy to achive
by applying these ops).

@jbrockmendel
Copy link
Member

Looked into this briefly, threw up my hands somewhere between core.internals and core.ops. Is the spaghetti in ops carefully optimized or the result of years of frankensteining?

@jreback
Copy link
Contributor

jreback commented Oct 31, 2017

Looked into this briefly, threw up my hands somewhere between core.internals and core.ops. Is the spaghetti in ops carefully optimized or the result of years of frankensteining?

the code in ops is mostly to handle multiple dispatch (before it became a thing)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Bug Error Reporting Incorrect or improved errors from pandas Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants