Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: TimeStamp looses frequency info on arithmetic ops #4547

Closed
mvds314 opened this issue Aug 13, 2013 · 23 comments · Fixed by #6560
Closed

BUG: TimeStamp looses frequency info on arithmetic ops #4547

mvds314 opened this issue Aug 13, 2013 · 23 comments · Fixed by #6560
Labels
Bug Frequency DateOffsets
Milestone

Comments

@mvds314
Copy link

mvds314 commented Aug 13, 2013

Running the follow code generates an error using pandas 0.11 (with winpython 2.7.5.2 64 bit):
import pandas as pd
ts1=pd.date_range('1/1/2000',periods=1,freq='Q')[0]
ts1-1-1

Error:
Traceback (most recent call last):
File "", line 1, in
File "tslib.pyx", line 566, in pandas.tslib._Timestamp.sub (pandas\tslib.c:10775)
File "tslib.pyx", line 550, in pandas.tslib._Timestamp.add (pandas\tslib.c:10477)
ValueError: Cannot add integral value to Timestamp without offset.

Reason:
ts1.freq has value:
1 QuarterEnd: startingMonth=12, kwds={'startingMonth': 12}, offset=3 MonthEnds
but
(ts1-1).freq has no value

@jreback
Copy link
Contributor

jreback commented Aug 13, 2013

You are selecting a single timestamp out, by definition it does not have a frequency; try selecting with a slice (even a len 1 slice)

In [30]: ts1=pd.date_range('1/1/2000',periods=1,freq='Q')

In [31]: ts1
Out[31]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2000-03-31 00:00:00]
Length: 1, Freq: Q-DEC, Timezone: None

In [32]: ts1[0:1]
Out[32]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2000-03-31 00:00:00]
Length: 1, Freq: Q-DEC, Timezone: None

@mvds314
Copy link
Author

mvds314 commented Aug 13, 2013

Thnx for the reply. I understand what you mean. However, Timestamp seems to have a frequency.
Since, define ts1 as:
ts1=pd.date_range('1/1/2000',periods=1,freq='Q')[0]
Now ts1 is a timestamp. And
ts1-1
gives 30/09/1999.
Furthermore: ts1-2 gives 30/06/1999
But: ts1-1-1 does not.

Also, if you ask me, it makes sense to increment timestamps with +1 and -1, since that allows you to easily index DatetimeIndex objects

@jreback
Copy link
Contributor

jreback commented Aug 13, 2013

what does ts1-1-1 mean?

@mvds314
Copy link
Author

mvds314 commented Aug 13, 2013

ts1-1-1=(ts1-1)-1
That is, substract 1 from the timestamp ts1 to obtain 30/09/1999 and then subtract 1 again to obtain 30/06/1999
However, the frequency information is lost along the way and this generates an error.

@jreback
Copy link
Contributor

jreback commented Aug 13, 2013

A single key selection simply has different semantics (across all of pandas), rather than a slice,

The following does what you want (note that I leave the index as is), then select the element at the end

In [15]: ts1=pd.date_range('1/1/2000',periods=1,freq='Q')

In [16]: (ts1-1)-1
Out[16]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[1999-09-30 00:00:00]
Length: 1, Freq: Q-DEC, Timezone: None

In [17]: ((ts1-1)-1)[0]
Out[17]: Timestamp('1999-09-30 00:00:00', tz=None)

@mvds314
Copy link
Author

mvds314 commented Aug 13, 2013

Thnx for the solution, I did something like to as a "dirty hack" to fix the code.

However, still, if TimeStamp has an attribute freq, as it does. See for example:
ts1=pd.date_range('1/1/2000',periods=1,freq='Q')[0]
print ts[0].freq
And incremental operators are defined for TimeStamps: that is
ts1-1
Then frequency information should not be lost.

The whole purpose of have have timestamps with a frequency attribute is to use incremental operators on them right?

@jreback
Copy link
Contributor

jreback commented Aug 13, 2013

Ok this is a bug (but not where I thought);

The timestamp has the freq when its taken from the datetimeindex (its not printing it, which may be a separate issue), but its there

In [1]: ts1=pd.date_range('1/1/2000',periods=1,freq='Q')[0]

In [2]: ts1.freq
Out[2]: <1 QuarterEnd: startingMonth=12, kwds={'startingMonth': 12}, offset=<3 MonthEnds>>

The sub operation on a Timestamp is treated correctly, but the new freq is not propogated to the new date

In [12]: ts1.__sub__(1).freq
<This returns None>

@mvds314
Copy link
Author

mvds314 commented Aug 13, 2013

I completely agree! :)

@jreback
Copy link
Contributor

jreback commented Aug 13, 2013

this is a very straightforward change in pandas/tslib.pyx/Timestamp/__add__, just need to pass the offset when creating the new Timestamp objects; but will need some tests (and MAYBE modification of some existing tests)...

up for a PR?

@mvds314
Copy link
Author

mvds314 commented Aug 13, 2013

Good to hear that the change is straightforward!

What's a PR? :)

@cpcloud
Copy link
Member

cpcloud commented Aug 13, 2013

PR == pull request

@jreback
Copy link
Contributor

jreback commented Aug 13, 2013

have a look here http://pandas.pydata.org/developers.html

@mvds314
Copy link
Author

mvds314 commented Aug 13, 2013

Thanks for submitting your pull request so quickly. If I understand you correctly, you want me to pull you pull request and then we can discuss whether the issue is solved right?
I'm really used to working with git or any other version control system, so I might need some time to get get it working correctly. I'm going on vacation for a several day as of Thursday. So, it might take untill the weekend of the 24th until we can discuss your pull request. Would that be ok?

@jreback
Copy link
Contributor

jreback commented Aug 13, 2013

that is all fine
we r here to help
it is pretty nice when u find a bug them submit a patch in order to fix it!

@jreback
Copy link
Contributor

jreback commented Aug 13, 2013

Oh, i just reread your comment. I would like you to actually do the patch (by submitting a pull request), and add tests as needed. It helps you understand the codebase and even how to use pandas better!

@mvds314
Copy link
Author

mvds314 commented Aug 13, 2013

Sure, I'll try. I think it will work out, hardest part will be getting used to the conventions and git interface ;-)

@cpcloud
Copy link
Member

cpcloud commented Aug 13, 2013

i'm happy to help you with git. i used to be pretty terrible at it, now i'm less terrible at it. i know it can seem like a mountain of complexity but once you get the hang of it you'll wonder why anyone would ever use anything else especially once you get in the habit of branching all the time..

@cpcloud
Copy link
Member

cpcloud commented Aug 13, 2013

also see #3156 for some tools that will make git/github integration easier, esp. hub.

@jreback
Copy link
Contributor

jreback commented Sep 27, 2013

@Martin31415926 doing a PR for this?

@mvds314
Copy link
Author

mvds314 commented Sep 28, 2013

Hi jreback, I would like too... but I'm finishing my PhD thesis at the moment, so unfortunately I can't make time for it at the moment. Sorry :(

@rosnfeld
Copy link
Contributor

rosnfeld commented Mar 4, 2014

You can assign this to me - I'm in this code already looking at another minor Timestamp addition/subtraction issue. (which I haven't created an issue for, I figured I would just submit the PR)

@jreback
Copy link
Contributor

jreback commented Mar 4, 2014

gr8! thanks @rosnfeld

@rosnfeld
Copy link
Contributor

rosnfeld commented Mar 5, 2014

Design question: what if somebody adds/subtracts a timedelta (or timedelta64) that is not in keeping with the Timestamp's frequency?

timestamp = date_range('2014-03-05', periods=10, freq='D')[0]
new_timestamp = timestamp + datetime.timedelta(seconds=1)

Now it's a little weird to "mix" new_timestamp and timestamp, as they are no longer "aligned". One could say they have the same frequency or period, but different "phase".

Maybe we just copy the frequency over to the new object and trust that users know what they are doing? Or we don't support copying frequency with anything but integer addition/subtraction. I'm inclined to do the former. I feel we can do better than just not copying it, but doing any checking is going to be expensive; I'm not aware of a way to check that a new timestamp "fits" a given frequency-driven series of Timestamps other than by constructing something like a DatetimeIndex and testing for inclusion.

jreback added a commit that referenced this issue Mar 7, 2014
BUG: preserve frequency across Timestamp addition/subtraction (#4547)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Frequency DateOffsets
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants