Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nonexistent=shift in tz_localize not precise #24466

Closed
sdementen opened this issue Dec 28, 2018 · 1 comment

Comments

Projects
None yet
3 participants
@sdementen
Copy link
Contributor

commented Dec 28, 2018

Problem description

As of today (2018-12-28), I read on http://pandas-docs.github.io/pandas-docs-travis/timeseries.html#nonexistent-times-when-localizing that

A DST transition may also shift the local time ahead by 1 hour creating nonexistent local times. The behavior of localizing a timeseries with nonexistent times can be controlled by the nonexistent argument. The following options are available:

raise: Raises a pytz.NonExistentTimeError (the default behavior)
NaT: Replaces nonexistent times with NaT
shift: Shifts nonexistent times forward to the closest real time

This is a great new feature (i.e. having leeway to manage NonExistentTimeError explicitly)!
To be sure I understand the problem of NonExistentTimeError correctly, is it correct to state they appear if an only if the time happens during a DST change (jumping one hour ahead) ? e.g. if I take tz=CET, we had the DST on the 2018-03-25, with the hour [02:00->03:00[ not existing so any localization of such time will raise a NonExistentTimeError:

Timestamp("2018-03-25T02:33:00").tz_localize("CET")`)
# pytz.exceptions.NonExistentTimeError: 2018-03-25 02:33:00

Or are there other cases ?

If so, I see the following behavior that would be useful besides the 'shift':

  • 'shift_backward' / 'shift_forward' ==> Shift nonexistent times backward/forward by one hour. Example of use case for shift_backward: I do some calculation on a local timestamp without tz, that I shift by 2 hours backward (to say "take it two hours before") and then I localize and get a NonExistentTimeError (e.g. Timestamp("2018-03-25T04:33:00") - DateOffset(hours=2)). I would like to get as a result of the tz_localize('CET'), the time "2018-03-25T01:33:00+0100" or "2018-03-25T03:33:00+0200" (and not "2018-03-25T01:59:59.99999+0100" or "2018-03-25T03:00:00+0200" as with 'snap_*')
  • 'snap_backward' / 'snap_forward' => Shifts nonexistent times backward/forward to the closest real time (ie explicit the direction of the shift). Example of use case for snap_backward: if I want to get the value of something known at some instant T and T is not existent, I would rather prefer to have the value at some instant T* < T (='backward') to avoid having "forward looking information". However, the 'snap_backward' is ill defined for the DST as the closest backward time for 2018-03-25T02:33:00 in CET is 2018-03-25 01:59:59.999999999...=2018-03-25 02:00:00 which is non existent... I guess this is the reason why the current 'shift' only propose the forward version, correct?

I miss essentially the 'shift_backward' ability in my day to day cases.

@mroeschke

This comment has been minimized.

Copy link
Member

commented Dec 28, 2018

The reason why "snap_backward" behavior was not included was because I didn't think shifting back to '[time]:59:59.999999999' would be entirely useful. If it would be, it could be added along with shift ("snap_forward")

'shift_backward' / 'shift_forward' behavior could be added as well. nonexistent could take a timedelta-like object and the nonexistent times could be shifted by that difference instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.