Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nonexistent=shift in tz_localize not precise #24466

Closed
sdementen opened this issue Dec 28, 2018 · 1 comment · Fixed by #24493
Closed

nonexistent=shift in tz_localize not precise #24466

sdementen opened this issue Dec 28, 2018 · 1 comment · Fixed by #24493
Labels
Enhancement Timezones Timezone data dtype
Milestone

Comments

@sdementen
Copy link
Contributor

Problem description

As of today (2018-12-28), I read on http://pandas-docs.github.io/pandas-docs-travis/timeseries.html#nonexistent-times-when-localizing that

A DST transition may also shift the local time ahead by 1 hour creating nonexistent local times. The behavior of localizing a timeseries with nonexistent times can be controlled by the nonexistent argument. The following options are available:

raise: Raises a pytz.NonExistentTimeError (the default behavior)
NaT: Replaces nonexistent times with NaT
shift: Shifts nonexistent times forward to the closest real time

This is a great new feature (i.e. having leeway to manage NonExistentTimeError explicitly)!
To be sure I understand the problem of NonExistentTimeError correctly, is it correct to state they appear if an only if the time happens during a DST change (jumping one hour ahead) ? e.g. if I take tz=CET, we had the DST on the 2018-03-25, with the hour [02:00->03:00[ not existing so any localization of such time will raise a NonExistentTimeError:

Timestamp("2018-03-25T02:33:00").tz_localize("CET")`)
# pytz.exceptions.NonExistentTimeError: 2018-03-25 02:33:00

Or are there other cases ?

If so, I see the following behavior that would be useful besides the 'shift':

  • 'shift_backward' / 'shift_forward' ==> Shift nonexistent times backward/forward by one hour. Example of use case for shift_backward: I do some calculation on a local timestamp without tz, that I shift by 2 hours backward (to say "take it two hours before") and then I localize and get a NonExistentTimeError (e.g. Timestamp("2018-03-25T04:33:00") - DateOffset(hours=2)). I would like to get as a result of the tz_localize('CET'), the time "2018-03-25T01:33:00+0100" or "2018-03-25T03:33:00+0200" (and not "2018-03-25T01:59:59.99999+0100" or "2018-03-25T03:00:00+0200" as with 'snap_*')
  • 'snap_backward' / 'snap_forward' => Shifts nonexistent times backward/forward to the closest real time (ie explicit the direction of the shift). Example of use case for snap_backward: if I want to get the value of something known at some instant T and T is not existent, I would rather prefer to have the value at some instant T* < T (='backward') to avoid having "forward looking information". However, the 'snap_backward' is ill defined for the DST as the closest backward time for 2018-03-25T02:33:00 in CET is 2018-03-25 01:59:59.999999999...=2018-03-25 02:00:00 which is non existent... I guess this is the reason why the current 'shift' only propose the forward version, correct?

I miss essentially the 'shift_backward' ability in my day to day cases.

@mroeschke
Copy link
Member

The reason why "snap_backward" behavior was not included was because I didn't think shifting back to '[time]:59:59.999999999' would be entirely useful. If it would be, it could be added along with shift ("snap_forward")

'shift_backward' / 'shift_forward' behavior could be added as well. nonexistent could take a timedelta-like object and the nonexistent times could be shifted by that difference instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants