New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent equality test around DST change with shared tzfile #338

Closed
jamesblackburn opened this Issue Dec 22, 2016 · 4 comments

Comments

Projects
None yet
2 participants
@jamesblackburn

jamesblackburn commented Dec 22, 2016

Reproduced in dateutil 2.6.0 and 2.4.2 : Python 2.7.11 |Anaconda custom (x86_64)

from dateutil.tz import gettz
from datetime import datetime as dt

tz = gettz('Europe/London')
x = dt(2007, 3, 25, 1, tzinfo=tz)
# x: datetime.datetime(2007, 3, 25, 1, 0, tzinfo=tzfile('/usr/share/zoneinfo/Europe/London'))

y = dt.fromtimestamp(int(x.strftime('%s')), tz)
# y:  datetime.datetime(2007, 3, 25, 0, 0, tzinfo=tzfile('/usr/share/zoneinfo/Europe/London'))
assert x == y # This fails

However if you use a new tz: gettz('Europe/London') or copy the existing one above, the equality comparison works:

z = dt.fromtimestamp(int(x.strftime('%s')), copy.copy(tz))
# z: datetime.datetime(2007, 3, 25, 0, 0, tzinfo=tzfile('/usr/share/zoneinfo/Europe/London'))
assert x == z  # This passes

This is surprising, as the only difference is the re-use of the tzfile in the first instance. I would have though tzfile objects are immutable?

x == y
>>> False
x == z
>>> True
y == z
>>> True

Given both datetimes represent the same ms-since-epoch instant I would expect both assertions to pass.

@jamesblackburn jamesblackburn changed the title from Inconsistent equality test around DST change to Inconsistent equality test around DST change with shared tzfile Dec 22, 2016

jamesblackburn added a commit to manahl/arctic that referenced this issue Dec 22, 2016

Copy tzfile to prevent equality issues.
Sharing tzfile objects doesn't appear to be safe particularly on DST
change:
dateutil/dateutil#338
@pganssle

This comment has been minimized.

Member

pganssle commented Dec 22, 2016

Hm, it's very surprising that this happens both in 2.4.2 and in 2.6.0, and in Python 2.7.x. I would think that it might be related to this issue with PEP 495. Technically, this may be "correct" behavior, with PEP 495.

@pganssle

This comment has been minimized.

Member

pganssle commented Dec 22, 2016

Hm, actually, looking more closely, it seems that this may be the exact opposite of the correct behavior. Very surprising. You definitely should not be making copies of tzinfo objects, which are supposed to be shared. I'll play around with it a bit. If it's only happening with Europe/London, it might be related to #321.

@pganssle pganssle added this to the 2.6.1 milestone Dec 22, 2016

@jamesblackburn

This comment has been minimized.

jamesblackburn commented Dec 22, 2016

Thanks for looking @pganssle. Also verified it to be an issue in python 3.5.1 - if python datetime is implicated.

Saw #321 and thought it could be related. However couldn't come up with a theory as to why the copy / tz reload should make a difference.

@pganssle pganssle modified the milestones: wontfix, 2.6.1 Jun 3, 2017

@pganssle pganssle added the wontfix label Jun 3, 2017

@pganssle

This comment has been minimized.

Member

pganssle commented Jun 3, 2017

@jamesblackburn OK, I've found the problem - March 25, 2007 at 1 AM is not a valid time in Europe/London, which is why the semantics are wonky:

from datetime import datetime as dt
from dateutil import tz
LON = tz.gettz('Europe/London')
tz.datetime_exists(datetime(2007, 3, 25, 1, tzinfo=LON)  # False
tz.datetime_exists(datetime(2007, 3, 25, 0, tzinfo=LON)  # True

The reason the equality semantics are inconsistent between y and z is that y is treated as an in-zone comparison and the wall clock (midnight vs. 1AM) is compared, whereas when x and z are compared, it's treated as an inter-zone comparison, and both are converted to UTC, and that is compared. For the y == z comparison, that works well because neither y or z are imaginary times, and they indeed both represent the same UTC time - being that they are the same wall time in the same zone.

However, x doesn't exist and as such it does not map to any UTC time, and so when you try to compare it to something in another zone, it is erroneously mapped to the point in UTC that is 1 DST-offset before the beginning of the gap - which is the same value that is created when you created y and z by passing them through a stage where they were represented as UTC.

So, the lessons are:

  1. Check to see if the datetimes you construct are imaginary when you create them if you're working in zones with gaps.
  2. Use tzinfo singletons wherever possible. In this case y had what I would consider to be closer to the right answer (though it's not clear that there is a right answer since x does not actually represent any real time).

@pganssle pganssle closed this Jun 3, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment