Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dateutil mishandles today's dates on NetBSD 9.0 etc. #1059

Open
eggert opened this issue Jun 22, 2020 · 6 comments
Open

dateutil mishandles today's dates on NetBSD 9.0 etc. #1059

eggert opened this issue Jun 22, 2020 · 6 comments

Comments

@eggert
Copy link

eggert commented Jun 22, 2020

#462 talks about a longstanding bug in dateutil. In 1995, tzcode 95f introduced a 64-bit extension to TZif files, designed to make them work after the year 2038. dateutil's TZif parser does not use this data, causing it to mishandle timestamps after the last explicit 32-bit transition in a TZif file, which means timestamps after 2038 in (say) Los Angeles are mishandled by dateutil. I'm sure you've been meaning to fix this before 2038 rolls around.

Something new has happened, though. A year ago, tzdb 2019b introduced the '-b slim' flag to the zic command, causing it to omit unnecessary data, the idea being to speed up the parsing of TZif files and make them smaller. Since the 32-bit data entries are always unnecessary, they're omitted. NetBSD 9.0, dated 2020-02-14, is using the '-b slim' flag to generate its TZif files, which means dateutil does not work even for today's timestamps on NetBSD. I assume other platforms will follow suit.

This change means it's time to boost the priority of fixing this bug.

To reproduce the problem, run the shell script datetime-slim-bug.sh (see below) in a writeable directory. It will create a slim TZif file 'Los_Angeles' and will use it to convert the time_t value 1592782632 into Los Angeles time. GNU 'date' handles this correctly, but datetime gets confused and thinks that Los Angeles is observing UTC. On my platform (Fedora 31 x86-64, Python 3.7.7, datetime 2.8.0) this script outputs:

2020-06-21 16:37:12-07:00
2020-06-21 23:37:12+00:00

The first line (from GNU 'date') is correct; the second line (from datetime) is wrong).

Here are the contents of datetime-slim-bug.sh:

#!/bin/sh
export LC_ALL=C
PWD=`pwd`

base64 -d >Los_Angeles <<'EOF'
VFppZjIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQAAAAEAAAAAAAAAVFppZjIA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB9AAAABQAAABT/////XgQawP////+epkig////
/5+7FZD/////oIYqoP////+hmveQ/////8uJGqD/////0iP0cP/////SYSYQ/////9b+dFz/////
2ICtkP/////a/sOQ/////9vAkBD/////3N6lkP/////dqayQ/////96+h5D/////34mOkP/////g
nmmQ/////+FpcJD/////4n5LkP/////jSVKQ/////+ReLZD/////5Sk0kP/////mR0oQ/////+cS
URD/////6CcsEP/////o8jMQ/////+oHDhD/////6tIVEP/////r5vAQ/////+yx9xD/////7cbS
EP/////ukdkQ/////++v7pD/////8HG7EP/////xj9CQ//////J/wZD/////82+ykP/////0X6OQ
//////VPlJD/////9j+FkP/////3L3aQ//////goohD/////+Q9YkP/////6CIQQ//////r4gyD/
////++hmEP/////82GUg//////3ISBD//////rhHIP//////qCoQAAAAAACYKSAAAAAAAYgMEAAA
AAACeAsgAAAAAANxKJAAAAAABGEnoAAAAAAFUQqQAAAAAAZBCaAAAAAABzDskAAAAAAHjUOgAAAA
AAkQzpAAAAAACa2/IAAAAAAK8LCQAAAAAAvgr6AAAAAADNnNEAAAAAANwJGgAAAAAA65rxAAAAAA
D6muIAAAAAAQmZEQAAAAABGJkCAAAAAAEnlzEAAAAAATaXIgAAAAABRZVRAAAAAAFUlUIAAAAAAW
OTcQAAAAABcpNiAAAAAAGCJTkAAAAAAZCRggAAAAABoCNZAAAAAAGvI0oAAAAAAb4heQAAAAABzS
FqAAAAAAHcH5kAAAAAAesfigAAAAAB+h25AAAAAAIHYrIAAAAAAhgb2QAAAAACJWDSAAAAAAI2ra
EAAAAAAkNe8gAAAAACVKvBAAAAAAJhXRIAAAAAAnKp4QAAAAACf+7aAAAAAAKQqAEAAAAAAp3s+g
AAAAACrqYhAAAAAAK76xoAAAAAAs036QAAAAAC2ek6AAAAAALrNgkAAAAAAvfnWgAAAAADCTQpAA
AAAAMWeSIAAAAAAycySQAAAAADNHdCAAAAAANFMGkAAAAAA1J1YgAAAAADYy6JAAAAAANwc4IAAA
AAA4HAUQAAAAADjnGiAAAAAAOfvnEAAAAAA6xvwgAAAAADvbyRAAAAAAPLAYoAAAAAA9u6sQAAAA
AD6P+qAAAAAAP5uNEAAAAABAb9ygAAAAAEGEqZAAAAAAQk++oAAAAABDZIuQAAAAAEQvoKAAAAAA
RURtkAAAAABF89MgAgECAQIDBAIBAgECAQIBAgECAQIBAgECAQIBAgECAQIBAgECAQIBAgECAQIB
AgECAQIBAgECAQIBAgECAQIBAgECAQIBAgECAQIBAgECAQIBAgECAQIBAgECAQIBAgECAQIBAgEC
AQIBAgECAQIBAgECAQIBAgECAQIBAgH//5EmAAD//52QAQT//4+AAAj//52QAQz//52QARBMTVQA
UERUAFBTVABQV1QAUFBUAApQU1Q4UERULE0zLjIuMCxNMTEuMS4wCg==
EOF

now=1592782632
TZ=$PWD/Los_Angeles date -d@$now +'%Y-%m-%d %H:%M:%S%:z'
python -c 'if 1:
       import datetime, dateutil.tz
       tz = dateutil.tz.gettz("'"$PWD"'/Los_Angeles");
       print(datetime.datetime.fromtimestamp('$now', tz))
'
@pganssle
Copy link
Member

Thanks. Now that zoneinfo is in the standard library, and considering I wrote it (and can thus relicense it freely), I expect this to be a reasonably quick fix.

At the same time I'm planning to switch our "fall-back" zones to using the tzdata package, which also uses zic -b slim , so it's good to know that fixing this bug should be a prerequisite to that.

pganssle added a commit to pganssle/dateutil that referenced this issue Aug 28, 2020
This should be reverted when support for version 2+ TZif files is added
(see dateutilGH-1059).
pganssle added a commit to pganssle/dateutil that referenced this issue Aug 28, 2020
This should be reverted when support for version 2+ TZif files is added
(see dateutilGH-1059).
pganssle added a commit to pganssle/dateutil that referenced this issue Aug 28, 2020
This should be reverted when support for version 2+ TZif files is added
(see dateutilGH-1059).
pganssle added a commit to pganssle/dateutil that referenced this issue Aug 28, 2020
This should be reverted when support for version 2+ TZif files is added
(see dateutilGH-1059).
pganssle added a commit to pganssle/dateutil that referenced this issue Aug 28, 2020
This should be reverted when support for version 2+ TZif files is added
(see dateutilGH-1059).
pganssle added a commit to pganssle/dateutil that referenced this issue Aug 28, 2020
This should be reverted when support for version 2+ TZif files is added
(see dateutilGH-1059).
pganssle added a commit to pganssle/dateutil that referenced this issue Aug 28, 2020
This should be reverted when support for version 2+ TZif files is added
(see dateutilGH-1059).
pganssle added a commit to pganssle/dateutil that referenced this issue Aug 28, 2020
This should be reverted when support for version 2+ TZif files is added
(see dateutilGH-1059).
@sbraz
Copy link

sbraz commented Oct 16, 2020

Now that the slim format has become the default, tests fail on distros which packaged tzdata 2020b as-is. It's easy to reproduce on Arch before they reverted to fat bloats in archlinux/svntogit-packages@7d8fb81:

FROM archlinux
RUN pacman --noconfirm -Syu
RUN pacman --noconfirm -S git python-tox
RUN pacman --noconfirm -U https://archive.archlinux.org/packages/t/tzdata/tzdata-2020b-1-x86_64.pkg.tar.zst
RUN git clone --depth 1 https://github.com/dateutil/dateutil/
WORKDIR dateutil
RUN tox -e py38

This yields 30 failed, 1985 passed, 47 skipped, 19 xfailed.

Example error:

___________________ test_resolve_imaginary[tzi4-dt4-dt_exp4] ___________________                                                                                                                                   
                                                                                                                                                                                                                   
tzi = tzfile('/usr/share/zoneinfo/Africa/Monrovia')                                                                                                                                                                
dt = datetime.datetime(1972, 1, 7, 0, 30, tzinfo=tzfile('/usr/share/zoneinfo/Africa/Monrovia'))                                                                                                                    
dt_exp = datetime.datetime(1972, 1, 7, 1, 14, 30, tzinfo=tzfile('/usr/share/zoneinfo/Africa/Monrovia'))                                                                                                            
                                                                                                         
    @pytest.mark.tz_resolve_imaginary                                                                    
    @pytest.mark.parametrize('tzi, dt, dt_exp', resolve_imaginary_tests)                                 
    def test_resolve_imaginary(tzi, dt, dt_exp):                                                                                                                                                                   
        dt = dt.replace(tzinfo=tzi)                                                                                                                                                                                
        dt_exp = dt_exp.replace(tzinfo=tzi)                                                              
                                                                                                                                                                                                                   
        dt_r = tz.resolve_imaginary(dt)                                                                                                                                                                            
>       assert dt_r == dt_exp                                                                                                                                                                                      
E       AssertionError: assert datetime.datetime(1972, 1, 7, 0, 30, tzinfo=tzfile('/usr/share/zoneinfo/Africa/Monrovia')) == datetime.datetime(1972, 1, 7, 1, 14, 30, tzinfo=tzfile('/usr/share/zoneinfo/Africa/Mon
rovia'))                                                                                                                                                                                                           
                                                                                                                                                                                                                   
dateutil/test/test_tz.py:2809: AssertionError                                                                                                                                                                      

@eggert
Copy link
Author

eggert commented Oct 31, 2020

I have created PR#1091, which should fix this issue.

@smy2748
Copy link

smy2748 commented Nov 10, 2020

Just for some more context: Looks like alpine is now using the newest zic, so timezones aren't working out of the box for those containers.

@eggert
Copy link
Author

eggert commented Nov 10, 2020

This problem should go away once PR#1091 gets merged into dateutil. I don't know what the schedule for that is.

@pganssle
Copy link
Member

I might be able to look at it this weekend. Last time I tried to get some stuff merged to prepare a release, the CI was pretty broken and annoyingly hard to fix, which ate up all my OSS time for that day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants