Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.to() fails with large dates due to dateutil timestamp overflow #991

Open
matthuisman opened this issue Jun 14, 2021 · 9 comments
Open

.to() fails with large dates due to dateutil timestamp overflow #991

matthuisman opened this issue Jun 14, 2021 · 9 comments
Labels

Comments

@matthuisman
Copy link

matthuisman commented Jun 14, 2021

Issue Description

(I know the issue is actually inside dateutil, but dateutil isn't written to support larger timestamps like arrow is)

64bit

import arrow

date = arrow.get('3100-01-01T07:00:00Z')
print(date)
print(date.to('local'))
Traceback (most recent call last):
  File "C:\Users\Matt\Desktop\test2.py", line 4, in <module>
    print(date.to("local"))
  File "C:\Users\Matt\AppData\Local\Programs\Python\Python39\lib\site-packages\arrow\arrow.py", line 1076, in to
    dt = self._datetime.astimezone(tz)
  File "C:\Users\Matt\AppData\Local\Programs\Python\Python39\lib\site-packages\dateutil\tz\_common.py", line 144, in fromutc
    return f(self, dt)
  File "C:\Users\Matt\AppData\Local\Programs\Python\Python39\lib\site-packages\dateutil\tz\_common.py", line 258, in fromutc
    dt_wall = self._fromutc(dt)
  File "C:\Users\Matt\AppData\Local\Programs\Python\Python39\lib\site-packages\dateutil\tz\_common.py", line 222, in _fromutc
    dtoff = dt.utcoffset()
  File "C:\Users\Matt\AppData\Local\Programs\Python\Python39\lib\site-packages\dateutil\tz\tz.py", line 222, in utcoffset
    if self._isdst(dt):
  File "C:\Users\Matt\AppData\Local\Programs\Python\Python39\lib\site-packages\dateutil\tz\tz.py", line 291, in _isdst
    dstval = self._naive_is_dst(dt)
  File "C:\Users\Matt\AppData\Local\Programs\Python\Python39\lib\site-packages\dateutil\tz\tz.py", line 260, in _naive_is_dst
    return time.localtime(timestamp + time.timezone).tm_isdst
OSError: [Errno 22] Invalid argument

32bit (user with a Raspberry Pi 3B+ initially reported this to me - 2050 isn't that far away)

import arrow

date = arrow.get('2050-01-01T07:00:00Z')
print(date)
print(date.to('local'))
File "/home/osmc/.kodi/addons/slyguy.disney.plus/resources/lib/plugin.py", line 484, in _parse_video
available = available.to('local')
File "/home/osmc/.kodi/addons/script.module.slyguy/resources/modules/arrow/arrow.py", line 722, in to
dt = self._datetime.astimezone(tz)
File "/home/osmc/.kodi/addons/script.module.slyguy/resources/modules/dateutil/tz/_common.py", line 144, in fromutc
return f(self, dt)
File "/home/osmc/.kodi/addons/script.module.slyguy/resources/modules/dateutil/tz/_common.py", line 258, in fromutc
dt_wall = self._fromutc(dt)
File "/home/osmc/.kodi/addons/script.module.slyguy/resources/modules/dateutil/tz/_common.py", line 222, in _fromutc
dtoff = dt.utcoffset()
File "/home/osmc/.kodi/addons/script.module.slyguy/resources/modules/dateutil/tz/tz.py", line 222, in utcoffset
if self._isdst(dt):
File "/home/osmc/.kodi/addons/script.module.slyguy/resources/modules/dateutil/tz/tz.py", line 291, in _isdst
dstval = self._naive_is_dst(dt)
File "/home/osmc/.kodi/addons/script.module.slyguy/resources/modules/dateutil/tz/tz.py", line 260, in _naive_is_dst
return time.localtime(timestamp + time.timezone).tm_isdst
ValueError: timestamp out of range for platform time_t

Due to the timestamp being too large here:
https://github.com/dateutil/dateutil/blob/master/dateutil/tz/tz.py#L259

If you hack _naive_is_dst to something like below

def _naive_is_dst(self, dt):
    timestamp = _datetime_to_timestamp(dt) + time.timezone

    MAX_TIMESTAMP = 32503719599.0
    MAX_TIMESTAMP_MS = MAX_TIMESTAMP * 1000
    MAX_TIMESTAMP_US = MAX_TIMESTAMP * 1000000

    if timestamp > MAX_TIMESTAMP:
        if timestamp < MAX_TIMESTAMP_MS:
            timestamp /= 1e3
        elif timestamp < MAX_TIMESTAMP_US:
            timestamp /= 1e6

    return time.localtime(timestamp).tm_isdst

it works as intended.
So maybe arrow needs to do it's own astimezone() so it can use it's normalise timestamp function.

For my workaround, I simply updated the dateutil code to use arrows normalize_timestamp
matthuisman/slyguy.addons@cccc92a

System Info

  • 🖥 OS name and version:
  • 🐍 Python version:
  • 🏹 Arrow version:
@anishnya
Copy link
Member

Thanks @matthuisman for the report. @jadchaar, @krisfremen and I will take a look at this soon and see what we can do.

@jadchaar
Copy link
Member

I am unable to reproduce this on my 64-bit macOS machine, so I think it is limited to Linux and windows. I think this is where the exception is triggered (our to() wrapper calls the Arrow constructor):

arrow/arrow/arrow.py

Lines 176 to 178 in 7c9632c

self._datetime = dt_datetime(
year, month, day, hour, minute, second, microsecond, tzinfo, fold=fold
)

Therefore, we may be able to wrap this in a try/except and in the except block, get the timestamp, normalize it (using normalize_timestamp), and then extract an arrow object from that.

Are you able to reproduce this on your end @anishnya? We should probably get this reproduced either on a local machine or on the CI builds so we can ensure our patch works as expected once we attempt a fix.

@matthuisman
Copy link
Author

matthuisman commented Jun 22, 2021

C:\Users\Matt\Desktop>py -3 test.py
3100-01-01T07:00:00+00:00
Traceback (most recent call last):
  File "C:\Users\Matt\Desktop\test.py", line 4, in <module>
    print(date.to('local'))
  File "C:\Users\Matt\AppData\Local\Programs\Python\Python39\lib\site-packages\arrow\arrow.py", line 1076, in to
    dt = self._datetime.astimezone(tz)
  File "C:\Users\Matt\AppData\Local\Programs\Python\Python39\lib\site-packages\dateutil\tz\_common.py", line 144, in fromutc
    return f(self, dt)
  File "C:\Users\Matt\AppData\Local\Programs\Python\Python39\lib\site-packages\dateutil\tz\_common.py", line 258, in fromutc
    dt_wall = self._fromutc(dt)
  File "C:\Users\Matt\AppData\Local\Programs\Python\Python39\lib\site-packages\dateutil\tz\_common.py", line 222, in _fromutc
    dtoff = dt.utcoffset()
  File "C:\Users\Matt\AppData\Local\Programs\Python\Python39\lib\site-packages\dateutil\tz\tz.py", line 222, in utcoffset
    if self._isdst(dt):
  File "C:\Users\Matt\AppData\Local\Programs\Python\Python39\lib\site-packages\dateutil\tz\tz.py", line 291, in _isdst
    dstval = self._naive_is_dst(dt)
  File "C:\Users\Matt\AppData\Local\Programs\Python\Python39\lib\site-packages\dateutil\tz\tz.py", line 260, in _naive_is_dst
    return time.localtime(timestamp + time.timezone).tm_isdst
OSError: [Errno 22] Invalid argument

The first print will print fine - it's just the "to" that fails.
The line it fails on is dt = self._datetime.astimezone(tz)
I'm pretty confident the issue is inside dateutil so maybe arrow will need to create it's own astimezone using normalize_timestamp? Or possibly dateutil will fix eventually?

@anishnya
Copy link
Member

@matthuisman dateutil has seen some recent work on it, but there was about a year stretch where there were zero commits to dateutil. I'm not too sure how likely it is dateutil will fix this issue and if they do fix it, when that fix will be merged in. We've had some internal discussions (well before this issue came up) about dropping the dateutil dependency from Arrow because of the inconsistency of dateutil's maintenance, but we haven't made a decision yet.

@anishnya
Copy link
Member

anishnya commented Jun 23, 2021

I am unable to reproduce this on my 64-bit macOS machine, so I think it is limited to Linux and windows. I think this is where the exception is triggered (our to() wrapper calls the Arrow constructor):

arrow/arrow/arrow.py

Lines 176 to 178 in 7c9632c

self._datetime = dt_datetime(
year, month, day, hour, minute, second, microsecond, tzinfo, fold=fold
)

Therefore, we may be able to wrap this in a try/except and in the except block, get the timestamp, normalize it (using normalize_timestamp), and then extract an arrow object from that.

Are you able to reproduce this on your end @anishnya? We should probably get this reproduced either on a local machine or on the CI builds so we can ensure our patch works as expected once we attempt a fix.

I've tried on my local machine as well (macOS 64bit) and haven't been able to reproduce this issue as well. I'll try on a Windows machine as well and provide a future update.

@matthuisman
Copy link
Author

matthuisman commented Jun 23, 2021

I think the issue is the same timestamp bug that requires the below lines in constants.py:
https://github.com/arrow-py/arrow/blob/master/arrow/constants.py#L16

You should be able to reproduce on mac with something like the below

import arrow
date = arrow.get(arrow.constants.MAX_TIMESTAMP).shift(years=100)
print(date)
print(date.to("local"))

@anishnya anishnya added this to To do in Release 1.2.0 Jun 24, 2021
@jadchaar
Copy link
Member

Finding the max timestamp in a platform agnostic manner has proven to be very difficult (no standard ways to do this in the Python standard library or in dateutil). So the MAX_TIMESTAMP is used as a rought guide for when to compute the normalized timestamp, but it is imperfect because datetime seems to have trouble with its own max timestamp (which is what we use in constants.py):

>>> datetime.fromtimestamp(datetime.max.timestamp())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: year 0 is out of range

@systemcatch
Copy link
Collaborator

Can't reproduce on my machine.

>>> import arrow
>>> date = arrow.get('3100-01-01T07:00:00Z')
>>> date
<Arrow [3100-01-01T07:00:00+00:00]>
>>> print(date.to('local'))
3100-01-01T07:00:00+00:00
>>> import platform
>>> platform.uname()
uname_result(system='Linux', node='Z490', release='5.8.0-63-generic', version='#71-Ubuntu SMP Tue Jul 13 15:59:12 UTC 2021', machine='x86_64')

@matthuisman
Copy link
Author

how about this

import arrow
date = arrow.get('1966-08-24T00:00:00Z')
print(date)
print(date.to('local'))

or

import arrow
date = arrow.get(arrow.constants.MAX_TIMESTAMP).shift(years=100)
print(date)
print(date.to("local"))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Development

No branches or pull requests

4 participants