Skip to content

Times are parsed inconsistently with respect to local time when they should be UTC #700

@ghost

Description

I recognised that the following tests are failing

  • tests/messages/test_frontend.py::InitCatalogTestCase::test_correct_init_more_than_2_plurals
  • tests/messages/test_frontend.py::InitCatalogTestCase::test_correct_init_singular_plural_forms
  • tests/messages/test_frontend.py::InitCatalogTestCase::test_keeps_catalog_non_fuzzy
  • tests/messages/test_frontend.py::InitCatalogTestCase::test_supports_no_wrap
  • tests/messages/test_frontend.py::InitCatalogTestCase::test_supports_width
  • tests/messages/test_frontend.py::InitCatalogTestCase::test_with_output_dir
  • tests/messages/test_frontend.py::CommandLineInterfaceTestCase::test_init_more_than_2_plural_forms
  • tests/messages/test_frontend.py::CommandLineInterfaceTestCase::test_init_singular_plural_forms
  • tests/messages/test_frontend.py::CommandLineInterfaceTestCase::test_init_with_output_dir

when run with a timezone different to UTC, e.g. like so

UTC=UTC+6 pytest -vv -k test_init_more_than_2_plural_forms

The assertion printed states a difference in POT-Creation-Date: the values are 2007-04-01 15:30+0200 vs 2007-04-01 9:30+0200 (ignore the +0200 timezone as that is not the timezone of the system but more or less just copied from input to output).
The date and time is actually a hardcoded 2007-04-01 15:30+0200 in both the expected string as well as in the source pot-file.

I dug a little in the source code as at some point something (i.e. Babel ; who else?) must do some magic conversion to convert that string into a (wrong) time and back to a string. I think I found the place in babel/messages/catalog.py function

def _parse_datetime_header(value):
    match = re.match(r'^(?P<datetime>.*?)(?P<tzoffset>[+-]\d{4})?$', value)

    tt = time.strptime(match.group('datetime'), '%Y-%m-%d %H:%M')
    ts = time.mktime(tt)
    dt = datetime.fromtimestamp(ts)

    # ... tzoffset handling ...

which as you can see does some wild conversation stuff back and forth to convert the string into a datetime object. The part to point out is the time.mktime call.
From the python documentation [0]:

From To Use
struct_time in UTC seconds since the epoch calendar.timegm()
struct_time in local time seconds since the epoch mktime()

So that's the function interpreting the string "15:30" into the time 9:30 as it does it with respect to the local time, i.e. with local timezone offset.

The issue can be fixed by using the UTC version calendar.timegm() as the string should be treated independent of the local time (if I understand the code correctly).

Alternatively, do not go through all the hoops with epoch time and such and directly use datetime.strptime() [1] to parse the string directly into a datetime object.

Pull request follows in a moment.

[0] https://docs.python.org/2/library/time.html
[1] https://docs.python.org/2/library/datetime.html#datetime.datetime.strptime

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions