Skip to content

proposal: time: handle time zones better in Parse and LoadLocation #63345

@robpike

Description

@robpike

Consider this unedited typescript running on my Mac with macOS and a recent Go distribution:

% cat x.go
package main

import (
	"fmt"
	"log"
	"time"
)

func main() {
	t, err := time.Parse("2006-01-02 15:04:05 MST", "2023-09-27 18:00:00 CEST")
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(t)
}
% go run x.go
2023-09-27 18:00:00 +0000 CEST
% TZ=CEST go run x.go
2023-09-27 18:00:00 +0000 CEST
% TZ=Europe/Berlin go run x.go
2023-09-27 18:00:00 +0200 CEST
% 

There are a few mysteries here for the uninitiated.

First, the time zone in the result - which is printed by time.Time.String so is actually what is held in the time value - always reads CEST. That seems good for starters.

Yet second, only the last of the three runs prints a correct time zone offset for CEST. The first two have it at zero, which is always wrong for CEST.

Third, even when I explicitly set the time zone through the TZ variable, it still doesn't get the offset right.

Fourth, even in the last run, the time zone prints as CEST although one might expect Europe/Berlin.

Fifth, perhaps strangest of all, this is working correctly according to the documentation in the time package.

This is a proposal to address, if not completely clear up, these mysteries.

Allow me to explain. (If you don't know much about time zones and how they work in modern computing, I suggest reading this IETF document first.)

The concept of time zone is not a first-class citizen in the time package. Instead, "location" is paramount. To get the correct rendering of a time, we must in effect ask for the time in Berlin, not in Central European Standard Time. In library terms, we call LoadLocation, not LoadTimeZone. There are a number of reasons for this, and it is confusing, but the fundamental one is that is how the IETF time zone (ha!) database works. That is why the only run above that gets everything right is the one where we set the location, even though that is ironically, incorrectly and confusingly done through the legacy TZ environment variable, whose initials of course stand for "time zone".

What happens in that third run is that because our location is Berlin, we have loaded the data for Berlin, and in that data is a description of the time zone called CEST. There is no time zone file for CEST itself. The only description of CEST is inside the data for Berlin. (Have a look for yourself; the data is in /usr/share/zoneinfo/europe/Berlin on Unix, although it's in binary; you need a program called zdump to examine it.)

In the first run, my location is not in central Europe, the data for CEST is never loaded, and time.Parse humors me but delivers an unsatisfactory result. I'll explain why in a minute. On the other hand, if you happen to live in central Europe, that first run will in fact work properly for you.

In the second run, a reasonable attempt to address this issue through the time-zone-looking TZ variable still fails, because CEST is not a location and therefore cannot be LoadLocationed.

You might ask, "Why doesn't time.Parse give an error then, if CEST is not a valid location?" It's because of this paragraph, the last in the doc comment for time.Parse:

When parsing a time with a zone abbreviation like MST, if the
zone abbreviation has a defined offset in the current location,
then that offset is used. The zone abbreviation "UTC" is recognized
as UTC regardless of location. If the zone abbreviation is unknown,
Parse records the time as being in a fabricated location with the given
zone abbreviation and a zero offset. This choice means that such a time can
be parsed and reformatted with the same layout losslessly, but the exact
instant used in the representation will differ by the actual zone offset.
To avoid such problems, prefer time layouts that use a numeric zone offset,
or use ParseInLocation.

I claim that this is an unfortunate design decision, albeit one made in good faith with a tinge of ignorance back in the early days of Go's development.

(You might also ask, "Why does time.Parse accept time zone strings and not locations?" There the answer is simpler: That's what time values look like, confusing though it is in this context.)

Compatibility concerns might argue we can't fix this behavior, but I believe it's confusing and results in apparently correct but actually wrong results, and should be fixed. But we can't just switch the whole thing to time zones for the very reason that locations were chosen as the fundamental unit. But we can do something about time zones.

Here is the proposal.

If you look in the time zone database manually, you'll find that all the time zone abbreviations like EST and AEST and CEST are unique, or at least nearly all are. In the data we can see the offset, which is a fixed value. For instance, EST is always offset by 5 hours from UTC, while EDT is always offset by 4 hours. I propose:

  1. When building the (misnamed) time zone data into the Go library, we actually include the time zone information as well as the location data for all unique time zone strings. It could just be a small map of names to offsets. There is a historical component we could use, but the data is actually stable enough that I wouldn't worry about it, and just use the most recent offset values.
  2. If Parse or LoadLocation is given a location/time zone that is not loadable, instead of incorrectly using a zero offset, we look up the name in that table. If it exists, we build a nonce Location with the provided name and fixed offset.
  3. If it does not exist, for compatibility we will need to keep the original incorrect behavior by assuming a zero offset. If we are willing to break compatibility, we could give an error. I'm not certain which is the best answer here.

The properties of these nonce locations are worth enumerating:

  1. They will not correctly handle daylight savings time switching. They are a fixed offset. If you ask for EST in the US summer, you will get the five-hour offset not the four-hour one you would if you used America/New_York as your location. But what gets printed will be logical, consistent, and reasonable. It will be a valid time with a matching time zone name and offset.
  2. If you ask for EST while in the location America/New_York in the summer, the time will come back with the correct location that prints as EDT. In other words, the behavior will still depend on your location, just not so confusingly.
  3. All three runs of the program above would always give a two hour offset for CEST (daylight savings time aside), regardless of the location running the program.
  4. Because they are not correct locations, they may actually make things worse in some cases by having people depend more on these nonce locations than they should, but that may be a tradeoff worth making.

I believe this approach is easy, reasonable, less confusing, and worth doing.

(Developed in conversation with @rsc).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions