Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Some of the timezone facts, as shown via pg_timezone_names, are out of date #8550

Open
bllewell opened this issue May 21, 2021 · 1 comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@bllewell
Copy link
Contributor

bllewell commented May 21, 2021

Jira Link: DB-2401
Using YB YB-2.4.0.0 on macOS Version 11.3.1 .

Some of the timezone facts, as shown via pg_timezone_names, are out of date w.r.t what this shows:

https://en.wikipedia.org/wiki/List_of_tz_database_time_zones

I did a simple copy-and-paste from the browser view of this data and used the \copy command in ysqlsh to ingest it to table stage. The unique timezone names that fall outside the intersection of stage and pg_timezone_names is tiny. (Each relation has about 600 rows. )

select name from stage
where name not in (select name from pg_timezone_names)
order by 1;

has this result:

     name     
--------------
 America/Nuuk
 Factory

And the opposite:

select name from pg_timezone_names
where name not in (select name from stage)
order by 1;

Has this result:

    name    
------------
 posixrules

So I joined pg_timezone_names to stage using name to populate the table extended_pg_timezone_names. This allows interesting queries that the shipped pg_timezone_names doesn't support.

I did this:

select name, utc_offset, std_offset, dst_offset
from extended_pg_timezone_names
where not (utc_offset = std_offset or utc_offset = dst_offset);

(std_offset is the offset from UTC when Daylight Savings Time is not in force. And dst_offset is the offset from UTC when it is in force.)

This is the result:

       name       | utc_offset | std_offset | dst_offset 
------------------+------------+------------+------------
 Europe/Volgograd | 04:00:00   | 03:00:00   | 03:00:00
 Africa/Juba      | 03:00:00   | 02:00:00   | 02:00:00
 Antarctica/Casey | 08:00:00   | 11:00:00   | 11:00:00

However, in Vanilla PG Version 13.2, the query gets no rows. As I understand it, the timezone facts are linked into the PG executable—or at least are accessible in its env for it to read when needed. Obviously, these facts have been updated between 11.2 and 13.2. Who knows if it was a bug in 11.2 or if these three timezones changed their rules. That doesn't matter.

YB needs to remain current with data like this.

I've attached my code kit. It's trivial. Just start copy-from-tz-database.sql at the ysqlsh prompt. You'll see what I report here.

ingest-tz-database-timezones.zip

@bllewell bllewell added the area/ysql Yugabyte SQL (YSQL) label May 21, 2021
@bllewell
Copy link
Contributor Author

Here are some more differences between pg_timzone_names in PG 13.2 and YB 2.4. To see some of them, you need to look at the abbreviations for both Standard Time and Daylight Savings Time. You need this function do do that:

create function jan_and_july_tz_abbrevs()
  returns table(name text, jan_abbrev text, july_abbrev text)
  language plpgsql
as $body$
declare
  set_timzone constant text := $$set timezone = '%s'$$;
  tz_set               text not null := '';
begin
  for tz_set in (
    select pg_timezone_names.name as a
    from pg_timezone_names
  ) loop
    execute format(set_timzone, tz_set);
    select
      current_setting('timezone'),
      to_char('2021-01-01 12:00:00 UTC'::timestamptz, 'TZ'),
      to_char('2021-07-01 12:00:00 UTC'::timestamptz, 'TZ')
    into name, jan_abbrev, july_abbrev;                      return next;
  end loop;
end;
$body$;

Now do this query in each env:

select
  name,
  p.abbrev,
  f.jan_abbrev,
  f.july_abbrev,
  lpad(p.utc_offset::text, 9) as "UTC offset",
  p.is_dst::text
from pg_timezone_names as p
inner join
jan_and_july_tz_abbrevs() as f using (name)
where name in (
  'America/Dawson',
  'America/Whitehorse',
  'Canada/Yukon',
  'America/Campo_Grande',
  'America/Cuiaba',
  'America/Sao_Paulo',
  'Brazil/East',
  'Etc/UCT'
  'UCT',
  'Africa/Juba',
  'Europe/Volgograd',
  'Antarctica/Macquarie',
  'Antarctica/Casey',
  'Pacific/Norfolk')
order by utc_offset, name;

Here is the YB 2.4 result:

         name         | abbrev | jan_abbrev | july_abbrev | UTC offset | is_dst 
----------------------+--------+------------+-------------+------------+--------
 America/Dawson       | PDT    | PST        | PDT         | -07:00:00  | true
 America/Whitehorse   | PDT    | PST        | PDT         | -07:00:00  | true
 Canada/Yukon         | PDT    | PST        | PDT         | -07:00:00  | true
 America/Campo_Grande | -04    | -03        | -04         | -04:00:00  | false
 America/Cuiaba       | -04    | -03        | -04         | -04:00:00  | false
 America/Sao_Paulo    | -03    | -02        | -03         | -03:00:00  | false
 Brazil/East          | -03    | -02        | -03         | -03:00:00  | false
 Africa/Juba          | EAT    | EAT        | EAT         |  03:00:00  | false
 Europe/Volgograd     | +04    | +04        | +04         |  04:00:00  | false
 Antarctica/Casey     | +08    | +08        | +08         |  08:00:00  | false
 Antarctica/Macquarie | +11    | +11        | +11         |  11:00:00  | false
 Pacific/Norfolk      | +11    | +11        | +11         |  11:00:00  | false

And here is the PG 13.2 result:

 America/Dawson       | MST    | MST        | MST         | -07:00:00  | false
 America/Whitehorse   | MST    | MST        | MST         | -07:00:00  | false
 Canada/Yukon         | MST    | MST        | MST         | -07:00:00  | false
 America/Campo_Grande | -04    | -04        | -04         | -04:00:00  | false
 America/Cuiaba       | -04    | -04        | -04         | -04:00:00  | false
 America/Sao_Paulo    | -03    | -03        | -03         | -03:00:00  | false
 Brazil/East          | -03    | -03        | -03         | -03:00:00  | false
 Africa/Juba          | CAT    | EAT        | CAT         |  02:00:00  | false
 Europe/Volgograd     | +03    | +03        | +03         |  03:00:00  | false
 Antarctica/Macquarie | AEST   | AEDT       | AEST        |  10:00:00  | false
 Antarctica/Casey     | +11    | +11        | +11         |  11:00:00  | false
 Pacific/Norfolk      | +11    | +12        | +11         |  11:00:00  | false

There are 14 rows. Each row is pairwise different. They were discovered by various ad hoc methods. You could test systematically by copying out the PG results to a file and copying them into a table called, say, PG_pg_timezone_names in YB. Then use EXCEPT to find the differences.

Notice this Wikipedia article: Daylight saving time in Canada. It says this:

In 2020, the territory of Yukon abandoned seasonal time change to permanently observe year-round Mountain Standard Time (MST).

PosgreSQL ensures that the pg_timezone_names view is kept current with each successive release. YB must do the same.

@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jun 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue
Projects
Status: No status
Development

No branches or pull requests

2 participants