Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to time zone conversions due to MS updates. #2598

Merged
merged 1 commit into from Oct 10, 2014

Conversation

jmdavis
Copy link
Member

@jmdavis jmdavis commented Oct 9, 2014

Microsoft updated Windows' time zone information again, creating time zones, renaming them, etc. So, the semi-official conversions between those and the TZ database need to be updated. If the autotesters are up-to-date, then this may fail on the first go round, and I'll have to readd some of the now defunct time zones, but what's here at the moment should be what's on an up-to-date Windows box.

@jmdavis
Copy link
Member Author

jmdavis commented Oct 9, 2014

Okay. It looks like the autotester is still out-of-date, so I restored the time zones that aren't really supposed to be around anymore. Hopefully, the tests will pass now.

@jmdavis
Copy link
Member Author

jmdavis commented Oct 9, 2014

Drat. Now, it's failing because the windows auto-tester boxes are missing the new Russian time zones. Switching the conversions back so that they don't use the new time zones would be wrong, because the new times zones are different from the old, but if the machine hasn't been updated, then it doesn't have the new ones... bleh. The TZ database's system of naming time zones after cities avoids this whole problem. Why oh why must MS keep renaming time zones like this... :(

I guess that I'll have to add some sort of list of defunct conversions to try when a newer name fails so that Windows boxes that aren't properly updated don't have this problem. Though why anyone would fail to update their Windows box...

@jmdavis jmdavis force-pushed the 13592 branch 4 times, most recently from 74be821 to aece5e6 Compare October 9, 2014 06:10
@jmdavis
Copy link
Member Author

jmdavis commented Oct 9, 2014

Yay! It looks like the tests are finally passing.

@monarchdodra
Copy link
Collaborator

Hum... These new unittests, will they still fail whenever a change is made on windows end?

@@ -28949,7 +28981,6 @@ string tzDatabaseNameToWindowsTZName(string tzName) @safe pure nothrow @nogc
case "Africa/Banjul": return "Greenwich Standard Time";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method of using a switch statement for a lookup table results in a lot of code bloat. Please consider using an associative array literal instead, or use two parallel arrays of strings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at doing that at one point and decided against it for some reason, though I can't recall why. I do remember that issues with pure and immutable made it a bit of a pain, but that could probably be worked around, especially since things keep getting better in that department thanks to improvements to inference and implicit conversions from pure functions. Regardless, I don't really care which way it is beyond which is more efficient, so I'm fine with changing it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a binary search lookup + 2 arrays be more appropriate?
I would expect the compiler to do something similar internally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler doesn't convert this code into 2 tables. (The cases are, but not the return statements.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, that makes sense, I don't know why I thought it would do the return statements.

@jmdavis
Copy link
Member Author

jmdavis commented Oct 9, 2014

Hum... These new unittests, will they still fail whenever a change is made on windows end?

I didn't change the unit tests at all. And yes, they will fail most of the time when Microsoft makes a change to their time zone names. And that's on purpose, because the test that fails is making sure that the conversion from TZ database names to the Windows time zone names and vice versa are up-to-date. That's not going to change. Unfortunately though, it usually gets caught because one of the D developers who uses Windows runs into the test failures after a windows update, and since I use Linux almost exclusively, I'm not one of them. The autotester never catches it, because for some reason, those boxes aren't getting updated. What I should probably do is start making an effort to track the updates that Microsoft does and watch for updates to the time zone info so that I can catch it faster. Or maybe I could write a script to grab the xml file with the mappings on a regular basis and compare it so that it can alert me... In either case, I should do a better job of anticipating it so that it causes fewer problems, but without those tests, we wouldn't catch when MS makes the updates, and the code would be wrong.

@monarchdodra
Copy link
Collaborator

I didn't change the unit tests at all.

Ah, nevermind, I read the code all wrong. Carry on then :)

@schveiguy
Copy link
Member

I'm not sure it's a good idea to remove time zones based on windows updates. We have no control over which updates the user has applied, and I wouldn't want someone using XP which receives no updates to have to deal with this.

Can the function which translates the sane names to the windows names return an array of possible names, and then have the function which looks up the timezone metadata try them in order? Or is there some mechanism to look up what version of the name database the OS is using?

Looking around the net, I see that there is possibly an alternative mechanism of just processing all the time zones in the registry and then using the id (which I'm assuming doesn't change?). The time zones are also localized, so a non-english version of Windows will have the localized names. Has anyone tested this on a non-english version of Windows?

For reference: http://stackoverflow.com/questions/4967903/linux-windows-timezone-mapping

@jmdavis
Copy link
Member Author

jmdavis commented Oct 9, 2014

I'm not sure it's a good idea to remove time zones based on windows updates. We have no control over which updates the user has applied, and I wouldn't want someone using XP which receives no updates to have to deal with this.

Well, this PR doesn't remove any, if nothing else, because the autotester hasn't been properly updated, so that would break it. And I don't know if I've ever actually removed any because of that. Certainly I didn't this time or the last time. But the fact that a machine might not be properly updated (e.g. and an XP box can't be at this point) definitely poses a problem, particularly for new time zones, since it won't have the correct information. That's why this PR makes it so that TimeZone.getTimeZone attempts a different time zone name for the new time zones on Windows if they fail. I'm not quite sure how else to handle it.

Can the function which translates the sane names to the windows names return an array of possible names, and then have the function which looks up the timezone metadata try them in order? Or is there some mechanism to look up what version of the name database the OS is using?

Well, returning an array would break the API regardless of whether that would theoretically be a good approach or not. Also, even if we did return an array, there's the problem that some of those values would then be outright incorrect on an up-to-date system because of actual changes to the time zone or time zone information beyond the name - e.g. if they switched time zones for an area but the old one is still used for other areas (though I suppose that the approach could be taken that the earlier in the array a value is, the more likely it is to be correct). So, I don't know if the array approach would be a good idea or not. But since it would break the API, we're kind of stuck unless we create new function names and evolve the API that way (though I'm willing to bet that the amount of code that would be broken by outright changing the functions to return arrays would be very small, since the few that use the functionality likely do so via TimeZone.getTimeZone rather than directly, and even those folks would likely be in the minority).

And considering that these updates are done via patches which show up as separately installed packages rather than it being tied to the version of Windows, I don't know how you could look up what version of the TZ info a Windows box is using.

Honestly, while I think that being able to give a TZ database name like "America/Los_Angeles" or "Europe/Paris" on any machine and have it work to get that time zone is quite nice, I do kind of wonder at this point if the better approach would have been to say that you just had to know what system you were on and what the time zone name was for that system (so have getTimeZone for PosixTimeZone and WindowsTimeZone but not TimeZone) rather than try and make it cross-platform, because clearly being able to have it be properly cross-platform requires that your Windows box be fully up-to-date, and we obviously can't guarantee that. And from a maintenance standpoint, even if we could guarantee that, we're continually forced to update the conversions in std.datetime when Windows does get an update that affects the time zone names. And on top of that, since the Windows TZ info is missing many time zones and is frequently wrong, I'd generally argue that the the main reason to use WindowsTimeZone rather than providing the TZ database files yourself for your Windows app and using PosixTimeZone is so that you get information that matches other Windows boxes rather than to get correct info, and if you provided the TZ database files for your app, you'd be able to use the TZ database names on all platforms anyway.

So, I actually wonder if the correct approach after this fix would be to simply deprecate TimeZone.getTimeZone (leaving PosixTimeZone.getTimeZone and WindowsTimeZone.getTimeZone) and deprecate the conversion functions and stop trying to be cross-platform with the time zone names. As useful as this functionality is for those who need it, I doubt that very many folks are using it, and if you really need it and need to get specific time zones in a cross-platform manner, maybe it should be up to you to figure out what conversions would be appropriate for the environments that you're targeting. And those who want to be cross-platform with the time zone names without that effort can just provide the TZ database files themselves, which is arguably the better approach anyway.

Looking around the net, I see that there is possibly an alternative mechanism of just processing all the time zones in the registry and then using the id (which I'm assuming doesn't change?). The time zones are also localized, so a non-english version of Windows will have the localized names. Has anyone tested this on a non-english version of Windows?

The ID is what we're using. It's the English name. And Windows.getInstalledTZNames works by grabbing all of those from the registery, and then TimeZone.getInstalledTZNames calls windowsTZNameToTZDatabaseName on each of them in order to be able to return an array of TZ database names instead of the Windows names so that it can be cross-platform. Originally, I'd used the field from the registry with the standard time zone name (which matches the ID in English), because wine uses the TZ database name for the ID, but that doesn't work on non-English systems, so Kenji changed it to use the ID quite some time ago (which unfortunately means that it doesn't work on wine, which I keep meaning to complain to them about, but that's their fault from differing from Windows on that point). So, it definitely does work on non-English systems thanks to Kenji.

@@ -29306,7 +29338,7 @@ string tzDatabaseNameToWindowsTZName(string tzName) @safe pure nothrow @nogc
case "Europe/Prague": return "Central Europe Standard Time";
case "Europe/Riga": return "FLE Standard Time";
case "Europe/Rome": return "W. Europe Standard Time";
case "Europe/Samara": return "Russian Standard Time";
case "Europe/Samara": return "Russia Time Zone 3";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance, this looks like you have removed "Russian Standard Time."

I think this means on an older system, someone trying to specify Europe/Samara, would get the wrong id (or no id?). Am I not understanding this correctly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still there. Other conversions use it. And that's why I added the bit in TimeZone.getTimeZone where it attempts the older conversion if a newer time zone name isn't on the system. That's the whole point of the new (private) _getOldName function.

@schveiguy
Copy link
Member

The ID is what we're using. It's the English name.

yeah, I looked in the registry. There is no numerical unchanging ID. What a sucky system!

However, I did notice in the parent reg key for time zones, (\HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Time Zones), there is a value called TzVersion, with an identifier. No idea if this is documented anywhere, but maybe this could be the "database version" of the time zones?

@schveiguy
Copy link
Member

So, I actually wonder if the correct approach after this fix would be to simply deprecate TimeZone.getTimeZone (leaving PosixTimeZone.getTimeZone and WindowsTimeZone.getTimeZone) and deprecate the conversion functions and stop trying to be cross-platform with the time zone names.

This might be the right answer. The likely process for using time zone ids is to:

  1. Look up the time zones on the system
  2. Display a list of them to the user
  3. User selects the one he wants.

All without the program having any idea what the ids actually mean. Another nicety would be to be able to look up all time zones that have certain properties. I.e. give me all time zones that are UTC-5 hours.

@jmdavis
Copy link
Member Author

jmdavis commented Oct 9, 2014

All without the program having any idea what the ids actually mean. Another nicety would be to be able to look up all time zones that have certain properties. I.e. give me all time zones that are UTC-5 hours.

This would work with just WindowsTimeZone.getTimeZone and getting the utcOffset from the resulting time zones, though querying more complicating information could get interesting.

@jmdavis
Copy link
Member Author

jmdavis commented Oct 9, 2014

In any case, if the current changes are acceptable, this should probably be merged sooner rather than later so that folks like Walter aren't having the unit tests fail on their local boxes. We can look at redoing the switch statement and/or deprecating these conversion functions in a separate PR.

@schveiguy
Copy link
Member

Not sure why the FreeBSD 32 bit failed, had a weird "timeout" message.

I deprecated the test so it can be re-run.

@monarchdodra
Copy link
Collaborator

Not sure why the FreeBSD 32 bit failed, had a weird "timeout" message.

It's a regular failure on FreeBSD 32 release. It's relatively new though.

@braddr : Any idea what is causing this system to hang?

@schveiguy
Copy link
Member

Well, given the current state of affairs, I can't see anything wrong with this PR.

I think as a future direction, we should distance ourselves from arbitrary OS decisions. I would be OK with deprecating the posix to windows time zone translation.

@braddr
Copy link
Member

braddr commented Oct 10, 2014

re: freebsd hang: https://issues.dlang.org/show_bug.cgi?id=13416

@dnadlinger
Copy link
Member

Auto-merge toggled on

dnadlinger added a commit that referenced this pull request Oct 10, 2014
Update to time zone conversions due to MS updates.
@dnadlinger dnadlinger merged commit 42cd114 into dlang:master Oct 10, 2014
@dnadlinger
Copy link
Member

Yep, this should merely fix the current state. Whether we want to redo the API or the implementation at some other point is a different story.

@jmdavis jmdavis deleted the 13592 branch May 5, 2017 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants