Skip to content

time: clock drift on Windows 2008r2 w/ version >= 1.9 #24489

@dennisdupont

Description

@dennisdupont

What version of Go are you using (go version)?

Occurs on v1.9 and above

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

windows/amd64
Issue occurs on win2008r2, but not on win2012 (tested) or win2016 (according to consul forum comments)
Issue occurs on domain attached or standalone server.

What did you do?

Clock drift was noticed on a deployed consul cluster (see hashicorp/consul#3925 for excruciating details). Determined it started with consul v0.9.3 and existed in the latest. That was when they switched to go v1.9. So we downgraded to v0.9.2 and problem disappeared.

The major applicable change in go 1.9 seemed to be the monotonic clock changes, so I experimented with the go version. If consul v0.9.3 is built with go 1.8 the problem also does not exist.

With help from a consul contributor we were able to create a small snippet to reproduce the issue:
https://play.golang.org/p/4y79262HSrJ

The clock drift is measured the same way as with the production servers, running w32tm:
>w32tm /stripchart /computer:10.60.1.25 /dataonly /samples:100
As soon as the test starts running you can see drift.

What did you expect to see?

A stable clock

What did you see instead?

Significant clock drift
Here is an example run:

C:\Users\Administrator>w32tm /stripchart /computer:208.88.126.235 /dataonly /samples:100
Tracking 208.88.126.235 [208.88.126.235:123].
Collecting 100 samples.
The current time is 3/22/2018 9:45:25 AM.
09:45:25, +00.0104974s
09:45:27, +00.0085572s
09:45:29, +00.0080007s
09:45:31, +00.0022288s
09:45:33, +00.0070934s
09:45:35, -00.0778244s <== test started
09:45:38, -00.1392391s
09:45:40, -00.3150037s
09:45:42, -00.4225186s
09:45:44, -00.4935759s
09:45:46, -00.6112448s
09:45:48, -00.7180814s
09:45:50, -00.8264958s
09:45:52, -00.9447071s
09:45:54, -01.0553810s <== over 1 second offset in ~20 seconds
09:45:56, -01.1570893s
09:45:58, -01.2324556s

This keeps growing until (S)NTP starts fighting the drift, but we have seen it as high as ~180 seconds, enough to cause kerberos auth failures.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions