-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve time precision on Windows #80
Conversation
Switch from time.Now to GetSystemTimePreciseAsFileTime to improve the precision of the timestamps and duration computations.
|
||
// This method is more precise than the go1.8 time.Now on Windows | ||
// See https://msdn.microsoft.com/en-us/library/windows/desktop/hh706895(v=vs.85).aspx | ||
// It is however ~10x slower and requires Windows 8+. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me the overhead is fine only if the issue is a "real" issue. What I mean, is that I can accept having the instrumentation a bit slower (still in the order of nanoseconds) if the access to the system shared memory to get the time is not optimal to users, especially if it provides bad results. It doesn't impact non-Windows users, so it's something we can try and improve later.
@LotharSee do you agree or we should tackle better this issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a note, I suggested the same approach for a DNS Check in the Datadog agent: https://github.com/DataDog/integrations-core/blob/2209b0c316378fcde2563b49c9200e0ffef18d91/dns_check/check.py#L15-L26
Even if it's a different problem (a polling check that is processed in another process), it's still required because it gets wrong time on Windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this imprecision is currently an issue.
@bmermet did you check that GetSystemTimePreciseAsFileTime
raises much more precise results? This initial issue was that time.Now()
could easily return 0
— maybe that's something we could easily add a test for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree a test would ne nice. Possibly tricky as all those timing tests are hard to achieve... Maybe only check that if we sleep a microsecond, the time elapsed is non-zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that indeed testing time related stuff is tricky. Since the time.Sleep() function uses the same clocks as the time.Now() timer, any use of time.Sleep() will result in a non 0 difference between two calls to time.Now(). I chose to poll the value of the high precision timer until it changes, and assert that the low precision one didn't change in the meantime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIce catch, indeed, that Sleep()
pb is annoying.
tracer/time_windows.go
Outdated
// precise implementation based on time.Now() | ||
func init() { | ||
if err := windows.LoadGetSystemTimePreciseAsFileTime(); err != nil { | ||
now = lowPrecisonNow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could be nice to have some info about this, I mean, a log message (dumping err with a surrounding message explaining we are in fallback mode and timing can be imprecise). This just to help support / us when trying to dig into timing issues.
|
||
// This method is more precise than the go1.8 time.Now on Windows | ||
// See https://msdn.microsoft.com/en-us/library/windows/desktop/hh706895(v=vs.85).aspx | ||
// It is however ~10x slower and requires Windows 8+. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree a test would ne nice. Possibly tricky as all those timing tests are hard to achieve... Maybe only check that if we sleep a microsecond, the time elapsed is non-zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little nitpick about test flakiness, but fine for me.
|
||
// This method is more precise than the go1.8 time.Now on Windows | ||
// See https://msdn.microsoft.com/en-us/library/windows/desktop/hh706895(v=vs.85).aspx | ||
// It is however ~10x slower and requires Windows 8+. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIce catch, indeed, that Sleep()
pb is annoying.
} | ||
|
||
func TestHighPrecisionTimerIsMoreAccurate(t *testing.T) { | ||
startLow := lowPrecisionNow() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is flaky, but as you say above, it's hard to avoid. I think you could reduce the flakiness by something like :
for startLow == lowPrecisionNow() {} // make sure we're just after a "low precision boundary"
startLow = lowPrecisionNow()
This switches from time.Now to GetSystemTimePreciseAsFileTime to improve the precision of the timestamps and duration computations from ~1ms to ~1µs. This comes at a performance cost since this call is much slower than the original one. I've run benchmarks in my VM and it makes a ~10x difference:
BenchmarkNormalTimeNow-2 100000000 12.7 ns/op
BenchmarkHighPrecisionTime-2 20000000 101 ns/op
This remain very fast, so I think the precision gain is worth the performance cost.