Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: time.Sleep takes more time than expected #44343

Open
egonelbre opened this issue Feb 17, 2021 · 66 comments
Open

runtime: time.Sleep takes more time than expected #44343

egonelbre opened this issue Feb 17, 2021 · 66 comments

Comments

@egonelbre
Copy link
Contributor

@egonelbre egonelbre commented Feb 17, 2021

This seems to be a regression with Go 1.16 time.Sleep.

What version of Go are you using (go version)?

$ go version
go version go1.16 windows/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=Z:\gocache
set GOENV=C:\Users\egone\AppData\Roaming\go\env
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=F:\Go\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=F:\Go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=c:\go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=c:\go\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=go1.16
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=f:\temp\sleep\go.mod
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=Z:\Temp\go-build3014714148=/tmp/go-build -gno-record-gcc-switches

What did you do?

package main

import (
	"fmt"
	"time"

	"github.com/loov/hrtime"
)

func main() {
	b := hrtime.NewBenchmark(100)
	for b.Next() {
		time.Sleep(50 * time.Microsecond)
	}
	fmt.Println(b.Histogram(10))
}

hrtime is a package that uses RDTSC for time measurements.

Output on Windows 10:

> go1.16 run .
  avg 15.2ms;  min 55.3µs;  p50 15.5ms;  max 16.3ms;
  p90 16.1ms;  p99 16.3ms;  p999 16.3ms;  p9999 16.3ms;
     55.3µs [  2] █
        2ms [  0]
        4ms [  0]
        6ms [  0]
        8ms [  0]
       10ms [  1] ▌
       12ms [  0]
       14ms [ 75] ████████████████████████████████████████
       16ms [ 22] ███████████▌
       18ms [  0]

> go1.15.8 run .
  avg 1.03ms;  min 63.9µs;  p50 1ms;  max 2.3ms;
  p90 1.29ms;  p99 2.3ms;  p999 2.3ms;  p9999 2.3ms;
     63.9µs [  1] ▌
      500µs [ 47] ███████████████████████████████████████
        1ms [ 48] ████████████████████████████████████████
      1.5ms [  1] ▌
        2ms [  3] ██
      2.5ms [  0]
        3ms [  0]
      3.5ms [  0]
        4ms [  0]
      4.5ms [  0]

Output on Linux (Debian 10):

$ go1.16 run test.go
  avg 1.06ms;  min 1.06ms;  p50 1.06ms;  max 1.08ms;
  p90 1.07ms;  p99 1.08ms;  p999 1.08ms;  p9999 1.08ms;
     1.06ms [  4] █▌
     1.07ms [ 84] ████████████████████████████████████████
     1.07ms [  7] ███
     1.08ms [  3] █
     1.08ms [  1]
     1.09ms [  1]
     1.09ms [  0]
      1.1ms [  0]
      1.1ms [  0]
     1.11ms [  0]

$ go1.15.8 run test.go
  avg 86.7µs;  min 57.3µs;  p50 83.6µs;  max 132µs;
  p90 98.3µs;  p99 132µs;  p999 132µs;  p9999 132µs;
     57.3µs [  2] █
       60µs [  1] ▌
       70µs [ 13] ████████
       80µs [ 64] ████████████████████████████████████████
       90µs [ 11] ██████▌
      100µs [  2] █
      110µs [  1] ▌
      120µs [  3] █▌
      130µs [  3] █▌
      140µs [  0]

The time granularity shouldn't be that bad even for Windows. So, there might be something going wrong somewhere.

@egonelbre
Copy link
Contributor Author

@egonelbre egonelbre commented Feb 17, 2021

@seankhliao seankhliao changed the title time.Sleep takes more time than expected runtime: time.Sleep takes more time than expected Feb 17, 2021
@bcmills
Copy link
Member

@bcmills bcmills commented Feb 17, 2021

That same CL shook out a number of kernel and runtime bugs in various configurations. (See previously #43067, #42515, #42237; cc @prattmic.)

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Feb 17, 2021

@ianlancetaylor ianlancetaylor added this to the Go1.17 milestone Feb 17, 2021
@prattmic
Copy link
Member

@prattmic prattmic commented Feb 17, 2021

This is reproducible with a trivial benchmark in time package:

func BenchmarkSimpleSleep(b *testing.B) {
       for i := 0; i < b.N; i++ {
               Sleep(50*Microsecond)
       }
}

amd64/linux, before/after http://golang.org/cl/232298:

name            old time/op  new time/op   delta
SimpleSleep-12  86.9µs ± 0%  609.8µs ± 5%  +601.73%  (p=0.000 n=10+9)

@prattmic
Copy link
Member

@prattmic prattmic commented Feb 17, 2021

For reference, across different sleep times:

name                  old time/op  new time/op   delta
SimpleSleep/1ns-12     460ns ± 3%    479ns ± 1%    +4.03%  (p=0.000 n=10+9)
SimpleSleep/100ns-12   466ns ± 3%    476ns ± 2%    +2.35%  (p=0.001 n=10+9)
SimpleSleep/500ns-12  6.47µs ±11%   6.70µs ± 5%      ~     (p=0.105 n=10+10)
SimpleSleep/1µs-12    10.3µs ±10%   12.2µs ±13%   +18.23%  (p=0.000 n=10+10)
SimpleSleep/10µs-12   81.9µs ± 1%  502.5µs ± 4%  +513.45%  (p=0.000 n=10+10)
SimpleSleep/50µs-12   87.0µs ± 0%  622.9µs ±18%  +615.69%  (p=0.000 n=8+10)
SimpleSleep/100µs-12   179µs ± 0%   1133µs ± 1%  +533.52%  (p=0.000 n=8+10)
SimpleSleep/500µs-12   592µs ± 0%   1137µs ± 1%   +91.97%  (p=0.000 n=10+10)
SimpleSleep/1ms-12    1.12ms ± 2%   1.14ms ± 1%    +1.36%  (p=0.000 n=9+10)
SimpleSleep/10ms-12   10.2ms ± 0%   10.3ms ± 0%    +0.79%  (p=0.000 n=9+9)

@prattmic
Copy link
Member

@prattmic prattmic commented Feb 17, 2021

Looking at the 100µs, the immediate problem is the delay resolution in netpoll.

Prior to http://golang.org/cl/232298, 95% of timer expirations in the 100µs case are detected by sysmon, which calls startm to wake an M to handle the timer (though this is not a particularly efficient approach).

After http://golang.org/cl/232298, this path is gone and the wakeup must come from netpoll (assuming all Ms are parked/blocked). netpoll on Linux only has 1ms resolution, so it must sleep at least that long before detecting the timer.

I'm not sure why I'm seeing ~500µs on the 10µs and 50µs benchmarks, but I may have bimodal distribution where ~50% of cases a spinning M is still awake long enough to detect the timer before entering netpoll.

I'm also not sure why @egonelbre is seeing ~14ms on Windows, as that also appears to have 1ms resolution on netpoll.

@prattmic
Copy link
Member

@prattmic prattmic commented Feb 17, 2021

I think the ideal fix to this would be to increase the resolution of netpoll. Even for longer timers, this limited resolution will cause slight skew to timer delivery (though of course there are no real-time guarantees).

As it happens, Linux v5.11 includes epoll_pwait2 which switches the timeout argument to a timespec for nanosecond resolution. Unfortunately, Linux v5.11 was released ... 3 days ago, so availability is not widespread to say the least.

In the past, I've also prototyped changing the netpoll timeout to being controlled by a timerfd (with the intention of being able to adjust the timer earlier without a full netpollBreak). That could be an option as well.

Both of these are Linux-specific solutions, I'd have to research other platforms more to get a sense of the options there.

We also may just want to bring the sysmon wakeup back, perhaps with slight overrun allowed to avoid excessive M wakeups.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Feb 18, 2021

I guess that wakeNetPoller doesn't help, because there is no poller sleeping at the point of calling time.Sleep.

Perhaps when netpoll sees a delay that is shorter than the poller resolution it should just do a non-blocking poll. That will effectively turn findrunnable into a busy wait when the next timer is very soon.

@ChrisHines
Copy link
Contributor

@ChrisHines ChrisHines commented Feb 18, 2021

While working on CL232298 I definitely observed anecdotal evidence that the netpoller has more latency than other ways of sleeping. From #38860 (comment):

My anecdotal observation here is that it appears the Linux netpoller implementation has more latency when waking after a timeout than the Go 1.13 timerproc implementation. Most of these benchmark numbers replicate on my Mac laptop, but the darwin netpoller seems to suffer less of that particular latency by comparison, and is also worse in other ways. So it may not be possible to close the gap with Go 1.13 purely in the scheduler code. Relying on the netpoller for timers changes the behavior as well, but these new numbers are at least in the same order of magnitude as Go 1.13.

I didn't try to address that in CL232298 primarily because it was already risky enough that I didn't want to make bigger changes. But an idea for something to try occurred to me back then. Maybe we could improve the latency of non-network timers by having one M block on a notesleep call instead of the netpoller. That would require findrunnable to compute two wakeup times, one for net timers to pass to the netpoller and one for all other timers to use with notesleep depending on which role it takes on (if any) when it cannot find any other work.

I haven't fully gauged how messy that would get.

Questions and concerns:

  • Coordinating two sleeping M's probably has complicating edge cases to figure out.
  • I haven't tested the latency of notesleep wakeups to know if it would actually help.
  • Would it require duplicating all the timer fields on each P, one for net timers and one for the others?

One other oddity that I noticed when testing CL232298: The linux netpoller sometimes wakes up from the timeout value early by a several microseconds. When that happens findrunnable usually does not find any expired timers since they haven't actually expired yet. A new--very short--pollUntil value gets computed and the M reenters the netpoller. The subsequent wakeup is then typically rather late, maybe up to 1ms, but I am going from memory here. I might be able to dig up some trace logs showing this behavior if I still have them and people are interested.

@prattmic
Copy link
Member

@prattmic prattmic commented Feb 18, 2021

I guess that wakeNetPoller doesn't help, because there is no poller sleeping at the point of calling time.Sleep.

wakeNetPoller shouldn't matter either way, because even if we wake the netpoller, it will just sleep again with a new timeout of 1ms, which is too long. (Unless wakeNetPoller happens to take so long that the timer has expired by the time the woken M gets to checkTimers).

Perhaps when netpoll sees a delay that is shorter than the poller resolution it should just do a non-blocking poll. That will effectively turn findrunnable into a busy wait when the next timer is very soon.

Maybe we could improve the latency of non-network timers by having one M block on a notesleep call instead of the netpoller.

As somewhat of a combination of these, one potential option would be to make netpoll with a short timeout do non-blocking netpoll, short notetsleep, non-blocking netpoll. Though this has the disadvantage of slightly increasing latency of network events from netpoll.

One other oddity that I noticed when testing CL232298: The linux netpoller sometimes wakes up from the timeout value early by a several microseconds. When that happens findrunnable usually does not find any expired timers since they haven't actually expired yet. A new--very short--pollUntil value gets computed and the M reenters the netpoller. The subsequent wakeup is then typically rather late, maybe up to 1ms, but I am going from memory here. I might be able to dig up some trace logs showing this behavior if I still have them and people are interested.

Hm, this sounds like another bug, or perhaps a spurious netpollBreak from another M.

@prattmic
Copy link
Member

@prattmic prattmic commented Feb 18, 2021

It seems that on Windows, notetsleep has 1ms precision in addition to netpoll, so the explanation in #44343 (comment) doesn't explain the increase in latency on Windows.

@ChrisHines
Copy link
Contributor

@ChrisHines ChrisHines commented Feb 18, 2021

My first thought on the Windows behavior is that somehow osRelax is being mismanaged and allowing the timer resolution to decrease to its resting mode. That thought is driven by the spike on the above histograms at 15ms. I haven't thought through how that might happen now.

@ChrisHines
Copy link
Contributor

@ChrisHines ChrisHines commented Feb 18, 2021

Hm, this sounds like another bug, or perhaps a spurious netpollBreak from another M.

That could be, but I was logging at least some of the calls to netpollBreak as well and don't recall seeing seeing that happen. I saved my logging code in case it can help. https://github.com/ChrisHines/go/tree/dlog-backup

@vellotis
Copy link

@vellotis vellotis commented Feb 25, 2021

For reference, output on my Windows 10:

> go1.16 run .
  avg 1.06ms;  min 475µs;  p50 1.01ms;  max 1.99ms;
  p90 1.13ms;  p99 1.99ms;  p999 1.99ms;  p9999 1.99ms;
      475µs [  1] ▌
      600µs [  1] ▌
      800µs [ 36] ██████████████████████████
        1ms [ 55] ████████████████████████████████████████
      1.2ms [  0]
      1.4ms [  0]
      1.6ms [  0]
      1.8ms [  7] █████
        2ms [  0]
      2.2ms [  0]

A totally different results from the #44343 (comment).

go env Output
$ go env
set GO111MODULE=on
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\user\AppData\Local\go-build
set GOENV=C:\Users\user\AppData\Roaming\go\env
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=C:\Projects\Go\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Projects\Go
set GOPRIVATE=
set GOPROXY=
set GOROOT=C:\Tools\Go\go1.16
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=C:\Tools\Go\go1.16\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=go1.16
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=NUL
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=C:\Users\user\AppData\Local\Temp\go-build2862485594=/tmp/go-build -gno-record-gcc-switches

@egonelbre
Copy link
Contributor Author

@egonelbre egonelbre commented Feb 26, 2021

@vellotis this could be because there's something running in the background changing Windows timer resolution. This could be some other Go service/binary built using older Go version. Of course there can plenty of other programs that may change it.

You can use https://github.com/tebjan/TimerTool to see what the current value is. There's some more detail in https://randomascii.wordpress.com/2013/07/08/windows-timer-resolution-megawatts-wasted.

@zobkiw
Copy link

@zobkiw zobkiw commented Mar 19, 2021

For what it's worth, go1.16.2 darwin/amd64 is also exhibiting this. In a program I'm running, the line time.Sleep(1 * time.Hour) takes roughly an hour and 3 minutes each time.

@prattmic
Copy link
Member

@prattmic prattmic commented Mar 19, 2021

@zobkiw It sounds like they is more likely to be related to #44868. I'm also curious if the failure is consistent and it really is always 1hr 3min, not 1hr 2min (or rather, ranging from 2-4min since alignment with 2min forcegc will vary)?

@zobkiw
Copy link

@zobkiw zobkiw commented Mar 19, 2021

@prattmic Strangely, I just checked the output of the script now and (although I just restarted it a few hours before) it was spot on at one hour twice in a row. However, yesterday it was "generally" 3 minutes. I don't have an exact time since we were only seeing the minute logged. It was always 3 minutes (rounded) from the previous run. 1:00pm, 2:03pm, 3:06pm, 4:09pm, etc.

Some things to note about this loop I'm running, it is calling a shell script using exec.Command before sleeping for an hour, then it does it again. The script takes about 10 seconds to execute and completes its job. It is also running in a screen session but so was the one this morning that was doing fine so I don't think that was it. The machine it was running on, a mac mini, was basically idle other that this lightweight job.

If you have some code you would like me to run I would be happy to - otherwise I will keep an eye on it here as well and report anything else I see. Hopefully some of this is helpful in narrowing down the cause if you haven't already.

UPDATE 3-20-2021 10am ET: I just ran my program for about the last 24 hours and sure enough, it would run fine for the first few iterations and then start to sleep longer based on the logging. Mostly it would hover between 3-5 minutes late but once it was 9 minutes! I've (this morning) written another program using the same logic but with simulated activity (since the actual tasks did not seem to be part of the problem) and much more detailed logging. I have it running now in 3 environments, two using screen and one not. It is running on an M1 Big Sur 11.2.3 and an Intel mac mini running the same. One thing that struck me this morning was wondering if the machine being asleep at all caused the delay. Since it runs OK for awhile (while I was using the machine presumably) and then had the delays over night. This morning, the latest reading (while I was using the machine again to install the new test code) was back to 1 hour spot on - no delay. Once I get some more data at the end of the day or tomorrow morning I will report back and share the code if there remains a problem.

@tandr
Copy link

@tandr tandr commented Mar 19, 2021

3 minutes per hour is rather worrisome, especially if some cron-like functionality required in a long-running service...

@networkimprov
Copy link

@networkimprov networkimprov commented Mar 19, 2021

In case it's related; on Windows, timers tick while the system is in standby/suspend, but they pause on other platforms.

This is supposed to have been addressed already #36141

@zobkiw
Copy link

@zobkiw zobkiw commented Mar 21, 2021

@prattmic et al, I took the liberty of writing a testbed to experiment with this issue. See all the details of test runs on Linux, Intel, and M1, along with source code here:

https://github.com/zobkiw/sleeptest

Comments welcome - especially if you find a bug in the code. Also, see the results directory for the output of the tests. Regardless, these results seem strange to me. I look forward to other input on this issue.

@networkimprov
Copy link

@networkimprov networkimprov commented Mar 21, 2021

cc @zx2c4 :-)

@egonelbre
Copy link
Contributor Author

@egonelbre egonelbre commented Mar 21, 2021

@zobkiw could it be something due to cpu frequency scaling? Can you try locking the cpu to a specific frequency (something low) and see whether the problem disappears?

@zx2c4
Copy link
Contributor

@zx2c4 zx2c4 commented Jun 27, 2021

Seems like a straightforward fix? If so, can we milestone for 1.17 and backport?

Not sure I see (yet) what the "straightforward" fix is. Going back to timeBeginPeriod isn't so appealing, for example.

@tandr
Copy link

@tandr tandr commented Jul 19, 2021

(Sorry if it was suggested and rejected already)

Folks, how would it sound if we introduce a new call, something like NanoSleep() or HighResSleep() - something to a degree that identifies that this sleep function is based on higher-precision logic?
On systems that don't have this problem, HighResSleep will just call Sleep and be done with it. On system that have hi-res timers it will add bit of a dance around what low-level function to call. If there is a limitation that system does not provide "long-shot" sleeps for high-resolution timers, we might have to add an internal logic to invoke low-res sleep for majority of the requested time, and then fire up hi-res one for the rest.

My reasoning is very simple - a lot of calls to Sleep are in minutes and hours range, and quite often even if they are smaller, programming logic does not require 1ms precision, and should work fine with 100ms (or even larger) imprecision.

@coder543
Copy link

@coder543 coder543 commented Jul 19, 2021

In general, I would expect the precision of the sleep to be proportional to the value passed in.

As a hypothetical:

If you are writing a game engine, you may need to serve a new frame every 16 milliseconds. After issuing the frame, you may want to sleep until the next frame, which is (16ms - the time it took to prepare this frame). On your powerful personal computer, this is often 8ms to 10ms! So then you choose time.Sleep and ship the game. Then the game runs poorly on a customer’s machine, and it turns out that processing is taking most of the 16ms on that machine, and then it is oversleeping. Clearly you should have chosen time.NanoSleep even though you sometimes sleep for 10ms!

Clearly, this whole outcome would have been avoidable if time.Sleep chose NanoSleep automatically for small values.

There are probably three general levels of sleep precision:

  1. the current sleep precision
  2. the more precise sleeping used previously in Go
  3. busy waiting.

If the sleep value is small enough and nothing else is runnable, it would likely make sense for the Go runtime to simply busy wait until the sleeping Goroutine is runnable again.

Is there any downside to automatically selecting more precise sleep modes? If the time to sleep is more than 5ms, the current approach should definitely be precise enough, and it allows the runtime and the OS to take more liberties with task scheduling to achieve efficiency. But, if the value is smaller… selecting a more precise sleep mechanism seems unlikely to cause any harm that wasn’t directly requested.

If someone wants to sleep less precisely and coalesce timers further, they could round up to the nearest second, minute, or whatever themselves. But, if the runtime decided to sleep for an extra, say, 1% as an acceptable margin of error, this would give the runtime a lot of opportunity to coalesce large sleeps automatically, while still enabling high precision sleeps to be serviced appropriately.

The current API doesn’t expose a way to be more precise, and I’m not sure adding more APIs is the best solution here, since you’ll never be able to guarantee a given level of service — the whole computer could hibernate before the task becomes runnable again, and then there’s nothing the runtime can do.

But, those are just my thoughts.

@alexbrainman
Copy link
Member

@alexbrainman alexbrainman commented Jul 20, 2021

I just built https://github.com/loov/hrtime/blob/master/_example/basic/main.go on current tip. And run it on one of my Windows 10 laptops. Here is the output:

C:\Users\Alex>test
1.6846ms
  avg 2.05ms;  min 43.3µs;  p50 1.99ms;  max 1s;
  p90 2.03ms;  p99 2.11ms;  p999 2.26ms;  p9999 1s;
     43.3µs [   2]
      500µs [ 114] █▌
        1ms [ 731] ████████████
      1.5ms [2356] ████████████████████████████████████████
        2ms [ 892] ███████████████
      2.5ms [   0]
        3ms [   0]
      3.5ms [   0]
        4ms [   0]
      4.5ms+[   1]


C:\Users\Alex>test
1.3457ms
  avg 1.82ms;  min 5.2µs;  p50 1.83ms;  max 1s;
  p90 2.01ms;  p99 2.11ms;  p999 2.31ms;  p9999 1s;
      5.2µs [  34] ▌
      500µs [ 133] ███
        1ms [1698] ████████████████████████████████████████
      1.5ms [1618] ██████████████████████████████████████
        2ms [ 611] ██████████████
      2.5ms [   1]
        3ms [   0]
      3.5ms [   0]
        4ms [   0]
      4.5ms+[   1]


C:\Users\Alex>test
1.0907ms
  avg 1.91ms;  min 3.4µs;  p50 1.96ms;  max 1s;
  p90 2.01ms;  p99 2.13ms;  p999 2.25ms;  p9999 1s;
      3.4µs [  47] ▌
      500µs [ 174] ███▌
        1ms [1143] ███████████████████████
      1.5ms [1973] ████████████████████████████████████████
        2ms [ 757] ███████████████
      2.5ms [   1]
        3ms [   0]
      3.5ms [   0]
        4ms [   0]
      4.5ms+[   1]


C:\Users\Alex>test
1.4356ms
  avg 1.97ms;  min 14µs;  p50 1.97ms;  max 1s;
  p90 2.02ms;  p99 2.15ms;  p999 2.31ms;  p9999 1s;
       14µs [  17]
      500µs [ 218] ████
        1ms [ 902] █████████████████
      1.5ms [2088] ████████████████████████████████████████
        2ms [ 869] ████████████████▌
      2.5ms [   0]
        3ms [   0]
      3.5ms [   0]
        4ms [   0]
      4.5ms+[   2]


C:\Users\Alex>test
157µs
  avg 1.95ms;  min 11.8µs;  p50 1.98ms;  max 1s;
  p90 2.02ms;  p99 2.08ms;  p999 2.27ms;  p9999 1s;
     11.8µs [   5]
      500µs [  95] █▌
        1ms [1227] ████████████████████████▌
      1.5ms [1989] ████████████████████████████████████████
        2ms [ 778] ███████████████▌
      2.5ms [   0]
        3ms [   0]
      3.5ms [   0]
        4ms [   0]
      4.5ms+[   2]


C:\Users\Alex>

The output looks reasonable to me. Most of sleep time is around 1-2 ms. That is what I would expect on Windows.

Alex

@ChrisHines
Copy link
Contributor

@ChrisHines ChrisHines commented Jul 20, 2021

@alexbrainman:

I just built https://github.com/loov/hrtime/blob/master/_example/basic/main.go on current tip. And run it on one of my Windows 10 laptops. Here is the output:

The output looks reasonable to me. Most of sleep time is around 1-2 ms. That is what I would expect on Windows.

I don't see any changes to the runtime since go1.17beta1 that I would expect to change the behavior of time.Sleep on Windows or in general. Do you have any theories why your results differ from what I got on go1.17beta1 in #44343 (comment) or #44343 (comment)?

@egonelbre
Copy link
Contributor Author

@egonelbre egonelbre commented Jul 20, 2021

For me, Go tip still has the same problem, output from code in #44343 (comment):

  avg 15.2ms;  min 2.1µs;  p50 15.4ms;  max 15.9ms;
  p90 15.6ms;  p99 15.7ms;  p999 15.9ms;  p9999 15.9ms;
      2.1µs [ 15] ▌
        2ms [  1]
        4ms [  0]
        6ms [  0]
        8ms [  0]
       10ms [  0]
       12ms [  0]
       14ms [984] ████████████████████████████████████████
       16ms [  0]
       18ms [  0]

And the _example/basic:

5.018ms
  avg 15.3ms;  min 2.2µs;  p50 15.4ms;  max 1.01s;
  p90 15.5ms;  p99 15.7ms;  p999 15.9ms;  p9999 1.01s;
      2.2µs [  90] ▌
        2ms [   0]
        4ms [   0]
        6ms [   0]
        8ms [   0]
       10ms [   0]
       12ms [   0]
       14ms [4004] ████████████████████████████████████████
       16ms [   1]
       18ms+[   1]

@alexbrainman
Copy link
Member

@alexbrainman alexbrainman commented Jul 22, 2021

I don't see any changes to the runtime since go1.17beta1 that I would expect to change the behavior of time.Sleep on Windows or in general.

I don't doubt that you and me are running the same code. I think the problem is that Windows versions that we run are different.

Do you have any theories why your results differ from what I got on go1.17beta1 in #44343 (comment) or #44343 (comment)?

I managed to reproduce your problem on another PC of mine.

CL 248699 introduced use of CreateWaitableTimerExW with CREATE_WAITABLE_TIMER_HIGH_RESOLUTION (see #8687 (comment) where this idea come from). CL 248699 calls CreateWaitableTimerExW with CREATE_WAITABLE_TIMER_HIGH_RESOLUTION, and, if call succeeds, the Go runtime will use that approach for waiting in usleep.

In my repro, I can see runtime is successfully calling and using CreateWaitableTimerExW with CREATE_WAITABLE_TIMER_HIGH_RESOLUTION, but unfortunately real wait is long. Like about 10ms.

Perhaps @jstarks has an explanation of what is wrong here.

I managed to disable CL 248699 affects on my repro PC by setting highResTimerSupported in runtime to false. If I do that, my wait times are back to around 1ms. But then timeBeginPeriod is back with the vengeance, and that was whole point of CL 248699 - to stop using timeBeginPeriod.

No idea what to do here.

For me, Go tip still has the same problem, ...

I believe you. See my reply above. I can reproduce it too.

Alex

@ChrisHines
Copy link
Contributor

@ChrisHines ChrisHines commented Jul 22, 2021

OK, @alexbrainman, it looks like we understand the problem the same way then. I agree with your analysis.

@jstarks
Copy link

@jstarks jstarks commented Jul 22, 2021

Hmm. This is unfortunate. I can look into the Windows timer code some more to see what might be going wrong. @alexbrainman, since you have seen both modes of this behavior, could you report your Windows OS build numbers for the two different machines?

@ChrisHines
Copy link
Contributor

@ChrisHines ChrisHines commented Jul 22, 2021

@jstarks Did you see #44343 (comment)? I think I explained what's going on in detail there. What else do you think we need to look at?

@jstarks
Copy link

@jstarks jstarks commented Jul 22, 2021

Ah, sorry, I didn't. That analysis sounds correct to me.

Unfortunately there is no way to opt into a high-resolution synchronous wait (for GetQueuedCompletionStatusEx or otherwise).

I can think of a few different ways to address this:

  1. Always enable 1ms timer granularity. This isn't so bad for recent Windows versions since this has become a per-process property (at least in some configurations--not certain on the details here right now).
  2. A local fix--wait on a high-resolution timer locally in netpoll. With only local changes, you'd probably have to add a second wait, so that you first wait on the completion port + timer via WaitForMultipleObjects, then call GetQueuedCompletionStatusEx if the completion port wait succeeded. Of course, there's a race here and you'd have to loop. And this adds in a least two more syscalls per short netpoll, which may be prohibitive.
  3. A non-local fix--maintain one or more high-resolution timers to represent the next sleep wakeup time(s). Associate these with the netpoll completion port via NtCreateWaitCompletionPacket + NtAssociateWaitCompletionPacket. Keep these up to date with the next wakeup time. Then you'll get the timeout via one of the threads at random, since the IOCP is global. I don't know enough about the Go scheduler to know if this is a good idea.

None of these are particularly nice.

@networkimprov
Copy link

@networkimprov networkimprov commented Jul 22, 2021

cc @ianlancetaylor re these suggestions...

@aclements
Copy link
Member

@aclements aclements commented Jul 22, 2021

@jstarks , is there documentation on NtCreateWaitCompletionPacket and NtAssociateWaitCompletionPacket?

We do keep a high-res timer per M (minit creates it). Does already having those timers around simplify anything?

@jstarks
Copy link

@jstarks jstarks commented Jul 23, 2021

No, these APIs are currently undocumented. I'd like to fix that, but it may take some time. These are the APIs that back the CreateThreadpoolWait and SetThreadpoolWait functions, which are already documented but don't give you the level of control you want over your threads.

The basic idea is a wait completion packet allows for associating a wait operation with an IOCP. You create one per concurrent wait you'd like to support (so one per M in this case, I guess), and you start the wait by associate the packet with the IOCP and a wait object. If the object becomes signaled, the signal is consumed (for auto-reset objects), the packet's information is posted to the IOCP, and the packet is disassociated. You must reassociate the packet again to arm it for another wait (similar to EPOLLONESHOT).

You can use these with any waitable object, including timers.

@alexbrainman
Copy link
Member

@alexbrainman alexbrainman commented Jul 25, 2021

@alexbrainman, since you have seen both modes of this behavior, could you report your Windows OS build numbers for the two different machines?

@jstarks I cannot reproduce this (#44343 (comment)) behaviour anymore. Makes me look like a liar. I can only see now average of 10-15 ms like everyone else.

I thought perhaps my odd result was due to some programs running on the system at the time of my test. So I built another Go program that just calls timeBeginPeriod and then goes to sleep, and run that program along with my test. But that still gives me 10-15 ms sleeps.

You should be able to easily reproduce what we see with instructions at #44343 (comment). If you have access to Windows kernel, perhaps you can figure out why high-performance timers gives us minimum sleeps of 10-15 instead of 1 ms.

Alex

@ChrisHines
Copy link
Contributor

@ChrisHines ChrisHines commented Jul 25, 2021

If you have access to Windows kernel, perhaps you can figure out why high-performance timers gives us minimum sleeps of 10-15 instead of 1 ms.

I am not sure that's the right question given what we know. High performance timers are not involved in time.Sleep processing in Go 1.16+. Since CL232298 time.Sleep and other runtime timers rely completely on the netpoller, which on Windows relies on GetQueuedCompletionStatusEx. In #44343 (comment) @jstarks tells us

Unfortunately there is no way to opt into a high-resolution synchronous wait (for GetQueuedCompletionStatusEx or otherwise).

Which I think leaves us with either changing something in how the netpoller waits for short durations or finding a way not to rely on the netpoller for short duration sleeps or non-network related timers.

@alexbrainman
Copy link
Member

@alexbrainman alexbrainman commented Jul 26, 2021

High performance timers are not involved in time.Sleep processing in Go 1.16+. Since CL232298 time.Sleep and other runtime timers rely completely on the netpoller, which on Windows relies on GetQueuedCompletionStatusEx

I was not aware of CL 232298. Indeed I can see that now time.Sleep is implemented by using GetQueuedCompletionStatusEx.

@jstarks I take my complain about high-res timers. They work as expected.

The problem is that GetQueuedCompletionStatusEx will sleep for a minimum of 15ms, if timeBeginPeriod is not called.

I suspect, if we switch timeBeginPeriod back on again, GetQueuedCompletionStatusEx will sleep for a minimum of 1 ms as before. I am not sure we want to do that.

Alex

Sean-Der added a commit to pion/webrtc that referenced this issue Aug 13, 2021
time.Sleep is not precise, instead use a ticker.

Relates to golang/go#44343
Sean-Der added a commit to pion/webrtc that referenced this issue Aug 13, 2021
time.Sleep is not precise, instead use a ticker.

Relates to golang/go#44343
Sean-Der added a commit to pion/webrtc that referenced this issue Aug 13, 2021
time.Sleep is not precise, instead use a ticker.

Relates to golang/go#44343

Co-authored-by: Twometer <twometer@outlook.de>
Sean-Der added a commit to pion/webrtc that referenced this issue Aug 13, 2021
time.Sleep is not precise, instead use a ticker.

Relates to golang/go#44343

Co-authored-by: Twometer <twometer@outlook.de>
@AndrewSav
Copy link

@AndrewSav AndrewSav commented Sep 9, 2021

The code from OP gives the same results like everyone else is seeing on my Windows 10 laptop, but on Windows 2019 VM it runs like this:

  avg 1.57ms;  min 114µs;  p50 1.93ms;  max 2.53ms;
  p90 2.03ms;  p99 2.53ms;  p999 2.53ms;  p9999 2.53ms;
      115µs [  1] █
      500µs [ 12] █████████████▌
        1ms [ 29] █████████████████████████████████
      1.5ms [ 35] ████████████████████████████████████████
        2ms [ 22] █████████████████████████
      2.5ms [  1] █
        3ms [  0]
      3.5ms [  0]
        4ms [  0]
      4.5ms [  0]

Does anyone know why is the difference? The executable is the same and built with 1.17 release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet