Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: windows system clock resolution issue #8687

Open
defia opened this issue Sep 9, 2014 · 75 comments
Open

runtime: windows system clock resolution issue #8687

defia opened this issue Sep 9, 2014 · 75 comments

Comments

@defia
Copy link

@defia defia commented Sep 9, 2014

What does 'go version' print?

go version go1.3.1 windows/amd64

actually it's just the same as the issue in chrome:
https://code.google.com/p/chromium/issues/detail?id=153139

What steps reproduce the problem?

1.download and run system clock resolution tool provided by the link in
https://code.google.com/p/chromium/issues/detail?id=153139
it will show your current timer interval
2.if current timer interval smaller than 15.625ms,try to close running programs until it
goes to 15.625ms
3.run any program compiled by go

What happened?
current timer interval goes to 1ms when go compiled bins running,and goes to 15.625
after quiting

What should have happened instead?
current timer interval should not change

at least,with no time package used, the timer setting should not change, or there should
be a method to control this
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Sep 9, 2014

Comment 1:

Labels changed: added repo-main, release-go1.4, os-windows.

@alexbrainman

This comment has been minimized.

Copy link
Member

@alexbrainman alexbrainman commented Sep 10, 2014

Comment 2:

I confirm what DefiaQ said.
The problem is that we call timeBeginPeriod Windows API. It was introduced as part of
pprof implementation
changeset:   9786:9c5c0cbadb4d                           
user:        Hector Chu <hectorchu@gmail.com>            
date:        Sat Sep 17 17:57:59 2011 +1000              
summary:     runtime: implement pprof support for windows
but I view it as "windows timer precision is very very low (about 15ms) comparing to
other OSes, so lets get best precision whenever we can (1ms)".
timeBeginPeriod has many problems. It does not work on some systems (it does not work on
my oldish windows xp). I suspect, it fails, if you don't have correct access rights on
the system (it works for administrator). When successfully set timeBeginPeriod will
drain laptop batteries faster (I read about it everywhere, I didn't have chance to
experience it on practice). But, I suspect, many other applications call timeBeginPeriod
too, so Go is not the only one at fault here.
Personally I don't care if timeBeginPeriod is set, but others might feel different.
Perhaps there is case here for some way to control it from within Go program.
ALternatively, people who care, can just remove timeBeginPeriod call in their own
runtime.
Alex
@defia

This comment has been minimized.

Copy link
Author

@defia defia commented Sep 10, 2014

Comment 3:

it will succeed even without administrator privilege.
so far i didn't find any applications other than chrome change timer precesion
@alexbrainman

This comment has been minimized.

Copy link
Member

@alexbrainman alexbrainman commented Sep 10, 2014

Comment 4:

Well, I am not running chrome but my timer interval is set to 1ms.
Also, why did you said "... if current timer interval smaller than 15.625ms,try to close
running programs until it goes to 15.625ms .. "? You should have said " ... close chrome
...".
Alex
@defia

This comment has been minimized.

Copy link
Author

@defia defia commented Sep 10, 2014

Comment 5:

i also suspect other applications will change the setting , though i havn't found any so
far
@rsc

This comment has been minimized.

Copy link
Contributor

@rsc rsc commented Sep 18, 2014

Comment 6:

Too late for 1.4. Maybe for a future release, if someone wants to do the work to prove
that it matters and that removing this is safe. I think without a clear report about bad
implications we won't do that work ourselves.

Labels changed: added release-none, removed release-go1.4.

Status changed to Accepted.

@defia defia added accepted labels Sep 18, 2014
@defia

This comment has been minimized.

Copy link
Author

@defia defia commented Dec 9, 2014

i made a package to call timeEndPeriod to reset the timer to normal on initialization.

if the program doesn't need much performance, but runs on a windows laptop for a long time, just import the package.
i've using this for some days, everything seems to be well so far

@defia

This comment has been minimized.

Copy link
Author

@defia defia commented Dec 11, 2014

I've done some benchmarks today.
At first I ran a cpu-heavy program and modified it to use hundreds of goroutines. To my surprise,it actually ran ~5% faster if I set the system clock resolution to 15.6ms(that is default value)

Then I ran this benchmark
I didn't look into the code, I guess it use only one goroutine, still pure cpu thing, and get still the same result.

Then I wrote a http benchmark here
This time it ran a lot faster(~50%) with 1ms system clock resolution.

So I guess, if your program don't use much IO (cause less context switches of goroutine), you can discard the system clock resoluton change.

@minux

This comment has been minimized.

Copy link
Member

@minux minux commented Dec 11, 2014

Could we use QueryPerformanceCounter for pprof support?

@alexbrainman

This comment has been minimized.

Copy link
Member

@alexbrainman alexbrainman commented Dec 11, 2014

And how are you going to use QueryPerformanceCounter in pprof support?

Alex

@mattn

This comment has been minimized.

Copy link
Member

@mattn mattn commented Dec 12, 2014

cyclesPerSecond uses systime in os1_windows.go

https://github.com/golang/go/blob/master/src/runtime/os1_windows.go#L314-L316

https://github.com/golang/go/blob/master/src/runtime/os1_windows.go#L286-L306

This part is possible to be replaced with QueryUnbiasedInterruptTime (not QueryPerformanceCounter).

@rsc rsc removed the os-windows label Apr 10, 2015
@rsc rsc added this to the Unplanned milestone Apr 10, 2015
@rsc rsc added OS-Windows and removed release-none labels Apr 10, 2015
@randomascii

This comment has been minimized.

Copy link

@randomascii randomascii commented Jun 12, 2015

I've blogged about this issue in the past (https://randomascii.wordpress.com/2013/07/08/windows-timer-resolution-megawatts-wasted/) and I wanted to share what I have learned:

As defia pointed out, having a higher timer frequency wastes CPU time and can make some CPU heavy programs run more slowly. This is because the OS is waking up ~15 times more frequently, consuming CPU time each time it does. It also means that other programs may wake up more frequently and waste CPU time - some software sits in Sleep(1) loops and will wake up ~15 times more frequently when the timer resolution is changed. So, raising the timer frequency can waste performance.

Raising the timer frequency will almost always waste some power and will harm battery life. This is the kicker. This is why no program should ever raise the timer frequency unless it needs the timer frequency raised. The timer frequency is a global setting and raising it without a good reason is poor citizenship.

If you have a work loop that relies on Sleep(1) or timeouts from WaitForSingleObject(handle, 1) then raising the timer frequency will make your loop run faster. Presumably that is why the http test ran faster. Most software should not be relying on timeouts, it should be running based on events. But, it is reasonable for specific go programs to raise the timer frequency if needed.

It is not reasonable for the go runtime to unilaterally raise timer frequency.

@randomascii

This comment has been minimized.

Copy link

@randomascii randomascii commented Jun 12, 2015

With the background information out of the way...

I discovered this issue because I noticed (using clockres.exe) that the timer frequency on my computer was always raised. I used "powercfg /energy /duration 5" to identify the culprit, a go program that was running in the background. This go program was monitoring system status and was idle about 99.9% of the time but because of the go runtime's decision to raise the timer frequency for all go programs this background program was measurably affecting power consumption and battery life. This 99.9% idle go program was affecting my computer more than a program that was actively running.

This was clearly a case where the program did not need a higher timer frequency. Raising the timer frequency is a reasonable option for the go runtime, but it is absolutely not a reasonable default for the go runtime.

Chrome has been fixed to not leave the timer frequency high, so has SDL, and so have many other products. Go needs to do likewise. Please.

@alexbrainman

This comment has been minimized.

Copy link
Member

@alexbrainman alexbrainman commented Sep 15, 2015

@randomascii

I appreciate and share your concerns. So I tried to measure the impact of removing timeBeginPeriod call on my windows 7 amd64 computer. (see bench.txt in https://gist.github.com/alexbrainman/914885b57518cb4a6e39 ). Unfortunately I can see quite significant slowdowns here and there. The change looks like a no go for me. Maybe gophers who know more about Go scheduler can explain the slowdowns and suggest ways to improve the situation.

For the moment, if someone cares about this, they can comment out call to timeBeginPeriod.

Alex

@randomascii

This comment has been minimized.

Copy link

@randomascii randomascii commented Sep 15, 2015

I know that some video applications (games, Chrome, a few others) have to leave the frequency high in order to accurately schedule tasks and meet frame rates. For other programs I don't understand the need to have the timer frequency high. I would imagine that tasks would typically be happening as quickly as possible (reading files, doing calculations, passing messages) and I'm curious as to what sort of timeouts, or waits, or sleeps are being affected by the timer frequency.

Anyway, if some Go programs need the timer frequency then those Go programs should raise the timer frequency. I have no problem with that. But if the Go runtime raises the timer frequency then that is a problem. We could debate whether raising the timer frequency should be opt-in (my strong vote) or opt-out, but it should be a choice. Otherwise long-running low-activity Go programs will be blacklisted by anybody who wants good battery life.

Go deserves better than that.

@alexbrainman

This comment has been minimized.

Copy link
Member

@alexbrainman alexbrainman commented Sep 15, 2015

I would imagine that tasks would typically be happening as quickly as possible ...

Things are never simple, it is always about trade-offs. But I don't know enough about Go runtime to give you explanation. And I am hopeful that other gophers will comment. All I can say is that Windows makes the task much harder when everything is measured in 15 milliseconds (comparing to nanoseconds on linux) - imagine running scheduler that performs equally well on both. But maybe Go runtime can be improved on windows, and we can disable timeBeginPeriod call.

Alex

@randomascii

This comment has been minimized.

Copy link

@randomascii randomascii commented Sep 15, 2015

FWIW, an ETW trace (see https://github.com/google/UIforETW/releases for the easiest way to record one) can be used to record, among other things, all context switches. This makes it possible to determine where applications are waiting.

The Windows timer resolution is a continuing source of frustration to be sure, but it only slows down code that calls Sleep(n) or WaitForSingleObject(handle, n) (where 'n' is not INFINITE).

Looking at bench.txt I see that *OneShotTimeout-2 runs slower with a lower timer frequency. That sounds like it is by-design and shouldn't be used to justify all Go programs running with a high timer frequency. Ditto with BenchmarkGoroutineIdle-2.

BenchmarkAppendGrowByte-2 runs 368% slower which suggests a benchmark bug. Either the code being tested is waiting (which it should not do) or the code that times is using timeGetTime and is stopping and starting the timer frequently and is therefore hitting aliasing errors.

I strongly suspect that the latter is the problem. timeGetTime's precision is increased when the timer frequency is increased. The benchmarks with the lower timer precision are mostly not slower they are just less accurate. The correct fix is not to raise the timer frequency (the timers will still be inaccurate) but to use accurate timers such as QueryPerformanceCounter.

It is a frustrating problem I'll agree, but raising the timer frequency so that benchmarks can measure the execution time better is not a great solution.

If it turns out that the benchmarks such as BenchmarkAppendGrowByte-2 are actually slower when the timer frequency is lower then I will be surprised and might actually investigate. Fixing slow benchmarks requires understanding why they are slow.

gopherbot pushed a commit that referenced this issue Apr 29, 2017
Currently Go sets the system-wide timer resolution to 1ms the whole
time it's running. This has negative affects on system performance and
power consumption. Unfortunately, simply reducing the timer resolution
to the default 15ms interferes with several sleeps in the runtime
itself, including sysmon's ability to interrupt goroutines.

This commit takes a hybrid approach: it only reduces the timer
resolution when the Go process is entirely idle. When the process is
idle, nothing needs a high resolution timer. When the process is
non-idle, it's already consuming CPU so it doesn't really matter if
the OS also takes timer interrupts more frequently.

Updates #8687.

Change-Id: I0652564b4a36d61a80e045040094a39c19da3b06
Reviewed-on: https://go-review.googlesource.com/38403
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
@prasannavl

This comment has been minimized.

Copy link

@prasannavl prasannavl commented Aug 6, 2017

Linking in a relevant comment here:
https://go-review.googlesource.com/c/38403#message-96d2d5bb23216c271f8f8c7bd7d471e0ce999fae

Windows does use separates linked lists internally for it's active threads, making the scheduler quite efficient - but it does take a toll, when it's running with higher resolution for the kernel's book keeping. It's evident from Chrome, which does the same trick, while Edge does a 4ms resolution only when needed, making it over 30-40% more energy efficient as per Microsoft's claims. On the other hand Chrome simply raises the resolution to 1ms every time - even if you just move the mouse around.

I think the goal should be to remove the reliance on multimedia timers entirely. While this likely may not be possible until UMS itself is implemented, it can be made consistent with other Windows programs on how it uses time.

The multimedia timers are for programs to decide on the resolution so they can choose effectively, not for the language runtime itself to make a decision. In the Windows world - where you don't have a very reliable nanosleep - it's generally accepted that time resolution is 15ms (unless you're doing realtime/multimedia application, in which case the program explicitly asks for it). So, it's not generally a surprise to the programmers that this is as low as you go, and it's consistent with the behavior of the platform.

And it's really not hard for programs that want to support windows to manually do the resolution switch if at all they need. So, I think Go tried too hard, and is now breaking things by forcing every Go program to not be a good citizen on the platform.

@randomascii

This comment has been minimized.

Copy link

@randomascii randomascii commented Dec 12, 2017

Any progress on this? The previous comment did a good job of explaining why the Go runtime should not raise the timer frequency by default. It needs to be an opt-in option.

@mkdthanga

This comment has been minimized.

Copy link

@mkdthanga mkdthanga commented Apr 7, 2018

I'm using Windows 10, how can I see my System Timer interval? Is there any application for it?

@randomascii

This comment has been minimized.

Copy link

@randomascii randomascii commented Apr 7, 2018

The sysinternals' tool clockres is the easiest way to display the current timer interval, if you're comfortable with using the command prompt. Another alternative is this GUI tool:
https://github.com/tebjan/TimerTool

More resources are available here:
https://randomascii.wordpress.com/2013/07/08/windows-timer-resolution-megawatts-wasted/

But the best option is running this batch file from an admin prompt (after installing WPT, downloading and running the latest UIforETW release will do it):
https://github.com/google/UIforETW/blob/master/bin/trace_timer_intervals.bat

This will tell you every program that has raised the timer interrupt frequency over the time period recorded. All of the other tools just give you a snapshot so they can give severely misleading results for programs (like recent Go programs, or Chrome) which frequently change the timer interrupt frequency.

@randomascii

This comment has been minimized.

Copy link

@randomascii randomascii commented Jun 19, 2018

If you can open an administrator command prompt then you can run this command:

powercfg /energy /duration 5

This will create a file called energy-report.html. If you load that into a web browser and search for "Outstanding Timer Request" you can see who has raised the timer interrupt frequency. That is how I initially found that the Go runtime was doing this.

danhhz pushed a commit to danhhz/cockroach that referenced this issue May 28, 2019
time.Now() has a precision of about 1ms on Windows[0], which is coarse
enough to fail some of our tests. This implementation has a precision
of about 10µs, which is fine enough to trip a hastily-written test in
storage; that test is also fixed here.

[0] golang/go#8687

Note that this implementation is significantly slower than time.Now:
```
name   old time/op  new time/op   delta
Now-2  14.0ns ± 3%  339.4ns ± 4%  +2332.32%  (p=0.000 n=9+9)
```
I suspect this is because of the explicit syscall required. time.Now
avoids the syscall by reading the system clock directly[1][2], so if
we need better performance, we can do something similar.

[1] golang/go@74b62b4
[2] https://github.com/golang/go/blob/go1.8/src/runtime/os_windows.go#L581:L624

(sapling split of 3fc5c81)
@randomascii

This comment has been minimized.

Copy link

@randomascii randomascii commented Nov 28, 2019

I am a Windows developer working at Google. I checked recently and I found ten (10!) different programs on my work machine that were all raising the Windows system-global timer interrupt frequency to 1 kHz. I raised this with our winops team who is responsible for creating and pushing these tools and they said that it is likely that these are all Go tools.

This means that my machine is essentially always running with a raised timer interrupt frequency. This then makes it quite difficult to do any testing of Chrome's behavior with the timer interrupt frequency not raised.

The winops team is not thrilled about having to apply mitigations to every tool which they write in order to stop the Go Runtime from doing this. In addition, all that they can do is revert the timer frequency change after it happens, which means that it still happens briefly whenever a Go programs is launched.

Can we please get the Go Runtime to stop doing this? It's a bad default. If Go ever becomes popular on Windows then it will need to be fixed to avoid globally increasing power usage and it would be better to fix it now.

Any regressions from fixing it necessarily come from inappropriately designed benchmarks which are inserting delays and hoping that they will be short.

If specific Go programs need the timer frequency raised then they should raise it themselves.

The internal Google bug number that I filed is 133354039.

This is the script that I have written in order to record these timer frequency transitions is:
https://github.com/google/UIforETW/blob/master/bin/trace_timer_intervals.bat

@randomascii

This comment has been minimized.

Copy link

@randomascii randomascii commented Nov 28, 2019

It was just pointed out to me that the Go Runtime was changed to only raise the timer interrupt frequency on demand.

11eaf42

This is an improvement, to be sure, but it is still not ideal. Because the Windows timer interrupt is a system global resource, any changes which Go makes to the timer interrupt frequency will affect other programs. That is, due to the many Go programs which our winops team runs on my workstation I can never test what Chrome behaves like when the timer interrupt frequency is "normal". Our winops team doesn't want or need the raised timer frequency and would appreciate a way to opt-out (or, better yet if the timer adjustments were opt-in).

Note that if the osRelax() implementation was behind a flag then this would also give an easy fix to this issue:

#28255

TL;DR - automatically changing the system-global timer interrupt frequency is not universally desirable and should be opt-in (ideally) or opt-out (next best). Either solution, but especially opt-in, would fix both this issue and issue 28255.

Thanks!

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Dec 2, 2019

Let's revisit in the 1.15 release cycle.

We started using a higher resolution in https://golang.org/cl/4960057, which added support for profiling on Windows. Perhaps we could at least switch to the higher resolution only while profiling, not all the time.

@aclements

This comment has been minimized.

Copy link
Member

@aclements aclements commented Dec 3, 2019

I think we're all in agreement that the runtime shouldn't be lowering the system timer frequency. The question is what we should be doing instead.

We tried removing the timeBeginPeriod in 2016 and it caused severe regressions in some benchmarks (#14790) because it affected sysmon's ability to promptly retake Ps from system calls and C calls. We're not doing this because specific Go applications depend on the higher frequency. If that were the case, I'd be happy to give them the API to ask for it. We're doing this because the Go runtime itself depends on the higher frequency. This doesn't affect all Go applications, but it's hard to predict what it will affect; it's clearly not just the games and multimedia applications that typically depend on a higher timer frequency.

I'd be happy to told this is no longer the case, but as far as I know the solution is still to use UMS to detect blocked system calls rather than short timers (this would be great, in fact; UMS is a far better solution than short timers, just nobody has tackled that complexity yet). Until that's implemented, my understanding is that simply removing the timeBeginPeriod calls isn't viable.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Dec 3, 2019

Have we tried changing Windows so that entersyscall acts like entersyscallblock? It seems that then the sysmon timer frequency would be less important, though of course there would be a performance penalty for every non-blocking system call.

@johnrs

This comment has been minimized.

Copy link

@johnrs johnrs commented Dec 3, 2019

I've been out of the loop for a year or two, so forgive me for not being up to date on recent changes.

If specific Go programs need the timer frequency raised then they should raise it themselves.

It depends on your priorities. Some need the higher resolution a faster clock offers and aren't as concerned about the extra overhead it costs.

Some other languages do this, too. Perl comes to mind (last time I checked it was about 5 years ago). And there are some other programs which set it fast, sometimes dynamically (as Go does now), for example Chrome.

As far as I can remember: Initially Go set it fast, always. But the user could slow it down. Most were happy.

To make the others happy, Go set it slow after start up. This made them happy but it upset thye other camp. For example, this broke 2 of my programs. But the user could force it fast, so it was easy to patch.

Then the Go runtime decided it needed to be fast from time to time for internal reasons, so it dynamically changed it. But this wasn't coordinated with any user i/f, so the user lost the ability to slow it. Some from both camps ( fast and slow) were upset.

Then Austin came up with the great idea to set it dynamically, when sleep and some other time related functions were called. This make most happy. The slow camp only saw occasional unwanted speedups which automatically reverted about 50ms later. The first time called, however, there could be a long (2 x 15.6ms, if memory serves) delay. This troubled the fast group but they found ways to keep it always fast (my favorite is to make a goroutine that does a 10ms sleep in a tight loop). Almost everyone was happy.

If a stable environment for testing is what you need, then one way is to run a go "utility" which stays in fast mode (as I just described) for the duration of the test, because if ANYONE on the system asks for fast then EVERYONE is affected. On the other hand, if your program runs in a pristine environment and is certain that nobody will do this, then I'd think there is no problem at test time either, as long as you test in the same environment. :)

A work-around might be to give the user the ability to enable (default) or disable the dynamic speedup done based on time-related calls. This wouldn't block what the runtime does for itself, however.

Side note: Why do some want the slow clock? Embedded systems trying to save power and high performance systems who don't want the extra OS overhead. Am I missing any? Have you done an estimate of what percentage effect is incurred by fast vs slow? I realize that situations vary, but some idea of the average and worst case would be interesting, I think.

Another side note: An additional consideration is a problem I mentioned a few years ago. Portable code might require more resolution that Windows is capable of (1ms). The more the disparity (15.6ms) the higher the odds that something bad might happen. The 2 programs I mentioned above both exhibited this. They ran fine on a Linux system, and fine on Windows fast (1ms) - mainly because I had allowed for a clock as slow as 1ms. But they both failed on Windows sloe (15.6ms). So the Go compatibility goals are a consideration also.

Side note: The Go runtime does internal timing much faster than 1ms in about a number places. I asked then if they were safe with a slower (either 1 or 15.6ms) clock. I don't recall anyone ever looking into that.

And then there's the other issue that "always longer than requested" is incorrect. It should say "from 1 tick (whatever that is on your system) less to something longer". This is most troublesome at the 1 tick boundary, of course. For example, on a fast (1ms) Windows system, if you ask for a 1ms sleep, it might sleep anywhere from 0 to 2ms (or longer if it's a busy system). It's the 0ms side that scares me.

@randomascii

This comment has been minimized.

Copy link

@randomascii randomascii commented Dec 10, 2019

@ToadKing

This comment has been minimized.

Copy link

@ToadKing ToadKing commented Dec 10, 2019

I second the power saving/battery life issue. We ship a Windows program built in Go that is long-running and in the background most of the time. It spends most of the time idle or only waking up and doing very small and quick processing before going idle again. Our use-case would 100% take the power saving over faster performance that a faster clock might bring.

@johnrs

This comment has been minimized.

Copy link

@johnrs johnrs commented Dec 10, 2019

I like the idea of an environment/runtime variable. Some (all?) of the problems between the Go runtime and the Go user could be resolved this way. It seems like the best we can hope for.

@randomascii

This comment has been minimized.

Copy link

@randomascii randomascii commented Feb 5, 2020

I found another data point to answer the question of why this matters. I just found that the performance measurement test machines used at Google to test Chromium have long-running Go programs on them. This means that the timer interrupt frequency is sometimes/frequently/always raised, which means that Chrome behaves slightly differently on those machines compared to normal consumer machines. This then makes it more difficult for us to validate changes that we make to Chrome to make it less dependent on a raised timer interrupt frequency, because the timer interrupt frequency will still be raised.

FWIW.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.