Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: time.Sleep bug #61456

Closed
johnrs opened this issue Jul 19, 2023 · 18 comments
Closed

runtime: time.Sleep bug #61456

johnrs opened this issue Jul 19, 2023 · 18 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime.

Comments

@johnrs
Copy link

johnrs commented Jul 19, 2023

What version of Go are you using (go version)?

go1.20.6 windows/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

windows_amd64 - Windows 10 Pro, 22H2, Build 19045.2965

What did you do?

Please see Issues #44343, #44476, #44608, and #61042.

What did you expect to see?

Please see Issues #44343, #44476, #44608, and #61042.

What did you see instead?

Please see Issues #44343, #44476, #44608, and #61042.

Discussion

A timing bug identified 2.5 years ago affects all versions of Go since go1.16. It causes some programs to not run correctly when compiled with go1.16 or after. It prevents some Windows users from upgrading past go1.15 for certain programs. Hence this proposed bug mitigation: Enable osRelax() until the timing bug is fixed.

In go1.16.0 - go1.16.3 changes were made to improve timing operations in runtime. These changes included disabling osRelax() to prevent it from switching the Windows system clock from the default 15.6ms to the more precise 1ms.

The resulting problem isn't always apparent on older versions of Windows, because if any other program running requested the 1ms switch it affected all programs. But in Windows 10 version 2004 (April 2020) and after, this was changed by Microsoft: timeBeginPeriod is no longer global. Only the requesting program sees the switched clock.

A quick test with a user function setting timeBeginPeriod to 1 ms shows that this avoids the timing problem. But this is not a production fix. The user function is not coordinated with runtime's uses of timeBeginPeriod in other places. In the past this was shown to be unacceptable. The solution was osRelax().

My proposal is to reactivate osRelax() which is in runtime/os_windows.go, line 426. Changing the const highResTimerSupported on line 464 would possibly be a way to do it.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 19, 2023
@seankhliao
Copy link
Member

we should keep the discussion in one place, please use the original issue

@seankhliao seankhliao closed this as not planned Won't fix, can't repro, duplicate, stale Jul 19, 2023
@johnrs
Copy link
Author

johnrs commented Jul 19, 2023

@seankhliao

Sorry, I was advised to open a new issue. Could you please re-open #61042 and mark it as a Bug?

@ianlancetaylor
Copy link
Contributor

Where were you advised to open a new issue?

It seems to me that the issue for this problem is #44343, which is still open. Do we need another issue besides that one?

@johnrs
Copy link
Author

johnrs commented Jul 20, 2023

Yes, if you check the last few messages at the very bottom of #61042. I apologize if I misunderstood, and didn't do it correctly.

#44343 was more about finding out what was causing the problem than getting it fixed.

What do you recommend?

@ianlancetaylor
Copy link
Contributor

What I see in #61402 is that @ChrisHines suggested that you post on #44343, or perhaps open a new proposal issue with new API. Perhaps this issue is intended to be that proposal? But it's not written as a proposal. A proposal should be about the proposed change. In this issue I guess you are proposing os.Relax? But that doesn't exist today, and I don't know what it is supposed to do, and you don't explain it.

I'm not trying to be difficult here. If you want to talk about the bug, do that on #44343. If you want to make a specific proposal, then do that, as described at https://go.dev/s/proposal (this issue is formatted as a bug report, not a proposal). If you want to propose that we add os.Relax, describe what it does as it would appear on https://pkg.go.dev/os.

@johnrs
Copy link
Author

johnrs commented Jul 21, 2023

Yes, and I did post on #44343 about getting it fixed. Over a week later, after no response, I moved on and opened this issue. I started to write it as a proposal, but as I read the guidelines, I realized that bug might be a better option. It sort of seems that proposals were for new features, while bugs were for fixing old ones.

Either way, for this issue it's the same info. I tried to describe the problem, the effects, the cause, and a possible mitigation. As I read what I wrote, I see that the proposal to fix it is somewhat implied, not stated. I will append a line making the proposal clear.

osRelax is already in runtime/os_windows.go. It just isn't being used currently. Sorry, I see that I referred to it as os.Relax by mistake.

What next? Can this issue be the proposal?

@ianlancetaylor
Copy link
Contributor

Just to set expectations, for the Go project it's not surprising when an issue that's been open for over two years doesn't get a response within a week. We have many open issues.

For clarity, when you write os.Relax I think you mean adding a new function to the os package. I now understand that you are referring to the existing internal runtime function osRelax, which has nothing to do with the os package. We do currently use osRelax in the runtime package. It's called from the system monitor thread, which runs the function sysmon in the file runtime/proc.go.

It sounds like you understand how to fix the problem, at least for newer Windows systems, which suggests that you should send in a patch, referencing #44343. Per https://go.dev/wiki/MinimumRequirements we now only support Windows 10 and later, which may make fixing this on HEAD simpler.

@johnrs
Copy link
Author

johnrs commented Jul 21, 2023

Am I correct in viewing this as a bug? Indeed, it would seem to violate the Go Compatibility Guarantee goal. That's why I thought that classifying it as a bug would gain it more attention than simply a proposal. Sorry if I got that wrong.

No, I don't understand how to fix the problem correctly. After reading the various issues about it, I sense that the consensus is to make a change to use a new Windows call. Unfortunately it isn't a documented call, hence the wait till it is.

That's why I refered to my idea, as a mediation, not a solution, of the problem. It would be temporary, until the desired Windows call was reliably available.

I'm not even sure of the best way to accomplish reactivting osRelax. I can think of 2 ways offhand, but either might possibly produce undesireable side effects. I don't know anywhere near enough about the runtime to say. Thus I don't feel that I'm qualified to propose the exact patch.

The 2 ways are: Negating the const highResTimerSupported, or removing the "if" in osRelax which tests the const.

What next? Can this issue be the proposal or bug - whichever is more appropriate?

@ianlancetaylor
Copy link
Contributor

This seems like a bug to me.

I don't see how changing this behavior would break the compatibility guarantee.

I don't see any API change here, so I don't see any reason for a proposal.

The place to discuss this bug is #44343. Duplicating the bug report into a different issue will just confuse matters, it won't help anything, and in particular it won't cause the bug to be fixed any faster.

Go is an open source project, which means that bugs get fixed by the people who care about them the most. I personally don't know enough about Windows to know what should be done here.

@johnrs
Copy link
Author

johnrs commented Jul 21, 2023

We agree that it's a bug. When #44343 was opened it wasn't know to be a bug. Now that it is, can #44343 be changed to bug status?

This bug breaks some programs such that they no longer run correctly. I believe this is covered by the compatibility guarantee.

@johnrs
Copy link
Author

johnrs commented Jul 21, 2023

P.S. You might be interested in a comment on #44343 by mmdriley about 3 months ago:

It seems relevant to note that @rsc cited this issue in https://research.swtch.com/telemetry-uses as a surprising instance of a bug that might keep Windows users from upgrading from Go _1.15.

@ianlancetaylor
Copy link
Contributor

can #44343 be changed to bug status?

#44343 is already considered to be a bug.

This bug breaks some programs such that they no longer run correctly. I believe this is covered by the compatibility guarantee.

Guaranteeing that all programs continue to run correctly is much stronger than the compatibility guarantee. If that were the guarantee, almost nothing could ever be changed.

@johnrs
Copy link
Author

johnrs commented Jul 21, 2023

Great! How do you tell? When I look at the "labels" I don't see it.

Guaranteeing that all programs continue to run correctly is much stronger than the compatibility guarantee.

Really? What I see is: "It is intended that programs written to the Go 1 specification will continue to compile and run correctly, unchanged, over the lifetime of that specification."

@ChrisHines
Copy link
Contributor

Here are some links to help connect the dots. The comment from mmrdiley that @johnrs is referring to is:

The reason Russ became aware of the issue and mentioned it in the telemetry uses article was due to this comment I made on the proposal to extend forwards compatibility:

His reply to my comment in that discussion is here:

@ianlancetaylor
Copy link
Contributor

Great! How do you tell? When I look at the "labels" I don't see it.

We don't have a label for "this is a bug."

What I see is: "It is intended that programs written to the Go 1 specification will continue to compile and run correctly, unchanged, over the lifetime of that specification."

See also the exceptions for "unspecified behavior" and "bugs".

@johnrs
Copy link
Author

johnrs commented Jul 21, 2023

We don't have a label for "this is a bug."

Ok. For some reason I was thinking there would be a "Bug" label like there is a "Proposal" label. I don't see it now, but I thought that I read something like that.

See also the exceptions for "unspecified behavior" and "bugs".

Are you suggesting that how long the Sleep function sleeps is "unspecified behavior"?

The only mention of "bug" relates to one which violates the specification initially, and might break programs if fixed. This issue is the exact opposite of that.

@ianlancetaylor
Copy link
Contributor

To be honest, I feel like we are just engaging in unproductive word splitting at this point. And I probably misunderstood your original comment. We all agree that we should fix this bug. Is there anything else to discuss?

@johnrs
Copy link
Author

johnrs commented Jul 22, 2023

I think we've covered everything. I've already added the pertinent info to #44343.

I think that I have a better understand of the process now. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime.
Projects
None yet
Development

No branches or pull requests

5 participants