time: make Timer/Ticker channels not receivable with old values after Stop or Reset returns #37196

bcmills · 2020-02-12T16:32:11Z

There is an argument for making
tr := time.NewTimer(0)
tr.Stop()
// There may or may not be a value in the channel now. But definitely
// one should not be added after we receive it.
select {
        case <-tr.C:
        default:
}
work, but it doesn't today, and we're certainly not going to do that for Go 1.7. Maybe early in Go 1.8 if someone wants to file a different bug.

As far as I can tell, no different bug was filed: the documentation for (*Timer).Stop still says:

go/src/time/sleep.go

Lines 57 to 66 in 7d2473d

    
           // To ensure the channel is empty after a call to Stop, check the 
        
           // return value and drain the channel. 
        
           // For example, assuming the program has not received from t.C already: 
        
           // 
        
           // 	if !t.Stop() { 
        
           // 		<-t.C 
        
           // 	} 
        
           // 
        
           // This cannot be done concurrent to other receives from the Timer's 
        
           // channel.

and that behavior is still resulting in subtle bugs (#27169). (Note that @rsc himself assumed that a select should work in this way in #14038 (comment).)

I think we should tighten up the invariants of (*Timer).Stop to eliminate this subtlety.

CC @empijei @matttproud @the80srobot @ianlancetaylor

The text was updated successfully, but these errors were encountered:

bcmills · 2020-02-12T16:33:49Z

For an example program illustrating why a user might reasonably expect (and attempt to make use of) this behavior, see https://play.golang.org/p/mDkMG67ehAI.

empijei · 2020-02-12T17:18:04Z

Thanks for filing this.

I also found this behavior puzzling and apparently I got it wrong too as I was expecting no values to be sent after t.Stop() returned.

If, as I understand now, a value might be sent after the call to Stop has returned, is there any way to correctly re-use a timer?

bcmills · 2020-02-12T17:44:16Z

If, as I understand now, a value might be sent after the call to Stop has returned, is there any way to correctly re-use a timer?

If you know that no other goroutine is receiving on the C channel, then you can receive unconditionally after the call to Stop returns false, and it won't actually matter whether the value was delivered before or after the call to Stop returned.

(The select-with-default after Stop is where the bug occurs, and none of the examples in your post do that — they don't need to, because in your examples the reads on t.C are all on the same goroutine as the calls to t.Stop.)

ianlancetaylor · 2020-02-12T18:31:52Z

In the current implementation, this can probably be done by changing runOneTimer to move the Cas away from timerRunning until after the invocation of f.

rsc · 2020-05-20T17:50:08Z

Added to proposal process to resolve whether to do this, after mention in #38945.

rsc · 2020-05-27T17:36:39Z

This made its first appearance in the proposal minutes last week.
You can imagine ways to break Go programs with this, such as by

t := time.NewTimer(1*time.Second)
time.Sleep(2*time.Second)
t.Stop()
<-t.C

This would succeed today, but if we clear the channel during Stop it would start blocking.
We would have to define that in the case where Stop pulls the element back out of the channel buffer, it returns true, meaning it "stopped" the timer (false means the timer already expired; expired would now mean "expired and was observed by a channel receive on t.C").

On the other hand, there is technically a race here, and it's not guaranteed that this snippet never blocks, especially on a loaded system. The correct snippet, even today, is:

t := time.NewTimer(1*time.Second)
time.Sleep(2*time.Second)
if !t.Stop() {
    <-t.C
}

That example would be unaffected, because Stop would pull the buffered element back out of the channel and then return true, signaling to the caller that there is nothing to receive.

We could make the change at the start of a release cycle and be ready to roll it back if real (as opposed to hypothetical) problems arose.

Thoughts?

bcmills · 2020-05-27T19:32:25Z

While I like the behavior of pulling the value back out of the channel and having Stop return true, I don't think it is actually backward-compatible, depending on how the len builtin for channels interacts with the memory model.

For example, this program today is not reported as a race and the receive is guaranteed to never block, but would deadlock if Stop clears the channel:

package main

import (
        "fmt"
        "time"
)

func main() {
        t := time.NewTimer(1 * time.Second)
        for len(t.C) == 0 {
                time.Sleep(time.Millisecond)
        }
        t.Stop()
        <-t.C
        fmt.Println("ok")
}

That is why I had restricted my original proposal to guarantee only that no value is sent after Stop returns.

ianlancetaylor · 2020-05-27T22:18:47Z

Sorry, I'm not sure I understand what it means to guarantee that no value is sent on the channel after Stop returns. As far as I can see, that is already guaranteed--maybe not in the documentation, but certainly in the implementation. What is the actual change that you are suggesting?

bcmills · 2020-05-28T02:19:33Z

@ianlancetaylor, the program given in https://play.golang.org/p/Wm1x8DmYoQo should run indefinitely without reporting any delayed times. Instead, it reports a nonzero rate of delayed timer sends.

On my Xeon workstation:

example.com$ gotip version
go version devel +b2ce3931 Tue May 26 22:28:45 2020 +0000 linux/amd64

example.com$ gotip run main.go
Timer stopped at 2020-05-27 22:16:46.951252699 -0400 EDT m=+2.886904949 subsequently received time
                 2020-05-27 22:16:46.951252454 -0400 EDT m=+2.886904697.
Timer stopped at 2020-05-27 22:17:00.794561424 -0400 EDT m=+16.730213661 subsequently received time
                 2020-05-27 22:17:00.794559263 -0400 EDT m=+16.730211512.
Timer stopped at 2020-05-27 22:17:00.774413697 -0400 EDT m=+16.710065937 subsequently received time
                 2020-05-27 22:17:00.77440917 -0400 EDT m=+16.710061421.
Timer stopped at 2020-05-27 22:17:03.679525664 -0400 EDT m=+19.615177943 subsequently received time
                 2020-05-27 22:17:03.679522832 -0400 EDT m=+19.615178663.
Timer stopped at 2020-05-27 22:17:06.024962313 -0400 EDT m=+21.960614549 subsequently received time
                 2020-05-27 22:17:06.024960025 -0400 EDT m=+21.960612270.
Timer stopped at 2020-05-27 22:17:06.692670731 -0400 EDT m=+22.628322974 subsequently received time
                 2020-05-27 22:17:06.692669848 -0400 EDT m=+22.628322092.
Timer stopped at 2020-05-27 22:17:22.942118175 -0400 EDT m=+38.877770415 subsequently received time
                 2020-05-27 22:17:22.942117468 -0400 EDT m=+38.877769713.
Timer stopped at 2020-05-27 22:17:22.901857123 -0400 EDT m=+38.837509358 subsequently received time
                 2020-05-27 22:17:22.901855958 -0400 EDT m=+38.837508205.
Timer stopped at 2020-05-27 22:17:27.342696328 -0400 EDT m=+43.278348567 subsequently received time
                 2020-05-27 22:17:27.342693224 -0400 EDT m=+43.278345463.
Timer stopped at 2020-05-27 22:17:28.993568136 -0400 EDT m=+44.929220391 subsequently received time
                 2020-05-27 22:17:28.993571601 -0400 EDT m=+44.929223839.
Received 10 delayed times after 128517449 iterations.

In all cases, the failure mode is the same: at some point after Stop returns, the channel receives a very-slightly-earlier timestamp. That is: the return from Stop is not synchronized with the actual send to the channel.

Instead, I propose that the send to the channel, if it occurs at all, ought to happen before the return from Stop.

ianlancetaylor · 2020-05-28T04:22:40Z

Thanks for the example. If the code is written per the documentation of time.Timer.Stop, it works correctly (https://play.golang.org/p/KlC__ujuE5M).

And now I see how this happens. It's because in runOneTimer we set the status to timerNoStatus before we call the function that sends a value on the channel, which permits deltimer and therefore time.Timer.Stop to return false before we send the value. The return of false does reliably mean that a value will be sent on the channel, but the value may not have been sent when Stop returns.

I don't see an easy fix in the current implementation. We can't leave the timer in timerRunning state while we unlock pp.timersLock in order to run the function.

bcmills · 2020-05-28T20:18:03Z

Would it make sense to change the timer to the timerModifying state while running the callback (which should be short), and then to timerNoStatus after the callback completes?

(Or would that potentially induce starvation for a time.Ticker with a very short period?)

bcmills · 2020-05-28T20:41:57Z

The same send-after-Stop problem appears to also applies to time.Ticker, which I suppose is not a surprise because they share the same underlying timer implementation.

(Code in https://play.golang.org/p/Y_Hz4xkYr07, but it doesn't reproduce there due to the Playground's faketime patch.)

ianlancetaylor · 2020-05-28T20:43:39Z

It's actually a bit more complicated than I thought. If we leave the timer status as timerRunning (or timerModifying), then if the timer function does anything that touches the timer, it can hang. For example, the scavenger uses a timer with a function of wakeScavenger that expects to stop the timer. That would hang waiting for the timer to leave timerRunning status. The netpoll code also manipulates the timer in the timer function, though I think it wouldn't cause any problems here.

rsc · 2020-06-03T17:33:07Z

Let's try to separate the discussion of semantics from implementation.

The docs say:

To ensure the channel is empty after a call to Stop, check the return value
and drain the channel. For example, assuming the program has not received
from t.C already:

    if !t.Stop() {
    	<-t.C
    }

The proposed change in semantics is to make it that no receive can ever happen after t.Stop returns. That would mean that under the assumption - "assuming the program has not received from t.C already" - t.Stop would never return false anymore. So the above code would no longer be required (and if left alone would just never execute the if body).

This would avoid the problem of people not knowing to write that fixup code or not understanding the subtlety involved. Certainly we've all confused ourselves enough just thinking through this over the years.

That's the proposed semantic change.

There are also questions about how to implement that - it's not just the easy "pull the item back out of the buffer" that I suggested before - but certainly it is possible. Let's figure out if we agree about the semantic change.

(Another side-effect would probably be that len(t.C) would always return 0, meaning the channel would appear to be unbuffered. But we've never promised buffering in the docs. The buffering is only there because I was trying to avoid blocking the timer thread. It's a hack.)

bcmills · 2020-06-03T18:21:13Z

I agree that if we could change the semantics to ensure that no receive is possible after Stop, that would be preferable to both the current behavior and the originally-proposed change to only disallow a send-after-Stop.

However, I don't see any way to do that without breaking the example in https://play.golang.org/p/r77N1PfXuu5.

That program explicitly relies on the buffering behaviors of t.C: specifically, that len(t.C) becomes nonzero when the timer fires, and that the buffered value remains receivable after t.Stop(). Neither of those behaviors is strictly guaranteed by the documentation, but nor does anything in the documentation explicitly reserve them as subject to change — and they have been observable since CL 4063043 over nine(!) years ago.

Disallowing a receive after Stop must necessarily break one of those two behaviors. If we marked the buffer as non-empty but then retracted the buffered value in t.Stop(), then the program would block forever on the subsequent receive. If we instead made the channel unbuffered — or, equivalently, hooked the runtime to avoid sending to it until some goroutine is ready to receive — then the program would loop forever waiting for len(t.C) to become nonzero.

So the only way I see to make that change in semantics would be to declare that this example program is already incorrect, because it is relying on undocumented behavior regardless of the stability of that behavior. I could be wrong, but given how long this behavior has been in place I suspect that changing it would break real programs.

ianlancetaylor · 2020-06-04T00:43:04Z

Do you know of any programs that call len on a timer channel?

bcmills · 2020-06-04T21:26:36Z

I do not know of any specifically, but given the 9-year interval and Hyrum's law I would be surprised if they did not exist.

If we are willing to assume that such programs do not exist (or declare that they are invalid if they do exist, because the buffering behavior was never documented), then preventing the receive seems ok.

gopherbot · 2024-03-01T05:04:33Z

Change https://go.dev/cl/568341 mentions this issue: time: avoid stale receives after Timer/Ticker Stop/Reset return

gopherbot · 2024-03-01T05:04:33Z

Change https://go.dev/cl/568375 mentions this issue: cmd/compile: compile cap(ch) as call to runtime.chancap

An upcoming CL will give this call more to do. For now, separate out the compiler change that stops inlining the computation. For #37196. Change-Id: I965426d446964b9b4958e4613246002a7660e7eb Reviewed-on: https://go-review.googlesource.com/c/go/+/568375 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Auto-Submit: Russ Cox <rsc@golang.org>

rsc · 2024-03-11T13:40:24Z

Looks like this may happen for Go 1.23, finally.

In the discussion above #37196 (comment) I said the channel would "appear to be unbuffered", but I don't think I realized at the time (or I forgot) that that's literally the entire semantic change: timer channels will now be synchronous aka unbuffered. All the simplification follows from that. (The implementation is more subtle, including a hidden buffer as an optimization, but the semantic effect is simply that timer channels will now behave as if unbuffered.)

For a graceful rollout, this change and the change for #8898 can be disabled by GODEBUG=asynctimerchan=1.

kevinburkesegment · 2024-03-19T23:14:44Z

Just found this thread while trying to track down a test flake - the test was trying to drain the channel with a <-timer.C but also provided a default: flag which caused the old event to fail to be read. I am glad that the behavior here is changing to make it more difficult to do the subtly wrong thing.

zigo101 · 2024-04-02T08:51:21Z

With tip, the finalizer of a timer will not get executed. Is it intended?

package main

import "time"
import "runtime"

func main() {
	timer := time.NewTimer(time.Second)
	runtime.SetFinalizer(timer, func(c *time.Timer){
		println("done")
	})
	timer.Stop()
	runtime.GC()
	time.Sleep(time.Second)
}

$ gotv 1.22. run main.go
[Run]: $HOME/.cache/gotv/tag_go1.22.1/bin/go run main.go
done
$ gotv :tip run main.go
[Run]: $HOME/.cache/gotv/bra_master/bin/go run main.go
$

zigo101 · 2024-04-02T08:54:48Z

BTW, since Go 1.23, does the Stop method of a Timer still need to be called to make the Timer collectable as earlier as possible?

kscooo · 2024-04-02T10:49:28Z

BTW, since Go 1.23, does the Stop method of a Timer still need to be called to make the Timer collectable as earlier as possible?

https://go-review.googlesource.com/c/go/+/512355#:~:text=This%20CL%20fixes,be%20GC%27ed%20immediately.

ianlancetaylor · 2024-04-02T17:41:22Z

@go101 The finalizer change sounds like a bug to me. Please open a new issue for that. Thanks.

zigo101 · 2024-04-02T18:10:18Z

@ianlancetaylor I'm not sure whether or not it is a bug, because the comment in the PR mentions a Timer object might be not allocated on heap, Anyway, I created the issue.

ianlancetaylor · 2024-04-02T18:46:07Z

Thanks for creating the issue. Seems to me it is a bug either in the implementation or in the documentation.

prashantv · 2024-06-21T23:00:18Z

With this change, I'm not sure if it will be possible for fake clocks to be used for deterministic tests and behave similarly to the stdlib timer methods.

Given some production code, say:

func delayedPush(clock clock.Clock) {
  c := clock.Timer(time.Second)
  <-c
  push()
}

And the following test:

func TestDelayedPush(t *testing.T) {
	fake := clock.NewFake()
	go delayedPush(fake)

	fake.Advance(time.Second)
	if !pushed {
		t.Errorf("expected push")
	}
}

This test is flaky: it passes if the background goroutine has created the timer before the fake advances time, but fails otherwise since. This is a really common issue so tests often use runtime.Gosched() or time.Sleep(...) to work around these, and while that reduces the likelihood of failure, it can still occur -- it's not fully deterministic.

This problem can be solved with fakes that support blocking until the timer is created, e.g.,BlockUntil,

func TestDelayedPush(t *testing.T) {
	fake := clock.NewFake()
	go delayedPush(fake)

	fake.BlockUntil(1) // wait for background Timer call
	fake.Advance(time.Second)
	if !pushed {
		t.Errorf("expected push")
	}
}

Prior to this change, since the channel was buffered, the Advance would write to the channel regardless of whether the background delayedPush goroutine was waiting on the channel or not. With this change, assuming the fake behaves consistently with the stdlib (non-blocking write to an unbuffered channel), then the test is flaky again: it passes if the background goroutine is reading from the channel, otherwise, the write is dropped, and the test fails.

I'm currently working on a fake clock library, primarily to help with writing deterministic, non-flaky tests, and with this change, I'm not sure that'll be possible within the current API surface.

ianlancetaylor · 2024-06-21T23:15:38Z

The code that uses the timers shouldn't care whether the channel is buffered or not. I don't see why a fake clock library would have to use an unbuffered channel.

Please see #67434 for some recent ideas on how to test timers and the like.

prashantv · 2024-06-22T00:12:50Z

The fake can use a buffered channel, but it's behavior will not be able to match the stdlib:

len and cap would return different values. Based on Russ's analysis above, I think this is probably OK.
Stop will behave differently since it can't unsend values like the stdlib.

I was hoping to build a fake that matches the stdlib behavior closely for high fidelity testing.

In this case, it's not a huge issue since the new stdlib behavior is less error-prone, and continues to work with code written against the previous model.

After reading the time PR more closely, it's likely that the fake can emulate the stdlib more closely by using the expensive approach that the change emulates -- use a separate goroutine to write to the channel, cancelling it when Stop is called, etc.

powerman · 2024-06-22T13:46:50Z

func delayedPush(clock clock.Clock) {
  c := clock.Timer(time.Second)
  <-c
  push()
}

To me it looks like test for this code will be racy in any case - this just can't be solved by doing something with timers, because all you can get as result is sync with <-c line, but not push() line, while test actually needs sync with push() line.

bcmills added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. early-in-cycle A change that should be done early in the 3 month dev cycle. labels Feb 12, 2020

bcmills added this to the Backlog milestone Feb 12, 2020

bcmills mentioned this issue Feb 25, 2020

time: racy Timer access should either work or throw, not panic #37400

Closed

bcmills mentioned this issue May 11, 2020

proposal: time: Timer.StopAndClear method #38945

Closed

rsc changed the title ~~time: a Timer should not send to its channel after its Stop method returns~~ proposal: time: a Timer should not send to its channel after its Stop method returns May 20, 2020

gopherbot added Proposal and removed NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels May 20, 2020

rsc modified the milestones: Backlog, Proposal May 20, 2020

rsc added this to Active in Proposals (old) May 20, 2020

rsc mentioned this issue May 20, 2020

proposal: review meeting minutes #33502

Open

rsc changed the title ~~proposal: time: a Timer should not send to its channel after its Stop method returns~~ proposal: time: a Timer channel should not be receivable after its Stop method returns May 27, 2020

rsc moved this from Likely Accept to Accepted in Proposals (old) Jun 24, 2020

rsc modified the milestones: Proposal, Backlog Jun 24, 2020

rsc changed the title ~~proposal: time: make Timer/Ticker channels not receivable with old values after Stop or Reset returns~~ time: make Timer/Ticker channels not receivable with old values after Stop or Reset returns Jun 24, 2020

rsc added Proposal-Accepted and removed Proposal-FinalCommentPeriod labels Jun 24, 2020

mstoykov mentioned this issue Oct 28, 2020

Immediately shutdown k6 cloud on second sig-c signal grafana/k6#1647

Merged

ash2k mentioned this issue Nov 12, 2020

Make RetryWatcher sleep interruptible kubernetes/kubernetes#93709

Merged

bcmills mentioned this issue Apr 22, 2021

proposal: time: context-aware time Ticker and Timer #45571

Closed

bcmills mentioned this issue Aug 17, 2023

time: stop requiring Timer/Ticker.Stop for prompt GC #61542

Closed

gopherbot closed this as completed in 966609a Mar 14, 2024

zigo101 mentioned this issue Apr 2, 2024

time: the finalizers of Timer objects will not get executed #66650

Open

dmitshur modified the milestones: Backlog, Go1.23 May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

time: make Timer/Ticker channels not receivable with old values after Stop or Reset returns #37196

time: make Timer/Ticker channels not receivable with old values after Stop or Reset returns #37196

bcmills commented Feb 12, 2020

bcmills commented Feb 12, 2020 •

edited

Loading

empijei commented Feb 12, 2020

bcmills commented Feb 12, 2020 •

edited

Loading

ianlancetaylor commented Feb 12, 2020

rsc commented May 20, 2020

rsc commented May 27, 2020

bcmills commented May 27, 2020 •

edited

Loading

ianlancetaylor commented May 27, 2020

bcmills commented May 28, 2020 •

edited

Loading

ianlancetaylor commented May 28, 2020

bcmills commented May 28, 2020

bcmills commented May 28, 2020

ianlancetaylor commented May 28, 2020

rsc commented Jun 3, 2020

bcmills commented Jun 3, 2020

ianlancetaylor commented Jun 4, 2020

bcmills commented Jun 4, 2020

gopherbot commented Mar 1, 2024

gopherbot commented Mar 1, 2024

rsc commented Mar 11, 2024

kevinburkesegment commented Mar 19, 2024

zigo101 commented Apr 2, 2024

zigo101 commented Apr 2, 2024

kscooo commented Apr 2, 2024 •

edited

Loading

ianlancetaylor commented Apr 2, 2024

zigo101 commented Apr 2, 2024

ianlancetaylor commented Apr 2, 2024

prashantv commented Jun 21, 2024

ianlancetaylor commented Jun 21, 2024

prashantv commented Jun 22, 2024 •

edited

Loading

powerman commented Jun 22, 2024

time: make Timer/Ticker channels not receivable with old values after Stop or Reset returns #37196

time: make Timer/Ticker channels not receivable with old values after Stop or Reset returns #37196

Comments

bcmills commented Feb 12, 2020

bcmills commented Feb 12, 2020 • edited Loading

empijei commented Feb 12, 2020

bcmills commented Feb 12, 2020 • edited Loading

ianlancetaylor commented Feb 12, 2020

rsc commented May 20, 2020

rsc commented May 27, 2020

bcmills commented May 27, 2020 • edited Loading

ianlancetaylor commented May 27, 2020

bcmills commented May 28, 2020 • edited Loading

ianlancetaylor commented May 28, 2020

bcmills commented May 28, 2020

bcmills commented May 28, 2020

ianlancetaylor commented May 28, 2020

rsc commented Jun 3, 2020

bcmills commented Jun 3, 2020

ianlancetaylor commented Jun 4, 2020

bcmills commented Jun 4, 2020

gopherbot commented Mar 1, 2024

gopherbot commented Mar 1, 2024

rsc commented Mar 11, 2024

kevinburkesegment commented Mar 19, 2024

zigo101 commented Apr 2, 2024

zigo101 commented Apr 2, 2024

kscooo commented Apr 2, 2024 • edited Loading

ianlancetaylor commented Apr 2, 2024

zigo101 commented Apr 2, 2024

ianlancetaylor commented Apr 2, 2024

prashantv commented Jun 21, 2024

ianlancetaylor commented Jun 21, 2024

prashantv commented Jun 22, 2024 • edited Loading

powerman commented Jun 22, 2024

bcmills commented Feb 12, 2020 •

edited

Loading

bcmills commented Feb 12, 2020 •

edited

Loading

bcmills commented May 27, 2020 •

edited

Loading

bcmills commented May 28, 2020 •

edited

Loading

kscooo commented Apr 2, 2024 •

edited

Loading

prashantv commented Jun 22, 2024 •

edited

Loading