Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: sync: remove the Cond type #21165

Closed
bcmills opened this issue Jul 25, 2017 · 72 comments
Closed

proposal: sync: remove the Cond type #21165

bcmills opened this issue Jul 25, 2017 · 72 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Proposal
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Jul 25, 2017

In the discussion on #16620, I've noticed that the majority of use-cases folks describe for sync.Cond turn out to work fine with a channel instead: (*sync.Cond).Broadcast corresponds to calling close on the channel, and (*sync.Cond).Signal corresponds to sending on the channel.

The existence of sync.Cond suggests that people might want to use it, but it is currently under-documented (#20491), incompatible with other Go synchronization patterns (e.g. select statements; see #16620), and unlike most other types in the sync package, does not have a valid zero-value (its Wait method requires a non-nil L field). It has an additional "no copy" invariant enforced through both a run-time dynamic check and special-case code in the vet tool.

On top of that, condition variables are fiendishly difficult to use: they are prone to either missed or spurious signals [citation needed — experience reports welcome].

An audit of the Go standard library shows only a handful of uses:

  • io/pipe.go: The use of sync.Cond was added in https://golang.org/cl/4252057. The previous implementation used channels and had no known bugs.
  • syscall/net_nacl.go: The comment there says "We do not use channels because we need to be able to handle writes after and during close, and because a chan byte would require too many send and receive operations in real use." It does not explain why these considerations preclude the use of channels (e.g. a separate 'chan struct{}' to signal closing, and channels of tokens or slices rather than channels of tokens or bytes).
  • net/http/h2_bundle.go: There are two sync.Conds in this file. One is is in a struct "like io.Pipe". The other is only used with Broadcast (never Signal), and can thus be replaced with a channel that is closed to broadcast readiness.
  • net/http/server.go: This sync.Cond is again only used with Broadcast, and thus easy to replace with a channel.
  • crypto/tls/conn.go: Again only used with Broadcast.

Of the above uses, only the one in syscall/net_nacl.go does not have an obvious channel equivalent. However, it appears to be used to limit the size of a buffer, and I know of at least one similar "limiter" API (x/sync/semaphore) that is implemented in terms of channels in order to support interoperation with the standard context package. (I did the Cond-to-channel conversion myself on a prototype of that package before it was open-sourced.)

In light of the above observations, I propose that we remove the Cond type from the sync package in Go 2.

@ianlancetaylor
Copy link
Contributor

The main thing that sync.Cond provides that (as far as I can see) channels do not is the ability to use both Signal and Broadcast on the same synchronization point. But if nobody does that then I agree that sync.Cond is not especially useful.

cc @bradfitz since he decided to use sync.Cond in x/net/http2 in https://golang.org/cl/16310 .

@cespare
Copy link
Contributor

cespare commented Jul 25, 2017

I don't use sync.Cond often, but when I have, I've been happy it exists. I agree that it's hard to use in other languages where you only have, say, locks and condition variables as concurrency primitives. But in Go, where we have channels, one can reserve use of sync.Cond for those specific cases where it's a good fit.

Besides the signal+broadcast case mentioned, another case where it's not obvious to me how to replace a Cond with a channel is where the mutex linked to the Cond protects some state. I typically want to lock, modify some state, and then cond.Wait -- which releases the lock. If I try to replace the Cond with a channel but keep the mutex, now I have two independent synchronization primitives, and that seems trickier to reason about.

@bcmills
Copy link
Contributor Author

bcmills commented Jul 25, 2017

@cespare

another case where it's not obvious to me how to replace a Cond with a channel is where the mutex linked to the Cond protects some state.

Indeed, that's the trickier type of usage I've seen in practice. (It's also, in my experience, the type more likely to have deadlocks, or spurious or missed signals.) That kind of usage is illustrated in syscall/net_nacl.go, where the Mutex guards two monotonic variables (r and m) and triggers based on the difference between them.

I know of at least two techniques that can apply for that type of usage.

  1. If the state is always associated with a single condition, we can apply the technique illustrated in proposal: sync: mechanism to select on condition variables #16620 (comment) and store the data in the channel payload itself.

  2. Otherwise, we can replace the Cond with a list of channels instead of a single channel, and ensure that edits to that list only occur with the Mutex held. To cancel a wait, we acquire the Mutex, check whether the channel was signaled, and (if it wasn't) remove the entry from the list. This is the technique used in (*semaphore.Weighted).Acquire.

(For the latter technique, there may be something we could factor out into a library that would harmonize with the rest of Go better than sync.Cond does today, but I'm not sure that it's a common enough use case to warrant being in the standard library.)

@cznic
Copy link
Contributor

cznic commented Jul 25, 2017

sync.Cond makes emulating pthreads simple. It seems to me semantics of sync.Cond is actually heavily inspired by pthreads.

@bcmills
Copy link
Contributor Author

bcmills commented Jul 25, 2017

@cznic

sync.Cond makes emulating pthreads simple.

Agreed, but is pthread really a good example for Go's concurrency model to emulate?

@dsnet dsnet added the v2 An incompatible library change label Jul 25, 2017
@dsnet dsnet changed the title proposal (Go 2): sync: remove the Cond type proposal: Go 2: sync: remove the Cond type Jul 25, 2017
@gopherbot gopherbot added this to the Proposal milestone Jul 25, 2017
@bradfitz
Copy link
Contributor

I think Go2 could improve Cond but I think removing Cond is kinda crazy. Using channels is currently a super heavy replacement.

@kr
Copy link
Contributor

kr commented Jul 26, 2017

Note that a single Cond lets multiple waiters each wait on a slightly different condition, such as an incrementing counter reaching various thresholds. I mentioned a real-life example of this in #16620 (comment). Channels seem to be a poor fit for this.

@reusee
Copy link

reusee commented Jul 26, 2017

Channels cannot be reopened to broadcast twice.

@tombergan
Copy link
Contributor

Since x/net/http2 was mentioned: I recently wrote a CL for x/net/http2 that was complicated by the use of Cond, because I wanted to select on set of channels or a Cond. I had to implement this by spinning a goroutine to wait on the channels, then broadcast to the Cond if one of those channels fired.
https://go-review.googlesource.com/c/53250/8/http2/transport.go#955

And FWIW, x/net/http2 uses Broadcast exclusively, never Signal.

I don't have a position on removing Cond, per se, but more than once I have been annoyed by not being able to select on a Cond and a channel simultaneously. If that could be fixed I would be happy.

@glycerine
Copy link

glycerine commented Aug 28, 2017

I find channels easier to use than sync.Cond, but channels don't provide for efficient, repeated broadcasting of more than one bit from 1:N receivers. I suggest adding a new type of channel, a broadcast channel. It would be very similar to a channel with buffer size 1, with some extra logic:

a) if the channel has a value, then receiving on a broadcast channel doesn't consume the value in it. This allows an unlimited number of receivers to receive the same value.

b) delete on the broadcast channel removes any value in it. This allows the sender to stop broadcasting any value at all.

Notes

  1. an empty broadcast channel would still cause receivers to block, as usual.

  2. Senders then broadcast by sending into the channel, as usual. But each new send replaces the old value, and any receive after a replacement sees the new value. Just like a condition variable, receivers may miss values if a new value replaces the old before they awake. And just like a condition variable, a receiver may not receive a value at all if somebody deletes it from the broadcast channel before receipt.

Alternatives:

A2) a more general approach would be to allow the sending of values tagged as "sticky", and let sticky values be consumed an infinite number of times. This could be nice if you want the buffer of the channel to contain both sticky and non-sticky values. You would need a new operation to consume and eliminate a sticky value.

A3) allow the size of the buffer attached to a channel to be resized without being re-allocated. Then, assuming you know how many subscribers you have, the sender can send exactly that many values on a regular buffered channel. Unfortunately this requires that the sender have "register a new client" logic so it knows when to expand the queue. Corresponding de-registration logic required as well. Hence I prefer the broadcast type channels suggested first.

edit: A2 may have the advantage of being the most backwards compatible, and the advantage of having obviously novel syntax for sending sticky values. Suppose for instance that the operator <~ was chosen for sticky send. Example ch := make(chan int); ch <~ 1; would send the value 1 as a sticky value, that could be received many times. The new syntax would make the use of the sticky value very visually distinct.

@cespare
Copy link
Contributor

cespare commented Sep 5, 2017

Over in #16620 (comment) I described taking some of my sync.Cond-using code and altering it to use a channel instead. I found it to be fairly tricky to get right (though of course it's possible that I overlooked some simpler way).

@as
Copy link
Contributor

as commented Jan 4, 2018

On top of that, condition variables are fiendishly difficult to use: they are prone to either missed or spurious signals

This couldn't be more true, I would gladly take performance loss in favor of debugging someone else's condition variable usage. People see it in stdlib, copy it, get it wrong, and then spend late nights debugging it.

sync.Cond is a keyword in my bug comb

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/94138 mentions this issue: shiny/driver/internal/event: use a channel in Deque instead of a sync.Cond

@ianlancetaylor
Copy link
Contributor

First I'll note that Go 2 should be largely if not entirely backward compatible with Go 1. sync.Cond is not actively buggy, so I don't see sufficient justification to remove it from "sync".

We could remove it from a new version of the sync package, whether that is called "sync/v2" or something else. But I don't think that removing condition variables is a sufficient reason to create a new version of the sync package.

@ianlancetaylor ianlancetaylor added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Mar 7, 2018
@gdamore
Copy link

gdamore commented Oct 3, 2018

I have used sync.Cond in a number of situations. Channels are really heavy weight, and its significant that a number of constructs that would be easy to implement as a relatively straight-forward condition variable followed by checks of several conditions can be implemented in terms of select {}, but at relatively large performance costs.

sync.Cond is harder to use than channels, and possibly more error prone, but for highly performance sensitive code there are so many constructs that are vastly more efficient with sync.Cond than select {} multiple conditions that removing sync.Cond would be devastating for projects that I work on.

Removing this construct would be devastating to me, and would be a reason to abandon to the language entirely for certain of my projects.

@gdamore
Copy link

gdamore commented Oct 3, 2018

Let me be clear here. If an implementation using channels, and multiple channels, can be show to be roughly equivalent in terms of performance with a using a single condition variable followed by multiple if {} statements, then I'd be willing to accept that channels are an equivalent replacement. My own experience is that this is vehemently not the case, and I believe that this proposal stems from a naive understanding of the documented semantics of channels and condition variables, without sufficient experience in using either in performance critical code to make assertions about the lack of utility of one vs. the other.

@bcmills
Copy link
Contributor Author

bcmills commented Nov 16, 2018

Channels are really heavy weight

It is true that a channel is currently larger than a sync.Cond, and requires more allocations. However, I don't believe that that is inherent to the use of channels. (#28366 describes one option to streamline both the API and implementation.)

for highly performance sensitive code there are so many constructs that are vastly more efficient with sync.Cond

Can you provide some benchmarks to demonstrate that? I'd be curious to see how well the channel alternatives can be optimized.

In my experience, even today the performance cost of untargeted wakeups can often swamp out the other performance advantages of condition variables.

@bcmills
Copy link
Contributor Author

bcmills commented Dec 20, 2018

I don't think that removing condition variables is a sufficient reason to create a new version of the sync package.

That's fair enough. If the benefit of removing sync.Cond isn't worth the code churn, can we at least mark it Deprecated?

We had a pretty clear education gap between use of sync.Cond and equivalent use of channels, but hopefully that's getting better: in particular, Rethinking Classical Concurrency Patterns illustrates a lot of alternatives starting around slide 37.

@gdamore
Copy link

gdamore commented Dec 21, 2018

@bcmills What is your beef here? Why do you so want to take away a tool that has proven itself quite useful many many times over? I know that sync.Cond() has resulted in cleaner and simpler code over attempts to make this work with channels, and faster code too! in many of my projects. Yes, there are sharp edges here, but these are hardly the only such. They are no worse than sync.Mutex in terms of dangerous tooling.

@andig
Copy link
Contributor

andig commented Jun 12, 2022

Experience report on

@andig, I explicitly addressed that pattern in my GC'18 talk; see specifically slides 70 and 102–105.

In #21165 (comment) it turned out trivial to replace the sync.Cond with a channel, exploiting the fact that a closed channel can always be read: evcc-io/evcc/pull/3629. The code becomes much clearer and has less dependencies. I'd still not use this as an argument to force replacement of sync.Cond though.

@wiyantotan
Copy link

@kevinburke
Take an example of your code. When 100 goroutine do WriteEvent, each of them is in race condition to acquire Lock().

The concept of sharing variable with mutex is once 1 of the goroutine acquired the Lock(), the rest of 99 goroutine have to wait until the goroutine that previously acquired the Lock() to Unlock().

The concept of Wait() is :

w.flushCond.L.Lock() // -> Try to get Lock()
if condition {
  // -> This Wait() call will automatically calls Unlock(), allowing other process to get the Lock().
  // -> And at the same time this process will goes to "blocking state"
  // -> Until the condition get notified (Signal() or Broadcast()), this process will "unblock" and "try to race together with other process to get the Lock()"
 // -> If you run your code and seems that Wait() didn't work, it is because of when "this process is lose when race to get the Lock()"
  w.flushCond.Wait() 
}
// do stuff
w.flushCond.L.Unlock()

I'd run your code multiple times, sometimes it is PASS and sometimes it FAIL. It's not always FAIL.

If you add time.Sleep after wg.Wait() , you will see that the Wait() is running.

wg.Wait()
time.Sleep(2 * time.Second) // ->add this

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/412237 mentions this issue: sync: add more notes about Cond behavior

@firelizzard18
Copy link
Contributor

If sync.Cond is removed from the language, ISTM there are enough use cases (especially for low-memory or high-performance situations) that someone will inevitably be motivated to create a replacement. And because that replacement would necessarily not be part of the standard library and thus not have access to runtime scheduler functions, it would be a lesser version than what we have now. So instead of eliminating sync.Cond, it would just be replaced by not as good versions of it.

gopherbot pushed a commit that referenced this issue Jun 17, 2022
Cond is difficult to use correctly (I was just bitten by it in
a production app that I inherited). While several proposals have come
up to improve or remove sync.Cond, no action has so far been taken.

Update the documentation to discourage use of sync.Cond, and point
people in the direction of preferred alternatives. I believe this will
help encourage behavior we want (less use of sync.Cond and more use of
channels), while also paving the way for, potentially, removing Cond
in a future version of the language.

Thanks very much to Bryan Mills and Sean Liao for discussion and
recommendations.

Updates #20491.
Updates #21165.

Change-Id: Ib4d0631c79d4c4d0a30027255cd43bc47cddebd3
Reviewed-on: https://go-review.googlesource.com/c/go/+/412237
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
@Acebond
Copy link

Acebond commented Feb 17, 2023

I think there are uses cases that do require sync.Cond and sync.Cond makes the code much smaller and easier to understand once you know how sync.Cond works.

I wanted a Golang stream multiplexer, and after writing my own using channels, I found sync.Cond from https://github.com/SiaFoundation/mux and forked that project into https://github.com/Acebond/gomux.

The gomux project is like 1/3 the code of my messy channels version and performs 2x as fast. Using sync.Cond is the only way to achieve optimal throughput in a stream multiplexer.

@hrissan
Copy link

hrissan commented Apr 7, 2023

It would be a disaster for us if Cond removed from language.

We are writing highly loaded services in golang using bunch of highly optimized components, like custom API RPC server.

We process million+ requests per second with almost no allocations and maintaining fixed memory ceiling, or in other words regardless of load server uses predefined amount of memory and never more. If requests come faster than they can be processed, they are not read from their sockets until enough memory is released.

We tried to use channels first when designed our components, but gradually got rid of almost of them, as a result removing most allocations, simplifying invariants and greatly speeding code up.

Problem with channels for us is they are hard to use with cancellation semantic and they have data transfer and signalling mixed. Most APIs with channels are hard to use and lead to various races which cannot be easily solved (good example is standard Timer).

Often we need to make operation on channel and external variables atomic, because channel content forms invariant with those. Also we often want explicit clear or peek operations.

Mutexes and Conds plus normal variables always give us easy inspection, manipulation and cancellation. When we need some state machine switch, we simple lock everything we need, then assign new state, then wake everybody who can be interested in changes. With channels it is simply impossible to do.

It is true Cond does not mix well with other sync primitives. But if WaitTimeout is added to Cond, then we'd have no need to mix Cond with anything, because we'd then get rid of the rest channels in our code and remove tons of remaining inefficiency and wrappers around fact that we now need goroutine with timer which will wake up Cond only to implement Timeout.

@ianlancetaylor ianlancetaylor changed the title proposal: Go 2: sync: remove the Cond type proposal: sync: remove the Cond type Jun 21, 2023
@ianlancetaylor ianlancetaylor removed the v2 An incompatible library change label Jun 21, 2023
@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Jun 21, 2023
@rsc rsc moved this from Incoming to Active in Proposals Jun 21, 2023
@rsc
Copy link
Contributor

rsc commented Jun 21, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Jun 28, 2023

Even in a v2 of sync, I find it very difficult to believe we would delete Cond. Condition variables are a fundamental building block for many abstractions. It's true that most users should use higher-level abstractions, but sync also exists to provide these building blocks, both Cond and Mutex.

@rsc
Copy link
Contributor

rsc commented Jun 28, 2023

Worth noting that the sync.Cond doc comment is very good at pointing people at other things.

@rsc rsc moved this from Active to Likely Decline in Proposals Jul 5, 2023
@rsc
Copy link
Contributor

rsc commented Jul 5, 2023

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Jul 5, 2023

For what it's worth I wrote this code over the weekend. Maybe there is a nice way to write it with a channel instead of a sync.Cond, but the sync.Cond seems to say exactly what I mean. I started out trying to use a channel but got frustrated by having to think about channel synchronization despite the fact that I was already holding a mutex.


// A WorkQ is a deduplicating unbounded work queue with a fixed number of workers.
type WorkQ struct {
	mu      sync.Mutex
	queued  map[any]bool
	work    map[any]func()
	closed  bool
	ready   sync.Cond // len(w.work) > 0 || w.closed && w.running == 0
	done    int
	running int
	workers sync.WaitGroup
	start   time.Time
	next    time.Time
}

const reportPeriod = 1 * time.Minute

// lock locks the queue and ensures that w.ready.L is initialized.
func (w *WorkQ) lock() {
	w.mu.Lock()
	if w.queued == nil {
		w.queued = make(map[any]bool)
	}
	if w.work == nil {
		w.work = make(map[any]func())
	}
	if w.ready.L == nil {
		w.ready.L = &w.mu
	}
	if w.start.IsZero() {
		w.start = time.Now()
		w.next = w.start
	}
}

// unlock unlocks the queue and signals w.ready if a worker should be woken up.
func (w *WorkQ) unlock() {
	if len(w.work) > 0 || w.closed && w.running == 0 {
		w.ready.Signal()
	}
	w.mu.Unlock()
}

// Start starts n workers.
func (w *WorkQ) Start(n int) {
	for i := 0; i < n; i++ {
		w.workers.Add(1)
		go w.run()
	}
}

// run runs a single worker loop.
func (w *WorkQ) run() {
	defer w.workers.Done()
	w.lock()
	for {
		for len(w.work) == 0 {
			if w.closed && w.running == 0 {
				w.unlock()
				return
			}
			w.ready.Wait()
		}
		var do func()
		for name, f := range w.work {
			delete(w.work, name)
			do = f
			break
		}

		w.running++
		w.unlock()
		do()
		w.lock()
		w.running--

		w.done++
		now := time.Now()
		if now.After(w.next) {
			log.Printf("%v %d/%d done", time.Since(w.start).Round(1*time.Second), w.done, len(w.queued))
			for w.next.Before(now) {
				w.next = w.next.Add(reportPeriod)
			}
		}
	}
}

// Wait closes the work queue and waits for all work to be completed.
func (w *WorkQ) Wait() {
	w.lock()
	w.closed = true
	w.unlock()
	w.workers.Wait()
}

// Add adds work with the given name and function.
// If work with the given id has been queued before, Add does nothing.
// The id must be comparable so it can be used as a map key.
func (w *WorkQ) Add(id any, f func()) {
	w.lock()
	if !w.queued[id] {
		w.queued[id] = true
		w.work[id] = f
	}
	w.unlock()
}

@rsc
Copy link
Contributor

rsc commented Jul 12, 2023

No change in consensus, so declined.
— rsc for the proposal review group

@rsc rsc moved this from Likely Decline to Declined in Proposals Jul 12, 2023
@rsc rsc closed this as completed Jul 12, 2023
@golang golang locked and limited conversation to collaborators Jul 11, 2024
@rsc rsc removed this from Proposals Jul 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Proposal
Projects
None yet
Development

No branches or pull requests