proposal: x/sync/semaphore: make semaphore resizable #29721

sherifabdlnaby · 2019-01-13T19:12:04Z

Hello,

Semaphores are often used to bound concurrency, for example I was implementing a goroutines pool pattern with weighted jobs, so a weighted semaphore is better and more performant than channel-based semaphore. a common functionality in goroutines pool is to be resizable, however the current sync/x/semaphore doesn't allow resizing the semaphore.

There is another implementation of a non-channel-based semaphores that supports resizing, but I found many bugs/deadlocks using them and they're all less performant than x/sync/semaphore.

The current implementation of x/sync/semaphore can be easily extended to be resizable, without affecting performance by any means.

The text was updated successfully, but these errors were encountered:

bcmills · 2019-01-14T13:57:32Z

CC @jba @Sajmani

bcmills · 2019-01-14T14:01:56Z

This proposal needs more detail.

What are some concrete use-cases for resizable semaphores? (Links to existing code that could be replaced or improved by this change would be ideal.)
What is the specific API being proposed?
- Does it add and remove capacity, or set the capacity to an absolute quantity?
- If it sets the absolute quantity, what happens if that capacity is lower than the number of outstanding tokens?

bcmills · 2019-01-14T14:06:05Z

Note that if you have some upper bound, you can already “resize” a semaphore.Weighted by acquiring and releasing surplus tokens:

const (
	maxCap = […]
	initialCap = […]  // ≤ maxCap
)
sem := semaphore.NewWeighted(maxCap)
sem.Acquire(ctx, maxCap - initialCap)
// Effective capacity of sem is now initialCap,
// but it can be “resized” up to maxCap by releasing tokens.

sherifabdlnaby · 2019-01-14T15:26:44Z

@bcmills

Example:

cockroachdb uses a semaphore for rate limiting, I found a bug (cockroachdb/cockroach#33554) in the semaphore package they were using that has potential to deadlock.
They decided to keep using this semaphore implementation because the bug was not currently reachable in their code and there aren't many alternative resizable semaphores. cockroachdb/cockroach#33554 (comment)

fixing the bug in that mentioned semaphore implementation halved its performance (300% slower than x/sync/semaphore) according to benchmarks.

What is the specific API being proposed?

Does it add and remove capacity, or set the capacity to an absolute quantity?

It sets the capacity to an absolute quantity.

If it sets the absolute quantity, what happens if that capacity is lower than the number of outstanding tokens?

If semaphore is resized to a number lower than the current acquired tokens it will not acquire any new tokens until a release that brings down current capacity to be lower or equal to the new semaphore capacity.

More details

What happens to the Aquire of N that was marked impossible and not added to waiters list ? and what happens if an Aquire of N that was already in the waiting list when a resize lower the capacity to be < N ?

In the current Implementation an Aquire of N where N is > capacity will not be added to the waiters' linked list because It is impossible to get acquired. and will block forever unless cancelled by its ctx.

In the PR, I've added a new linked-list for the impossible waiters, when Acquire(N) is impossible it will be added to the impossible waiters list, Impossible waiters can become possible if a resize of the semaphore was big enough.

So now when Resize(N) is called this is what happens:

Set a new size for the semaphore.
Will loop through the waiters list removing the now-impossible waiters to the impossible waiters.
Will loop through the Impossible Waiters list adding the now-possible ones to the waiters list.
Will signal possible waiters from the front of the waiters list to unblock ( same as Release() mechanism).

A resize is O(W+I), W = length of waiters list, I = length of impossible waiters list.

sherifabdlnaby · 2019-01-14T15:30:46Z

Note that if you have some upper bound, you can already “resize” a semaphore.Weighted by acquiring and releasing surplus tokens:
const (
	maxCap = […]
	initialCap = […]  // ≤ maxCap
)
sem := semaphore.NewWeighted(maxCap)
sem.Acquire(ctx, maxCap - initialCap)
// Effective capacity of sem is now initialCap,
// but it can be “resized” up to maxCap by releasing tokens.

Yes, this will work effectively if you have an upper-bound.
I believe a simple .Resize(N) will be simpler to use.

bcmills · 2019-01-14T21:35:45Z

math.MaxInt64 is always a valid upper bound.

So it seems the only behavioral difference today is the FIFO behavior of “impossible” tasks. Those are not generally something you want to have in a well-behaved system anyway, since they consume resources while waiting for cancellation; so, how often does that difference matter?

sherifabdlnaby · 2019-01-17T01:45:55Z

@bcmills

math.MaxInt64 is always a valid upper bound.

Yes it's, but this will allow impossible tasks to be added to waiters list and will block other waiters when that impossible task is at the front of the list thus causing a deadlock which is unexpected semaphore behavior.

I understand your point that it is not generally something you want to have in a well-behaved system anyway.

But my opinion is a Resize() would be simpler to use than an ad-hoc way to resize the semaphore that may have unexpected deadlocks because of the semaphore implementation.

rsc · 2021-02-24T18:49:30Z

/cc @jba

rsc · 2021-02-24T19:46:19Z

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

sherifabdlnaby · 2021-02-24T20:39:19Z

@rsc This is great news, I've been using my fork for a long time, and having it part of the official/x/sync/semaphore` will be very welcomed.
I updated the PR golang/sync#3 and sync it with upstream changes.

rsc · 2021-03-10T18:48:16Z

@sherifabdlnaby Can you say something about why you are maintaining a fork instead of using a wrapper that does the trick @bcmills mentioned with allocating a very large-sized semaphore but then using Acquire and Release calls to take capacity away / bring it back?

sherifabdlnaby · 2021-03-10T23:49:30Z

@rsc
Using the approach suggested by @bcmills was not ideal in my usecase, in func (s *Weighted) notifyWaiters() #L117 you see that when a resource is released; only the front of the waiters is tried for the acquiring and if no tokens are available it explicitly avoids looking up another possible smaller acquires to avoid starvation.

Hence, when trying to acquire the semaphore with weight > the effective size of the semaphore it blocks the semaphore until a resize that makes the weight <= effective size; it will not let smaller semaphores get acquired despite blocking forever until the semaphore is resized again.

In my implementation the acquires that their weight >= the size of the semaphore are moved to a separate impossibleWaiters queue and is only added to the semaphore queue when it's possible for them to be acquired eventually (checked at every invocation of resize() function).

In my use-case, this was very important to the performance of the semaphore. I use it for dynamically bounding concurrency and needed a more performant method than channel-based semaphores.

jba · 2021-03-11T13:10:40Z

I don't see anything wrong with this proposal technically. It would be different if Acquire rejected impossible requests, but instead it lets them wait indefinitely. The only difference here (assuming Resize is never called) is some extra space, negligible compared to the goroutine's memory.

The effect of Resize cannot be obtained with the existing semaphore, for the reason explained above: when you resize down, you can deadlock because the waiter at the front of the list can never proceed.

The use case of weighted tasks with dynamic concurrency seems like a reasonable one.

ajwerner · 2021-03-17T00:01:09Z

cockroachdb uses a semaphore for rate limiting, I found a bug (cockroachdb/cockroach#33554) in the semaphore package they were using that has potential to deadlock.
They decided to keep using this semaphore implementation because the bug was not currently reachable in their code and there aren't many alternative resizable semaphores. cockroachdb/cockroach#33554 (comment)

fixing the bug in that mentioned semaphore implementation halved its performance (300% slower than x/sync/semaphore) according to benchmarks.

Thanks for the reminder on this. In other use cases, CockroachDB has generally decided to stop using the semaphore library altogether. That library is not very good from a fairness or allocation perspective. The quotapool library provides fairness, resizing, and dramatically more flexibility. This library has proven to be quite good. I'll pick it up in our internal limit package and I'll try to work on getting it spun out into an Apache license somewhere.

rsc · 2021-03-24T17:37:50Z

@sherifabdlnaby thanks for the reply, specifically:

In my implementation the acquires that their weight >= the size of the semaphore are moved to a separate impossibleWaiters queue and is only added to the semaphore queue when it's possible for them to be acquired eventually (checked at every invocation of resize() function).

It sounds like your semaphore is not first come, first served, while the Go implementation is. So in addition to making it resizable you are also asking for the policy to be changed about how requests are satisfied. I am less sure about making that change. Thoughts on that part, @bcmills and @jba?

sherifabdlnaby · 2021-03-25T02:35:04Z

@rsc

It sounds like your semaphore is not first come, first served, while the Go implementation is. So in addition to making it resizable you are also asking for the policy to be changed about how requests are satisfied.

The policy stays as first come, first served as long the requests are satisfiable, when a request is impossible to be acquired given the current size of the semaphore, it will be moved to the impossibleWaiters queue; now when It is possible to be acquired (after a resize) it will be put back at the end of the main queue.

So essentially the policy is the same as long as you have not introduced resize() (and having a request that is > the size of the semaphore). making the change backward compatible and non-breaking to existing behavior.

jba · 2021-03-25T13:51:51Z

@sherifabdlnaby, did you check out the quotapool package mentioned above? That sounds like an improvement on what you're proposing here.

sherifabdlnaby · 2021-03-31T05:14:59Z

@jba I didn't know about the quotapool package at the time I needed the resizable semaphore. It looks pretty sophisticated and I would definitely try it.
I don't know how much overhead is added to achieve such sophistication, I'll have to benchmark it. The reason I needed a resizable semaphore is I needed a more performant worker pool than the one that uses channels as a semaphore.

ajwerner · 2021-03-31T13:48:11Z

I don't know how much overhead is added to achieve such sophistication, I'll have to benchmark it. The reason I needed a resizable semaphore is I needed a more performant worker pool than the one that uses channels as a semaphore.

Let me know what you discover! I tuned the thing somewhat heavily. It has a fast path which avoids using channels if there is sufficient quota. Any allocations are pooled so generally there are no allocations to interact with it. The channels are also pooled as it is designed to not close them but rather recycle them.

rsc · 2021-03-31T17:25:39Z

Given that @sherifabdlnaby has a customized fork and also @ajwerner has written another package to serve the specific need, and given that to serve this need in x/sync/semaphore would require changing the queueing order for pending requests, it sounds like we probably should leave x/sync/semaphore's semantics alone and encourage the use of alternate packages when they are a better fit.

rsc · 2021-04-01T03:16:25Z

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

sherifabdlnaby · 2021-04-03T08:44:52Z

Thanks all for your time discussing this 🙏🏻

rsc · 2021-04-07T18:02:05Z

No change in consensus, so declined.
— rsc for the proposal review group

gopherbot added this to the Unreleased milestone Jan 13, 2019

sherifabdlnaby mentioned this issue Jan 13, 2019

semaphore: make semaphore resizable golang/sync#3

Open

bcmills added the Proposal label Jan 14, 2019

FelixSeptem mentioned this issue Aug 29, 2019

【Go 夜读】第 57 期 semaphore源码浅析 talkgo/night#456

Closed

ALTree changed the title ~~x/sync/semaphore: make semaphore resizable.~~ x/sync/semaphore: make semaphore resizable Jan 7, 2020

ianlancetaylor added this to Incoming in Proposals (old) Feb 24, 2021

rsc moved this from Incoming to Active in Proposals (old) Feb 24, 2021

rsc mentioned this issue Feb 24, 2021

proposal: review meeting minutes #33502

Open

ninevra mentioned this issue Mar 21, 2021

Implement semaphores avajs/cooperate#5

Merged

rsc changed the title ~~x/sync/semaphore: make semaphore resizable~~ proposal: x/sync/semaphore: make semaphore resizable Mar 31, 2021

rsc moved this from Active to Likely Decline in Proposals (old) Apr 1, 2021

rsc added the Proposal-FinalCommentPeriod label Apr 1, 2021

sherifabdlnaby closed this as completed Apr 3, 2021

rsc moved this from Likely Decline to Declined in Proposals (old) Apr 7, 2021

rsc removed the Proposal-FinalCommentPeriod label Apr 7, 2021

golang locked and limited conversation to collaborators Apr 7, 2022

gopherbot added the FrozenDueToAge label Apr 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: x/sync/semaphore: make semaphore resizable #29721

proposal: x/sync/semaphore: make semaphore resizable #29721

sherifabdlnaby commented Jan 13, 2019

bcmills commented Jan 14, 2019

bcmills commented Jan 14, 2019

bcmills commented Jan 14, 2019 •

edited

sherifabdlnaby commented Jan 14, 2019 •

edited

sherifabdlnaby commented Jan 14, 2019

bcmills commented Jan 14, 2019

sherifabdlnaby commented Jan 17, 2019 •

edited

rsc commented Feb 24, 2021

rsc commented Feb 24, 2021

sherifabdlnaby commented Feb 24, 2021 •

edited

rsc commented Mar 10, 2021

sherifabdlnaby commented Mar 10, 2021 •

edited

jba commented Mar 11, 2021

ajwerner commented Mar 17, 2021

rsc commented Mar 24, 2021

sherifabdlnaby commented Mar 25, 2021

jba commented Mar 25, 2021

sherifabdlnaby commented Mar 31, 2021

ajwerner commented Mar 31, 2021

rsc commented Mar 31, 2021

rsc commented Apr 1, 2021

sherifabdlnaby commented Apr 3, 2021

rsc commented Apr 7, 2021

proposal: x/sync/semaphore: make semaphore resizable #29721

proposal: x/sync/semaphore: make semaphore resizable #29721

Comments

sherifabdlnaby commented Jan 13, 2019

bcmills commented Jan 14, 2019

bcmills commented Jan 14, 2019

bcmills commented Jan 14, 2019 • edited

sherifabdlnaby commented Jan 14, 2019 • edited

Example:

What is the specific API being proposed?

More details

What happens to the Aquire of N that was marked impossible and not added to waiters list ? and what happens if an Aquire of N that was already in the waiting list when a resize lower the capacity to be < N ?

sherifabdlnaby commented Jan 14, 2019

bcmills commented Jan 14, 2019

sherifabdlnaby commented Jan 17, 2019 • edited

rsc commented Feb 24, 2021

rsc commented Feb 24, 2021

sherifabdlnaby commented Feb 24, 2021 • edited

rsc commented Mar 10, 2021

sherifabdlnaby commented Mar 10, 2021 • edited

jba commented Mar 11, 2021

ajwerner commented Mar 17, 2021

rsc commented Mar 24, 2021

sherifabdlnaby commented Mar 25, 2021

jba commented Mar 25, 2021

sherifabdlnaby commented Mar 31, 2021

ajwerner commented Mar 31, 2021

rsc commented Mar 31, 2021

rsc commented Apr 1, 2021

sherifabdlnaby commented Apr 3, 2021

rsc commented Apr 7, 2021

bcmills commented Jan 14, 2019 •

edited

sherifabdlnaby commented Jan 14, 2019 •

edited

sherifabdlnaby commented Jan 17, 2019 •

edited

sherifabdlnaby commented Feb 24, 2021 •

edited

sherifabdlnaby commented Mar 10, 2021 •

edited