Skip to content

sync: add WaitGroup.Go #63796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dolmen opened this issue Oct 28, 2023 · 51 comments
Closed

sync: add WaitGroup.Go #63796

dolmen opened this issue Oct 28, 2023 · 51 comments

Comments

@dolmen
Copy link
Contributor

dolmen commented Oct 28, 2023

TL;DR

Add a func (*WaitGroup) Go(task func()) method to launch a task in a goroutine tracked with a sync.WaitGroup.

Combined with the loopvar change (#60078), writing parallel code would be much less error prone.

Rationale

A very common use case for sync.WaitGroup is to track the termination of tasks launched in goroutines.

Here is the classic example:

	var wg sync.WaitGroup
	for i := 1; i <= 5; i++ {
		i := i
		wg.Add(1)
		go func() {
			defer wg.Done()
			work(i)
		}()
	}
	wg.Wait()

I propose to add a func (*WaitGroup) Go(func()) method that would wrap:

  • wg.Add(1): the 1 looks like a magic value
  • launching the goroutine: the go keyword and the () after the func body are magic for Go beginners
  • defer wg.Done(): the defer keyword and the appearance of Done before the call to the worker are magic

A simple implementation:

func (wg *WaitGroup) Go(task func()) {
	wg.Add(1)
	go func() {
		defer wg.Done()
		task()
	}()
}

The example above would be much reduced and many footguns avoided (the last remaining footgun is being addressed by #60078):

	var wg WaitGroup
	for i := 1; i <= 5; i++ {
		i := i  // avoid loopvar footgun for go < 1.22
		wg.Go(func() {
			work(i)
		})
	}
	wg.Wait()

The full modified example, including an extended implementation of sync.WaitGroup, is available on the Go playground.

(to handle the case of a task with arguments, I would recommend rewriting the work to use the builder pattern: https://go.dev/play/p/g1Um_GhQOyc)

@gopherbot gopherbot added this to the Proposal milestone Oct 28, 2023
@dsnet
Copy link
Member

dsnet commented Oct 28, 2023

This is a repeat of #18022, which was re-oriented to become a vet check, but I'm still highly in support of a WaitGroup.Go method. The vet check has never been implemented so it's still easy for users to write the racy version where they increment the WaitGroup within the goroutine rather than before it.

The tailscale codebase has an extension to sync.WaitGroup that adds the Go method. In our usages of it, I have found the Go method to be cleaner.

As an additional data point, the errgroup.Group type also has a Go method.

@seankhliao
Copy link
Member

There's also #57534 to just add errgroup in std

@dolmen
Copy link
Contributor Author

dolmen commented Oct 28, 2023

@dsnet Thanks for the link. GitHub search just doesn't work and didn't help me to (re)discover prior proposals.

@dolmen
Copy link
Contributor Author

dolmen commented Oct 28, 2023

See also golang.org/x/sync/errgroup.Group.Go(func() error)

@earthboundkid

This comment has been minimized.

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Oct 28, 2023
@timothy-king
Copy link
Contributor

Maybe Go could be added to a different type in sync other than sync.WaitGroup? The real benefit of adding Go to sync.WaitGroup is to mix the old paradigm of Add and Done with the new paradigm of Go on the same object. Mixed usage sounds more confusing than keeping each type narrower and clearer. (My suggestion is essentially the same as WaitGroupX in the playground example.)

@earthboundkid
Copy link
Contributor

A WaitGroup with a Go method is not very different from https://pkg.go.dev/cmd/go/internal/par#Queue. I would kind of rather see that moved out of internal and into sync/par. It's a much higher level library than the sync primitives.

@ConradIrwin
Copy link
Contributor

I do think this would be nice to have on wait group.

A very relevant piece of writing is this: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/. He argues that go routines are too flexible, and a more restrictive primitive is usually what you want.

At a previous employer we used https://github.com/ConradIrwin/parallel instead of either waitgroup or go statements. This let us do nice things like ensure that integration tests wait for all spawned go-routines before completing.

I'm not sure that waitgroup should gain Do too; but I do like the idea of having a shared implementation of structured concurrency in go. (On the flip side, the advantage of structured concurrency is that a correct implementation is transparent to callers).

@earthboundkid
Copy link
Contributor

I was also inspired by the structured concurrency post when I wrote my helper library https://github.com/carlmjohnson/flowmatic. It's a little tricky because the standard library needs to have low level primitives so that people can build higher level abstractions like Map and whatnot, but it would be good to have a standard set of high level abstractions too.

@Merovius
Copy link
Contributor

Merovius commented Nov 1, 2023

I've also wanted (*WaitGroup).Go a few times. To me, errgroup serves a slightly different purpose with slightly different semantics and I still use WaitGroups all the time. Having a convenient Go method would simplify things and I don't really see a whole lot of downsides, to be honest.

@dsnet
Copy link
Member

dsnet commented Nov 1, 2023

I don't really see a whole lot of downsides, to be honest.

One possible argument against it is that use of it requires the function to close over variable that may be mutating over the lifetime of the asynchronously running goroutine. As it stands today, people can write the more safe:

wg.Add(1)
go func(a int, b string) {
    defer wg.Done()
    ...
}(a, b)

where a and b are passed in as explicit arguments, and avoid racy mutations.

However, my personal experience is that I've encountered more bugs due to calling wg.Add(1) in the wrong place:

go func(a int, b string) {
    wg.Add(1)
    defer wg.Done()
    ...
}()

@baryluk
Copy link

baryluk commented Nov 4, 2023

Similar and related: #63941

@kscooo
Copy link

kscooo commented Nov 6, 2023

Very useful feature, #63941 is the same, to provide developers with less error-prone api, community third-party libraries have also been implemented

@cgang
Copy link

cgang commented Feb 19, 2024

Very useful and straightforward, wondering if Add()/Done() is really needed when this Go() available.

@dsnet
Copy link
Member

dsnet commented Sep 26, 2024

I'd like to get this back on the active proposal track as I believe the main concern with Go is no longer valid since Go1.21 (or at least significantly alleviated).

At Tailscale we have the syncs.WaitGroup type that is essentially an experiment that provides the Go method.

I have discovered the vast majority of existing sync.WaitGroup usages could be migrated to use the Go method safely because they were usually of the following construction:

var wg sync.WaitGroup
defer wg.Wait()
for i, v := range ... {
    wg.Add(1)
    go func(i int, v T) {
        defer wg.Done()
        ... // use i and v
    }(i, v)
}

However, this extra level of parameter passing is no longer necessary in modern Go.
The above could just be simplified as:

var wg syncs.WaitGroup
defer wg.Wait()
for i, v := range ... {
    wg.Go(func() {
        ... // use i and v
    })
}

because i and v are newly declared variable for each iteration.

@dsnet
Copy link
Member

dsnet commented Sep 26, 2024

Also, I should note that the bug that Go was trying to solve still lacks a vet check 8 years later (#18022).

@zigo101
Copy link

zigo101 commented Sep 26, 2024

However, this extra level of parameter passing is no longer necessary in modern Go.

Just be careful when the passed loop-var parameters are modified in the Go method function body.
The extra level is still needed for such cases.

[edit]: this note is actually only valid for traditional 3-clause for-loops.

for i := 0, condition, postStatement {
    wg.Go(func() {
        ... // Since Go 1.22, DON'T modify i here!!! No matter how the modification is synchronized.
            // Before Go 1.22, the modification might be okay if it is synchronized well.
    })
}

@adonovan
Copy link
Member

I ran staticcheck's SA2000 analyzer across the module mirror corpus, and it turned up a number of bugs with a very low rate of false positives.

Adding this checker (or something similar) to vet might be a worthwhile move if we don't pursue this proposal. Or even if we do.

@seankhliao
Copy link
Member

a naive search gets us 109k hits on github for correctly using wg.Add/go/wg.Done for the correct use,
and 1.1k hits for possibly doing it wrong with go/wg.Add/wg.Done.

#39863 was also a dupe, which was rejected on the grounds that Go would only take func(), but given how prevalent the simple inlined functions are, I don't think that should be a concern.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/632915 mentions this issue: go/analysis/passes/waitgroup: report WaitGroup.Add in goroutine

gopherbot pushed a commit to golang/tools that referenced this issue Dec 2, 2024
This CL defines a new analyzer, "waitgroup", that reports a
common mistake with sync.WaitGroup: calling wg.Add(1) inside
the new goroutine, instead of before starting it.

This is a port of Dominik Honnef's SA2000 algorithm,
which uses tree-based pattern matching, to elementary
go/{ast,types} + inspector operations.

Fixes golang/go#18022
Updates golang/go#63796

Change-Id: I9d6d3b602ce963912422ee0459bb1f9522fc51f9
Reviewed-on: https://go-review.googlesource.com/c/tools/+/632915
Reviewed-by: Robert Findley <rfindley@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/633704 mentions this issue: go/analysis/passes/waitgroup: report WaitGroup.Add in goroutine

@seankhliao seankhliao changed the title proposal: sync: add Go method to WaitGroup to launch a tracked goroutine proposal: sync: add WaitGroup.Go Feb 11, 2025
@aclements aclements moved this from Incoming to Active in Proposals Mar 19, 2025
@aclements aclements moved this from Active to Likely Accept in Proposals Apr 2, 2025
@aclements
Copy link
Member

Based on the discussion above, this proposal seems like a likely accept.
— aclements for the proposal review group

The proposal is to add a Go method to sync.WaitGroup:

// A WaitGroup waits for a collection of tasks to finish.
type WaitGroup struct { ... }

// Go calls f on a new goroutine and adds that task to the WaitGroup.
// When f returns, the task is removed from the WaitGroup.
//
// If the WaitGroup is empty, Go must happen before a [WaitGroup.Wait].
// Typically, this simply means Go is called to start tasks before Wait is called.
// If the WaitGroup is not empty, Go may happen at any time.
// This means a goroutine started by Go may itself call Go.
// If a WaitGroup is reused to wait for several independent sets of tasks,
// new Go calls must happen after all previous Wait calls have returned.
//
// In the terminology of [the Go memory model](https://go.dev/ref/mem),
// the return from f "synchronizes before" the return of any Wait call that it unblocks.
func (*WaitGroup) Go(f func())

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/662635 mentions this issue: sync: add WaitGroup.Go

@earthboundkid
Copy link
Contributor

Should Go propagate panics to Wait? If not, do we need to document that uncaught panics are fatal?

@seankhliao
Copy link
Member

I think that's out of scope. WaitGroup.Go is a literal translation of the existing behaviour.

@earthboundkid
Copy link
Contributor

I think on balance it should probably just be a literal translation of the existing behavior, but it's worth pointing out the danger in the documentation.

@aclements
Copy link
Member

gopherbot closed this as completedin 822031d38 minutes ago

This got committed prematurely. Given that this is in likely accept, I don't think we need to back it out right now, but we should if this doesn't go to accept next week. I'll reopen to track the proposal, with the intent to close it again if it goes to accept next week.

@aclements aclements reopened this Apr 4, 2025
@aclements
Copy link
Member

I think on balance it should probably just be a literal translation of the existing behavior, but it's worth pointing out the danger in the documentation.

I agree.

I think if we were designing a new mechanism (cough errgroup), then we should strongly consider panic propagation, but since this is part of WaitGroup I think we need to allow the obvious rewrite from Add/Done to Go without introducing unintended consequences.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/662975 mentions this issue: sync: tidy WaitGroup documentation, add WaitGroup.Go example

@adonovan
Copy link
Member

adonovan commented Apr 4, 2025

Speaking of panic propagation, see #73159.

This got committed prematurely.

My bad; sorry for complicating things.

@mrclmr
Copy link
Contributor

mrclmr commented Apr 7, 2025

First of all, apologies for joining the discussion a bit late. I just have a few questions and thoughts about the proposal. Hopefully, a decision can be reached soon.

Magic number wg.Add(1)

wg.Add(1): the 1 looks like a magic value

See #63796 (comment)

var wg sync.WaitGroup
for i := 1; i <= 5; i++ {
	i := i
	wg.Add(1)
	go func() {
		defer wg.Done()
		work(i)
	}()
}
wg.Wait()

Wouldn't this example written like this with calling wg.Add() only once and no magic number is needed?

var wg sync.WaitGroup
jobs := 5
wg.Add(jobs)
for range jobs {
	go func() {
		defer wg.Done()
		work(i)
	}()
}
wg.Wait()

Maybe jobs := 5 is not good style.

errgroup proposal

See #63796 (comment)

Wouldn't it be better to first advance this proposal and bring errgroup to the sync package? That would feel cleaner and with this order we would see any thoughts about the wg.Go() API. But this could also be a worse approach.

vet

See #18022

The newly merged vet check is great! Does the vet check catch wrong usage of the API that is mentioned in the proposal? And does this mean that there are still wrong usages possible? I miss an overview what vet covers and what still can be done by the user.

Is wrong usage discoverable with proper testing?

Wrong usage founds in GitHub

See #63796 (comment)

Thanks a lot for formulating the search query.

Does this mean that >99% of users understand and use the API correctly and <1% use the API wrongly?

Thoughts

I read the constructive course of the conversation, and I am aware that a lot of people are in favor for this proposal. I have some questions and thoughts myself, and perhaps with a deeper understanding, I'll come to share the same confidence that adding wg.Go() is a good idea.
At the moment, I feel like I'm missing some of the underlying context. From my current perspective, the benefits of adding wg.Go() to the sync.WaitGroup API seem relatively minor, while the added complexity might introduce long-term drawbacks.

  • Code fragmentation: It's likely that many will continue using the existing wg.Add() / wg.Done() pattern, especially if there's no clear incentive to migrate. This could fragment codebases and lead to inconsistent usage.
  • Introducing wg.Go() creates two distinct ways to use sync.WaitGroup, which could make the API surface feel less cohesive or intuitive.
  • wg.Add() and wg.Done() have been around for 14+ years (commit 2011-02-03) and seem to serve >99% of users effectively (please correct me if I'm misreading usage data).
  • wg.Go() devalues the keyword go and adds a layer of abstraction. Imagine somebody new to the language exploring go keyword and then finding out that you shouldn't use with a sync.WaitGroup.
  • wg.Go() would exist for <1% users that use the API wrongly and >99% of users use an old API (Did I understand usage data correctly?)?
  • The new vet checks find wrong usage of the API (Here I am not sure what how much the new vet checks cover everything regarding wg.Add() and wg.Done())
  • Using wg.Add() and wg.Done() is a more self-explanatory API. wg.Go() needs more explanation and moves complexity to the documentation.

Looking forward to hearing others' thoughts - especially if I'm misunderstanding some of the motivations or benefits here.

@adonovan
Copy link
Member

adonovan commented Apr 7, 2025

Wouldn't this example written like this with calling wg.Add() only once and no magic number is needed?

That's a reasonable alternative way to express the code. Sometimes you see Add(n) followed by n operations; sometimes you see Add(1) before each of n operations; both are fine. But both are simpler using the new Go method.

Wouldn't it be better to first advance this proposal [#63796] and bring errgroup to the sync package?

In hindsight one could imagine a cleaner decomposition of the various parts: for example, sync.CountingSemaphore (functionally identical to WaitGroup) for occasional low-level tasks, plus sync.WaitGroup (functionally identical to errgroup.Group) for everyday structured concurrency. The fact is that WaitGroup is already widely used for the latter role, and we can't change that, but we can at least make it more concise and less error-prone.

Does the vet check catch wrong usage of the API that is mentioned in the proposal?
And does this mean that there are still wrong usages possible?

The vet check catches the specific mistake where the Add call is done inside the child goroutine. There are certainly other ways to make mistakes.

Is wrong usage discoverable with proper testing?

Sometimes, especially if it leads to a data race and you run your tests under the race detector. But not always.

Does this mean that >99% of users understand and use the API correctly and <1% use the API wrongly?

Exactly. But WaitGroup is a very widely used type, so that 1% still leads to over a thousand mistakes in the corpus. In general, the criteria for vet checks do not include the relative frequency of mistakes, but the absolute number of mistakes, their severity, the precision with which we can detect them, and the cost of running the analyzer. In this case, it is very cheap to identify mistakes with almost perfect reliability, and they are (absolutely) quite numerous in the corpus.

many will continue using the existing wg.Add() / wg.Done() pattern... lead to inconsistent usage.

That's true, but we already have a "modernizer" to automate the code transformation to use Go when it is appropriate.

Introducing `wg.Go() creates two distinct ways to use sync.WaitGroup, which could make the API surface feel less cohesive or intuitive. [etc]

Yes, it's a mere convenience function, and convenience always must be weighed against cognitive burden. I think the trade-off is reasonable in this case. I've certainly made the mistake of calling Add in the wrong place, and it took me a while to notice.

More generally, what I've learned from sync.WaitGroup and especially errgroup.Group is that the paradigm of structured concurrency (for example, creating n tasks with lifetimes bounded by the parent) is more often than not what I want. Unstructured concurrency is less often needed. This blog post is relevant: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/

wg.Go() would exist for <1% users that use the API wrongly and >99% of users use an old API (Is this thought correct?)?

No. I expect wg.Go would become the usual form in new code (not least because your IDE will remind you about it).

@earthboundkid
Copy link
Contributor

That's true, but we already have a "modernizer" to automate the code transformation to use Go when it is appropriate.

Yes, and as existing uses are modernized, if you do see wg.Add, you would expect it to be some kind of advanced usage that would be impractical to do with .Go, and so scrutinize it more if it's just an erroneous use.

@mrclmr
Copy link
Contributor

mrclmr commented Apr 7, 2025

@adonovan Thank you very much for the answers to my numerous questions.

In hindsight one could imagine a cleaner decomposition of the various parts: for example, sync.CountingSemaphore (functionally identical to WaitGroup) for occasional low-level tasks, plus sync.WaitGroup (functionally identical to errgroup.Group) for everyday structured concurrency. The fact is that WaitGroup is already widely used for the latter role, and we can't change that, but we can at least make it more concise and less error-prone.

Okay, I see. The current API is too open and only in rare cases special calls of wg.Add() and wg.Done() are needed.

More generally, what I've learned from sync.WaitGroup and especially errgroup.Group is that the paradigm of structured concurrency (for example, creating n tasks with lifetimes bounded by the parent) is more often than not what I want. Unstructured concurrency is less often needed. This blog post is relevant: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/

I see where the design journey is headed.

Okay, I am not entirely convinced by the arguments. But that's okay. Practically the majority (looking a the thumbs up emojis) wants this proposal.

Let's assume that wg.Go() will be accepted. Then from an API design perspective sync.WaitGroup feels half-baked if there is not a distinct type (like e.g. sync.CountingSemaphore or other name) for low-level tasks. If wg.Go() is accepted and wg.Add() and wg.Done() do not point to a new API the sync.WaitGroup API will have an awkward surface. But maybe there are plans to add follow-up proposals and I don't see the whole plan or other things?

@aclements
Copy link
Member

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— aclements for the proposal review group

The proposal is to add a Go method to sync.WaitGroup:

// A WaitGroup waits for a collection of tasks to finish.
type WaitGroup struct { ... }

// Go calls f on a new goroutine and adds that task to the WaitGroup.
// When f returns, the task is removed from the WaitGroup.
//
// If the WaitGroup is empty, Go must happen before a [WaitGroup.Wait].
// Typically, this simply means Go is called to start tasks before Wait is called.
// If the WaitGroup is not empty, Go may happen at any time.
// This means a goroutine started by Go may itself call Go.
// If a WaitGroup is reused to wait for several independent sets of tasks,
// new Go calls must happen after all previous Wait calls have returned.
//
// In the terminology of [the Go memory model](https://go.dev/ref/mem),
// the return from f "synchronizes before" the return of any Wait call that it unblocks.
func (*WaitGroup) Go(f func())

@aclements aclements moved this from Likely Accept to Accepted in Proposals Apr 9, 2025
@aclements aclements changed the title proposal: sync: add WaitGroup.Go sync: add WaitGroup.Go Apr 9, 2025
@aclements aclements modified the milestones: Proposal, Backlog Apr 9, 2025
@adonovan
Copy link
Member

adonovan commented Apr 9, 2025

This issue now tracks the work of implementing the proposal.

By a stroke of good fortune / carelessness, the implementation is already complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests