-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix data race on listener.connWG #260
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #260 +/- ##
==========================================
- Coverage 82.21% 82.11% -0.11%
==========================================
Files 37 38 +1
Lines 3217 3243 +26
==========================================
+ Hits 2645 2663 +18
- Misses 456 461 +5
- Partials 116 119 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
Function waitGroup.Add cannot be called when another goroutine calls waitGroup.Wait as is mentioned in note: https://pkg.go.dev/sync#WaitGroup.Add Co-authored-by: Jozef Kralik <jojo.lwin@gmail.com>
a557404
to
e209826
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @stv0g! Left a few minor comments below.
// | ||
// A WaitGroup must not be copied after first use. | ||
// | ||
// In the terminology of the Go memory model, a call to Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed this was truncated when this package was introduced in pion/udp
. Perhaps we could fix it to match https://cs.opensource.google/go/go/+/refs/tags/go1.21.0:src/sync/waitgroup.go;l=23?
// In the terminology of the Go memory model, a call to Done
// “synchronizes before” the return of any Wait call that it unblocks.
// WaitGroups in the sync package do not allow adding or | ||
// subtracting from the counter while another goroutine is | ||
// waiting, while this one does. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking at https://pkg.go.dev/sync#WaitGroup.Add and I don't believe this is strictly accurate (I know you are just migrating this package and didn't initially introduce it). Per https://pkg.go.dev/sync#WaitGroup.Add,
Note that calls with a positive delta that occur when the counter is zero must happen before a Wait. Calls with a negative delta, or calls with a positive delta that start when the counter is greater than zero, may happen at any time. Typically this means the calls to Add should execute before the statement creating the goroutine or other event to be waited for.
I believe the problem that was being encountered in the initial race condition was the rare case in which the wait group counter went to zero, but then Accept()
was called on the listener, and its Add()
raced with `Wait() in the goroutine that is started when listening beings.
I think the longer term solution is to consider some slight redesign in the UDP listener, but in the mean time, perhaps we could update this comment to mention that it guards against race conditions when adding to a wait group with counter 0 that is concurrently being waited on. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for investigation. I was also sceptical about this comment, but then didnt go into the details as I was not the original author.
Looking again at this, I dont believe it would be a good decission to merge this as is.
In my opinion the race should be fixed in the listener not by patching a commonly understood synchronization primitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stv0g sounds good to me -- are you planning to address in this PR or another?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @hasheddan,
Sorry, I am currently too busy to look into this in detail.
Would you have the capacity to take this over?
Or should we go with the old fix instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @stv0g! I can take a look, thanks for the ping!
It looks like sync.WaitGroup is in use in transport for connWG and there's no current issues around it being used, so I'll close this issue for now. |
Fixes #259