Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/trace: data race when freeing traces #39790

Open
jrockway opened this issue Jun 23, 2020 · 1 comment
Open

x/net/trace: data race when freeing traces #39790

jrockway opened this issue Jun 23, 2020 · 1 comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@jrockway
Copy link

jrockway commented Jun 23, 2020

This bug is on go version go1.14.4 linux/amd64 with x/net from master (techincally 627f9648deb96c27737b83199d44bb5c1010cbcf).

There is a race condition between event recycling and trace clearing when the trace has a recycler set with SetRecycler and the number of events in tr.events is <= 4 (len(tr.eventBuf)). When a trace is initialized, tr.events is set to tr.eventsBuf[:0]. In unref, the freeing logic starts a recycling function over each event in a separate goroutine by retaining a reference to the tr.events slice, and then calls freeTrace and then reset to write a zero value to each element of the eventsBuf array. When the number of events added to the trace fits in the space reserved by tr.eventsBuf, tr.events just aliases that buffer, and so the order of reads and writes is non-deterministic. The freeing could run first, and thus the recycler function sees the zero-valued events written by reset, or the recycling goroutine could run first, in which case the recycling performs as expected.

golang/net#75 (https://go-review.googlesource.com/c/net/+/238302) adds a test and a fix for this. If you patch in only the test and run "go test -race ." in the trace directory, you'll see that the race detector detects this data race. My fix makes a copy of tr.events if the number of events is less than or equal to the length of tr.eventsBuf, so that the recycler can run at its leisure. Obviously this is not ideal, as the whole point of eventsBuf and the trace pool is to avoid allocations. However, I get the feeling that people don't actually use event recyclers, or they would have already noticed that events don't get recycled and filed a bug like this ;)

Other options that could be considered are doing the entire free asynchronously (letting the refcount hit 0, doing recycling and freeing in the background, and then pushing the trace into the pool that newTrace draws from), or just doing the recycling synchronously (which probably breaks existing code that did something silly like tr.SetRecycler(func (e interface{}) { ch <- e }); tr.Finish(); <-ch). Let me know which approach you prefer, and I'll update the PR accordingly.

@gopherbot gopherbot added this to the Unreleased milestone Jun 23, 2020
@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Jun 23, 2020

CC @pjweinb

@cagedmantis cagedmantis added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants