-
Notifications
You must be signed in to change notification settings - Fork 17.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime/race: potential false positive from race detector #39052
Comments
That looks like a race between reading the length of the channel and writing to the channel. That looks like a correct report to me. |
@randall77 Thanks for looking at this. Commenting out the assertion on the channel length doesn't change things, so I don't think that's what is causing the report. I could be wrong, obviously, but considering what statements are being referred to in the report, |
It would help if you could provide complete stand-alone code for the problem, and if the race report that you show corresponds to that code. That said, I see that |
The |
@ianlancetaylor The mock The channel is closed after the wg is done, so after the channel was returned to the 3 routines. The routines run sequentially by definition (each routine requires a mutex lock), so only the last call to @randall77 As you can see in the repo I created: the code to reproduce the issue is indeed identical to what I posted here. the
checking length and cap of a channel doesn't require synchronisation, so there is no data race possible there. The closing of the channel is where the race detector takes issue with, unless I remove the mock from the broker first. The issue I have is that: the behaviour of the unit tests, and indeed the broker are 100% deterministic. I use the waitgroup for synchronisation, and Try as I like, to me there's only 2 ways around this issue:
|
You're right, this is not a race. Why then did it report the line number of the len call? I agree with Ian. The compiler converts:
To
In between the assignment and the |
I've gone over this so many times, thinking I must've missed something, but I do believe I have found a case where the race detector returns a false positive (ie data race where there really isn't a data race). It seems to be something that happens when writing to a channel in a
select-case
statement directly.The unit tests trigger the race detector, even though I'm ensuring all calls accessing the channel have been made using a callback and a waitgroup.
I have the channels in a map, which I access through a mutex. The data race vanishes the moment I explicitly remove the type that holds the channel from this map. The only way I am able to do is because the mutex is released, so once again: I'm certain everything behaves correctly. Code below
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I'm writing a simple message event bus where the broker pushes data onto a channel of its subscribers/consumers. If the channel buffer is full, I don't want the broker to block, so I'm using routines, and a
select
statement to skip writes to a channel with a full buffer. To make life easier WRT testing, I'm mocking a subscriber interface, and I'm exposing the channels through functions (similar tocontext.Context.Done()
and the like).My tests all pass, and everything behaves as expected. However, running the same tests with the race detector, I'm getting what I believe to be a false positive. I have a test where I send data to a subscriber that isn't consuming the messages. The channel buffer is full, and I want to ensure that the broker doesn't block. To make sure I've tried to send all data, I'm using a waitgroup to check if the subscriber has indeed been accessed N number of times (where N is the number of events I'm sending). Once the waitgroup is done, I validate what data is on the channel, make sure it's empty, and then close it. The statement where I close the channel is marked as a data race.
If I do the exact same thing, but remove the subscriber from the broker, the data race magically is no more. Here's the code to reproduce the issue:
broker.go
broker_test.go
See the data race by running:
go test -v -race ./broker/... -run TestRace
What did you expect to see?
I expect to see log output showing that the subscriber was skipped twice (output I do indeed see), and no data race
What did you see instead?
I still saw the code behaved as expected, but I do see a data race reported:
Though I'm not certain, my guess is that the expression
s.C() <- v
, because it's a case expression, is what trips the race detector up here. The channel buffer is full, so any writes would be blocking if I'd put the channel write in thedefault
case. As it stands, the write cannot possibly be executed, so instead my code logs the fact that a subscriber is being skipped, the routine ends (defer func unlocks the mutex), and the mock callback decrements the waitgroup. Once the waitgroup is empty, all calls to my mock subscriber have been made, and the channel can be safely closed.It seems, however, that I need to add the additional call, removing the mock from the broker to "reset" the race detector state. I'll try and have a look at the source, maybe something jumps out.
The text was updated successfully, but these errors were encountered: