Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix data race issue with calling wg.Add(1) after wg.Wait() #28

Merged
merged 3 commits into from
Feb 23, 2023

Conversation

alovak
Copy link
Contributor

@alovak alovak commented Jan 4, 2023

In certain situations, it is possible that the Close() method could be called at the same time as a callback function like ReadTimeoutHandler. This callback function could then call the Send method, which would attempt to increment the counter of a WaitGroup that has already had Wait called on it (inside Close). This scenario can lead to a data race, according to the data race detector.

To fix this issue, we should ensure that wg.Add(1) is not called after or during wg.Wait(). This is achieved by putting wg.Add(1) into the mutex that controls the closing state of the connection.

Some more context:

From the go documentation:

Note that calls with a positive delta that occur when the counter is zero must happen before a Wait.

Explanation from Rob Pike:

From the stack traces alone, this looks like a bug that showed up
today in some Google code. The issue is that there is no
happens-before relation between Waitgroup.Add and Waitgroup.Wait, so
your code does an Add after a Wait, the race detector will tell you.
You need to establish a synchronization to guarantee the Add does not
pass the Wait.

@@ -746,7 +746,7 @@ func TestClient_Send(t *testing.T) {
}

c, err := connection.New(server.Addr, testSpec, readMessageLength, writeMessageLength,
connection.ReadTimeout(100*time.Millisecond),
connection.ReadTimeout(50*time.Millisecond),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to reproduce the data race, I reduced the timeout here

@alovak alovak requested a review from adamdecaf January 4, 2023 14:59
@adamdecaf
Copy link
Member

Can this get merged or tested more?

@alovak
Copy link
Contributor Author

alovak commented Jan 10, 2023

Yeah, going to test it with using test environment with real brand + test harness.

@alovak alovak merged commit 3092837 into master Feb 23, 2023
@alovak alovak deleted the fix-datarace-with-wait-group branch September 1, 2023 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants