Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eliminate AddConnection/RemoveConnection race #530

Merged

Conversation

rade
Copy link
Member

@rade rade commented Apr 9, 2015

We were invoking AddConnection in a separate goroutine, and this could result in RemoveConnection getting invoked first, resulting in bogus entries in our connection map, which would effectively prevent
connectivity to the other peer until the stale entry got removed through a lost tie break. And if, per chance, the remote end ran into exactly the same race condition, then connectivity between that peer and ourself was broken permanently.

Hitting this race condition required a combination of a) no UDP heartbeat being received for HeartbeatTimeout(30s) - this causes the actorLoop to terminate and RemoveConnection to be invoked, and b) the AddConnection invocation getting delayed for at least that long.

Fixes #529.

We were invoking AddConnection in a separate goroutine, and this could
result in RemoveConnection getting invoked first, resulting in bogus
entries in our connection map, which would effectively prevent
connectivity to the other peer until the stale entry got removed
through a lost tie break. And if, per chance, the remote end ran into
exactly the same race condition, then connectivity between that peer
and ourself was broken permanently.

Hitting this race condition required a combination of a) no UDP
heartbeat being received for HeartbeatTimeout(30s) - this causes the
actorLoop to terminate and RemoveConnection to be invoked, and b) the
AddConnection invocation getting delayed for at least that long.
bboreham added a commit that referenced this pull request Apr 9, 2015
…_race

Very subtle, but it looks ok to me.  Fixes #529
@bboreham bboreham merged commit cec1efb into weaveworks:master Apr 9, 2015
@rade rade modified the milestone: 0.10.0 Apr 18, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bogus unestablished connections due to add/remove race condition
2 participants