New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wireguard: Fix timeout in unit test #16001
Conversation
As this PR only touches unit test code, I will not run the full CI, just the Travis and runtime tests. |
7e797e8
to
cc4f15c
Compare
test-runtime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just small nit.
This commit fixes a deadlock in a unit test which ironically tests for deadlocks. The unit test in question ensures that the `wireguard.Agent` `UpdatePeer` method does not create a deadlock if a concurrent IPCache update is performed. The previous version of this test wanted to ensure this by taking a read-lock on the IPCache, which would ensure that only `UpdatePeer` would make progress (as it also just takes an RLock). However, that approach could lead to a timeout when `ipCache.Upsert` was invoked before `wgAgent.UpdatePeer`, as due to the FIFO nature of underlying mutex implementation, `UpdatePeer` will never obtain an `RLock` if there is a waiting writer. This commit addresses this by taking the `wgAgent` lock instead. This means that `UpdatePeer` will lock the IPCache and then wait for the `wgAgent` lock to become available. Any concurrent IPCache updates will also be blocked until `UpdatePeer` has finished as before. This commit also introduces some additional checks to ensure the spawned go routines have actually been scheduled. This is still best effort, as there is easy way to ensure that a certain method is blocked on a particular mutex. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
cc4f15c
to
fc3a3a0
Compare
Travis is green, I'm marking this ready to merge. As mentioned above, this commit only changes unit tests ran by Travis, no impact on Jenkins/GithubActions CI or the Cilium code itself. |
This commit fixes a deadlock in a unit test which ironically tests for
deadlocks. The unit test in question ensures that the
wireguard.Agent
UpdatePeer
method does not create a deadlock if a concurrent IPCacheupdate is performed.
The previous version of this test wanted to ensure this by taking a
read-lock on the IPCache, which would ensure that only
UpdatePeer
would make progress (as it also just takes an RLock). However, that
approach could lead to a timeout when
ipCache.Upsert
was invokedbefore
wgAgent.UpdatePeer
, as due to the FIFO nature of underlyingmutex implementation,
UpdatePeer
will never obtain anRLock
if thereis a waiting writer.
This commit addresses this by taking the
wgAgent
lock instead. Thismeans that
UpdatePeer
will lock the IPCache and then wait for thewgAgent
lock to become available. Any concurrent IPCache updates willalso be blocked until
UpdatePeer
has finished as before.This commit also introduces some additional checks to ensure the spawned
go routines have actually been scheduled. This is still best effort, as
there is easy way to ensure that a certain method is blocked on a
particular mutex.
Fixes: #15937