New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deadlock on HA leadership transfer when standby was actively forwarding a request #12601
Comments
Nice find, and thanks for the really great bug report. Concise yet contains everything we could ask for. I think the fix is easy, but I'm not going to PR it because we should repro this ourselves so we can make sure this solves it, and I haven't the time myself right now. We'll make sure someone gets to this soon, hopefully in time for the next minor release. Proposed fix:
|
was actively forwarding a request fixes GH #12601
* Fix a Deadlock on HA leadership transfer when standby was actively forwarding a request fixes GH #12601 * adding the changelog
* Fix a Deadlock on HA leadership transfer when standby was actively forwarding a request fixes GH #12601 * adding the changelog
* Fix a Deadlock on HA leadership transfer when standby was actively forwarding a request fixes GH #12601 * adding the changelog
* Fix a Deadlock on HA leadership transfer when standby was actively forwarding a request fixes GH #12601 * adding the changelog
Thanks for submitting this issue. The issue has been addressed in the above PRs on main and three released branches. I am going to close this issue. |
Thanks for the quick resolution, we really appreciate it! |
…2691) * Fix a Deadlock on HA leadership transfer when standby was actively forwarding a request fixes GH hashicorp#12601 * adding the changelog
…#12721) * Fix a Deadlock on HA leadership transfer when standby was actively forwarding a request fixes GH hashicorp#12601 * adding the changelog
…#12722) * Fix a Deadlock on HA leadership transfer when standby was actively forwarding a request fixes GH hashicorp#12601 * adding the changelog
Describe the bug
A deadlock can occur between the
stateLock
andrequestForwardingConnectionLock
of the standby core when assuming HA leadership while actively forwarding a request. In particular, the following goroutines will hang indefinitely:To Reproduce
vault operator step-down
) -- nothing unusual in the leader logs.Expected behavior
Standby should proceed with unseal strategy and start handling requests as HA leader.
Environment:
vault status
): 1.7.3vault version
): N/AVault server configuration file(s):
Additional context
Full standby server logs, including goroutine trace.
I believe this is the same issue as previously reported:
Bug was likely introduced by #9496.
Thanks also to @scotje for his help uncovering this bug.
The text was updated successfully, but these errors were encountered: