Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't say we've shut down cluster listener before having done so #13679

Merged
merged 2 commits into from
Jan 19, 2022

Conversation

ncabatoff
Copy link
Contributor

@ncabatoff ncabatoff commented Jan 17, 2022

This was motivated by this output from a test on ent:

2022-01-14T09:09:34.006-0500 [DEBUG] core0.core: starting cluster listeners
2022-01-14T09:09:34.006-0500 [INFO]  core0.core.cluster-listener.tcp: starting listener: listener_address=127.0.0.1:0
2022-01-14T09:09:34.007-0500 [INFO]  core0.core.cluster-listener: serving cluster requests: cluster_listen_address=127.0.0.1:52477
2022-01-14T09:09:34.008-0500 [TRACE] core0.raft: setting up raft cluster
...
2022-01-14T09:09:52.591-0500 [INFO]  core0.core: stopping cluster listeners
2022-01-14T09:09:52.591-0500 [INFO]  core0.core.cluster-listener: forwarding rpc listeners stopped
2022-01-14T09:09:53.071-0500 [INFO]  core0.core.cluster-listener: rpc listeners successfully shut down
2022-01-14T09:09:53.071-0500 [INFO]  core0.core: cluster listeners successfully shut down
...
2022-01-14T09:09:53.072-0500 [INFO]  core0.core.cluster-listener.tcp: starting listener: listener_address=127.0.0.1:52477
2022-01-14T09:09:53.072-0500 [ERROR] core0.core.cluster-listener.tcp: error starting listener: error="listen tcp 127.0.0.1:52477: bind: address already in use"

The test in question, TestRaft_SnapshotAPI_RekeyRotate_Backward/rekey-with-perf-standby, was still in the initial setup stage: we'd done the initial raft cluster single-node setup, won the first election to be able to store the raft configuration in the first log entry, and were in the process of restarting raft with the real listener prior to joining the other nodes.

I have been unable to reproduce this failure even before making this change, so it's a speculative fix.

@ncabatoff ncabatoff requested a review from a team January 17, 2022 14:42
@vercel vercel bot temporarily deployed to Preview – vault-storybook January 17, 2022 14:45 Inactive
@vercel vercel bot temporarily deployed to Preview – vault January 17, 2022 14:45 Inactive
@ncabatoff ncabatoff merged commit 90ce935 into main Jan 19, 2022
@ncabatoff ncabatoff deleted the fix-cluster-listener-stop-race branch January 19, 2022 15:51
ncabatoff added a commit that referenced this pull request Jan 19, 2022
ncabatoff added a commit that referenced this pull request Jan 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants