Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make bootstrap exit early on RAFT store reporting ready #4871

Merged
merged 1 commit into from
May 8, 2024

Conversation

reyreaud-l
Copy link
Contributor

@reyreaud-l reyreaud-l commented May 8, 2024

What's being changed:

Our current bootstrap routine doesn't return if the cluster recovers from the RAFT log config. Due to that the bootstrap process will be in a fail loop where it tries to join and is denied due to being already a member of the cluster.
Once we hit the timeout the node would crash and restart.

Change the bootstrap process to check for the RAFT store reporting ready, meaning we have joined a cluster and the bootstrap can stop.

Also improve logs to have the common action field and be more explicit on failure/success.

Review checklist

  • Documentation has been updated, if necessary. Link to changed documentation:
  • Chaos pipeline run or not necessary. Link to pipeline:
  • All new code is covered by tests where it is reasonable.
  • Performance tests have been run or not necessary.

@reyreaud-l reyreaud-l force-pushed the lre/fix-bootstrap-timeout-restart branch from 45216b1 to 3e19b1e Compare May 8, 2024 07:42
antas-marcin
antas-marcin previously approved these changes May 8, 2024
cluster/store/bootstrap.go Outdated Show resolved Hide resolved
@reyreaud-l reyreaud-l closed this May 8, 2024
@reyreaud-l reyreaud-l reopened this May 8, 2024
@reyreaud-l reyreaud-l force-pushed the lre/fix-bootstrap-timeout-restart branch from 3e19b1e to a9ebe84 Compare May 8, 2024 13:53
@reyreaud-l reyreaud-l force-pushed the lre/fix-bootstrap-timeout-restart branch 2 times, most recently from 9a6f169 to 2200a77 Compare May 8, 2024 15:00
@reyreaud-l reyreaud-l changed the title make bootstrap return on successful notify make bootstrap exit early on RAFT store reporting ready May 8, 2024
@reyreaud-l reyreaud-l force-pushed the lre/fix-bootstrap-timeout-restart branch from 2200a77 to 0e7e616 Compare May 8, 2024 15:04
moogacs
moogacs previously approved these changes May 8, 2024
Copy link
Contributor

@moogacs moogacs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

cluster/store/bootstrap.go Show resolved Hide resolved
cluster/store/bootstrap.go Outdated Show resolved Hide resolved
cluster/store/bootstrap.go Outdated Show resolved Hide resolved
Signed-off-by: Loic Reyreaud <loic@weaviate.io>
Copy link

sonarcloud bot commented May 8, 2024

Quality Gate Passed Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@reyreaud-l reyreaud-l merged commit 50b6b87 into stable/v1.25 May 8, 2024
40 checks passed
@reyreaud-l reyreaud-l deleted the lre/fix-bootstrap-timeout-restart branch May 8, 2024 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants