Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to use NODE_OPS_CMD for bootstrap operation #8472

Closed
asias opened this issue Apr 13, 2021 · 0 comments · Fixed by #8481
Closed

Switch to use NODE_OPS_CMD for bootstrap operation #8472

asias opened this issue Apr 13, 2021 · 0 comments · Fixed by #8481

Comments

@asias
Copy link
Contributor

asias commented Apr 13, 2021

In commit 323f72e (repair: Switch to
use NODE_OPS_CMD for replace operation), we switched replace operation
to use the new NODE_OPS_CMD infrastructure.

We should continue to switch bootstrap operation to use
NODE_OPS_CMD.

The benefits:

  • It is more reliable to detect pending node operations, to avoid
    multiple topology changes at the same time, than using gossip status.

  • The cluster reverts to a state before the bootstrap operation
    automatically in case of error much faster than gossip.

  • Allows users to pass a list of dead nodes to ignore during bootstrap
    explicitly.

  • The BOOTSTRAP gossip status is not needed any more. This is one step
    closer to achieve gossip-less topology change.

asias added a commit to asias/scylla that referenced this issue Apr 14, 2021
In commit 323f72e (repair: Switch to
use NODE_OPS_CMD for replace operation), we switched replace operation
to use the new NODE_OPS_CMD infrastructure.

In this patch, we continue the work to switch bootstrap operation to use
NODE_OPS_CMD.

The benefits:

- It is more reliable to detect pending node operations, to avoid
  multiple topology changes at the same time, than using gossip status.

- The cluster reverts to a state before the bootstrap operation
  automatically in case of error much faster than gossip.

- Allows users to pass a list of dead nodes to ignore during bootstrap
  explicitly.

- The BOOTSTRAP gossip status is not needed any more. This is one step
  closer to achieve gossip-less topology change.

Fixes scylladb#8472
asias added a commit to asias/scylla that referenced this issue Apr 21, 2021
In commit 323f72e (repair: Switch to
use NODE_OPS_CMD for replace operation), we switched replace operation
to use the new NODE_OPS_CMD infrastructure.

In this patch, we continue the work to switch bootstrap operation to use
NODE_OPS_CMD.

The benefits:

- It is more reliable to detect pending node operations, to avoid
  multiple topology changes at the same time, than using gossip status.

- The cluster reverts to a state before the bootstrap operation
  automatically in case of error much faster than gossip.

- Allows users to pass a list of dead nodes to ignore during bootstrap
  explicitly.

- The BOOTSTRAP gossip status is not needed any more. This is one step
  closer to achieve gossip-less topology change.

Fixes scylladb#8472
asias added a commit to asias/scylla that referenced this issue Apr 21, 2021
In commit 323f72e (repair: Switch to
use NODE_OPS_CMD for replace operation), we switched replace operation
to use the new NODE_OPS_CMD infrastructure.

In this patch, we continue the work to switch bootstrap operation to use
NODE_OPS_CMD.

The benefits:

- It is more reliable to detect pending node operations, to avoid
  multiple topology changes at the same time, than using gossip status.

- The cluster reverts to a state before the bootstrap operation
  automatically in case of error much faster than gossip.

- Allows users to pass a list of dead nodes to ignore during bootstrap
  explicitly.

- The BOOTSTRAP gossip status is not needed any more. This is one step
  closer to achieve gossip-less topology change.

Fixes scylladb#8472
@slivne slivne added this to the 4.x milestone Apr 24, 2021
asias added a commit to asias/scylla that referenced this issue Apr 28, 2021
In commit 323f72e (repair: Switch to
use NODE_OPS_CMD for replace operation), we switched replace operation
to use the new NODE_OPS_CMD infrastructure.

In this patch, we continue the work to switch bootstrap operation to use
NODE_OPS_CMD.

The benefits:

- It is more reliable to detect pending node operations, to avoid
  multiple topology changes at the same time, than using gossip status.

- The cluster reverts to a state before the bootstrap operation
  automatically in case of error much faster than gossip.

- Allows users to pass a list of dead nodes to ignore during bootstrap
  explicitly.

- The BOOTSTRAP gossip status is not needed any more. This is one step
  closer to achieve gossip-less topology change.

Fixes scylladb#8472
avikivity added a commit that referenced this issue May 6, 2021
…ation' from Asias He

In commit 323f72e (repair: Switch to
use NODE_OPS_CMD for replace operation), we switched replace operation
to use the new NODE_OPS_CMD infrastructure.

In this patch set, we continue the work to switch decommission and bootstrap
operation to use NODE_OPS_CMD.

Fixes #8472
Fixes #8471

Closes #8481

* github.com:scylladb/scylla:
  repair: Switch to use NODE_OPS_CMD for bootstrap operation
  repair: Switch to use NODE_OPS_CMD for decommission operation
@DoronArazii DoronArazii modified the milestones: 5.x, 4.6 Jul 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants