Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-22.1: server: react to decommissioning nodes by proactively enqueuing their replicas #82680

Merged

Conversation

aayushshah15
Copy link
Contributor

@aayushshah15 aayushshah15 commented Jun 9, 2022

Backport 2/2 commits from #80993 and 1/1 commit from #82683.

/cc @cockroachdb/release


Note: This patch implements a subset of #80836

Previously, when a node was marked DECOMMISSIONING, other nodes in the
system would learn about it via gossip but wouldn't do much in the way
of reacting to it. They'd rely on their replicaScanner to gradually
run into the decommissioning node's ranges and rely on their
replicateQueue to then rebalance them.

This meant that even when decommissioning a mostly empty node, our worst
case lower bound for marking that node fully decommissioned was one
full scanner interval
(which is 10 minutes by default).

This patch improves this behavior by installing an idempotent callback
that is invoked every time a node is detected to be DECOMMISSIONING.
When it is run, the callback enqueues all the replicas on the local
stores that are on ranges that also have replicas on the decommissioning
node. Note that when nodes in the system restart, they'll re-invoke this callback
for any already DECOMMISSIONING node.

Resolves #79453

Release note (performance improvement): Decommissioning should now be
substantially faster, particularly for small to moderately loaded nodes.

Release justification: non-invasive performance improvement for node decommissioning

… replicas

Note: This patch implements a subset of cockroachdb#80836

Previously, when a node was marked `DECOMMISSIONING`, other nodes in the
system would learn about it via gossip but wouldn't do much in the way
of reacting to it. They'd rely on their `replicaScanner` to gradually
run into the decommissioning node's ranges and rely on their
`replicateQueue` to then rebalance them.

This meant that even when decommissioning a mostly empty node, our worst
case lower bound for marking that node fully decommissioned was _one
full scanner interval_ (which is 10 minutes by default).

This patch improves this behavior by installing an idempotent callback
that is invoked every time a node is detected to be `DECOMMISSIONING`.
When it is run, the callback enqueues all the replicas on the local
stores that are on ranges that also have replicas on the decommissioning
node.

Release note (performance improvement): Decommissioning should now be
substantially faster, particularly for small to moderately loaded nodes.
@blathers-crl
Copy link

blathers-crl bot commented Jun 9, 2022

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Patches should only be created for serious issues or test-only changes.
  • Patches should not break backwards-compatibility.
  • Patches should change as little code as possible.
  • Patches should not change on-disk formats or node communication protocols.
  • Patches should not add new functionality.
  • Patches must not add, edit, or otherwise modify cluster versions; or add version gates.
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters.
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.

Add a brief release justification to the body of your PR to justify this backport.

Some other things to consider:

  • What did we do to ensure that a user that doesn’t know & care about this backport, has no idea that it happened?
  • Will this work in a cluster of mixed patch versions? Did we test that?
  • If a user upgrades a patch version, uses this feature, and then downgrades, what happens?

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@aayushshah15
Copy link
Contributor Author

Don't stamp yet, there's a bug in the original patch that I need to fix first.

@aayushshah15 aayushshah15 marked this pull request as ready for review June 9, 2022 20:12
@aayushshah15 aayushshah15 requested review from a team as code owners June 9, 2022 20:12
@aayushshah15 aayushshah15 requested a review from a team June 9, 2022 20:12
@aayushshah15 aayushshah15 requested a review from a team as a code owner June 9, 2022 20:12
@aayushshah15 aayushshah15 requested review from shermanCRL and removed request for a team and shermanCRL June 9, 2022 20:12
This commit fixes a bug from cockroachdb#80993. Without this commit, nodes
might re-run the callback to enqueue a decommissioning node's ranges
into their replicate queues if they received a gossip update from that
decommissioning node that was perceived to be newer. Re-running this
callback on every newer gossip update from a decommissioning node will
be too expensive for nodes with a lot of replicas.

Release note: None
@aayushshah15 aayushshah15 requested a review from a team as a code owner June 10, 2022 21:52
@aayushshah15 aayushshah15 requested review from miretskiy and removed request for a team and miretskiy June 10, 2022 21:52
@aayushshah15
Copy link
Contributor Author

@kvoli and / or @AlexTalks: could I get a stamp on this? This patch has been baking on master for over a week and we haven't seen any fallout related to it.

Copy link
Collaborator

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aayushshah15 aayushshah15 merged commit 6882d94 into cockroachdb:release-22.1 Jul 5, 2022
@aayushshah15 aayushshah15 deleted the backport22.1-80993 branch July 5, 2022 18:00
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Jul 6, 2022
This patch fixes a merge skew introduced by cockroachdb#82680 and 82800

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants