Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed tracking decommissioned update revision in cluster::members_backend #8245

Merged
merged 3 commits into from
Jan 16, 2023

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented Jan 16, 2023

Members backend was tracking last node decommission revision to be able
to cancel all related partition movements. The tracking was broken as it
might be the case that the revision map was updated by the next
decommission update while the previous recommission was still being
processed. In order to fix this issue and simplify tracking of last
decommission revision id introduced tracking previous decommission
revision inside of recommission update metadata object. This way a
certain recommission action is always related with correct decommission
revision.

Fixes: #8218

Backports Required

  • none - not a bug fix
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v22.3.x
  • v22.2.x
  • v22.1.x

UX Changes

Release Notes

Bug Fixes

  • fixed incorrectly tracked previous decommission update that in some rare situations may lead to assertion being triggered in Redpanda

Signed-off-by: Michal Maslanka <michal@redpanda.com>
When operation is already finished as the node was removed we may exit
early instead of going through the whole members_backend reconciliation
logic.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
Members backend was tracking last node decommission revision to be able
to cancel all related partition movements. The tracking was broken as it
might be the case that the revision map was updated by the next
decommission update while the previous recommission was still being
processed. In order to fix this issue and simplify tracking of last
decommission revision id introduced tracking previous decommission
revision inside of recommission update metadata object. This way a
certain recommission action is always related with correct decommission
revision.

Fixes: redpanda-data#8218

Signed-off-by: Michal Maslanka <michal@redpanda.com>
@mmaslankaprv mmaslankaprv merged commit e0bbd4c into redpanda-data:dev Jan 16, 2023
@mmaslankaprv
Copy link
Member Author

/backport v22.3.x

@vbotbuildovich
Copy link
Collaborator

The pull request is not merged yet. Cancelling backport...

Workflow run logs.

@piyushredpanda
Copy link
Contributor

/backport v22.3.x

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the below command:

git cherry-pick -x 9bb6ac4b30c0a0903cdc2681bd64bb844e9c54a2 8c33fb8d3d7f1637a92553e67820255bc2db908b 51d721b9fad42b90c2538d0834c490a55d1b0a43

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI Failure (internal redpanda assert!) in NodesDecommissioningTest.test_flipping_decommission_recommission
4 participants