-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI Failure (internal redpanda assert!) in NodesDecommissioningTest.test_flipping_decommission_recommission
#8218
Labels
area/controller
ci-failure
kind/bug
Something isn't working
sev/high
loss of availability, pathological performance degradation, recoverable corruption
Comments
@mmaslankaprv there is an assert you've added is failing |
6 tasks
will look into this. thanks for reporting. |
Seen here #8092 |
This was referenced Jan 14, 2023
6 tasks
jcsp
added
area/controller
sev/high
loss of availability, pathological performance degradation, recoverable corruption
labels
Jan 16, 2023
mmaslankaprv
added a commit
to mmaslankaprv/redpanda
that referenced
this issue
Jan 17, 2023
Members backend was tracking last node decommission revision to be able to cancel all related partition movements. The tracking was broken as it might be the case that the revision map was updated by the next decommission update while the previous recommission was still being processed. In order to fix this issue and simplify tracking of last decommission revision id introduced tracking previous decommission revision inside of recommission update metadata object. This way a certain recommission action is always related with correct decommission revision. Fixes: redpanda-data#8218 Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 51d721b)
mmaslankaprv
added a commit
to mmaslankaprv/redpanda
that referenced
this issue
Jan 17, 2023
Members backend was tracking last node decommission revision to be able to cancel all related partition movements. The tracking was broken as it might be the case that the revision map was updated by the next decommission update while the previous recommission was still being processed. In order to fix this issue and simplify tracking of last decommission revision id introduced tracking previous decommission revision inside of recommission update metadata object. This way a certain recommission action is always related with correct decommission revision. Fixes: redpanda-data#8218 Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 51d721b)
mmaslankaprv
added a commit
to mmaslankaprv/redpanda
that referenced
this issue
Jan 17, 2023
Members backend was tracking last node decommission revision to be able to cancel all related partition movements. The tracking was broken as it might be the case that the revision map was updated by the next decommission update while the previous recommission was still being processed. In order to fix this issue and simplify tracking of last decommission revision id introduced tracking previous decommission revision inside of recommission update metadata object. This way a certain recommission action is always related with correct decommission revision. Fixes: redpanda-data#8218 Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 51d721b)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/controller
ci-failure
kind/bug
Something isn't working
sev/high
loss of availability, pathological performance degradation, recoverable corruption
https://buildkite.com/redpanda/redpanda/builds/21131#0185aa2a-f6ee-4d97-a593-ba8122fc740d
The text was updated successfully, but these errors were encountered: