cluster: enable controller replay if last_applied is ahead of log #5703

jcsp · 2022-07-28T13:00:31Z

Cover letter

This situation happens if someone deletes controller log
segments while leaving the kvstore in place.

Previously, the kvstore last_applied would cause the
node to hang waiting for the controller log to replay
to that offset. This is not a redpanda bug per-se, as it only happens
when the underlying system violates invariants about storage,
but it is a case where we can be more helpful.

Now, we log an error about the apparent inconsistency,
and proceed.

In general we do not want to ignore data inconsistency, but
this is a special case: deleting the controller log is something
a user might legitimately do in order to work around another
issue + force redpanda to rebuild the local copy of the
controller log.

Fixes #4950

UX changes

None

Release notes

Improvements

Controller log replay is more resilient to unexpected removal of log on disk.

src/v/cluster/controller.cc

Noticed this while writing a test with an off by one on a node id. Internally the error handling is safe, but in the API we're returning a 500 instead of a 400.

This situation happens if someone deletes controller log segments while leaving the kvstore in place. Previously, the kvstore last_applied would cause the node to hang waiting for the controller log to replay to that offset. Now, we log an error about the apparent inconsistency, and proceed. In general we do not want to ignore data inconsistency, but this is a special case: deleting the controller log is something a user might legitimately do in order to work around another issue + force redpanda to rebuild the local copy of the controller log.

For tests that want to know "did node X log message Y" rather than just "was message Y logged anywhere"

This test validates that it is possible to reset the controller log on a single node by removing it, a procedure occasionally used in the field in the event of a cluster in a split brain situation resulting from interference with a node's storage.

src/v/cluster/controller.cc

jcsp added kind/enhance New feature or request area/controller labels Jul 28, 2022

github-actions bot added the area/redpanda label Jul 28, 2022

jcsp force-pushed the controller-replay-after-dataloss branch 2 times, most recently from 30887d4 to 2089526 Compare August 2, 2022 13:08

jcsp marked this pull request as ready for review August 2, 2022 13:09

jcsp requested review from dotnwat, NyaliaLui, mmaslankaprv, ztlpn and VadimPlh as code owners August 2, 2022 13:09

mmaslankaprv reviewed Aug 2, 2022

View reviewed changes

src/v/cluster/controller.cc Outdated Show resolved Hide resolved

mmaslankaprv reviewed Aug 2, 2022

View reviewed changes

src/v/cluster/controller.cc Outdated Show resolved Hide resolved

mmaslankaprv reviewed Aug 2, 2022

View reviewed changes

src/v/cluster/controller.cc Outdated Show resolved Hide resolved

jcsp force-pushed the controller-replay-after-dataloss branch from 2089526 to d1e3ff2 Compare November 24, 2022 20:04

jcsp requested review from mmaslankaprv and removed request for dotnwat, ztlpn, NyaliaLui and VadimPlh November 24, 2022 20:55

mmaslankaprv reviewed Nov 25, 2022

View reviewed changes

src/v/cluster/controller.cc Outdated Show resolved Hide resolved

jcsp added 4 commits November 25, 2022 15:14

admin: fix 500 errors on unknown node IDs in leader transfer

b9015b7

Noticed this while writing a test with an off by one on a node id. Internally the error handling is safe, but in the API we're returning a 500 instead of a 400.

tests: enable searching a single node's log in RedpandaTest

6a1798c

For tests that want to know "did node X log message Y" rather than just "was message Y logged anywhere"

tests: add ControllerEraseTest

0eda42c

This test validates that it is possible to reset the controller log on a single node by removing it, a procedure occasionally used in the field in the event of a cluster in a split brain situation resulting from interference with a node's storage.

jcsp force-pushed the controller-replay-after-dataloss branch from d1e3ff2 to 0eda42c Compare November 25, 2022 15:29

mmaslankaprv reviewed Nov 28, 2022

View reviewed changes

src/v/cluster/controller.cc Show resolved Hide resolved

mmaslankaprv self-requested a review November 28, 2022 10:33

mmaslankaprv approved these changes Nov 28, 2022

View reviewed changes

jcsp merged commit 79a8e3e into redpanda-data:dev Nov 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cluster: enable controller replay if last_applied is ahead of log #5703

cluster: enable controller replay if last_applied is ahead of log #5703

jcsp commented Jul 28, 2022 •

edited

cluster: enable controller replay if last_applied is ahead of log #5703

cluster: enable controller replay if last_applied is ahead of log #5703

Conversation

jcsp commented Jul 28, 2022 • edited

Cover letter

UX changes

Release notes

Improvements

jcsp commented Jul 28, 2022 •

edited