node wise recovery improvements #15394

bharathv · 2023-12-11T14:27:47Z

Some changes from previous defunct nodes implementation.

defunct nodes is no longer a thing, will be reimplemented as offline nodes in a future PR.
node wise recovery and offlining nodes are two separate user workflows. Earlier node wise recovering partitions and marking nodes as defunct were combined into a single command.
node wise recovery no longer touches partitions with majority, balancer should automatically recover them after unavailability timeout.
corresponding controller command for node wise recovery is idempotent (can be issued as many times as needed)

API examples:

fetch a list of partitions losing majority - GET http://:9644/v1/partitions/majority_lost?dead_nodes=5,6,7
node wise recovery them - POST http://:9644/v1/partitions/force_recover_from_nodes

example payload.

{'dead_nodes': [5], 'partitions_to_force_recover': [{'ntp': {'ns': 'kafka', 'topic': 'topic-3', 'partition': 0     }, 'topic_revision': 42, 'replicas': [{'node_id': 5, 'core': 0}], 'dead_nodes': [5]}, {'ntp': {'ns': 'kafka', 'topic': 'topic-7', 'partition': 0}, 'topic_revision': 49, 'replicas': [{'node_id': 5, 'core': 1}], 'dead_nodes': [5]}, {'     ntp': {'ns': 'kafka', 'topic': 'topic-5', 'partition': 0}, 'topic_revision': 47, 'replicas': [{'node_id': 5, 'core': 1}], 'dead_nodes': [5]}, {'ntp': {'ns': 'kafka', 'topic': 'topic-1', 'partition': 0}, 'topic_revision': 39, 'replic     as': [{'node_id': 5, 'core': 1}], 'dead_nodes': [5]}, {'ntp': {'ns': 'kafka', 'topic': 'topic-17', 'partition': 1}, 'topic_revision': 59, 'replicas': [{'node_id': 5, 'core': 1}], 'dead_nodes': [5]}]}

Backports Required

Release Notes

Features

Adds the ability to force recover all partitions from a set of nodes (aka node wise recovery). This bulk recovers all partitions that lost majority because the nodes holding the majority replicas are no longer available.

bharathv · 2023-12-12T04:12:37Z

/dt

vbotbuildovich · 2023-12-12T06:35:15Z

new failures in https://buildkite.com/redpanda/redpanda/builds/42572#018c5c79-699d-4e7f-95f6-317755a710a1:

"rptest.tests.cluster_config_test.ClusterConfigAliasTest.test_aliasing_with_upgrade.wipe_cache=False.prop_set=PropertyAliasData.primary_name=.log_retention_ms.aliased_name=.delete_retention_ms.redpanda_version=.23.3.test_values=.1000000.300000.500000"

new failures in https://buildkite.com/redpanda/redpanda/builds/42572#018c5c79-69a4-4294-9b51-0d811376a80f:

"rptest.tests.cluster_config_test.ClusterConfigAliasTest.test_aliasing_with_upgrade.wipe_cache=True.prop_set=PropertyAliasData.primary_name=.log_retention_ms.aliased_name=.delete_retention_ms.redpanda_version=.23.3.test_values=.1000000.300000.500000"

src/v/cluster/topic_table.cc

src/v/cluster/topic_table.h

src/v/cluster/topic_table.cc

bharathv · 2023-12-13T04:00:44Z

/dt

mmaslankaprv · 2023-12-13T08:13:28Z

src/v/cluster/topics_frontend.cc

+        auto validation_err
+          = _topics.local().validate_force_reconfigurable_partitions(result);
+        if (validation_err) {
+            co_return errc::concurrent_modification_error;


maybe we can return a validation_err here ?

this was intentional because if a user makes a get_majority_lost_partitions() they don't expect an error unless something concurrently modified the state while processing the request (hence concurrent_modification_error).. this validation_error is only possible in that situation.

mmaslankaprv · 2023-12-13T08:14:14Z

just two comments, otherwise looks good

missed this in the original PR.

This command will replace defunct_nodes_cmd in the subsequent commits.

Adds logic for processing force reconfiguration partitions. Additionally undos most of the defunct_node_cmd state processing.

Additionally renames defunct_node -> dead_node in most APIs to clearup terminology.

.. for node-wise-recovery APIs

bharathv · 2023-12-13T11:33:38Z

/dt

vbotbuildovich · 2023-12-13T14:04:07Z

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/42670#018c6338-c294-4847-8836-7c0c7258f6b6

github-actions bot added the area/redpanda label Dec 11, 2023

bharathv force-pushed the defunct_node_fixes branch from e8d907b to 293887a Compare December 12, 2023 08:32

bharathv marked this pull request as ready for review December 12, 2023 08:33

bharathv requested review from ztlpn and mmaslankaprv December 12, 2023 08:33

mmaslankaprv reviewed Dec 12, 2023

View reviewed changes

src/v/cluster/topic_table.cc Outdated Show resolved Hide resolved

mmaslankaprv reviewed Dec 12, 2023

View reviewed changes

src/v/cluster/topic_table.h Outdated Show resolved Hide resolved

mmaslankaprv reviewed Dec 12, 2023

View reviewed changes

src/v/cluster/topic_table.cc Show resolved Hide resolved

mmaslankaprv reviewed Dec 12, 2023

View reviewed changes

src/v/cluster/topic_table.cc Outdated Show resolved Hide resolved

bharathv force-pushed the defunct_node_fixes branch from 293887a to 4ebcf8a Compare December 12, 2023 11:25

bharathv requested a review from mmaslankaprv December 12, 2023 11:25

bharathv added this to the v23.3.1-rc3 milestone Dec 13, 2023

mmaslankaprv reviewed Dec 13, 2023

View reviewed changes

bharathv added 8 commits December 13, 2023 00:21

topics_frontend: check for iterator stability after schedling point

ae076df

missed this in the original PR.

topic_table: Add bulk_force_reconfiguration_cmd

b298710

This command will replace defunct_nodes_cmd in the subsequent commits.

topic_table: remove unusued variable

5984c57

topic_table: state tracking for bulk_force_reconfiguration_cmd

8017f4b

partition_balancer: bulk force reconfiguration updates

991538a

Adds logic for processing force reconfiguration partitions. Additionally undos most of the defunct_node_cmd state processing.

members: remove defunct_node_cmd and related code

3c4891f

Additionally renames defunct_node -> dead_node in most APIs to clearup terminology.

topics_frontend: issue bulk_force_reconfiguration_cmd

f40d1df

.. for node-wise-recovery APIs

ducktape: update node_wise_recovery test

f43c399

bharathv force-pushed the defunct_node_fixes branch from 4ebcf8a to f43c399 Compare December 13, 2023 08:58

bharathv requested a review from mmaslankaprv December 13, 2023 09:00

mmaslankaprv approved these changes Dec 13, 2023

View reviewed changes

bharathv merged commit a2a2dd4 into redpanda-data:dev Dec 13, 2023
20 checks passed

bharathv deleted the defunct_node_fixes branch December 13, 2023 14:51

r-vasquez mentioned this pull request Dec 17, 2023

rpk: add 'rpk cluster partitions unsafe-recover' #15300

Merged

7 tasks

github-actions bot mentioned this pull request Dec 22, 2023

update redpanda appVersion from v23.2.21 to v23.3.1 redpanda-data/helm-charts#950

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

node wise recovery improvements #15394

node wise recovery improvements #15394

bharathv commented Dec 11, 2023 •

edited

Loading

bharathv commented Dec 12, 2023

vbotbuildovich commented Dec 12, 2023 •

edited

Loading

bharathv commented Dec 13, 2023

mmaslankaprv Dec 13, 2023

bharathv Dec 13, 2023

mmaslankaprv commented Dec 13, 2023

bharathv commented Dec 13, 2023

vbotbuildovich commented Dec 13, 2023

node wise recovery improvements #15394

node wise recovery improvements #15394

Conversation

bharathv commented Dec 11, 2023 • edited Loading

Backports Required

Release Notes

Features

bharathv commented Dec 12, 2023

vbotbuildovich commented Dec 12, 2023 • edited Loading

bharathv commented Dec 13, 2023

mmaslankaprv Dec 13, 2023

Choose a reason for hiding this comment

bharathv Dec 13, 2023

Choose a reason for hiding this comment

mmaslankaprv commented Dec 13, 2023

bharathv commented Dec 13, 2023

vbotbuildovich commented Dec 13, 2023

bharathv commented Dec 11, 2023 •

edited

Loading

vbotbuildovich commented Dec 12, 2023 •

edited

Loading