Controller backend: reconcile each ntp in individual fibers #16055

ztlpn · 2024-01-10T16:26:26Z

In the future, when partitions will be assigned to shards by a node-level component (as opposed to topic table), controller backend will have to accept notifications from several sources. To support this it is more convenient to have per-ntp fibers doing reconciliation, as opposed to a single housekeeping fiber that periodically launches reconciliation for all pending ntps and waits for all reconciliation attempts to finish.

Backports Required

Release Notes

none

vbotbuildovich · 2024-01-10T18:53:09Z

new failures in https://buildkite.com/redpanda/redpanda/builds/43647#018cf476-8100-43a4-9e7d-67a03af05ae4:

"rptest.tests.recovery_mode_test.RecoveryModeTest.test_recovery_mode"

ztlpn · 2024-01-10T23:18:28Z

Hmm, the above failure may actually be semi-related due to slightly changed timings of partition creation. In the test run ntp kafka/__consumer_offsets/1 never got a leader because all replicas reported

TRACE 2024-01-10 18:19:47,779 [shard 0:main] raft - [group_id:13, {kafka/__consumer_offsets/1}] consensus.cc:953 - current node 
priority 1 is lower than target 6538824 (next vote 4359216)

Looking at the code, I see that before the raft group manager is marked "ready", consensus instances are created with priority=1. When set_ready is called, priority is reset to a normal value for all raft instances existing at that point. But I think there is a small race possibility where set_ready is called exactly between the moment a consensus instance is created and the moment it is added to the group_manager map.

@mmaslankaprv WDYT?

ztlpn · 2024-01-11T00:16:17Z

ok, this is very easy to reproduce with a well-placed sleep

src/v/cluster/controller_backend.cc

mmaslankaprv · 2024-01-11T08:37:31Z

src/v/cluster/controller_backend.cc

+              auto [rs_it, inserted] = _states.try_emplace(d.ntp);
+              if (inserted) {
+                  rs_it->second
+                    = ss::make_lw_shared<ntp_reconciliation_state>();
+              }


src/v/cluster/controller_backend.cc

ztlpn · 2024-01-11T12:54:51Z

Fix for the test_recovery_mode failure: #16068

bharathv

looks pretty good to me.

Event is a simple notification mechanism. One fiber can set() the event and another can wait() for it.

vbotbuildovich · 2024-01-12T23:09:59Z

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/43737#018cffa6-0d05-4f85-8ffb-774fb45f569c

ztlpn requested review from bharathv and mmaslankaprv January 10, 2024 16:26

github-actions bot added the area/redpanda label Jan 10, 2024

mmaslankaprv reviewed Jan 11, 2024

View reviewed changes

src/v/cluster/controller_backend.cc Outdated Show resolved Hide resolved

mmaslankaprv reviewed Jan 11, 2024

View reviewed changes

src/v/cluster/controller_backend.cc Outdated Show resolved Hide resolved

bharathv previously approved these changes Jan 12, 2024

View reviewed changes

ztlpn added 6 commits January 12, 2024 20:59

ssx: add event class

00b0c46

Event is a simple notification mechanism. One fiber can set() the event and another can wait() for it.

c/controller_backend: reconcile each ntp in separate fiber

f2d9d1f

c/controller_backend: move internal structs to .cc

db37089

c/controller_backend: add binding for housekeeping_interval property

8109422

c/controller_backend: add housekeeping jitter

4c6452f

c/controller_backend: add stuck ntp watchdog fiber

fadddd2

ztlpn dismissed bharathv’s stale review via fadddd2 January 12, 2024 20:14

ztlpn force-pushed the controller-backend-fiber-per-ntp branch from 1e32c2b to fadddd2 Compare January 12, 2024 20:14

ztlpn requested review from bharathv and mmaslankaprv January 12, 2024 20:15

mmaslankaprv approved these changes Jan 15, 2024

View reviewed changes

ztlpn merged commit c9eb5cb into redpanda-data:dev Jan 15, 2024
19 checks passed

ztlpn deleted the controller-backend-fiber-per-ntp branch April 19, 2024 09:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Controller backend: reconcile each ntp in individual fibers #16055

Controller backend: reconcile each ntp in individual fibers #16055

ztlpn commented Jan 10, 2024

vbotbuildovich commented Jan 10, 2024

ztlpn commented Jan 10, 2024 •

edited

ztlpn commented Jan 11, 2024

mmaslankaprv Jan 11, 2024

ztlpn commented Jan 11, 2024

bharathv left a comment

vbotbuildovich commented Jan 12, 2024

Controller backend: reconcile each ntp in individual fibers #16055

Controller backend: reconcile each ntp in individual fibers #16055

Conversation

ztlpn commented Jan 10, 2024

Backports Required

Release Notes

vbotbuildovich commented Jan 10, 2024

ztlpn commented Jan 10, 2024 • edited

ztlpn commented Jan 11, 2024

mmaslankaprv Jan 11, 2024

Choose a reason for hiding this comment

ztlpn commented Jan 11, 2024

bharathv left a comment

Choose a reason for hiding this comment

vbotbuildovich commented Jan 12, 2024

ztlpn commented Jan 10, 2024 •

edited