cluster: set feature_table version during bootstrap #8225

jcsp · 2023-01-13T18:01:13Z

cluster: set feature_table version during bootstrap

In some circumstances, a cluster can have two nodes up+bootstrapped,
with an active controller raft group, but a third founding node
trying to join the cluster by requesting an ID (because this node
reported in earlier, but had trouble following through with the
bootstrap process).

Because handing out node IDs is behind a feature gate, this
would fail, and the cluster logical version would not advance
because there was a node in the members_table who had not
reported its logical version.

Avoid this class of issues by including logical version in
the controller bootstrap event, so that as soon as there is
a bootstrap message in the log, we are certain to have activated
all the features that a node might need to proceed.

Fixes #8203

Backports Required

UX Changes

None

Release Notes

Bug Fixes

An issue is fixed that could prevent clusters forming successfully when network disruptions happened during initial startup of the founding nodes.

andrwng · 2023-01-13T18:25:04Z

src/v/cluster/bootstrap_backend.cc

+        co_await _feature_table.invoke_on_all(
+          [v = cmd.value.founding_version](features::feature_table& ft) {
+              ft.set_active_version(v);


I think we need to avoid doing this if the version is lower than the current active version, e.g. if we've already loaded a feature table snapshot and are now replaying the controller log that was started on an older version.

Good catch, you're right. I think we can be even stricter, and just only set the active version if the current active version is invalid_version: this should only be happening when we have a totally default-constructed feature table.

jcsp · 2023-01-13T21:39:57Z

This caused a failure in test_feature_table_snapshots

The "normal" set_active_version method increments the version but does not activate features (though it may make them available). This is because activating features on upgrade is optional, and that optionality is respected at time of writing to the controller log, not replaying it. The reason for making auto-activation optional is to enable cautious upgrades, but during cluster bootstrap this motivation goes away, so we will unconditionally switch on all the available features when we see the bootstrap message's cluster version in the controller log. That is what this function does, as well as calling through to set_active_version.

This enables nodes reading the record to fast-forward their feature table to this version, without waiting for the feature_manager logic to see all nodes' versions and activate that way. Usually this is superfluous, when a founding triplet of nodes will be up promptly, but in some cases we might run with just two nodes for some time, where these two nodes would otherwise wait for the third to report its version before advancing the feature table version.

It will use this for setting the cluster logical version proactively when it sees one in a bootstrap message.

In some circumstances, a cluster can have two nodes up+bootstrapped, with an active controller raft group, but a third founding node trying to join the cluster by requesting an ID (because this node reported in earlier, but had trouble following through with the bootstrap process). Because handing out node IDs is behind a feature gate, this would fail, and the cluster logical version would not advance because there was a node in the members_table who had not reported its logical version. Avoid this class of issues by including logical version in the controller bootstrap event, so that as soon as there is a bootstrap message in the log, we are certain to have activated all the features that a node might need to proceed. Fixes redpanda-data#8203

jcsp · 2023-01-17T17:20:14Z

This one snowballed a bit...

I realized that bootstrap_backend also needs to write a snapshot, because once it has applied the latest version, there is no subsequent feature_manager-driven update that would prompt a snapshot.

While looking at this, I was also reminded that we go through this awkward phase pre-cluster-join where we have no logical cluster version. I've added a commit that fast-forwards the cluster version prior to joining the cluster. This is kind of orthogonal to the issue originally fixed in this PR, but made sense while I had my mind on it.

One thing I haven't done here is to make that node join totally safe by refusing to let newer-version nodes join the cluster (because they would have set their own feature_table to a higher one than the cluster is using). Currently we validate the version of joining nodes, but only to refuse to let too-old nodes join. Maybe some discussion needed on how to make this check, or whether we should instead try to tolerate newer nodes and teach them to rewind their feature table when they join an older version cluster.

jcsp · 2023-01-17T22:47:28Z

The early population of cluster version prior to node join was getting hairy, so I split it off in #8282

andrwng

LGTM, though I agree for the story to be complete we need #8282, which I also think makes sense since a brand new node should expect to only be able to join a cluster of the same version.

jcsp · 2023-01-18T10:15:28Z

CI failure is:

CI Failure (assert in persisted_stm) in EndToEndTopicRecovery.test_restore #8293

jcsp · 2023-01-18T17:58:08Z

/ci-repeat 5

jcsp · 2023-01-19T11:46:14Z

1 failure in those repeats, which was:

CI Failure (stuck consumer causes timeout) in PartitionMovementUpgradeTest.test_basic_upgrade #8207

jcsp · 2023-01-19T11:46:30Z

/backport v22.3.x

vbotbuildovich · 2023-01-19T11:47:39Z

Failed to run cherry-pick command. I executed the below command:

git cherry-pick -x 98ba9430855e1d8b0db09b9b1d18cdc16fe3b1e1 9d9539c911d80560b0622e72dc468164523f8420 a4500c8409984da8f5e42f10222f5546c0ed2a59 54120c9e1f982a36963749ea15799eaa9b43c11c 57c70f3e496a4eb7299c1ca844bf3c2552118429

Workflow run logs.

jcsp · 2023-01-20T17:00:26Z

#8340

[v22.3.x] Backport #8225 (cluster: set feature_table version during bootstrap)

jcsp requested review from andrwng and dlex January 13, 2023 18:01

github-actions bot added the area/redpanda label Jan 13, 2023

andrwng reviewed Jan 13, 2023

View reviewed changes

jcsp force-pushed the issue-8203 branch 2 times, most recently from d7e1be2 to 4368bf8 Compare January 13, 2023 19:35

jcsp force-pushed the issue-8203 branch 2 times, most recently from 5d71e97 to a3de4b1 Compare January 16, 2023 22:52

jcsp added 5 commits January 17, 2023 17:12

cluster: pass a feature table ref into bootstrap_backend

a4500c8

It will use this for setting the cluster logical version proactively when it sees one in a bootstrap message.

cluster: write a feature table snapshot when bootstrapping

57c70f3

jcsp force-pushed the issue-8203 branch from a3de4b1 to ad7f348 Compare January 17, 2023 17:14

jcsp force-pushed the issue-8203 branch from ad7f348 to 57c70f3 Compare January 17, 2023 22:46

jcsp mentioned this pull request Jan 17, 2023

cluster: fast-forward feature table before joining cluster #8282

Merged

6 tasks

jcsp marked this pull request as ready for review January 17, 2023 23:09

jcsp requested a review from andrwng January 17, 2023 23:09

andrwng approved these changes Jan 18, 2023

View reviewed changes

jcsp merged commit 9e6ede0 into redpanda-data:dev Jan 19, 2023

jcsp deleted the issue-8203 branch January 19, 2023 11:46

jcsp mentioned this pull request Jan 20, 2023

[v22.3.x] Backport #8225 (cluster: set feature_table version during bootstrap) #8340

Merged

6 tasks

andrwng added a commit that referenced this pull request Jan 20, 2023

Merge pull request #8340 from jcsp/pr-8225-v22.3.x

c896fae

[v22.3.x] Backport #8225 (cluster: set feature_table version during bootstrap)

joejulian mentioned this pull request Jan 31, 2023

Helm install fails to form cluster with Got request to assign node ID, but feature not active redpanda-data/helm-charts#290

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cluster: set feature_table version during bootstrap #8225

cluster: set feature_table version during bootstrap #8225

jcsp commented Jan 13, 2023

andrwng Jan 13, 2023

jcsp Jan 13, 2023

jcsp commented Jan 13, 2023

jcsp commented Jan 17, 2023

jcsp commented Jan 17, 2023

andrwng left a comment

jcsp commented Jan 18, 2023

jcsp commented Jan 18, 2023

jcsp commented Jan 19, 2023

jcsp commented Jan 19, 2023

vbotbuildovich commented Jan 19, 2023

jcsp commented Jan 20, 2023

cluster: set feature_table version during bootstrap #8225

cluster: set feature_table version during bootstrap #8225

Conversation

jcsp commented Jan 13, 2023

Backports Required

UX Changes

Release Notes

Bug Fixes

andrwng Jan 13, 2023

Choose a reason for hiding this comment

jcsp Jan 13, 2023

Choose a reason for hiding this comment

jcsp commented Jan 13, 2023

jcsp commented Jan 17, 2023

jcsp commented Jan 17, 2023

andrwng left a comment

Choose a reason for hiding this comment

jcsp commented Jan 18, 2023

jcsp commented Jan 18, 2023

jcsp commented Jan 19, 2023

jcsp commented Jan 19, 2023

vbotbuildovich commented Jan 19, 2023

jcsp commented Jan 20, 2023