k8s: Wait for restarted broker to catch up #7594

RafalKorepta · 2022-12-01T15:49:59Z

UnderReplicatedPartitionThreshold regulate when rolling update will continue with
restarts. The procedure can be described as follows:

Rolling update checks if Pod specification needs to be replaced and deletes it
Deleted Redpanda Pod is put into maintenance mode (postStart hook will disable
maintenance mode when new Pod starts)
Rolling update waits for Pod to be in Ready state
Rolling update checks if cluster is in healthy state
Rolling update checks if restarted Redpanda Pod admin API Ready endpoint returns HTTP 200 response
Using UnderReplicatedPartitionThreshold each under replicated partition metric is compared with the threshold
Rolling update moves to the next Redpanda pod

The metric vectorized_cluster_partition_under_replicated_replicas is used in the comparison

Mentioned metrics has the following help description:
vectorized_cluster_partition_under_replicated_replicas Number of under replicated replicas

By default, the UnderReplicatedPartitionThreshold will be 0, which means all partitions needs to catch up without any lag.

Backports Required

UX Changes

Release Notes

Features

Wait for under replicated partition to catch up

REF

#3023 (comment)

nicolaferraro

Just a minor issue on the logic. Other than that, it looks good.

nicolaferraro · 2023-01-02T10:49:03Z

src/go/k8s/pkg/resources/statefulset_update.go

+				continue
+			}
+			if m.Gauge == nil {
+				r.logger.Info("cluster_partition_under_replicated_replicas metric does not have value", "labels", m.Label)


continue missing

nicolaferraro · 2023-01-02T10:53:23Z

src/go/k8s/pkg/resources/statefulset.go

@@ -139,6 +140,7 @@ func NewStatefulSet(
 		decommissionWaitInterval,
 		logger.WithValues("Kind", statefulSetKind()),
 		nil,
+		0,


It seems there's no way to customize this. Is this expected?

Yes and no. In ideal world Redpanda operator should have close loop for rolling update. When it detects the need for restart, then Redpanda should be throttled to the point where operator can observe no lag on all Redpanda instances. This is ideal, but in real deployment we can not throttle producers, so there are to options:

threshold in cluster custom resource definition

threshold in operator configuration (currently command line flags)

I would lean towards threshold in cluster custom resource definition.

What are your thoughts?

Yeah I would also feel better if this is customizable - in case this yields cluster never catching up so we can at least tweak it and let it move on...

I am fine with both options you presented.

There's a restartConfig section in the CRD where this can fit. I'm thinking to cases where we do an upgrade between slightly incompatible versions and the cluster is unable to replicate partitions until all nodes are at the same version. The flag allows to unlock the upgrade..

nicolaferraro · 2023-01-02T10:54:21Z

src/go/k8s/pkg/resources/statefulset_update.go

+		if err = r.queryRedpandaUnderReplicatedPartition(ctx, &adminURL); err != nil {
+			return &RequeueAfterError{
+				RequeueAfter: RequeueDuration,
+				Msg:          fmt.Sprintf("under replicated partition is not ready: %v", err),


(non-blocking) Maybe a better description of the error...

I will try with broker reported under replicated partitions: %v

alenkacz

Thanks, great addition to make rollout safer! I left some comments though...

src/go/k8s/pkg/resources/statefulset.go

alenkacz · 2023-01-02T12:25:37Z

src/go/k8s/pkg/resources/statefulset.go

@@ -139,6 +140,7 @@ func NewStatefulSet(
 		decommissionWaitInterval,
 		logger.WithValues("Kind", statefulSetKind()),
 		nil,
+		0,


Yeah I would also feel better if this is customizable - in case this yields cluster never catching up so we can at least tweak it and let it move on...

I am fine with both options you presented.

src/go/k8s/pkg/resources/statefulset_update.go

alenkacz · 2023-01-02T12:28:22Z

src/go/k8s/pkg/resources/statefulset_update.go

+	client := &http.Client{Timeout: adminAPITimeout}
+
+	// TODO right now we support TLS only on one listener so if external
+	// connectivity is enabled, TLS is enabled only on external listener. This


I don't think this is true, tls could be enabled on internal listener as well. I think the behavior you're describing is true only for schema registry.

To be brutally honest I copied it from your previous implementation

redpanda/src/go/k8s/pkg/resources/statefulset_update.go

Lines 417 to 419 in a6a021b

// TODO right now we support TLS only on one listener so if external

// connectivity is enabled, TLS is enabled only on external listener. This

// will be fixed by https://github.com/redpanda-data/redpanda/issues/1084

6235e96#diff-5619b1cd9bdb790f53d380d84b884e952da245749cb78145fdf8f6b328b1be2cR182-R184

alenkacz · 2023-01-02T12:29:10Z

src/go/k8s/pkg/resources/statefulset_update.go

+	}
+
+	for name, metricFamily := range metrics {
+		if name != "vectorized_cluster_partition_under_replicated_replicas" {


this is per node metric? Don't we have to look for metric for this particular node?

I will ask core how to discover under replicated replicas

To be honest I forget how I implemented this function. We query current POD. You can see that headless service with exact pod name is used

redpanda/src/go/k8s/pkg/resources/statefulset_update.go

Line 201 in c5bb116

Host: fmt.Sprintf("%s.%s", pod.Name, headlessServiceWithPort),

That means metrics comes from one POD.

oh I see, thanks

RafalKorepta · 2023-01-05T09:22:55Z

It works after merging
https://github.com/redpanda-data/vtools/pull/1264/files

Prove is in https://buildkite.com/redpanda/vtools/builds/4999

alenkacz

LGTM, no blockers

alenkacz · 2023-01-06T10:01:21Z

src/go/k8s/apis/redpanda/v1alpha1/cluster_types.go

@@ -177,6 +177,26 @@ type ClusterSpec struct {
 type RestartConfig struct {
 	// DisableMaintenanceModeHooks deactivates the preStop and postStart hooks that force nodes to enter maintenance mode when stopping and exit maintenance mode when up again
 	DisableMaintenanceModeHooks *bool `json:"disableMaintenanceModeHooks,omitempty"`
+
+	// UnderReplicatedPartitionThreshold regulate when rolling update will continue with


nit: regulates - or rather controls

alenkacz · 2023-01-06T10:02:30Z

src/go/k8s/apis/redpanda/v1alpha1/cluster_webhook_test.go

@@ -217,6 +217,19 @@ func TestDefault(t *testing.T) {
 		redpandaCluster.Default()
 		assert.Equal(t, v1alpha1.DefaultLicenseSecretKey, redpandaCluster.Spec.LicenseRef.Key)
 	})
+
+	t.Run("", func(t *testing.T) {


nit: should the test have a name?

Test name added

alenkacz · 2023-01-06T10:04:40Z

src/go/k8s/pkg/resources/statefulset_update_test.go

+
+func TestEvaluateRedpandaUnderReplicatedPartition(t *testing.T) {
+	ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		f, err := os.Open("testdata/metrics.gold.txt")


nit: I think the filed are supposed to be called golden, not gold

Thank you, name changed.

When rolling restart is performed then the broker might need to catch up with other replica. This change will parse /metrics endpoint to see if `vectorized_cluster_partition_under_replicated_replicas` is under threshold.

Kminion will create artificial load to have any values for `vectorized_cluster_partition_under_replicated_replicas` metric.

…dpanda-datagh-3023/check-under-replicated-partitions-in-upgrade-procedure

…edpanda-datagh-3023/check-under-replicated-partitions-in-upgrade-procedure

RafalKorepta requested a review from a team as a code owner December 1, 2022 15:50

github-actions bot added the area/k8s label Dec 1, 2022

RafalKorepta force-pushed the rk/gh-3023/check-under-replicated-partitions-in-upgrade-procedure branch from c5fe25d to 38e0c58 Compare January 2, 2023 10:21

nicolaferraro reviewed Jan 2, 2023

View reviewed changes

RafalKorepta force-pushed the rk/gh-3023/check-under-replicated-partitions-in-upgrade-procedure branch from 38e0c58 to c5bb116 Compare January 2, 2023 11:26

alenkacz reviewed Jan 2, 2023

View reviewed changes

RafalKorepta force-pushed the rk/gh-3023/check-under-replicated-partitions-in-upgrade-procedure branch 3 times, most recently from 12aac25 to 4515dc8 Compare January 4, 2023 15:20

RafalKorepta force-pushed the rk/gh-3023/check-under-replicated-partitions-in-upgrade-procedure branch 3 times, most recently from af460c4 to 77934ba Compare January 5, 2023 13:46

RafalKorepta requested review from nicolaferraro, alenkacz, alejandroEsc and joejulian January 5, 2023 20:22

alenkacz previously approved these changes Jan 6, 2023

View reviewed changes

Rafal Korepta added 2 commits January 6, 2023 11:13

k8s: Wait for restarted broker to catch up

cbfe742

When rolling restart is performed then the broker might need to catch up with other replica. This change will parse /metrics endpoint to see if `vectorized_cluster_partition_under_replicated_replicas` is under threshold.

k8s: Add kminion to e2e tests for artificial load

f40b53c

Kminion will create artificial load to have any values for `vectorized_cluster_partition_under_replicated_replicas` metric.

RafalKorepta dismissed alenkacz’s stale review via f40b53c January 6, 2023 10:14

RafalKorepta force-pushed the rk/gh-3023/check-under-replicated-partitions-in-upgrade-procedure branch from 77934ba to f40b53c Compare January 6, 2023 10:14

alenkacz approved these changes Jan 6, 2023

View reviewed changes

RafalKorepta merged commit 5edf3ad into redpanda-data:dev Jan 6, 2023

RafalKorepta mentioned this pull request Jan 8, 2023

k8s: implement safe rolling upgrade logic in operator #3023

Closed

joejulian pushed a commit to joejulian/redpanda that referenced this pull request Mar 10, 2023

(split) Merge pull request redpanda-data#7594 from RafalKorepta/rk/re…

745e6f5

…dpanda-datagh-3023/check-under-replicated-partitions-in-upgrade-procedure

joejulian pushed a commit to joejulian/redpanda that referenced this pull request Mar 24, 2023

operator: Merge pull request redpanda-data#7594 from RafalKorepta/rk/r…

6e58377

…edpanda-datagh-3023/check-under-replicated-partitions-in-upgrade-procedure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

k8s: Wait for restarted broker to catch up #7594

k8s: Wait for restarted broker to catch up #7594

RafalKorepta commented Dec 1, 2022 •

edited

nicolaferraro left a comment

nicolaferraro Jan 2, 2023

RafalKorepta Jan 2, 2023

nicolaferraro Jan 2, 2023

RafalKorepta Jan 2, 2023

alenkacz Jan 2, 2023

nicolaferraro Jan 3, 2023

nicolaferraro Jan 2, 2023 •

edited

RafalKorepta Jan 2, 2023

alenkacz left a comment

alenkacz Jan 2, 2023

alenkacz Jan 2, 2023

RafalKorepta Jan 2, 2023

alenkacz Jan 2, 2023

RafalKorepta Jan 2, 2023

RafalKorepta Jan 2, 2023

alenkacz Jan 6, 2023

RafalKorepta commented Jan 5, 2023 •

edited

alenkacz left a comment

alenkacz Jan 6, 2023

RafalKorepta Jan 6, 2023

alenkacz Jan 6, 2023

RafalKorepta Jan 6, 2023

alenkacz Jan 6, 2023

RafalKorepta Jan 6, 2023

	// TODO right now we support TLS only on one listener so if external
	// connectivity is enabled, TLS is enabled only on external listener. This
	// will be fixed by https://github.com/redpanda-data/redpanda/issues/1084

k8s: Wait for restarted broker to catch up #7594

k8s: Wait for restarted broker to catch up #7594

Conversation

RafalKorepta commented Dec 1, 2022 • edited

Backports Required

UX Changes

Release Notes

Features

REF

nicolaferraro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nicolaferraro Jan 2, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alenkacz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RafalKorepta commented Jan 5, 2023 • edited

alenkacz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RafalKorepta commented Dec 1, 2022 •

edited

nicolaferraro Jan 2, 2023 •

edited

RafalKorepta commented Jan 5, 2023 •

edited