Enable etcd cluster scaling for userclusters #5571

moelsayed · 2020-06-25T14:52:47Z

What this PR does / why we need it:

Enables configurable etcd cluster size in the user cluster spec.
Managed scale up/down for etcd stateful set.
Automatic etcd member join and reconcile.
Adds http based liveness probe for etcd pods
Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes Support configurable user cluster etcd cluster size #5545

Special notes for your reviewer:

Documentation:

Does this PR introduce a user-facing change?:

Adds configurable etcd cluster size to user cluster spec.

multi-io · 2020-07-01T23:41:11Z

pkg/resources/etcd/statefulset.go

+		if etcdClusterSize > replicas {
+			return replicas + 1
+		}
+		return replicas - 1


Tested with a pre-existing, running cluster with 3 replicas. The statefulset ended up being reduced to 2 replicas, I assume here (.Spec.EtcdClusterSize not set => etcdClusterSize:=0; replicas:=3; isEtcdHealthy:=true => return 2). The etcd launcher then refuses to launch a 2-node cluster because that's smaller than the defaultClusterSize (3), leading to a permanently non-running cluster. Is the idea that the admin must always set .Spec.EtcdClusterSize?

No, it's defaulted here, which runs as part of the API server. I am not 100% sure if it will default existing clusters as well. But it is a good point, we should be more defensive here.

Confirmed that an existing cluster is migrated to the new launcher properly if .spec.etcdClusterSize is set to 3 first. Maybe we should just treat an unspecified (==0) value for the field as if the field was set to 3.

No, it's defaulted here, which runs as part of the API server. I am not 100% sure if it will default existing clusters as well. But it is a good point, we should be more defensive here.

Yeah the defaulting I saw too, but I'm pretty sure that's only called when new clusters are created.

@multi-io I just pushed a fix for it. Can you please confirm it's working?

Confirmed. Existing etcd cluster was migrated without having to manually set .spec.etcdClusterSize or anything else in the cluster resource.

@multi-io I think this should work https://github.com/kubermatic/kubermatic/pull/5571/files#diff-60471266513513d81d30f4855c2c01e2R316

moadqassem · 2020-07-02T11:25:10Z

cmd/etcd-launcher/main.go

 	// not required, will leave it for now.
-	os.Setenv(initialStateEnvName, "new")
-	os.Setenv(initialClusterEnvName, initialMemberList(config.clusterSize, config.namespace))
+	os.Setenv(initialStateEnvName, e.config.initialState)


This line and the line underneath it has no error check, I know the original code had no error check either but would make since to add some checks here.

moadqassem · 2020-07-02T11:50:40Z

cmd/etcd-launcher/main.go

-	os.Setenv(initialStateEnvName, "new")
-	os.Setenv(initialClusterEnvName, initialMemberList(config.clusterSize, config.namespace))
+	os.Setenv(initialStateEnvName, e.config.initialState)
+	os.Setenv(initialClusterEnvName, strings.Join(initialMembers, ","))


One other thing regarding this line and the one above, are you using it for debugging? because normally env vars are populated via the job instead of setting explicitly in the code.

moadqassem · 2020-07-02T11:58:44Z

cmd/etcd-launcher/main.go

+	return fmt.Sprintf("%s.etcd.%s.svc.cluster.local:2380", e.config.podName, e.config.namespace)
+}
+
+func (e *etcdCluster) getConfigFromEnv() error {


The way we actually pass configs to the bin/cmd, is via having the args/options pattern. In other words, we pass those configs as flags to the cmd instead of env vars. So it follows as:

From out side(e.g: job runner) env vars are being created and passed to the container for example.

Inside the container/job the configs are passed as flags.

For more information take a look at the user-cluster-controller-manager and how those args are read via the hack script

cmd/etcd-launcher/main.go

moadqassem · 2020-07-02T12:06:27Z

cmd/etcd-launcher/main.go

-		etcdCmd(config),
-		os.Environ())
+	// setup and start etcd command
+	cmd := exec.Command("/usr/local/bin/etcd", etcdCmd(e.config)...)


Shouldn't we make suer that etcd cmd exists before doing any calls or preparation? if os.Status("") => not exist return an error as early as possible?

cmd/etcd-launcher/main.go

moadqassem · 2020-07-02T12:12:08Z

cmd/etcd-launcher/main.go

+	// just get a key from etcd, this is how `etcdctl endpoint health` works!
+	ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+	_, err = client.Get(ctx, "healthy")
+	cancel()


Nit:maybe check the error then cancel? defer cancel()

moadqassem · 2020-07-02T12:12:21Z

cmd/etcd-launcher/main.go

+	// just get a key from etcd, this is how `etcdctl endpoint health` works!
+	ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
+	_, err := e.client.Get(ctx, "healthy")
+	cancel()


Nit:maybe check the error then cancel? defer cancel()

moadqassem · 2020-07-02T12:14:18Z

cmd/etcd-launcher/main.go

 	return cmd
 }
+
+func (e *etcdCluster) getClient() error {
+	if e.client != nil {


I don't understand this one, so when the client is not nil you are returning nil, shouldn't you return the client here?

client is set in the etcdCluster struct. If it's nil, I create it and assign it to e.client instead of passing it around.

moadqassem · 2020-07-02T12:27:16Z

cmd/etcd-launcher/main.go

+
+	// if existing, we need to reconcile
+	var healthy bool
+	for i := 0; i < 5; i++ {


So here you are trying to make sure that the storage is healthy, so you are trying to get valid results in 5 times. In general that's valid of course but what about adding some semantics something similar to this

var ( tries int maxTries int healthy bool ) for tries < maxTries { if healthy, err = e.isHealthy(); !healthy || err != nil { log.Error("...") time.Sleep(500 * time.Milisecond) tries++ continue } log.Info("...") break } if !healthy || err != nil { log.Fatal("failed for ever") }

Or you can try out this lovely package:
https://godoc.org/k8s.io/apimachinery/pkg/util/wait#Poll

multi-io · 2020-07-03T10:15:19Z

I think the cluster size field should be .spec.etcd.clusterSize instead of .spec.etcdClusterSize, i.e. have a spec.etcd map in there that can accommodate future etcd-related settings.

moadqassem · 2020-07-03T09:01:49Z

cmd/etcd-launcher/main.go

+		for { // reconcile dead members
+			members, err := e.listMembers()
+			if err != nil {
+				time.Sleep(10 * time.Second)


Shouldn't we log the error in the for loop? this line and the lines underneath it?

moadqassem · 2020-07-03T12:37:16Z

cmd/etcd-launcher/main.go

+				continue
+			}
+		}
+	} else { // new etcd member, need to join hte cluster


typo: s/hte/the

moadqassem · 2020-07-03T12:43:21Z

cmd/etcd-launcher/main.go

 	}

 	config.clusterSize = defaultClusterSize
 	if s := os.Getenv("ECTD_CLUSTER_SIZE"); s != "" {
 		if config.clusterSize, err = strconv.Atoi(s); err != nil {
-			return nil, fmt.Errorf("failed to read ECTD_CLUSTER_SIZE: %v", err)
+			return fmt.Errorf("failed to read ECTD_CLUSTER_SIZE: %v", err)


typo: s/ECTD/ETCD

moadqassem · 2020-07-03T12:43:39Z

cmd/etcd-launcher/main.go

-		if config.clusterSize > defaultClusterSize {
-			return nil, fmt.Errorf("ECTD_CLUSTER_SIZE is smaller then %d", defaultClusterSize)
+		if config.clusterSize < defaultClusterSize {
+			return fmt.Errorf("ECTD_CLUSTER_SIZE is smaller then %d", defaultClusterSize)


typo: s/ECTD/ETCD

typo: s/then/than

moadqassem · 2020-07-03T12:53:21Z

cmd/etcd-launcher/main.go

+	if err = e.getLocalClient(); err != nil {
+		return false, err
+	}
+	var resp *clientv3.StatusResponse


This can be derived instead and declared inside the for loop since you are not using outside of it:

resp, err = e.localClient.Status(context.Background(), e.endpoint())

moadqassem · 2020-07-03T13:00:35Z

cmd/etcd-launcher/main.go

+			continue
+		}
+		if !healthy {
+			if _, err := e.client.MemberRemove(context.Background(), member.ID); err != nil {


you don't need the check here and you are not wrap the error, thus remove the if statement please and return the method.

return e.client.MemberRemove(context.Background(), member.ID)

moelsayed · 2020-07-07T18:23:00Z

/retest

moelsayed · 2020-07-08T14:43:45Z

/retest

moadqassem · 2020-07-08T15:52:10Z

cmd/etcd-launcher/main.go

 	"strings"
 	"time"

+	kubermaticlog "github.com/kubermatic/kubermatic/pkg/log"


Please regroup the imports respectively :

go sdk imports("context")

third party imports("go.uber.org/zap")

kubermatic/machine-controller imports

k8s imports

moadqassem · 2020-07-08T15:57:31Z

cmd/etcd-launcher/main.go

@@ -88,30 +89,40 @@ func main() {
 		for { // reconcile dead members
 			members, err := e.listMembers()
 			if err != nil {
+				log.Warnf("failed to list memebers: %v ", err)


Please use logging with additional context with error wrapping:
log.Warnw("failed to list memebers ", zap.Error(err))

moadqassem · 2020-07-08T16:03:09Z

cmd/etcd-launcher/main.go


+	if _, err := os.Stat(etcdCommandPath); os.IsNotExist(err) {
+		log.Fatalf("can't find etcd command [%s]: %v", etcdCommandPath, err)


Please follow the pattern that we use regaring contextual logs:
log.Fatalw("can't find command","command-path", etcdCommandPath, zap.Error(err))

moadqassem · 2020-07-08T16:17:28Z

cmd/etcd-launcher/main.go

 				break
 			}
 			// to avoide race conditions, we will run only on the cluster leader
 			leader, err := e.isLeader()
 			if err != nil || !leader {
+				if err != nil {


You don't need to check the error two times, instead make the log message more general:
log.Warnw("failed to remove member, error occurred or didn't get the current leader", zap.Error(err))

In this case, if there was an error it will be printed out as an error

moadqassem · 2020-07-08T16:18:06Z

cmd/etcd-launcher/main.go

 				time.Sleep(10 * time.Second)
 				continue
 			}
-			if err := e.removeDeadMembers(); err != nil {
+			if err := e.removeDeadMembers(log); err != nil {
+				log.Warnf("failed to remove member: %v", err)


Please adjust.
log.Warnw("failed to remove member", zap.Error(err))

moadqassem · 2020-07-08T16:25:47Z

cmd/etcd-launcher/main.go

+		if err = wait.Poll(1*time.Second, 30*time.Second, func() (bool, error) {
+			return e.isEndpointHealthy(member.PeerURLs[0])
+		}); err != nil {
+			log.Infof("member [%s] is not responding, removing from cluster", member.Name)


Please use:
log.Infow("member is not responding, removing from cluster", "member-name", member.Name)

moadqassem · 2020-07-08T16:27:27Z

cmd/etcd-launcher/main.go

 	if err != nil {
 		log.Fatalf("failed to get launcher configuration: %v", err)
 	}

+	logOpts := kubermaticlog.NewDefaultOptions()
+	rawLog := kubermaticlog.New(logOpts.Debug, logOpts.Format)
+	log := rawLog.Sugar()


This variable collides with the "log" imported package

The imported package is aliased, so it should be ok?

moadqassem · 2020-07-08T16:39:39Z

pkg/controller/seed-controller-manager/kubernetes/health.go

@@ -82,6 +82,21 @@ func (r *Reconciler) syncHealth(ctx context.Context, cluster *kubermaticv1.Clust
 	if err != nil {
 		return err
 	}
+	// set ClusterConditionEtcdClusterInitialized, this should be don't only once


I don't quite understand the comment, can you please make it clearer :-)

moadqassem · 2020-07-08T16:43:24Z

pkg/resources/etcd/statefulset.go

 						},
+						// {


If this is not needed can you please remove it?

moadqassem · 2020-07-08T16:46:23Z

pkg/resources/etcd/statefulset.go

+		return etcdClusterSize
+	}
+	replicas := int(*set.Spec.Replicas)
+	isEtcdHealthy := data.Cluster().Status.ExtendedHealth.Etcd == kubermaticv1.HealthStatusUp


Can you please move this line to 341, as this line is evaluated and not used if the etcdClusterSize == replicas

moelsayed · 2020-07-15T12:09:34Z

/test pre-kubermatic-e2e-aws-flatcar-1.18

moelsayed · 2020-07-15T15:26:41Z

/retest

moelsayed · 2020-07-15T18:20:54Z

/retest

moadqassem · 2020-07-16T12:38:39Z

/approve
/lgtm

kubermatic-bot · 2020-07-16T12:38:43Z

LGTM label has been added.

Git tree hash: 95798e4abc12a40a47b037e6c9551941f4531c5c

kubermatic-bot · 2020-07-16T12:39:57Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: moadqassem, moelsayed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [moadqassem,moelsayed]
~~cmd/OWNERS~~ [moadqassem,moelsayed]
~~pkg/controller/OWNERS~~ [moadqassem,moelsayed]
~~pkg/crd/OWNERS~~ [moadqassem,moelsayed]
~~pkg/resources/OWNERS~~ [moadqassem,moelsayed]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kubermatic-triage-bot · 2020-07-16T14:56:36Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs

Review the full test history

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

Also, here is a cat.
/meow

kubermatic-bot · 2020-07-16T14:56:39Z

@kubermatic-triage-bot:

In response to this:

/retest
This bot automatically retries jobs that failed/flaked on approved PRs

Review the full test history

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

Also, here is a cat.
/meow

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

moelsayed · 2020-07-16T15:50:53Z

/retest

moelsayed force-pushed the resize_etcd branch from af01b25 to f70103f Compare July 1, 2020 03:51

kubermatic-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 1, 2020

moelsayed changed the title ~~[WIP] Enable etcd cluster scaling for userclusters~~ Enable etcd cluster scaling for userclusters Jul 1, 2020

kubermatic-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 1, 2020

multi-io reviewed Jul 1, 2020

View reviewed changes

moadqassem reviewed Jul 2, 2020

View reviewed changes

moadqassem reviewed Jul 3, 2020

View reviewed changes

moelsayed force-pushed the resize_etcd branch 2 times, most recently from e95e6f8 to 5780ffd Compare July 7, 2020 16:23

moadqassem reviewed Jul 8, 2020

View reviewed changes

moelsayed added 8 commits July 15, 2020 13:15

support resizeable etcd for user clusters

74c567f

vendor update

1f3618c

add liveness probe

325861c

vendor update

fb29d50

fix tests

f8762ce

default size for existing clusters

c7d96b2

review comments

4adfb8e

update fixtures

5b4a060

moelsayed added 2 commits July 15, 2020 13:15

move size to ComponentsOverride

784caab

review comments

0343572

moelsayed force-pushed the resize_etcd branch from 5780ffd to 0343572 Compare July 15, 2020 11:15

bump test image

ee9df82

kubermatic-bot assigned moadqassem Jul 16, 2020

kubermatic-bot added the lgtm Indicates that a PR is ready to be merged. label Jul 16, 2020

kubermatic-bot merged commit 77fb2bd into kubermatic:master Jul 16, 2020

kubermatic-bot added this to the v2.15 milestone Jul 16, 2020


		if _, err := os.Stat(etcdCommandPath); os.IsNotExist(err) {
		log.Fatalf("can't find etcd command [%s]: %v", etcdCommandPath, err)

Enable etcd cluster scaling for userclusters #5571

Enable etcd cluster scaling for userclusters #5571

Conversation

moelsayed commented Jun 25, 2020 • edited Loading

multi-io Jul 1, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

multi-io Jul 1, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

multi-io commented Jul 3, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

moelsayed commented Jul 7, 2020

moelsayed commented Jul 8, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

moelsayed commented Jul 15, 2020

moelsayed commented Jul 15, 2020

moelsayed commented Jul 15, 2020

moadqassem commented Jul 16, 2020

kubermatic-bot commented Jul 16, 2020

kubermatic-bot commented Jul 16, 2020

kubermatic-triage-bot commented Jul 16, 2020

kubermatic-bot commented Jul 16, 2020

moelsayed commented Jul 16, 2020

moelsayed commented Jun 25, 2020 •

edited

Loading

multi-io Jul 1, 2020 •

edited

Loading

multi-io Jul 1, 2020 •

edited

Loading

multi-io commented Jul 3, 2020 •

edited

Loading