introduces KubeAPIReadinessChecker used by startup monitor to assess Kube API server readiness/health condition #1180

p0lyn0mial · 2021-07-16T14:43:12Z

this PR implements the following checks

checks if we are not dealing with the old kas
checks kube-apiserver /healthz/etcd endpoint
checks kube-apiserver /healthz endpoint
checks kube-apiserver /readyz endpoint
checks if the kas pod is running at the expected revision
checks that kubelet has reporting readiness for the new pod

p0lyn0mial · 2021-07-16T14:43:54Z

sttts · 2021-07-19T15:51:22Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+			return doKubeReadyCheck(ctx, ch.client, ch.baseRawURL, 3, 5*time.Second)
+		},
+
+		// doRevisionCheck in "strict" mode


what is strict? Better describe here

sttts · 2021-07-19T15:52:10Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+}
+
+// doRevisionCheck checks if the kas pod is running at the expected revision
+func doRevisionCheck(ctx context.Context, podClient corev1client.PodInterface, monitorRevision int, strictMode bool) (bool, string, string) {


be more descriptive with strictMode. What does it mean?

sttts · 2021-07-19T15:57:56Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+		return !strictMode, "NotReady", "waiting for Kube API server pod to show up"
+	}
+	if len(apiServerPods.Items) != 1 {
+		return !strictMode, "NotReady", fmt.Sprintf("unexpected number of Kube API server pods %d, expected only one pod", len(apiServerPods.Items))


do we change the name?

MultiplePods

sttts · 2021-07-19T16:00:20Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+	}
+
+	if revision != monitorRevision {
+		return false, "NotReady", fmt.Sprintf("the running Kube API is at unexpected revision %d, expected %d", revision, monitorRevision)


use better reasons for all of these. NotReady should be reserved for the time we see the pod in the API, but it is not ready yet.

WrongRevision

sttts · 2021-07-19T16:05:56Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+func doRevisionCheck(ctx context.Context, podClient corev1client.PodInterface, monitorRevision int, strictMode bool) (bool, string, string) {
+	apiServerPods, err := podClient.List(ctx, metav1.ListOptions{LabelSelector: "apiserver=true"})
+	if err != nil {
+		return !strictMode, "NotReadyError", fmt.Sprintf("falied to get Kube API server pod due to %v", err)


PodListError

sttts · 2021-07-19T16:06:08Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+		return !strictMode, "NotReadyError", fmt.Sprintf("falied to get Kube API server pod due to %v", err)
+	}
+	if len(apiServerPods.Items) == 0 {
+		return !strictMode, "NotReady", "waiting for Kube API server pod to show up"


PodNotRunning

sttts · 2021-07-19T16:06:36Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+	kasPod := apiServerPods.Items[0]
+	revisionString, found := kasPod.Labels["revision"]
+	if !found {
+		return false, "NotReady", fmt.Sprintf("pod %s doesn't have revision label", kasPod.Name)


PodNotRunning

sttts · 2021-07-19T16:06:40Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+		return false, "NotReady", fmt.Sprintf("pod %s doesn't have revision label", kasPod.Name)
+	}
+	if len(revisionString) == 0 {
+		return false, "NotReady", fmt.Sprintf("empty revision label on %s pod", kasPod.Name)


sttts · 2021-07-19T16:06:50Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+		return false, "NotReady", fmt.Sprintf("empty revision label on %s pod", kasPod.Name)
+	}
+	revision, err := strconv.Atoi(revisionString)
+	if err != nil || revision < 0 {


aojea · 2021-07-19T16:48:36Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+		Transport: utilnet.SetTransportDefaults(&http.Transport{
+			TLSClientConfig: tlsConfig,
+		}),
+		Timeout: responseTimeout,


http.Client.Timeout may timeout even if the connection was sucesfull

https://cs.opensource.google/go/go/+/refs/tags/go1.16.6:src/net/http/client.go;l=90-94

I fixed recently a bug in kops because of this, https://github.com/kubernetes/kops/pull/11884/files#diff-8a8f23d88ae96f6029ac0558e279ed8850568082c913852bfd45c2b54e26166fR81-R93

We are operating in a 2 seconds timeframe that seems to short, but still can race.
What about having more granularity (I'm making up the values here)?

httpClient := &http.Client{ Transport: &http.Transport{ DialContext: (&net.Dialer{ Timeout: 5 * time.Second, KeepAlive: 5 * time.Second, }).DialContext, TLSHandshakeTimeout: 5 * time.Second, ResponseHeaderTimeout: 5 * time.Second, IdleConnTimeout: 5 * time.Second, },

This http client will be talking to Kube API over localhost. In case of an error/timeout requests will be retired for 5 min.

Do we still think 2 seconds is too short?

ah, ok, this is always localhost, maybe I'm being too paranoid 😄
these requests doesn't go through the apf thing, right? there is no chance the server can take more than 2 seconds to reply? The different timeouts help to known which part timed out exactly, but as you say this may not apply here

these requests doesn't go through the apf thing, right?

it does, we will authN as "system:master" which always is privileged.

there is no chance the server can take more than 2 seconds to reply?

there is, for example getting a pod might take much longer if etcd is slow for example. I can increase the timeout to 5 seconds.

The purpose of these checks is to assess the readiness of the Kube API server and fallback to the previous version in case of any issues. If the API server is unable to serve requests because the connection to etcd is "slow" we should fail.

I saw the Pod.List() and other api calls take 2 seconds on busy CIs, can this be a concern here?

ahh, this uses the kubeclient not the http.client

aojea · 2021-07-20T08:52:39Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+
+// SetRestConfig called by startup monitor to provide a valid configuration for authN/authZ against Kube API server
+func (ch *KubeAPIReadinessChecker) SetRestConfig(restConfig *rest.Config) {
+	ch.restConfig = restConfig


is this restConfig limited by the client-go parameters QPS, burst, RateLimiter?
if affirmative, should we override it?

the default values QPS=5, Burst=10 should be enough, we are going to send just request per second.

aojea · 2021-07-20T08:59:43Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+
+	}
+	if ch.client == nil {
+		client, err := createHTTPClient(2*time.Second, ch.restConfig)


nit, should this be a constant?

const httpClientTimeout = 2* time.Second

aojea · 2021-07-20T09:02:57Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+	}
+
+	if ch.kubeClient == nil {
+		kubeClient, err := kubernetes.NewForConfig(ch.restConfig)


should we add a timeout here too? I mean, like in the http.client{}

yes we should, I slightly reworked the PR

p0lyn0mial · 2021-07-20T10:24:03Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+
+	// note that we will be talking to Kube API over localhost and in case of an error/timeout requests will be retired for 5 min.
+	// setting the global timeout to a short value seems to be fine
+	ch.restConfig.Timeout = 4 * time.Second


@aojea increased the timeout to 4s and set QPS to 10 and Burst to 15

aojea · 2021-07-20T14:16:07Z

pkg/operator/startupmonitorreadiness/readiness_checks_test.go

+			healthy: false,
+			reason:  "UnhealthyError",
+			// we don't check the entire rsp from the server
+			msg: "falied while performing the check due to",


nit , you were carrying over this typo falied, I see it in 4 places

aojea · 2021-07-20T14:29:05Z

LGTM

sttts · 2021-07-21T09:23:16Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+		//	TODO: watch /var/log/kube-apiserver/termination.log for the first start-up attempt (beware of the race of startup-monitor startup and kube-apiserver startup). Set Reason=NeverStartedUp when this times out.
+		//	TODO: watch /var/log/kube-apiserver/termination.log for more than one start-up attempt. Set Reason=CrashLooping if more than one is found and the monitor times out.
+
+		// doRevisionCheck in "lazy" mode to avoid false positive - failing readyz check when the previous instance hasn't terminated


what is lazy mode?

sttts · 2021-07-21T09:24:03Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+			return doRevisionCheck(ctx, ch.kubeClient.CoreV1().Pods(operatorclient.TargetNamespace), revision, false)
+		},
+
+		// checks etcd health condition


be precise: this is "check etcd health how kube-apiserver sees it"

sttts · 2021-07-21T09:25:07Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+
+		// doRevisionCheck in "lazy" mode to avoid false positive - failing readyz check when the previous instance hasn't terminated
+		func() (bool, string, string) {
+			return doRevisionCheck(ctx, ch.kubeClient.CoreV1().Pods(operatorclient.TargetNamespace), revision, false)


what is a revision check if the revision does not matter?! versus line 107

sttts · 2021-07-21T09:25:42Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+			return doRevisionCheck(ctx, ch.kubeClient.CoreV1().Pods(operatorclient.TargetNamespace), revision, true)
+		},
+
+		// checks if kas pod is in PodRunning phase and has PodReady condition set to true


check that kubelet has reporting readiness of the pod

sttts · 2021-07-21T13:12:57Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+	}
+
+	// loop through a list of ordered checks for assessing Kube API readiness condition
+	for _, checkFn := range []func(context.Context) (bool, string, string){


😍 – nice and short

sttts · 2021-07-21T13:13:55Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+		noOldRevisionPodExists(ch.kubeClient.CoreV1().Pods(operatorclient.TargetNamespace), revision),
+
+		// check kube-apiserver /healthz/etcd endpoint
+		goodETCDEndpoint(ch.client, ch.baseRawURL),


nit: good(KubeApiserver)HealthzEtcdEndpoint – this sounds like we check etcd directly.

sttts · 2021-07-21T13:20:00Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+func doHTTPCheckAndTransform(ctx context.Context, client *http.Client, rawURL string, checkName string, httpCheckFn func(ctx context.Context, client *http.Client, rawURL string) (int, string, error)) (bool, string, string) {
+	statusCode, response, err := httpCheckFn(ctx, client, rawURL)
+	if err != nil {
+		errMsg := fmt.Sprintf("failed while performing the check due to %v", err)


is this helpful. does err have e.g. the path? Would also follow the normal format with colon: .... the check: %v

sttts · 2021-07-22T09:50:36Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+		newRevisionPodExists(ch.kubeClient.CoreV1().Pods(operatorclient.TargetNamespace), revision),
+
+		// check that kubelet has reporting readiness of the pod
+		newPodHasStateRunning(ch.kubeClient.CoreV1().Pods(operatorclient.TargetNamespace)),


I would like to see the last two linked into one check. It's the final check and it should ensure consistency. Now there is a slight race because we do two client calls.

sttts · 2021-07-22T09:52:03Z

pkg/operator/startupmonitorreadiness/readiness_checks.go

+	}
+
+	if revision != monitorRevision {
+		return false, "InvalidPod", fmt.Sprintf("the running Kube API is at unexpected revision %d, expected %d", revision, monitorRevision)


UnexpectedRevision

…Kube API server readiness/health condition

sttts · 2021-07-22T15:33:58Z

/retest
/lgtm
/approve

openshift-ci · 2021-07-22T15:35:12Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: p0lyn0mial, sttts

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [sttts]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

sttts · 2021-07-22T20:44:28Z

/retest

openshift-ci · 2021-07-22T21:31:52Z

@p0lyn0mial: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e-aws-single-node	`abb0257`	link	`/test e2e-aws-single-node`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci bot requested review from mfojtik and sttts July 16, 2021 14:43

p0lyn0mial mentioned this pull request Jul 16, 2021

introduces KubeAPIReadinessChecker used by startup monitor to assess Kube API server readiness/health condition #1179

Closed

openshift-ci bot assigned sttts Jul 16, 2021

p0lyn0mial force-pushed the sno-readiness-checks branch 2 times, most recently from f4601e9 to 92c73fc Compare July 19, 2021 14:23

sttts reviewed Jul 19, 2021

View reviewed changes

aojea reviewed Jul 19, 2021

View reviewed changes

aojea reviewed Jul 20, 2021

View reviewed changes

p0lyn0mial force-pushed the sno-readiness-checks branch from 92c73fc to ef30a07 Compare July 20, 2021 10:22

p0lyn0mial commented Jul 20, 2021

View reviewed changes

aojea reviewed Jul 20, 2021

View reviewed changes

sttts reviewed Jul 21, 2021

View reviewed changes

p0lyn0mial force-pushed the sno-readiness-checks branch from ef30a07 to 62f2b19 Compare July 21, 2021 10:55

sttts reviewed Jul 21, 2021

View reviewed changes

p0lyn0mial force-pushed the sno-readiness-checks branch from 62f2b19 to c34d869 Compare July 22, 2021 07:13

sttts reviewed Jul 22, 2021

View reviewed changes

p0lyn0mial force-pushed the sno-readiness-checks branch from c34d869 to 8982eb6 Compare July 22, 2021 12:10

p0lyn0mial added 3 commits July 22, 2021 14:13

introduces KubeAPIReadinessChecker used by startup monitor to assess …

b935349

…Kube API server readiness/health condition

pin library-go

4b12802

bump (library-go)

abb0257

p0lyn0mial force-pushed the sno-readiness-checks branch from 8982eb6 to abb0257 Compare July 22, 2021 12:13

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 22, 2021

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 22, 2021

openshift-merge-robot merged commit e1b7cc3 into openshift:master Jul 22, 2021

introduces KubeAPIReadinessChecker used by startup monitor to assess Kube API server readiness/health condition #1180

introduces KubeAPIReadinessChecker used by startup monitor to assess Kube API server readiness/health condition #1180

Conversation

p0lyn0mial commented Jul 16, 2021 • edited

p0lyn0mial commented Jul 16, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sttts Jul 19, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aojea Jul 20, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aojea Jul 20, 2021 • edited

Choose a reason for hiding this comment

aojea commented Jul 20, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sttts commented Jul 22, 2021

openshift-ci bot commented Jul 22, 2021

sttts commented Jul 22, 2021

openshift-ci bot commented Jul 22, 2021 • edited

p0lyn0mial commented Jul 16, 2021 •

edited

sttts Jul 19, 2021 •

edited

aojea Jul 20, 2021 •

edited

aojea Jul 20, 2021 •

edited

openshift-ci bot commented Jul 22, 2021 •

edited