New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
introduces KubeAPIReadinessChecker used by startup monitor to assess Kube API server readiness/health condition #1180
introduces KubeAPIReadinessChecker used by startup monitor to assess Kube API server readiness/health condition #1180
Conversation
/assign @sttts |
f4601e9
to
92c73fc
Compare
return doKubeReadyCheck(ctx, ch.client, ch.baseRawURL, 3, 5*time.Second) | ||
}, | ||
|
||
// doRevisionCheck in "strict" mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is strict? Better describe here
} | ||
|
||
// doRevisionCheck checks if the kas pod is running at the expected revision | ||
func doRevisionCheck(ctx context.Context, podClient corev1client.PodInterface, monitorRevision int, strictMode bool) (bool, string, string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
be more descriptive with strictMode. What does it mean?
return !strictMode, "NotReady", "waiting for Kube API server pod to show up" | ||
} | ||
if len(apiServerPods.Items) != 1 { | ||
return !strictMode, "NotReady", fmt.Sprintf("unexpected number of Kube API server pods %d, expected only one pod", len(apiServerPods.Items)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we change the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MultiplePods
} | ||
|
||
if revision != monitorRevision { | ||
return false, "NotReady", fmt.Sprintf("the running Kube API is at unexpected revision %d, expected %d", revision, monitorRevision) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use better reasons for all of these. NotReady should be reserved for the time we see the pod in the API, but it is not ready yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WrongRevision
func doRevisionCheck(ctx context.Context, podClient corev1client.PodInterface, monitorRevision int, strictMode bool) (bool, string, string) { | ||
apiServerPods, err := podClient.List(ctx, metav1.ListOptions{LabelSelector: "apiserver=true"}) | ||
if err != nil { | ||
return !strictMode, "NotReadyError", fmt.Sprintf("falied to get Kube API server pod due to %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PodListError
return !strictMode, "NotReadyError", fmt.Sprintf("falied to get Kube API server pod due to %v", err) | ||
} | ||
if len(apiServerPods.Items) == 0 { | ||
return !strictMode, "NotReady", "waiting for Kube API server pod to show up" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PodNotRunning
kasPod := apiServerPods.Items[0] | ||
revisionString, found := kasPod.Labels["revision"] | ||
if !found { | ||
return false, "NotReady", fmt.Sprintf("pod %s doesn't have revision label", kasPod.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PodNotRunning
return false, "NotReady", fmt.Sprintf("pod %s doesn't have revision label", kasPod.Name) | ||
} | ||
if len(revisionString) == 0 { | ||
return false, "NotReady", fmt.Sprintf("empty revision label on %s pod", kasPod.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InvalidPod
return false, "NotReady", fmt.Sprintf("empty revision label on %s pod", kasPod.Name) | ||
} | ||
revision, err := strconv.Atoi(revisionString) | ||
if err != nil || revision < 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InvalidPod
Transport: utilnet.SetTransportDefaults(&http.Transport{ | ||
TLSClientConfig: tlsConfig, | ||
}), | ||
Timeout: responseTimeout, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
http.Client.Timeout may timeout even if the connection was sucesfull
https://cs.opensource.google/go/go/+/refs/tags/go1.16.6:src/net/http/client.go;l=90-94
I fixed recently a bug in kops because of this, https://github.com/kubernetes/kops/pull/11884/files#diff-8a8f23d88ae96f6029ac0558e279ed8850568082c913852bfd45c2b54e26166fR81-R93
We are operating in a 2 seconds timeframe that seems to short, but still can race.
What about having more granularity (I'm making up the values here)?
httpClient := &http.Client{
Transport: &http.Transport{
DialContext: (&net.Dialer{
Timeout: 5 * time.Second,
KeepAlive: 5 * time.Second,
}).DialContext,
TLSHandshakeTimeout: 5 * time.Second,
ResponseHeaderTimeout: 5 * time.Second,
IdleConnTimeout: 5 * time.Second,
},
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This http client will be talking to Kube API over localhost. In case of an error/timeout requests will be retired for 5 min.
Do we still think 2 seconds is too short?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, ok, this is always localhost, maybe I'm being too paranoid 😄
these requests doesn't go through the apf thing, right? there is no chance the server can take more than 2 seconds to reply? The different timeouts help to known which part timed out exactly, but as you say this may not apply here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these requests doesn't go through the apf thing, right?
it does, we will authN as "system:master" which always is privileged.
there is no chance the server can take more than 2 seconds to reply?
there is, for example getting a pod might take much longer if etcd is slow for example. I can increase the timeout to 5 seconds.
The purpose of these checks is to assess the readiness of the Kube API server and fallback to the previous version in case of any issues. If the API server is unable to serve requests because the connection to etcd is "slow" we should fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw the Pod.List() and other api calls take 2 seconds on busy CIs, can this be a concern here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, this uses the kubeclient not the http.client
|
||
// SetRestConfig called by startup monitor to provide a valid configuration for authN/authZ against Kube API server | ||
func (ch *KubeAPIReadinessChecker) SetRestConfig(restConfig *rest.Config) { | ||
ch.restConfig = restConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this restConfig limited by the client-go parameters QPS, burst, RateLimiter
?
if affirmative, should we override it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the default values QPS=5, Burst=10 should be enough, we are going to send just request per second.
|
||
} | ||
if ch.client == nil { | ||
client, err := createHTTPClient(2*time.Second, ch.restConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, should this be a constant?
const httpClientTimeout = 2* time.Second
} | ||
|
||
if ch.kubeClient == nil { | ||
kubeClient, err := kubernetes.NewForConfig(ch.restConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add a timeout here too? I mean, like in the http.client{}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes we should, I slightly reworked the PR
92c73fc
to
ef30a07
Compare
|
||
// note that we will be talking to Kube API over localhost and in case of an error/timeout requests will be retired for 5 min. | ||
// setting the global timeout to a short value seems to be fine | ||
ch.restConfig.Timeout = 4 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aojea increased the timeout to 4s and set QPS to 10 and Burst to 15
healthy: false, | ||
reason: "UnhealthyError", | ||
// we don't check the entire rsp from the server | ||
msg: "falied while performing the check due to", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit , you were carrying over this typo falied
, I see it in 4 places
LGTM |
// TODO: watch /var/log/kube-apiserver/termination.log for the first start-up attempt (beware of the race of startup-monitor startup and kube-apiserver startup). Set Reason=NeverStartedUp when this times out. | ||
// TODO: watch /var/log/kube-apiserver/termination.log for more than one start-up attempt. Set Reason=CrashLooping if more than one is found and the monitor times out. | ||
|
||
// doRevisionCheck in "lazy" mode to avoid false positive - failing readyz check when the previous instance hasn't terminated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is lazy mode?
return doRevisionCheck(ctx, ch.kubeClient.CoreV1().Pods(operatorclient.TargetNamespace), revision, false) | ||
}, | ||
|
||
// checks etcd health condition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
be precise: this is "check etcd health how kube-apiserver sees it"
|
||
// doRevisionCheck in "lazy" mode to avoid false positive - failing readyz check when the previous instance hasn't terminated | ||
func() (bool, string, string) { | ||
return doRevisionCheck(ctx, ch.kubeClient.CoreV1().Pods(operatorclient.TargetNamespace), revision, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is a revision check if the revision does not matter?! versus line 107
return doRevisionCheck(ctx, ch.kubeClient.CoreV1().Pods(operatorclient.TargetNamespace), revision, true) | ||
}, | ||
|
||
// checks if kas pod is in PodRunning phase and has PodReady condition set to true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check that kubelet has reporting readiness of the pod
ef30a07
to
62f2b19
Compare
} | ||
|
||
// loop through a list of ordered checks for assessing Kube API readiness condition | ||
for _, checkFn := range []func(context.Context) (bool, string, string){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😍 – nice and short
noOldRevisionPodExists(ch.kubeClient.CoreV1().Pods(operatorclient.TargetNamespace), revision), | ||
|
||
// check kube-apiserver /healthz/etcd endpoint | ||
goodETCDEndpoint(ch.client, ch.baseRawURL), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: good(KubeApiserver)HealthzEtcdEndpoint
– this sounds like we check etcd directly.
func doHTTPCheckAndTransform(ctx context.Context, client *http.Client, rawURL string, checkName string, httpCheckFn func(ctx context.Context, client *http.Client, rawURL string) (int, string, error)) (bool, string, string) { | ||
statusCode, response, err := httpCheckFn(ctx, client, rawURL) | ||
if err != nil { | ||
errMsg := fmt.Sprintf("failed while performing the check due to %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this helpful. does err have e.g. the path? Would also follow the normal format with colon: .... the check: %v
62f2b19
to
c34d869
Compare
newRevisionPodExists(ch.kubeClient.CoreV1().Pods(operatorclient.TargetNamespace), revision), | ||
|
||
// check that kubelet has reporting readiness of the pod | ||
newPodHasStateRunning(ch.kubeClient.CoreV1().Pods(operatorclient.TargetNamespace)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see the last two linked into one check. It's the final check and it should ensure consistency. Now there is a slight race because we do two client calls.
} | ||
|
||
if revision != monitorRevision { | ||
return false, "InvalidPod", fmt.Sprintf("the running Kube API is at unexpected revision %d, expected %d", revision, monitorRevision) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UnexpectedRevision
c34d869
to
8982eb6
Compare
…Kube API server readiness/health condition
8982eb6
to
abb0257
Compare
/retest |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: p0lyn0mial, sttts The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
@p0lyn0mial: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
this PR implements the following checks
xref: openshift/enhancements#833