etcd container fails healthcheck probe due to context deadline exceeded #12755

liiri · 2021-03-09T13:32:00Z

I have etcd 3.4.13-0 running in my bare metal Kubernetes cluster.

I'm seeing many of the infamous "error:context deadline exceeded" took too long (2.000051514s) to execute warnings, and I'm pretty sure there is an actual disk issue.

However, I'm more concerened with the fact that my container gets shut down after running for some time, due to Kuberetes health check failures. What I see in logs:

...
2021-03-09 08:16:54.404418 W | etcdserver: read-only range request "key:\"/registry/health\" " with result "error:context deadline exceeded" took too long (2.000051514s) to execute
WARNING: 2021/03/09 08:16:54 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2021-03-09 08:16:55.901302 W | etcdserver: failed to revoke 74867816086e2e90 ("etcdserver: request timed out")

What actually makes the request time out here? Is the "took to long" warning actually an error? Or is it in the logs but unrelated to the grpc error? Where is the context deadline metric configured?

The text was updated successfully, but these errors were encountered:

ptabor · 2021-03-09T14:46:34Z

error:context deadline exceeded means that the deadline is reached and the request failed.

As user can set arbitrarily short deadlines... from etcd perspective its just a 'warning' that one of the callers was left unsatisfied.

k8s has recent changes to attach less objects to individual 'leases' so the failures of lease revocation should be less likely.

liiri · 2021-03-09T15:17:38Z

So in this case, the user is kube-apiserver?

Can you please explain (or refer me to) how leases are affecting the health api?

ptabor · 2021-03-09T16:25:35Z

Please check this 2 PRs:

lease manager limit max objects attached to a lease kubernetes/kubernetes#98257
Use etcdctl endpoint health as a etcd's livenessProbe kubernetes/kubernetes#97034

ptabor closed this as completed Mar 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

etcd container fails healthcheck probe due to context deadline exceeded #12755

etcd container fails healthcheck probe due to context deadline exceeded #12755

liiri commented Mar 9, 2021

ptabor commented Mar 9, 2021

liiri commented Mar 9, 2021

ptabor commented Mar 9, 2021

etcd container fails healthcheck probe due to context deadline exceeded #12755

etcd container fails healthcheck probe due to context deadline exceeded #12755

Comments

liiri commented Mar 9, 2021

ptabor commented Mar 9, 2021

liiri commented Mar 9, 2021

ptabor commented Mar 9, 2021