KubeAPIErrorBudgetBurn Alert Reason #615

d-m · 2021-05-29T02:31:12Z

Hello all,

I was hoping that someone might be able to help me with understanding why the KubeAPIErrorBudgetBurn alert (long: 3d, short 6h) was firing.

I reviewed the API Server dashboard and noticed that there were large spikes for an entry with no resource label:

The dashboard uses the query cluster_quantile:apiserver_request_duration_seconds:histogram_quantile{verb="read"}.

I also read through #464 and the very helpful runbook mentioned in a comment in that ticket. The only example query in the runbook that returned any results was the resource scoped slow read request query but it didn't have a resource name, either:

Any suggestions for next steps would be appreciated.

Thanks.

The text was updated successfully, but these errors were encountered:

metalmatze · 2021-05-31T15:08:05Z

You should be able to figure out the slow resource be removing the sum() to only have the rate which won't aggregate anymore.

paulfantom · 2021-06-01T08:26:01Z

We might want to link to https://github.com/prometheus-operator/kube-prometheus/wiki/KubeAPIErrorBudgetBurn somewhere and/or improve it.

mihail-velikov · 2021-08-26T10:37:50Z

Hello everyone,

Since last week I also started getting this alert and I am pretty much clueless on how to proceed.

Our cluster setup is deployed using kubespray on:
k8s 1.19.2
OS: ubuntu 20.04
3 master - 4 CPU/16GB RAM
20 workers - 8 CPU/64 GB RAM
All this is hosted on premise with vmware as the underlying hypervisor and calico as the network plugin with VXLAN and ipinip disabled. The master nodes are disabled for scheduling and thus run only the cluster components + etcd.

Looking at the API dashboard I noticed that we have slow Write SLI queries:

The two slow queries seems to be related to "ingress" and "pods". I checked the API logs and I saw that some "Patch" events for ingress take very long time. Example:
(I0826 10:17:07.864302 1 trace.go:205] Trace[162001560]: "Patch" url:/apis/extensions/v1beta1/namespaces/ews-int/ingresses/ews-int-redis-commander-generic,user-agent:kubectl/v1.21.0 (linux/amd64) kubernetes/cb303e6,client:172.17.42.247 (26-Aug-2021 10:16:59.314) (total time: 8549ms):

I have the suspicion that this is related to the old API endpoint "apis/extensions/v1beta1/" and I will double check that by removing this specific ingresses. I have already checked node CPU/RAM usage on the masters and it is very low. I have also checked the etcd logs and it doesn't have any obvious issues - no slow queries/disc sync/etc.

Regarding the slow pod write queries - I have no idea how to further investigate this besides enabling "profiling" for the api server.

Any hints will be greatly appreciated.

Kind Regards,
Mihail Velikov

mihail-velikov · 2021-08-30T09:37:40Z

Update:
It seems that my suspicion was incorrect. We updated all ingress endpoints to the latest version of the api but the problem persists.
Additionally I tried enabling the profiling of the api-server but not much more information was added in the logs about the slow requests.

povilasv · 2021-09-22T09:53:40Z

One approach to use tracing if you would run newer k8s version -> https://kubernetes.io/blog/2021/09/03/api-server-tracing/

vijay-veeranki mentioned this issue Dec 14, 2021

Investigate alertname="KubeAPIErrorBudgetBurn" in "live" cluster ministryofjustice/cloud-platform#3429

Closed

pacoxu mentioned this issue Jul 21, 2022

KubeAPIErrorBudgetBurn kubernetes/kubernetes#111266

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KubeAPIErrorBudgetBurn Alert Reason #615

KubeAPIErrorBudgetBurn Alert Reason #615

d-m commented May 29, 2021

metalmatze commented May 31, 2021

paulfantom commented Jun 1, 2021

mihail-velikov commented Aug 26, 2021

mihail-velikov commented Aug 30, 2021

povilasv commented Sep 22, 2021

KubeAPIErrorBudgetBurn Alert Reason #615

KubeAPIErrorBudgetBurn Alert Reason #615

Comments

d-m commented May 29, 2021

metalmatze commented May 31, 2021

paulfantom commented Jun 1, 2021

mihail-velikov commented Aug 26, 2021

mihail-velikov commented Aug 30, 2021

povilasv commented Sep 22, 2021