New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcd timeout #9320
Comments
Is this using HDD? etcd is I/O intensive, persisting Raft entries with application data on disk. |
yeah, something doesn't look right with the sync duration. |
Thank you for getting back so promptly. Is disk the only reason that cause sync duration high and etcd timeout? Why would etcd timeout so fast at the begining of cluster creation? Is Kubernetes stressing etcd too much? Do we have guidelines on CPU/Memory for etcd/k8s-apiserver VMs? Thanks a lot. |
I am either using t2.medium or m4.large instance on AWS, and according to aws documentation, those are EBS-only instances. I think it is slower than SSD. |
We have general hardward setting guide https://github.com/coreos/etcd/blob/master/Documentation/op-guide/hardware.md, but not Kubernetes specific. The metrics
indicates a bunch of v3 requests timed out, but these workloads are not that intensive. etcd with better disk should be able to handle much more pressing workloads. |
for ebs backed instance, you might consider to use |
My Kubernetes master and etcd is running on the same VM is running into some intermittent issue with "etcd timing out" and "apiserver received an error that is not an metav1.Status: etcdserver: request timed out"
Attach is my etcd metrics. Can somebody point to me if there are any obvious problems in this metrics? Like I need to switch etcd to use better disk, etc?
The text was updated successfully, but these errors were encountered: