-
Notifications
You must be signed in to change notification settings - Fork 39.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API server creates one etcd client per CRD #111622
Comments
|
/sig api-machinery |
|
I have a draft fix for this open in #111559. It needs a little work but I'm interested in finishing it up and hope to get time to work on it this week. |
|
/triage accepted |
|
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
|
/remove-lifecycle stale |
|
We had similar issues with our etcd operator that caused high CPU usage due to the TLS handshakes. I wrote a little client pool for the v3.client back then: It helped a lot, is fairly simple to reason about and is somewhat battle tested in Openshift for close to a year now. |
|
PRs are, as ever, welcome. |
|
Folks, please read through all the traffic of comments in #111559 |
|
fyi, #111559 is superseded by #114458
#114458 looks stale for couple of months though. |
|
thanks @chaochn47 !! we can poke enj next week |
What happened?
Currently the API server is creating one etcd client per CRD. This causes it to use more memory and more TCP connections than it should need to. In #111477 we significantly reduced the amount of memory said clients use by updating them to share a single logger, but we should address the underlying issue.
I've heard various folks, including @aojea and @lavalamp mention that they've looked into addressing this but I couldn't immediately find a tracking issue. We were previously discussing this in #111476 which was closed by #111477, but there seems to be agreement that the "correct fix" is what is tracked by this issue.
What did you expect to happen?
Presuming that a single etcd client can handle multiple requests at once, I'd expect the API server to use approximately one etcd client (i.e. one https://pkg.go.dev/go.etcd.io/etcd/client/v3#Client) per etcd cluster it's connected to. I say approximately because I see that additional clients are also used for things like health probes.
How can we reproduce it (as minimally and precisely as possible)?
Create a kind cluster and deploy a few CRDs.
Note that 1,937 - 60 = 1,877 - the number of CRDs we have loaded (the 1,878 above includes the header line).
Anything else we need to know?
Based on my read of the code, I see roughly:
Kubernetes version
Commit e000a2e (current master)
Cloud provider
OS version
N/A
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: