Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API server creates one etcd client per CRD #111622

Open
negz opened this issue Aug 2, 2022 · 10 comments
Open

API server creates one etcd client per CRD #111622

negz opened this issue Aug 2, 2022 · 10 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@negz
Copy link
Contributor

negz commented Aug 2, 2022

What happened?

Currently the API server is creating one etcd client per CRD. This causes it to use more memory and more TCP connections than it should need to. In #111477 we significantly reduced the amount of memory said clients use by updating them to share a single logger, but we should address the underlying issue.

I've heard various folks, including @aojea and @lavalamp mention that they've looked into addressing this but I couldn't immediately find a tracking issue. We were previously discussing this in #111476 which was closed by #111477, but there seems to be agreement that the "correct fix" is what is tracked by this issue.

What did you expect to happen?

Presuming that a single etcd client can handle multiple requests at once, I'd expect the API server to use approximately one etcd client (i.e. one https://pkg.go.dev/go.etcd.io/etcd/client/v3#Client) per etcd cluster it's connected to. I say approximately because I see that additional clients are also used for things like health probes.

How can we reproduce it (as minimally and precisely as possible)?

Create a kind cluster and deploy a few CRDs.

$ k get crd|wc -l
1878

# Enter the kind cluster's control plane container.
$ docker exec -it e092b6d27bf-withlogger-control-plane bash

# Find the API server's PID
root@e092b6d27bf-withlogger-control-plane:/# ps -C kube-apiserver
    PID TTY          TIME CMD
    493 ?        00:29:03 kube-apiserver

# Count how many etcd connections exist with the 1,878 CRDs loaded
# Column 3 is the remote host and port per https://www.kernel.org/doc/Documentation/networking/proc_net_tcp.txt
# 0100007F:094B is hex for 127.0.0.1:2379 (the etcd client port)
root@e092b6d27bf-withlogger-control-plane:/# cat /proc/493/net/tcp|awk '{ print $3 }'|grep '0100007F:094B'|wc -l
1937

# Back outside of the kind control plane container, delete all your CRDs
$ k delete crds --all

# Then back inside the container re-count the connections
root@e092b6d27bf-withlogger-control-plane:/# cat /proc/493/net/tcp|awk '{ print $3 }'|grep '0100007F:094B'|wc -l
60

Note that 1,937 - 60 = 1,877 - the number of CRDs we have loaded (the 1,878 above includes the header line).

Anything else we need to know?

Based on my read of the code, I see roughly:

  • All of /apis is handled by a *crdHandler
  • When a request comes in it gets an instance of the crd (from an informer)
  • Gets serving info from a map of CRDs - if there is none it...
  • Calls customresource.NewStorage which creates a genericregistry.Store
  • Calls its CompleteWithOptions method
  • That calls its “Decorator”, which is a genericregistry.StorageWithCacher
  • That calls generic.NewRawStorage which just wraps factory.Create
  • factory.Create calls newETCD3Storage which calls newETCD3Client

Kubernetes version

Commit e000a2e (current master)

Cloud provider

N/A

OS version

N/A

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@negz negz added the kind/bug Categorizes issue or PR as related to a bug. label Aug 2, 2022
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 2, 2022
@negz
Copy link
Contributor Author

negz commented Aug 2, 2022

/sig api-machinery

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 2, 2022
@negz
Copy link
Contributor Author

negz commented Aug 2, 2022

I have a draft fix for this open in #111559. It needs a little work but I'm interested in finishing it up and hope to get time to work on it this week.

@cici37
Copy link
Contributor

cici37 commented Aug 4, 2022

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 4, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 2, 2022
@vaibhav2107
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 13, 2022
@tjungblu
Copy link

tjungblu commented Mar 3, 2023

We had similar issues with our etcd operator that caused high CPU usage due to the TLS handshakes. I wrote a little client pool for the v3.client back then:
https://github.com/openshift/cluster-etcd-operator/blob/master/pkg/etcdcli/etcdcli_pool.go

It helped a lot, is fairly simple to reason about and is somewhat battle tested in Openshift for close to a year now.

@sftim
Copy link
Contributor

sftim commented Mar 14, 2023

PRs are, as ever, welcome.

@dims
Copy link
Member

dims commented Jan 19, 2024

Folks, please read through all the traffic of comments in #111559

@chaochn47
Copy link

chaochn47 commented Jan 20, 2024

fyi, #111559 is superseded by #114458

@dims

@enj - Actually, I was just chatting with Jordan about it and he came up with a nice idea that instead of reusing etcd client across everything, maybe we should consider reusing etcd client on across all CRDs (since they are theoretically unbounded) and leave built-in resources as is (with etcd-client per built-in resource).

While it still has some risk inside, I think it's a nice proposal that eliminates the inherent risk that it will break the built-in functionality. So I would be supportive for it (and Jordan would be too as it was his idea :) )

It may require non-trival changes to this PR, but maybe we should consider it?

#114458 (comment)

#114458 looks stale for couple of months though.

@dims
Copy link
Member

dims commented Jan 20, 2024

thanks @chaochn47 !! we can poke enj next week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
9 participants