Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd3 client logger uses significant memory with many CRDs #111476

Closed
negz opened this issue Jul 27, 2022 · 8 comments · Fixed by #111477
Closed

etcd3 client logger uses significant memory with many CRDs #111476

negz opened this issue Jul 27, 2022 · 8 comments · Fixed by #111477
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@negz
Copy link
Contributor

negz commented Jul 27, 2022

What happened?

The @crossplane project uses a lot of CRDs - in some cases almost 2,000. I do mean kind: CustomResourceDefinition here, not custom resources. We've noticed that API server memory consumption balloons when many CRDs are installed - for example with ~1,900 CRDs we see the API server using up to ~8Gi RSS.

A little profiling shows that around 1.5Gi of that memory is being used by the etcd3 client's logger.

With the Logger

With Kubernetes built from e092b6d with the CRDs at https://github.com/negz/crossplane-scale/tree/dc25bef/etoomanycrds/latest-crds loaded I'm seeing the API server using up to ~7.5GiB RSS. I see it dip to ~5.5GiB, presumably when garbage collection runs.

$ k get nodes
NAME                                   STATUS   ROLES           AGE     VERSION
e092b6d27bf-withlogger-control-plane   Ready    control-plane   9m47s   v1.25.0-alpha.3.190+e092b6d27bff00

$ k get crds|wc -l
1878

You can see zapCore.newCounters taking up a ton of memory below. Zap is the logger used by the etcd client. Note that this is a sample, so it's actually using more than the indicated ~800MiB.

profile-with-logger

Without the Logger

I'm seeing the API server using up to ~6GiB RSS. I see it dip to ~4.5GiB, presumably due to GC runs.

$ k get nodes
NAME                                      STATUS   ROLES           AGE   VERSION
e092b6d27bf-withoutlogger-control-plane   Ready    control-plane   31m   v1.25.0-alpha.3.190+e092b6d27bff00-dirty

$ k get crds|wc -l
1878

You can see zapCore.newCounters no longer appears below.

profile-without-logger

What did you expect to happen?

I would not expect a logger to be responsible for such a significant amount of the API server's memory consumption.

How can we reproduce it (as minimally and precisely as possible)?

You can get the above profile by:

  1. Checking out e092b6d.
  2. kind build node-image . to build a kind node.
  3. kind create cluster --image kindest/node:latest to start a control plane for the above build.
  4. kubectl apply -f https://raw.githubusercontent.com/negz/crossplane-scale/dc25bef/profile.yaml to allow anonymous pprof access
  5. kubectl apply -f the CRDs from https://github.com/negz/crossplane-scale/tree/dc25bef/etoomanycrds/latest-crds
  6. go tool pprof -png https+insecure://localhost:$(kind get kubeconfig|grep server|cut -d: -f4)/debug/pprof/heap

I used https://github.com/nicolargo/glances to see the RSS but I imagine top would show it too. See #111477 for details on how I disabled the logger to get the "without logger" numbers.

Anything else we need to know?

No response

Kubernetes version

v1.25.0-alpha.3.190+e092b6d27bff00

Cloud provider

N/A

OS version

# On Linux:
$ cat /etc/os-release
BUG_REPORT_URL="https://github.com/NixOS/nixpkgs/issues"
BUILD_ID="22.05.20220713.365e1b3"
DOCUMENTATION_URL="https://nixos.org/learn.html"
HOME_URL="https://nixos.org/"
ID=nixos
LOGO="nix-snowflake"
NAME=NixOS
PRETTY_NAME="NixOS 22.05 (Quokka)"
SUPPORT_URL="https://nixos.org/community.html"
VERSION="22.05 (Quokka)"
VERSION_CODENAME=quokka
VERSION_ID="22.05"

$ uname -a
Linux mael 5.18.7 #1-NixOS SMP Sat Jun 25 13:29:48 UTC 2022 aarch64 GNU/Linux

Install tools

`kind`

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@negz negz added the kind/bug Categorizes issue or PR as related to a bug. label Jul 27, 2022
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 27, 2022
@negz
Copy link
Contributor Author

negz commented Jul 27, 2022

/sig api-machinery

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 27, 2022
@negz negz changed the title API server etcd3 client logger uses significant amounts of memory when many CRDs are installed API server etcd3 client logger uses significant memory with many CRDs Jul 27, 2022
@negz negz changed the title API server etcd3 client logger uses significant memory with many CRDs etcd3 client logger uses significant memory with many CRDs Jul 27, 2022
negz added a commit to negz/kubernetes that referenced this issue Jul 27, 2022
This logger is responsible for 20% of the API server's memory usage when
many CRDs are installed. See the below issue for more context.

kubernetes#111476

Signed-off-by: Nic Cope <nicc@rk0n.org>
@apelisse
Copy link
Member

Since you looked at the code, and I'd rather not spend time doing that. Any chance you can tell if it's because it's being mis-used?

@jpbetz
Copy link
Contributor

jpbetz commented Jul 27, 2022

This is a scale dimension where the apiserver has historically scaled poorly. With the OpenAPI discovery performance being improved it looks like this might be a hint at what the next bottleneck might be?

Any idea how many etcd clients were instantiated? I suspect it's quite a lot.

@negz
Copy link
Contributor Author

negz commented Jul 27, 2022

Since you looked at the code, and I'd rather not spend time doing that. Any chance you can tell if it's because it's being mis-used?

Sadly I didn't yet look deeply past turning it off. I was slightly optimistic that we might just be able to disable it without any meaningful impact. What I do know is that the offending code seems to be on the etcd3 client end - i.e. it defaults to creating its own logger when one isn't supplied. You can see that code at https://github.com/etcd-io/etcd/blob/v3.5.4/client/pkg/logutil/zap.go#L24.

@negz
Copy link
Contributor Author

negz commented Jul 27, 2022

Any idea how many etcd clients were instantiated? I suspect it's quite a lot.

I don't - will try to find out.

@jpbetz
Copy link
Contributor

jpbetz commented Jul 27, 2022

Any idea how many etcd clients were instantiated? I suspect it's quite a lot.

I don't - will try to find out.

The zap logs might tell you if you have the data. If my memory of the code is right, you should get 1 etcd client per CRD right now (plus some for builtin types).

@leilajal
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 28, 2022
@negz
Copy link
Contributor Author

negz commented Jul 29, 2022

you should get 1 etcd client per CRD right now (plus some for builtin types).

Based on my read of the code, you're right. I see roughly:

  • All of /apis is handled by a *crdHandler
  • When a request comes in it gets an instance of the crd (from an informer)
  • Gets serving info from a map of CRDs - if there is none it...
  • Calls customresource.NewStorage which creates a genericregistry.Store
  • Calls its CompleteWithOptions method
  • That calls its “Decorator”, which is a genericregistry.StorageWithCacher
  • That calls generic.NewRawStorage which just wraps factory.Create
  • factory.Create calls newETCD3Storage which calls newETCD3Client

I'm experiencing technical difficulties getting lsof running in my kind container but /proc/pid/net confirms we have a connection to etcd per CRD:

$ k get crd|wc -l
1878

$ docker exec -it e092b6d27bf-withlogger-control-plane bash

root@e092b6d27bf-withlogger-control-plane:/# ps -C kube-apiserver
    PID TTY          TIME CMD
    493 ?        00:29:03 kube-apiserver

# With the 1,878 CRDs loaded
# Column 3 is the remote host and port per https://www.kernel.org/doc/Documentation/networking/proc_net_tcp.txt
# 0100007F:094B is hex for 127.0.0.1:2379 (the etcd client port)
root@e092b6d27bf-withlogger-control-plane:/# cat /proc/493/net/tcp|awk '{ print $3 }'|grep '0100007F:094B'|wc -l
1937

# After running kubectl delete crds --all
root@e092b6d27bf-withlogger-control-plane:/# cat /proc/493/net/tcp|awk '{ print $3 }'|grep '0100007F:094B'|wc -l
60

Note that 1,937 - 60 = 1,877 - the number of CRDs we have loaded (the 1,878 above includes the header line).

k8s-publishing-bot pushed a commit to kubernetes/apiserver that referenced this issue Aug 2, 2022
This logger is responsible for 20% of the API server's memory usage when
many CRDs are installed. See the below issue for more context.

kubernetes/kubernetes#111476

Signed-off-by: Nic Cope <nicc@rk0n.org>

Kubernetes-commit: 0e5401c93940126beac45264aa056507b0950075
negz added a commit to negz/kubernetes that referenced this issue Aug 2, 2022
This logger is responsible for 20% of the API server's memory usage when
many CRDs are installed. See the below issue for more context.

kubernetes#111476

Signed-off-by: Nic Cope <nicc@rk0n.org>
negz added a commit to negz/kubernetes that referenced this issue Aug 2, 2022
This logger is responsible for 20% of the API server's memory usage when
many CRDs are installed. See the below issue for more context.

kubernetes#111476

Signed-off-by: Nic Cope <nicc@rk0n.org>
negz added a commit to negz/kubernetes that referenced this issue Aug 2, 2022
This logger is responsible for 20% of the API server's memory usage when
many CRDs are installed. See the below issue for more context.

kubernetes#111476

Signed-off-by: Nic Cope <nicc@rk0n.org>
k8s-publishing-bot pushed a commit to kubernetes/apiserver that referenced this issue Aug 10, 2022
This logger is responsible for 20% of the API server's memory usage when
many CRDs are installed. See the below issue for more context.

kubernetes/kubernetes#111476

Signed-off-by: Nic Cope <nicc@rk0n.org>

Kubernetes-commit: 6204b37b180f597840d5cdf687ab5411cac6ee3c
k8s-publishing-bot pushed a commit to kubernetes/apiserver that referenced this issue Aug 10, 2022
This logger is responsible for 20% of the API server's memory usage when
many CRDs are installed. See the below issue for more context.

kubernetes/kubernetes#111476

Signed-off-by: Nic Cope <nicc@rk0n.org>

Kubernetes-commit: 9a60b0c5f4b50bf63c21af27bd379394ddee4413
k8s-publishing-bot pushed a commit to kubernetes/apiserver that referenced this issue Aug 10, 2022
This logger is responsible for 20% of the API server's memory usage when
many CRDs are installed. See the below issue for more context.

kubernetes/kubernetes#111476

Signed-off-by: Nic Cope <nicc@rk0n.org>

Kubernetes-commit: c1c18295893282ae32e7be0d09f607680241712f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants