Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-apiserver oom, list resource consume too much memory cause json decode #125580

Open
V0idk opened this issue Jun 19, 2024 · 6 comments
Open
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@V0idk
Copy link

V0idk commented Jun 19, 2024

What happened?

The APIServer concurrency capability is too weak. In the test, the memory usage of 20 concurrent requests increases to 12 GB. The data size of "kubectl get crd -A -o yaml" is 20 MB.

  1. Why does serialization consume so much memory? Is there any optimization mechanism?
  2. Another point to note: when I call 100 watches concurrently(and we know that watch will initially treat all items as add events, which are equivalent to lists.), kube-apiserver only goes up to 3GB, while 10 LIST concurrent operations go up to 8GB. Why is the memory usage difference between watches and lists so huge?

image

image

image

image

after i specfic resourceVersion=0, memory usage does not improve:

image

image

What did you expect to happen?

Why does serialization consume so much memory? Is there any optimization mechanism?

How can we reproduce it (as minimally and precisely as possible)?

just kubectl get crd -A -o yaml & concurrently

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
# paste output here

Cloud provider

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@V0idk V0idk added the kind/bug Categorizes issue or PR as related to a bug. label Jun 19, 2024
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 19, 2024
@V0idk
Copy link
Author

V0idk commented Jun 19, 2024

/sig api-machinery

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 19, 2024
@nayihz
Copy link
Contributor

nayihz commented Jun 19, 2024

Can this issue be reproduced? What's your k8s cluster version? Do you try to reproduce it in other cluster?

@mauri870
Copy link
Member

Additionally, can you share the actual profile file? Thanks.

@seans3
Copy link
Contributor

seans3 commented Jun 20, 2024

/triage accepted

/assign @benluddy

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 20, 2024
@benluddy
Copy link
Contributor

/cc @p0lyn0mial

Each watch event is serialized to a buffer and flushed to the response one at a time, whereas for lists, complete list objects are serialized to a buffer before writing to the response. I don't think truly streaming encoders are coming soon, but https://kep.k8s.io/3157 is targeting beta in v1.31 and will allow "lists" to be performed using (and have approximately the same cost as) the watch mechanism.

@DrAuYueng
Copy link
Contributor

/cc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

7 participants