Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics_server_resizer option #8018

Merged
merged 1 commit into from Sep 28, 2021

Conversation

oomichi
Copy link
Contributor

@oomichi oomichi commented Sep 24, 2021

What type of PR is this?

/kind bug

What this PR does / why we need it:

The addon-resizer container can reduce resource limits of cpu and memory of metrics-server container in the pod, and that caused OOMKilled.
In addition, the original metrics-server manifest doesn't contain the addon-resizer container as 1.
So this adds metrics_server_resizer option to control the addon-resizer container deployment and the default value is false to make it stable for most environments.

Which issue(s) this PR fixes:

Fixes #8010

Does this PR introduce a user-facing change?:

Add a new option `metrics_server_resizer` (default to false) to control the addon-resizer container deployment in metrics-server pod

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 24, 2021
@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Sep 24, 2021
@oomichi
Copy link
Contributor Author

oomichi commented Sep 24, 2021

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 24, 2021
@oomichi oomichi changed the title Add metrics_server_resizer_enabled option Add metrics_server_resizer option Sep 24, 2021
The addon-resizer container can reduce resource limits of cpu and
memory of metrics-server container in the pod, and that caused
OOMKilled.
In addition, the original metrics-server manifest doesn't contain
the addon-resizer container as [1].
So this adds metrics_server_resizer option to control the addon-resizer
container deployment and the default value is false to make it stable
for most environments.

[1]: https://github.com/kubernetes-sigs/metrics-server/blob/527679e5e8a103919c935d0575c20741796bc25d/manifests/base/deployment.yaml
@oomichi
Copy link
Contributor Author

oomichi commented Sep 24, 2021

Tested and confirmed that works as the resource limit is not changed after 10mins:

$ kubectl -n kube-system get pods metrics-server-c57c76cf4-db582
NAME                             READY   STATUS    RESTARTS   AGE
metrics-server-c57c76cf4-db582   1/1     Running   0          10m
$ kubectl get pod -l app.kubernetes.io/name=metrics-server -n kube-system -o custom-columns="Name:metadata.name,Containers:spec.containers[*].name,CPU-limit:spec.containers[*].resources.limits.cpu,MEM-limit:spec.containers[*].resources.limits.memory"
Name                             Containers       CPU-limit   MEM-limit
metrics-server-c57c76cf4-db582   metrics-server   100m        200Mi

Actually addon-resizer updates the resource limits like https://github.com/kubernetes/autoscaler/blob/068af5bf7e2bafe558654f9ef5fa0468db077cb9/addon-resizer/nanny/nanny_lib.go#L162-L163

It would be better to make it optional.

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 24, 2021
@floryut floryut added kind/feature Categorizes issue or PR as related to a new feature. and removed kind/bug Categorizes issue or PR as related to a bug. labels Sep 24, 2021
Copy link
Member

@floryut floryut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat 👍

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: floryut, oomichi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 24, 2021
@ledroide
Copy link
Contributor

Tested and confirmed
This PR solves the issue #8010 on our side, pushed on 4 different clusters, 1 hour ago.
CPU and memory limits are now preserved, as defined in metrics_server_limits_cpu and metrics_server_limits_memory variables.

@oomichi
Copy link
Contributor Author

oomichi commented Sep 27, 2021

Tested and confirmed
This PR solves the issue #8010 on our side, pushed on 4 different clusters, 1 hour ago.
CPU and memory limits are now preserved, as defined in metrics_server_limits_cpu and metrics_server_limits_memory variables.

Thank you so much for testing this @ledroide :-)

@floryut
Copy link
Member

floryut commented Sep 28, 2021

/lgtm
If already tested, thanks @oomichi @ledroide

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 28, 2021
@k8s-ci-robot k8s-ci-robot merged commit 8d3961e into kubernetes-sigs:master Sep 28, 2021
@oomichi
Copy link
Contributor Author

oomichi commented Sep 28, 2021

/lgtm If already tested, thanks @oomichi @ledroide

@floryut Thanks for approving this.
I think this also needs to be backported to release-2.17 branch.

oomichi added a commit to oomichi/kubespray that referenced this pull request Sep 28, 2021
The addon-resizer container can reduce resource limits of cpu and
memory of metrics-server container in the pod, and that caused
OOMKilled.
In addition, the original metrics-server manifest doesn't contain
the addon-resizer container as [1].
So this adds metrics_server_resizer option to control the addon-resizer
container deployment and the default value is false to make it stable
for most environments.

This is a cherry-pick of 8d3961e

[1]: https://github.com/kubernetes-sigs/metrics-server/blob/527679e5e8a103919c935d0575c20741796bc25d/manifests/base/deployment.yaml
k8s-ci-robot pushed a commit that referenced this pull request Sep 28, 2021
The addon-resizer container can reduce resource limits of cpu and
memory of metrics-server container in the pod, and that caused
OOMKilled.
In addition, the original metrics-server manifest doesn't contain
the addon-resizer container as [1].
So this adds metrics_server_resizer option to control the addon-resizer
container deployment and the default value is false to make it stable
for most environments.

This is a cherry-pick of 8d3961e

[1]: https://github.com/kubernetes-sigs/metrics-server/blob/527679e5e8a103919c935d0575c20741796bc25d/manifests/base/deployment.yaml
LuckySB pushed a commit to southbridgeio/kubespray that referenced this pull request Oct 23, 2021
The addon-resizer container can reduce resource limits of cpu and
memory of metrics-server container in the pod, and that caused
OOMKilled.
In addition, the original metrics-server manifest doesn't contain
the addon-resizer container as [1].
So this adds metrics_server_resizer option to control the addon-resizer
container deployment and the default value is false to make it stable
for most environments.

[1]: https://github.com/kubernetes-sigs/metrics-server/blob/527679e5e8a103919c935d0575c20741796bc25d/manifests/base/deployment.yaml
scervantes-stratio pushed a commit to scervantes-stratio/kubespray that referenced this pull request Nov 11, 2021
…sigs#8031)

The addon-resizer container can reduce resource limits of cpu and
memory of metrics-server container in the pod, and that caused
OOMKilled.
In addition, the original metrics-server manifest doesn't contain
the addon-resizer container as [1].
So this adds metrics_server_resizer option to control the addon-resizer
container deployment and the default value is false to make it stable
for most environments.

This is a cherry-pick of 8d3961e

[1]: https://github.com/kubernetes-sigs/metrics-server/blob/527679e5e8a103919c935d0575c20741796bc25d/manifests/base/deployment.yaml
unai-ttxu pushed a commit to Stratio/kubespray that referenced this pull request Nov 11, 2021
…sigs#8031) (#22)

The addon-resizer container can reduce resource limits of cpu and
memory of metrics-server container in the pod, and that caused
OOMKilled.
In addition, the original metrics-server manifest doesn't contain
the addon-resizer container as [1].
So this adds metrics_server_resizer option to control the addon-resizer
container deployment and the default value is false to make it stable
for most environments.

This is a cherry-pick of 8d3961e

[1]: https://github.com/kubernetes-sigs/metrics-server/blob/527679e5e8a103919c935d0575c20741796bc25d/manifests/base/deployment.yaml

Co-authored-by: Kenichi Omichi <ken1ohmichi@gmail.com>
@floryut floryut mentioned this pull request Dec 21, 2021
otani88 pushed a commit to velas/kubespray that referenced this pull request Mar 5, 2022
…sigs#8031)

The addon-resizer container can reduce resource limits of cpu and
memory of metrics-server container in the pod, and that caused
OOMKilled.
In addition, the original metrics-server manifest doesn't contain
the addon-resizer container as [1].
So this adds metrics_server_resizer option to control the addon-resizer
container deployment and the default value is false to make it stable
for most environments.

This is a cherry-pick of 8d3961e

[1]: https://github.com/kubernetes-sigs/metrics-server/blob/527679e5e8a103919c935d0575c20741796bc25d/manifests/base/deployment.yaml
sakuraiyuta pushed a commit to sakuraiyuta/kubespray that referenced this pull request Apr 16, 2022
The addon-resizer container can reduce resource limits of cpu and
memory of metrics-server container in the pod, and that caused
OOMKilled.
In addition, the original metrics-server manifest doesn't contain
the addon-resizer container as [1].
So this adds metrics_server_resizer option to control the addon-resizer
container deployment and the default value is false to make it stable
for most environments.

[1]: https://github.com/kubernetes-sigs/metrics-server/blob/527679e5e8a103919c935d0575c20741796bc25d/manifests/base/deployment.yaml
huangkevin404 pushed a commit to wiremind/kubespray that referenced this pull request Sep 15, 2022
…sigs#8031)

The addon-resizer container can reduce resource limits of cpu and
memory of metrics-server container in the pod, and that caused
OOMKilled.
In addition, the original metrics-server manifest doesn't contain
the addon-resizer container as [1].
So this adds metrics_server_resizer option to control the addon-resizer
container deployment and the default value is false to make it stable
for most environments.

This is a cherry-pick of 8d3961e

[1]: https://github.com/kubernetes-sigs/metrics-server/blob/527679e5e8a103919c935d0575c20741796bc25d/manifests/base/deployment.yaml
skw0823 pushed a commit to skw0823/kubespray that referenced this pull request Mar 2, 2023
…sigs#8031)

The addon-resizer container can reduce resource limits of cpu and
memory of metrics-server container in the pod, and that caused
OOMKilled.
In addition, the original metrics-server manifest doesn't contain
the addon-resizer container as [1].
So this adds metrics_server_resizer option to control the addon-resizer
container deployment and the default value is false to make it stable
for most environments.

This is a cherry-pick of 8d3961e

[1]: https://github.com/kubernetes-sigs/metrics-server/blob/527679e5e8a103919c935d0575c20741796bc25d/manifests/base/deployment.yaml
skw0823 pushed a commit to skw0823/kubespray that referenced this pull request Mar 6, 2023
…sigs#8031)

The addon-resizer container can reduce resource limits of cpu and
memory of metrics-server container in the pod, and that caused
OOMKilled.
In addition, the original metrics-server manifest doesn't contain
the addon-resizer container as [1].
So this adds metrics_server_resizer option to control the addon-resizer
container deployment and the default value is false to make it stable
for most environments.

This is a cherry-pick of 8d3961e

[1]: https://github.com/kubernetes-sigs/metrics-server/blob/527679e5e8a103919c935d0575c20741796bc25d/manifests/base/deployment.yaml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

metrics-server gets OOMKilled - low memory limit
4 participants