Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InPlacePodVerticalScaling does not meet the requirement of qosClass being equal to Guaranteed after shrinking the memory #124786

Open
itonyli opened this issue May 10, 2024 · 9 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@itonyli
Copy link

itonyli commented May 10, 2024

What happened?

InPlacePodVerticalScaling does not meet the requirement of qosClass being equal to Guaranteed after shrinking the memory
image

What did you expect to happen?

InPlacePodVerticalScaling maintains the same qosclass type of Pod before and after scaling

How can we reproduce it (as minimally and precisely as possible)?

After enabling the InPlacePodVerticalScaling feature, the patch modifies the request and limit of the container's resource to a value smaller than usage

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
# paste output here

Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.1", GitCommit:"3ddd0f45aa91e2f30c70734b175631bec5b5825a", GitTreeState:"clean", BuildDate:"2022-05-24T12:17:11Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"darwin/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"29", GitVersion:"v1.29.2", GitCommit:"4b8e819355d791d96b7e9d9efe4cbafae2311c88", GitTreeState:"clean", BuildDate:"2024-02-14T22:24:00Z", GoVersion:"go1.21.7", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@itonyli itonyli added the kind/bug Categorizes issue or PR as related to a bug. label May 10, 2024
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 10, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tamilselvan1102
Copy link

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 10, 2024
@chengjoey
Copy link
Contributor

chengjoey commented May 11, 2024

i can't reproduce this issue in the same 1.29 version, my pod.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: test-in-place-vpa-pod
spec:
  containers:
  - name: nginx
    image: nginx:1.15.4
    resizePolicy:
    - resourceName: "cpu"
      restartPolicy: "NotRequired"
    - resourceName: "memory"
      restartPolicy: "NotRequired"
    resources:
      limits:
        cpu: "0.1"
        memory: "100M"
      requests:
        cpu: "0.1"
        memory: "100M"

then patch pod:

kubectl patch pod test-in-place-vpa-pod --patch '{"spec":{"containers":[{"name":"nginx","resources":{"limits":{"cpu":"0.1","memory":"100m"},"requests":{"cpu":"0.1","memory":"50m"}}}]}}'

i got this error:

The Pod "test-in-place-vpa-pod" is invalid:
* spec.containers[0].resources.requests: Invalid value: "100m": must be less than or equal to memory limit of 50m
* metadata: Invalid value: "Guaranteed": Pod QoS is immutable

@itonyli
Copy link
Author

itonyli commented May 13, 2024

i can't reproduce this issue in the same 1.29 version, my pod.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: test-in-place-vpa-pod
spec:
  containers:
  - name: nginx
    image: nginx:1.15.4
    resizePolicy:
    - resourceName: "cpu"
      restartPolicy: "NotRequired"
    - resourceName: "memory"
      restartPolicy: "NotRequired"
    resources:
      limits:
        cpu: "0.1"
        memory: "100M"
      requests:
        cpu: "0.1"
        memory: "100M"

then patch pod:

kubectl patch pod test-in-place-vpa-pod --patch '{"spec":{"containers":[{"name":"nginx","resources":{"limits":{"cpu":"0.1","memory":"100m"},"requests":{"cpu":"0.1","memory":"50m"}}}]}}'

i got this error:

The Pod "test-in-place-vpa-pod" is invalid:
* spec.containers[0].resources.requests: Invalid value: "100m": must be less than or equal to memory limit of 50m
* metadata: Invalid value: "Guaranteed": Pod QoS is immutable

you can test with memory resource.

@chengjoey
Copy link
Contributor

you can test with memory resource.

could you please provide your patch command

@pacoxu pacoxu added this to Triage in SIG Node Bugs May 14, 2024
@haircommander haircommander moved this from Triage to Needs Information in SIG Node Bugs May 22, 2024
@haircommander
Copy link
Contributor

/cc @esotsal

@hshiina
Copy link

hshiina commented May 27, 2024

I reproduced this issue on v1.30.

  1. Create a namespace and a pod following the documentation:

    $ kubectl create namespace qos-example
    $ kubectl create -f https://k8s.io/examples/pods/qos/qos-pod-5.yaml
    
  2. Update the memory limit and request with a quite low value (1Mi):

    $ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"1Mi"}, "limits":{"memory":"1Mi"}}}]}}'
    
  3. Confirm the container status:

    $ kubectl -n qos-example get pod qos-demo-5 -o json | jq ".status.containerStatuses[0].resources"
    {
      "limits": {
        "cpu": "700m",
        "memory": "200Mi"
      },
      "requests": {
        "cpu": "700m",
        "memory": "1Mi"
      }
    }
    

Then, I noticed the status.resize is "InProgress":

$ kubectl -n qos-example get pod qos-demo-5 -o json | jq ".status.resize"
"InProgress"

Though I'm not familiar with this feature, I guess the runtime is still trying to resize the pod. This issue seems caused because the updated value is too small to be practical.

After this situation, I patched another update with a practical value:

$ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"100Mi"}, "limits":{"memory":"100Mi"}}}]}}'
pod/qos-demo-5 patched

Then, the pod was resized with the later patch:

$ kubectl -n qos-example get pod qos-demo-5 -o json | jq ".status.containerStatuses[0].resources"
{
  "limits": {
    "cpu": "700m",
    "memory": "100Mi"
  },
  "requests": {
    "cpu": "700m",
    "memory": "100Mi"
  }
}

This issue can be solved by another patch. So, I don't think this causes a big problem.

@esotsal
Copy link
Contributor

esotsal commented May 27, 2024

Nice catch, tried also with latest K8s , interestingly lower than 14Mi in my tests it failed as well, not sure where the bug is in K8s or outside K8s ( container runtime ).

Definitely worth checking it more deep, thanks for sharing @hshiina , it seems it is a bug somewhere

$ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"15Mi"}, "limits":{"memory":"15Mi"}}}]}}'
pod/qos-demo-5 patched
$ kubectl exec qos-demo-5 --namespace=qos-example -- cat /sys/fs/cgroup/memory.max
15728640 <-- success has 15Mi
$ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"20Mi"}, "limits":{"memory":"20Mi"}}}]}}'
pod/qos-demo-5 patched
$ kubectl exec qos-demo-5 --namespace=qos-example -- cat /sys/fs/cgroup/memory.max
20971520 < -- success has 20Mi
$ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"15Mi"}, "limits":{"memory":"15Mi"}}}]}}'
pod/qos-demo-5 patched
$ kubectl exec qos-demo-5 --namespace=qos-example -- cat /sys/fs/cgroup/memory.max
15728640 < -- success has 15Mi
$ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"14Mi"}, "limits":{"memory":"14Mi"}}}]}}'
pod/qos-demo-5 patched
$ kubectl exec qos-demo-5 --namespace=qos-example -- cat /sys/fs/cgroup/memory.max
14680064 < -- success
$ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"13Mi"}, "limits":{"memory":"13Mi"}}}]}}'
pod/qos-demo-5 patched
$ kubectl exec qos-demo-5 --namespace=qos-example -- cat /sys/fs/cgroup/memory.max
14680064 < -- failed still has 14Mi
$ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"15Mi"}, "limits":{"memory":"15Mi"}}}]}}'
pod/qos-demo-5 patched
$ kubectl exec qos-demo-5 --namespace=qos-example -- cat /sys/fs/cgroup/memory.max
15728640 < -- success has 15Mi
$ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"14Mi"}, "limits":{"memory":"14Mi"}}}]}}'
pod/qos-demo-5 patched
$ kubectl exec qos-demo-5 --namespace=qos-example -- cat /sys/fs/cgroup/memory.max
14680064 < -- success has 14 Mi
$ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"1Gi"}, "limits":{"memory":"1Gi"}}}]}}'
pod/qos-demo-5 patched
$ kubectl exec qos-demo-5 --namespace=qos-example -- cat /sys/fs/cgroup/memory.max
1073741824 < -- success has 1Gi
$ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"13Mi"}, "limits":{"memory":"13Mi"}}}]}}'
pod/qos-demo-5 patched
$ kubectl exec qos-demo-5 --namespace=qos-example -- cat /sys/fs/cgroup/memory.max
1073741824 <-- failed still has 1Gi 
$ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"12Mi"}, "limits":{"memory":"12Mi"}}}]}}'
pod/qos-demo-5 patched
$ kubectl exec qos-demo-5 --namespace=qos-example -- cat /sys/fs/cgroup/memory.max
1073741824 <-- failed still has 1Gi 
$ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"15Mi"}, "limits":{"memory":"15Mi"}}}]}}'
pod/qos-demo-5 patched
$ kubectl exec qos-demo-5 --namespace=qos-example -- cat /sys/fs/cgroup/memory.max
15728640 <-- success
$ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"14.5Mi"}, "limits":{"memory":"14.5Mi"}}}]}}'
pod/qos-demo-5 patched
$ kubectl exec qos-demo-5 --namespace=qos-example -- cat /sys/fs/cgroup/memory.max
15204352  < -- success
$ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"14.1Mi"}, "limits":{"memory":"14.1Mi"}}}]}}'
pod/qos-demo-5 patched
$ kubectl exec qos-demo-5 --namespace=qos-example -- cat /sys/fs/cgroup/memory.max
14782464 < -- success
$ kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"memory":"13.1Mi"}, "limits":{"memory":"13.1Mi"}}}]}}'
pod/qos-demo-5 patched
$ kubectl exec qos-demo-5 --namespace=qos-example -- cat /sys/fs/cgroup/memory.max
14782464 < -- failed still has 14.1Mi

@hshiina
Copy link

hshiina commented May 27, 2024

As I put a comment here, the resizing got failed in kubelet:

klog.ErrorS(nil, "Aborting attempt to set pod memory limit less than current memory usage", "pod", pod.Name)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
SIG Node Bugs
Needs Information
Development

No branches or pull requests

7 participants