Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Sidecar Containers] Consider restartable init containers in eviction message #124947

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

toVersus
Copy link

@toVersus toVersus commented May 19, 2024

What type of PR is this?

/kind bug

What this PR does / why we need it:

When a Pod is evicted due to resources pressure on the node, the kubelet creates an eviction event and reports it to the user. The event message includes which containers within the Pod exceeded their resource requests, along with their actual usage. Similar information is also included in the event annotations.

Currently, the evictionMessage function, which generates the message and annotations for the eviction event, does not account for the restartable init containers (sidecar containers).

This PR improves the event to account for restartable init containers as follows:

apiVersion: v1
count: 1
eventTime: null
firstTimestamp: "2024-05-19T08:14:14Z"
involvedObject:
  apiVersion: v1
  kind: Pod
  name: alloc-lm9sq
  namespace: default
  resourceVersion: "19979"
  uid: 786af1f6-3b04-4385-b61f-acfca78ddfd6
kind: Event
lastTimestamp: "2024-05-19T08:14:14Z"
message: 'The node was low on resource: memory. Threshold quantity: 1536Mi, available:
  1208704Ki. Container el1 was using 44Ki, request is 0, has larger consumption
  of memory. Container el2 was using 48Ki, request is 0, has larger consumption
  of memory. Container el3 was using 48Ki, request is 0, has larger consumption
  of memory. Container el-sidecar1 was using 3818464Ki, request is 0, has larger
  consumption of memory. Container el-sidecar2 was using 1452Ki, request is 0, has
  larger consumption of memory. Container el-sidecar3 was using 48Ki, request is
  0, has larger consumption of memory. '
metadata:
  annotations:
    offending_containers: el1,el2,el3,el-sidecar1,el-sidecar2,el-sidecar3
    offending_containers_usage: 44Ki,48Ki,48Ki,3818464Ki,1452Ki,48Ki
    starved_resource: memory
  creationTimestamp: "2024-05-19T08:14:14Z"
  name: alloc-lm9sq.17d0d5ba642ef84c
  namespace: default
  resourceVersion: "20072"
  uid: 521c02b7-4935-4def-be2c-53a76c32f4f7
reason: Evicted
reportingComponent: kubelet
reportingInstance: kind-worker
source:
  component: kubelet
  host: kind-worker
type: Warning

Which issue(s) this PR fixes:

Fixes #124938

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Fixes a bug where restartable init containers were not accounted for in the message and annotations of eviction event.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

[KEP]: https://github.com/kubernetes/enhancements/issues/753

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. labels May 19, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: toVersus
Once this PR has been reviewed and has the lgtm label, please assign sjenning for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 19, 2024
@k8s-ci-robot
Copy link
Contributor

Welcome @toVersus!

It looks like this is your first PR to kubernetes/kubernetes 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kubernetes has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 19, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @toVersus. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@toVersus
Copy link
Author

/cc @SergeyKanzhelev @matthyx @gjkim42

Signed-off-by: Tsubasa Nagasawa <toversus2357@gmail.com>
Copy link
Member

@gjkim42 gjkim42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure it is ok to add restartable init containers to the eviction messages without actually accounting them in eviction logic.

@toVersus
Copy link
Author

toVersus commented May 20, 2024

Checking the eviction behavior in the local Kubernetes 1.30.0 cluster, I confirmed that eviction occurs even when restartable init containers cause memory pressure on the node.

cat <<EOF | kind create cluster --config -
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
kubeadmConfigPatches:
- |
  kind: KubeletConfiguration
  evictionHard:
    memory.available: "1.5Gi"
nodes:
- role: control-plane
  image: kindest/node:v1.30.0@sha256:047357ac0cfea04663786a612ba1eaba9702bef25227a794b52890dd8bcd692e
- role: worker
  image: kindest/node:v1.30.0@sha256:047357ac0cfea04663786a612ba1eaba9702bef25227a794b52890dd8bcd692e
EOF

# Create Pod causing eviction by memory pressure
cat <<'EOF' | kubectl create -f -
apiVersion: v1
kind: Pod
metadata: 
  generateName: alloc-
spec:
  terminationGracePeriodSeconds: 1
  restartPolicy: Never
  initContainers:
  - name: el-sidecar1
    image: registry.k8s.io/e2e-test-images/agnhost:2.52
    args: ["stress", "--mem-alloc-size", "100Mi", "--mem-alloc-sleep", "1s", "--mem-total", "4000Mi"]
    restartPolicy: Always
  - name: el-sidecar2
    image: registry.k8s.io/e2e-test-images/agnhost:2.52
    command: ["sleep", "infinity"]
    restartPolicy: Always
  - name: el-sidecar3
    image: registry.k8s.io/e2e-test-images/agnhost:2.52
    command: ["sleep", "infinity"]
    restartPolicy: Always
  containers:
  - name: el1
    image: registry.k8s.io/e2e-test-images/agnhost:2.52
    command: ["sleep", "infinity"]
  - name: el2
    image: registry.k8s.io/e2e-test-images/agnhost:2.52
    command: ["sleep", "infinity"]
  - name: el3
    image: registry.k8s.io/e2e-test-images/agnhost:2.52
    command: ["sleep", "infinity"]
EOF

Details of the evicted Pod:

kubectl describe pods alloc-5ffbx
Name:             alloc-5ffbx
Namespace:        default
Priority:         0
Service Account:  default
Node:             kind-worker/192.168.228.2
Start Time:       Mon, 20 May 2024 21:43:52 +0900
Labels:           <none>
Annotations:      <none>
Status:           Failed
Reason:           Evicted
Message:          The node was low on resource: memory. Threshold quantity: 1536Mi, available: 1248032Ki. Container el3 was using 48Ki, request is 0, has larger consumption of memory. Container el1 was using 44Ki, request is 0, has larger consumption of memory. Container el2 was using 44Ki, request is 0, has larger consumption of memory.
IP:               10.244.1.2
IPs:
  IP:  10.244.1.2
Init Containers:
  el-sidecar1:
    Container ID:
    Image:         registry.k8s.io/e2e-test-images/agnhost:2.52
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Args:
      stress
      --mem-alloc-size
      100Mi
      --mem-alloc-sleep
      1s
      --mem-total
      4000Mi
    State:          Terminated
      Reason:       ContainerStatusUnknown
      Message:      The container could not be located when the pod was terminated
      Exit Code:    137
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Last State:     Terminated
      Reason:       ContainerStatusUnknown
      Message:      The container could not be located when the pod was deleted.  The container used to be Running
      Exit Code:    137
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:          False
    Restart Count:  1
    Environment:
      MEMORY_LIMIT:  node allocatable (limits.memory)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4f6hc (ro)
  el-sidecar2:
    Container ID:  containerd://bf59fcb6ef501a02c929c11b8b702603cb063ec60c60670ade25cbec83881d96
    Image:         registry.k8s.io/e2e-test-images/agnhost:2.52
    Image ID:      registry.k8s.io/e2e-test-images/agnhost@sha256:b173c7d0ffe3d805d49f4dfe48375169b7b8d2e1feb81783efd61eb9d08042e6
    Port:          <none>
    Host Port:     <none>
    Command:
      sleep
      infinity
    State:          Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Mon, 20 May 2024 21:44:05 +0900
      Finished:     Mon, 20 May 2024 21:44:59 +0900
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4f6hc (ro)
  el-sidecar3:
    Container ID:  containerd://1394ee2639ff685ed41da6e1d145432f44051dddcb3ae4fd794c9ae16c4750ba
    Image:         registry.k8s.io/e2e-test-images/agnhost:2.52
    Image ID:      registry.k8s.io/e2e-test-images/agnhost@sha256:b173c7d0ffe3d805d49f4dfe48375169b7b8d2e1feb81783efd61eb9d08042e6
    Port:          <none>
    Host Port:     <none>
    Command:
      sleep
      infinity
    State:          Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Mon, 20 May 2024 21:44:06 +0900
      Finished:     Mon, 20 May 2024 21:44:59 +0900
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4f6hc (ro)
Containers:
  el1:
    Container ID:
    Image:         registry.k8s.io/e2e-test-images/agnhost:2.52
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      sleep
      infinity
    State:          Terminated
      Reason:       ContainerStatusUnknown
      Message:      The container could not be located when the pod was terminated
      Exit Code:    137
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Last State:     Terminated
      Reason:       ContainerStatusUnknown
      Message:      The container could not be located when the pod was deleted.  The container used to be Running
      Exit Code:    137
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:          False
    Restart Count:  1
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4f6hc (ro)
  el2:
    Container ID:
    Image:         registry.k8s.io/e2e-test-images/agnhost:2.52
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      sleep
      infinity
    State:          Terminated
      Reason:       ContainerStatusUnknown
      Message:      The container could not be located when the pod was terminated
      Exit Code:    137
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Last State:     Terminated
      Reason:       ContainerStatusUnknown
      Message:      The container could not be located when the pod was deleted.  The container used to be Running
      Exit Code:    137
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:          False
    Restart Count:  1
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4f6hc (ro)
  el3:
    Container ID:
    Image:         registry.k8s.io/e2e-test-images/agnhost:2.52
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      sleep
      infinity
    State:          Terminated
      Reason:       ContainerStatusUnknown
      Message:      The container could not be located when the pod was terminated
      Exit Code:    137
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Last State:     Terminated
      Reason:       ContainerStatusUnknown
      Message:      The container could not be located when the pod was deleted.  The container used to be Running
      Exit Code:    137
      Started:      Mon, 01 Jan 0001 00:00:00 +0000
      Finished:     Mon, 01 Jan 0001 00:00:00 +0000
    Ready:          False
    Restart Count:  1
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4f6hc (ro)
Conditions:
  Type                        Status
  DisruptionTarget            True
  PodReadyToStartContainers   False
  Initialized                 False
  Ready                       False
  ContainersReady             False
  PodScheduled                True
Volumes:
  kube-api-access-4f6hc:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age   From               Message
  ----     ------     ----  ----               -------
  Normal   Scheduled  82s   default-scheduler  Successfully assigned default/alloc-5ffbx to kind-worker
  Normal   Pulling    82s   kubelet            Pulling image "registry.k8s.io/e2e-test-images/agnhost:2.52"
  Normal   Pulled     69s   kubelet            Successfully pulled image "registry.k8s.io/e2e-test-images/agnhost:2.52" in 12.457s (12.457s including waiting). Image size: 50208832 bytes.
  Normal   Created    69s   kubelet            Created container el-sidecar1
  Normal   Started    69s   kubelet            Started container el-sidecar1
  Normal   Pulled     69s   kubelet            Container image "registry.k8s.io/e2e-test-images/agnhost:2.52" already present on machine
  Normal   Created    69s   kubelet            Created container el-sidecar2
  Normal   Started    69s   kubelet            Started container el-sidecar2
  Normal   Pulled     68s   kubelet            Container image "registry.k8s.io/e2e-test-images/agnhost:2.52" already present on machine
  Normal   Created    68s   kubelet            Created container el-sidecar3
  Normal   Started    68s   kubelet            Started container el-sidecar3
  Normal   Pulled     67s   kubelet            Container image "registry.k8s.io/e2e-test-images/agnhost:2.52" already present on machine
  Normal   Created    67s   kubelet            Created container el1
  Normal   Started    67s   kubelet            Started container el1
  Normal   Pulled     67s   kubelet            Container image "registry.k8s.io/e2e-test-images/agnhost:2.52" already present on machine
  Normal   Created    67s   kubelet            Created container el2
  Normal   Started    67s   kubelet            Started container el2
  Normal   Pulled     67s   kubelet            Container image "registry.k8s.io/e2e-test-images/agnhost:2.52" already present on machine
  Normal   Created    67s   kubelet            Created container el3
  Normal   Started    67s   kubelet            Started container el3
  Warning  Evicted    18s   kubelet            The node was low on resource: memory. Threshold quantity: 1536Mi, available: 1248032Ki. Container el3 was using 48Ki, request is 0, has larger consumption of memory. Container el1 was using 44Ki, request is 0, has larger consumption of memory. Container el2 was using 44Ki, request is 0, has larger consumption of memory.
  Normal   Killing    18s   kubelet            Stopping container el-sidecar1
  Normal   Killing    18s   kubelet            Stopping container el1
  Normal   Killing    18s   kubelet            Stopping container el2
  Normal   Killing    18s   kubelet            Stopping container el3
  Normal   Killing    18s   kubelet            Stopping container el-sidecar3

@kannon92
Copy link
Contributor

/ok-to-test

can we add some e2e tests confirming eviction works for sidecars? Memory and storage would be good. A manually test is good but it would be ideal to make sure eviction is tested with sidecars.

it can be a follow up if necessary.

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 20, 2024
@toVersus
Copy link
Author

I agree that. I initially considered adding E2E tests, but I was reluctant because E2E tests related to Pod eviction are expensive (marked as Slow, Serial, and Disruptive). This is especially true since there is no special care in the eviction logic for sidecar containers. However, if necessary, I am trying to work on adding E2E tests, although I think it might be better to handle it in a separate PR.

@bart0sh
Copy link
Contributor

bart0sh commented May 23, 2024

/triage accepted
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels May 23, 2024
@k8s-ci-robot k8s-ci-robot removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels May 23, 2024
Signed-off-by: Tsubasa Nagasawa <toversus2357@gmail.com>
@toVersus
Copy link
Author

Non-restartable Init Containers are also not considered in the Pod Eviction messages, so I have modified it to take Init Containers into account. If it is acceptable to expand the scope of the changes, I will update the title and description of the PR.

Steps to trigger Pod Eviction with Init Containers:

cat <<EOF | kind create cluster --config -
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
kubeadmConfigPatches:
- |
  kind: KubeletConfiguration
  evictionHard:
    memory.available: "1.5Gi"
nodes:
- role: control-plane
  image: kindest/node:v1.30.0@sha256:047357ac0cfea04663786a612ba1eaba9702bef25227a794b52890dd8bcd692e
- role: worker
  image: kindest/node:v1.30.0@sha256:047357ac0cfea04663786a612ba1eaba9702bef25227a794b52890dd8bcd692e
EOF

# Create Pod causing eviction by memory pressure
cat <<'EOF' | kubectl create -f -
apiVersion: v1
kind: Pod
metadata: 
  generateName: alloc-
spec:
  terminationGracePeriodSeconds: 1
  restartPolicy: Never
  initContainers:
  - name: el-init1
    image: registry.k8s.io/e2e-test-images/agnhost:2.52
    args: ["stress", "--mem-alloc-size", "100Mi", "--mem-alloc-sleep", "1s", "--mem-total", "4000Mi"]
  containers:
  - name: el1
    image: registry.k8s.io/e2e-test-images/agnhost:2.52
    command: ["sleep", "infinity"]
EOF

Event before this PR:

apiVersion: v1
count: 1
eventTime: null
firstTimestamp: "2024-05-29T09:20:56Z"
involvedObject:
  apiVersion: v1
  kind: Pod
  name: alloc-v972z
  namespace: default
  resourceVersion: "573"
  uid: a7e67c6e-4838-4ce8-a656-c31f9fb42831
kind: Event
lastTimestamp: "2024-05-29T09:20:56Z"
message: 'The node was low on resource: memory. Threshold quantity: 1536Mi, available:
  1453492Ki. '
metadata:
  annotations:
    offending_containers: ""
    offending_containers_usage: ""
    starved_resource: memory
  creationTimestamp: "2024-05-29T09:20:56Z"
  name: alloc-v972z.17d3eb2bd8f92aca
  namespace: default
  resourceVersion: "655"
  uid: b3aa8512-9383-4f8c-9794-89a77f1d6602
reason: Evicted
reportingComponent: kubelet
reportingInstance: kind-worker
source:
  component: kubelet
  host: kind-worker
type: Warning

Event after this PR:

apiVersion: v1
count: 1
eventTime: null
firstTimestamp: "2024-05-29T09:01:25Z"
involvedObject:
  apiVersion: v1
  kind: Pod
  name: alloc-tlhbv
  namespace: default
  resourceVersion: "1673"
  uid: 4bb36fc8-1dd7-4cc9-befd-9216b1fd7220
kind: Event
lastTimestamp: "2024-05-29T09:01:25Z"
message: 'The node was low on resource: memory. Threshold quantity: 1536Mi, available:
  1186072Ki. Container el-init1 was using 4133704Ki, request is 0, has larger consumption
  of memory. '
metadata:
  annotations:
    offending_containers: el-init1
    offending_containers_usage: 4133704Ki
    starved_resource: memory
  creationTimestamp: "2024-05-29T09:01:25Z"
  name: alloc-tlhbv.17d3ea1b1e10b7f6
  namespace: default
  resourceVersion: "1748"
  uid: 8f0437ba-e380-47e3-8d07-2f67854d67d3
reason: Evicted
reportingComponent: kubelet
reportingInstance: kind-worker
source:
  component: kubelet
  host: kind-worker
type: Warning

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: Needs Reviewer
Development

Successfully merging this pull request may close these issues.

[Sidecar Containers] Eviction message should account for the sidecar containers
5 participants