Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If a container doesn't have cpu limits, kube_pod_resource_limit reports the init-container limit #2295

Closed
andresm53 opened this issue Jan 2, 2024 · 4 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@andresm53
Copy link

What happened: Given the following pod:

apiVersion: v1
kind: Pod
metadata:
  name: example
  labels:
    app: nginx
spec:
  containers:
    - name: nginx
      image: nginx:latest
      ports:
        - containerPort: 8080
  initContainers:
  - name: init-myservice
    image: busybox:1.28
    resources:
      limits:
        cpu: 500m
        memory: 64Mi
      requests:
        cpu: 50m
        memory: 64Mi

kube_pod_resource_limit reports 500m as the pod limit.

What you expected to happen: As per Init Containers documentation:

The Pod's effective request/limit for a resource is the higher of:

  • the sum of all app containers request/limit for a resource
  • the effective init request/limit for a resource

Since the app container doesn't have cpu limits, which means "no limit", I would have expected that kube_pod_resource_limit reports none.

How to reproduce it (as minimally and precisely as possible):

  1. Create a pod using the example pasted above.
  2. Query kube_pod_resource_limit cpu resource:
    sum(kube_pod_resource_limit{resource='cpu',pod='example',namespace='test'})

Anything else we need to know?:

Environment: Openshift 4.12.

  • kube-state-metrics version:
sh-4.4$ ./kube-state-metrics --version
kube-state-metrics, version v2.6.0 (branch: rhaos-4.12-rhel-8, revision: 18c2653)
  build date:       2023-08-23T20:55:33Z
  go version:       go1.19.10 X:strictfipsruntime
  platform:         linux/amd64
  • Kubernetes version (use kubectl version):
    I0102 12:41:29.597851 1 server.go:254] "Run with Kubernetes cluster version" major="1" minor="25"
  • Cloud provider or hardware configuration: AWS
@andresm53 andresm53 added the kind/bug Categorizes issue or PR as related to a bug. label Jan 2, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jan 2, 2024
@dashpole
Copy link

/assign @rexagod
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 11, 2024
@rexagod
Copy link
Member

rexagod commented Jan 17, 2024

FYI While this seems like a bug, it is recommended to use kube-scheduler's exposed metrics for kube_pod_resource_{limit/request}s.

@andresm53
Copy link
Author

Thanks @rexagod . The problem with that (kube-scheduler's exposed metrics) is: in my particular case, I am using Openshift (4.12) and by default it uses kube_pod_resource_limit to display the cpu metrics chart. This is how it looks like, for the example pod I provided before. As you can see the chart is confusing, because the pod doesn't effectively has any cpu limits, but the chart implies that it has.

293755369-a544a531-6460-4bfe-81c7-29edefd2d926

@rexagod
Copy link
Member

rexagod commented Feb 25, 2024

@andresm53 Oh wow! Thank you for bringing this up, I'll ping the console folks internally to take a look. That being said, I believe https://github.com/openshift/console would be a better place to raise this.

Closing, feel free to reopen in openshift/console.

@rexagod rexagod closed this as not planned Won't fix, can't repro, duplicate, stale Feb 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

4 participants