Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/kubeletstat] Review cpu.utilization naming #27885

Open
TylerHelmuth opened this issue Oct 20, 2023 · 20 comments
Open

[receiver/kubeletstat] Review cpu.utilization naming #27885

TylerHelmuth opened this issue Oct 20, 2023 · 20 comments
Labels

Comments

@TylerHelmuth
Copy link
Member

TylerHelmuth commented Oct 20, 2023

Component(s)

receiver/kubeletstats

Is your feature request related to a problem? Please describe.

The Kubeletestats Receiver currently uses *.cpu.utilization as the name for cpu metrics that report the CPUStats UsageNanoCores value.

I believe that UsageNanoCores reports the actual amount of cpu being used not the ratio of the amount being used out of a total limit. If this is true, then our use of utilization is not meeting semantic convention exceptions.

I would like to have a discussion about what exactly UsageNanoCores represents and if our metric naming needs updating.

Related to discussion that started in #24905

@TylerHelmuth TylerHelmuth added enhancement New feature or request needs triage New item requiring triage priority:p2 Medium discussion needed Community discussion needed receiver/kubeletstats and removed enhancement New feature or request needs triage New item requiring triage labels Oct 20, 2023
@TylerHelmuth
Copy link
Member Author

/cc @jinja2 @dmitryax @povilasv

@povilasv
Copy link
Contributor

povilasv commented Oct 23, 2023

Did some digging:

Kubernetes Docs state:

// Total CPU usage (sum of all cores) averaged over the sample window.
// The "core" unit can be interpreted as CPU core-nanoseconds per second.
// +optional
UsageNanoCores *uint64 json:"usageNanoCores,omitempty"

Looks like it's getting these metrics from CRI and if CRI doesn't have stats it's computing using this formula:

		nanoSeconds := newStats.Timestamp - cachedStats.Timestamp

		usageNanoCores := uint64(float64(newStats.UsageCoreNanoSeconds.Value-cachedStats.UsageCoreNanoSeconds.Value) /
			float64(nanoSeconds) * float64(time.Second/time.Nanosecond))

Ref: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/stats/cri_stats_provider.go#L791-L800

Where:

// Cumulative CPU usage (sum across all cores) since object creation.
UsageCoreNanoSeconds *UInt64Value protobuf:"bytes,2,opt,name=usage_core_nano_seconds,json=usageCoreNanoSeconds,proto3" json:"usage_core_nano_seconds,omitempty"

🤔

Playing a bit with the formula:

Limit - is total available cpu time.

Let's say we collect every 1 second, and app uses total available cpu time so 1 second.

nanoSeconds := now() - (now() - 1s) = 1s = 1,000,000,000 nanoseconds

UsaeNanocores := (2,000,000,000 - 1,000,000,000)  / 1,000,000,000  * 1,000,000,000 = 1,000,000,000 
or simplified:

UsageNanocores := (2s - 1s) / 1s * float64(time.Second/time.Nanosecond)) =  unit64(1 * float64(time.Second/time.Nanosecond)))  =  1,000,000,000

Based on this example, the result is actual usage of 1,000,000,000 nano seconds or 1second.

So this metricunit seems to be nanoseconds, not percentage.

If my calculations are correct, I think we should rename to cpu.usage with proper unit (nanoseconds)?

@TylerHelmuth
Copy link
Member Author

@povilasv thank you!

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Dec 25, 2023
@TylerHelmuth TylerHelmuth removed the Stale label Jan 8, 2024
dmitryax pushed a commit that referenced this issue Jan 12, 2024
**Description:** 
Starts the name change processor for `*.cpu.utilization` metrics. 

**Link to tracking Issue:** 
Related to
#24905
Related to
#27885
cparkins pushed a commit to AmadeusITGroup/opentelemetry-collector-contrib that referenced this issue Feb 1, 2024
…elemetry#25901)

**Description:** 
Starts the name change processor for `*.cpu.utilization` metrics. 

**Link to tracking Issue:** 
Related to
open-telemetry#24905
Related to
open-telemetry#27885
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@ChrsMark
Copy link
Member

ChrsMark commented Mar 27, 2024

FYI @TylerHelmuth @povilasv In SemConv we have merged open-telemetry/semantic-conventions#282 which adds the container.cpu.time metric for now.

For the dockerstats receiver we have #31649 which will try to align the implementation with the added SemConvs.

Do we have a summary so far for what is missing from the kubeletstats receiver in terms of naming changes (like this current issue)?

Shall we try to adopt the kubeletstats receiver with #31649? Happy to help with that.

At the moment the implementation of the receiver provides the following:

Are we planning to keep them all?
Are those all allgined with https://github.com/open-telemetry/semantic-conventions/blob/71c2e8072596fb9a4ceb68303c83f5389e0beb5f/docs/general/metrics.md#instrument-naming?

From https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/25901/files#diff-3343de7bfda986546ce7cb166e641ae88c0b0aecadd016cb253cd5a0463ff464R352-R353 I see we are going to remove/deprecate container.cpu.utilization?
Could we keep it instead as optional metric and find a proper way to calculate it? I see it was mentioned at #25901 (comment) but not sure how it was resolved. I guess that would be possible by adding a cpuNodeLimit (retrieved from the Node Resource) at

similarly with set resource limit. I drafted a very WiP implementation for this to illustrate the point -> ChrsMark@27ce769

@TylerHelmuth
Copy link
Member Author

TylerHelmuth commented Apr 2, 2024

@ChrsMark in my opinion yes to all questions. We want to be aligned with the spec (although I'd love to reduce the number of iterations in the receivers to gain that alignment, how long till we're stable lol).

I don't have a lot of time to dedicate to getting kubeletstatsreceiver update-to-date with the non-stable spec. At this point I was planning to wait for things to stabilize before making any more changes besides the work we started in this issue.

@ChrsMark
Copy link
Member

ChrsMark commented Apr 3, 2024

Thank's @TylerHelmuth, I see the point of not chasing after an unstable schema/spec.

Just to clarify regarding the container.cpu.utilization though: shall we abort its deprecation and try to calculate this properly as it was mentioned at #25901 (comment)? This can happen based on the cpuNodeLimit, and seems to be doable based on a quick research I did: ChrsMark@27ce769

@TylerHelmuth
Copy link
Member Author

@ChrsMark yes I'd be fine with keeping the metric if we can calculate it correctly. We'd still need to go through some sort of feature-gate processor to make it clear to users that the metric has changed and that if they want the old value they need to use the new .usage metric.

@ChrsMark
Copy link
Member

@TylerHelmuth @povilasv I have drafted a patch to illustrate the point at #32295. My sightings look promising :).

If we agree on the idea I can move the PR forward to fix the details and open it for review. Let me know what you think.

@TylerHelmuth
Copy link
Member Author

Seems reasonable. @jinja2 please take a look

@povilasv
Copy link
Contributor

This looks reasonable for me as well. I would add the informer too so we don't call get Node everytime we scrape data, in practice Node cpu capacity doesn't change

@ChrsMark
Copy link
Member

Thank's for the feedback folks!
I have adjusted #32295 to use an informer instead of getting the Node on every scrape. Something I noticed is that we have several places where there are different informer based implementations, like in k8sclusterreceiver and k8sattributesprocessor. Maybe it would make sense to extract that common logic into a common lib and re-use it from the various receivers and processors (even in resourcedetectorprocessor), but we can file a different issue for this.
For now we can continue any discussion at #32295.

@dmitryax
Copy link
Member

dmitryax commented May 2, 2024

Hey folks. Sorry, I'm late to the party.

I am not convinced about repurposing the container.cpu.utilization to be a ratio of the node cpu limit:

  1. It's unclear from the metric's name what the limit, while k8s.container.cpu_limit_utilization gives pretty good idea. I can imagine it to be utilization of some container limit, not node's limit.
  2. container.cpu.utilization name is generic and can be used in docker receiver as well. Can we always easily retrieve number of host CPUs there? I'm not sure about it.

I think something like k8s.container.node_limit_utilization would be a better name here. It's clear and consistent with other utilization metrics.

Even if we decide to repurpose container.cpu.utilization, I would strongly suggest deprecating->disabling and maybe removing it first instead of just changing its meaning. I believe the metrics based on the node limit should be optional, and container.cpu.time or container.cpu.usage should be enabled by default instead.

cc @ChrsMark @TylerHelmuth @povilasv

@ChrsMark
Copy link
Member

ChrsMark commented May 2, 2024

Thank's @dmitryax. I agree with making the metric more specific here.
There is a similar distinction in meatricbeat as well, with *.node.pct and *.limit.pct accordingly.

If others agree, I can change the PR accordingly to:

  1. introduce the k8s.container.cpu.node_limit_utilization.
  2. remove the feature flag since we don't need it any more and leave container.cpu.utilization as deprecated.
  3. since we are on it, I suggest we also introduce the k8s.pod.cpu.node_limit_utilization as part of the same PR.
  4. for k8s.node.cpu.utilization since we can use it as is I guess we will need to go through the feature flag path in a separate PR.

Extra: I wonder also if that would make sense to change k8s.container.cpu_limit_utilization to k8s.container.cpu.limit_utilization so as to be consistent with having k8s.container.cpu.* as a clear namespace.

@dmitryax @TylerHelmuth @povilasv let me know what you think.

@povilasv
Copy link
Contributor

povilasv commented May 2, 2024

I also agree with @dmitryax points too. I guess computing against node limits has problems and is unfair if node limits > container limits, etc. So we need to do proper deprecation.

Regarding k8s.container.cpu.node_limit_utilization, is this useful? if you run many pods on a Node and they start consuming cpu and linux gives all of them equal share, your Pod wont report 100 % k8s.container.cpu.node_limit_utilization and yet it wont be able to get more cpu.

@ChrsMark
Copy link
Member

ChrsMark commented May 2, 2024

Regarding k8s.container.cpu.node_limit_utilization, is this useful

I still find this useful in order to be able to compare how much cpu utilization a container/pod has against the Node's capacity. This helps you understand if a Pod/container is a really problematic workload or not. Not against its own limit but against the Node's capacity.

For example, you can see 96% limit_utilization from container A and 96% limit_utilization from container B. But if those 2 have different resource limits you don't get a realistic overview of which one of them consumes more against the Node.

@jinja2
Copy link
Contributor

jinja2 commented May 2, 2024

What does "node_limit" mean here? Is it referring to the host's capacity or the node allocatable (capacity - system/kube reserved)? I find node_limit to be ambiguous but my assumption would be that it refers to allocatable since that's the amount of resources actually available for pod scheduling. Do you think a user might want to select whether the utilization is against the capacity or the allocatable?

@TylerHelmuth
Copy link
Member Author

Even if we decide to repurpose container.cpu.utilization, I would strongly suggest deprecating->disabling and maybe removing it first instead of just changing its meaning. I believe the metrics based on the node limit should be optional, and container.cpu.time or container.cpu.usage should be enabled by default instead.

I am good with this. I care mainly about the switch from *.cpu.utilization to *.cpu.usage.

@ChrsMark
Copy link
Member

ChrsMark commented May 8, 2024

Do you think a user might want to select whether the utilization is against the capacity or the allocatable?

I would go with the capacity since the allocatable is kind of k8s specific and is another "limit" to my mind. I think users would be fine to see the actual utilization against the actual node's capacity. This would not require extra knowledge like the allocatable configuration etc.


I will change the PR for now to to use the k8s.container.cpu.node_limit_utilization and we can continue the discussion there.

andrzej-stencel pushed a commit that referenced this issue May 31, 2024
…ic (#32295)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

At the moment. We calculate the `k8s.container.cpu_limit_utilization` as
[a ratio of the container's
limits](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/kubeletstatsreceiver/documentation.md#k8scontainercpu_limit_utilization)
at
https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/867d6700c31446172e6998e602c55fbf7351831f/receiver/kubeletstatsreceiver/internal/kubelet/cpu.go#L30.

Similarly we can calculate the cpu utilization as ratio of the whole
node's allocatable cpu, if we divide by the total number of node's
cores.

We can retrieve this information from the Node's `Status.Capacity`, for
example:

```console
$ k get nodes kind-control-plane -ojsonpath='{.status.capacity}'
{"cpu":"8","ephemeral-storage":"485961008Ki","hugepages-1Gi":"0","hugepages-2Mi":"0","memory":"32564732Ki","pods":"110"}
```




## Performance concerns

In order to get the Node's capacity we need an API call to the k8s API
in order to get the Node object.
Something to consider here is the performance impact that this extra API
call would bring. We can always choose to have this metric as disabled
by default and clearly specify in the docs that this metric comes with
an extra API call to get the Node of the Pods.

The good thing is that `kubeletstats` receiver target's only one node so
I believe it's a safe assumption to only fetch the current node because
all the observed Pods will belong to the one single local node. Correct
me if I miss anything here.

In addition, instead of performing the API call explicitly on every
single `scrape` we can use an informer instead and leverage its cache. I
can change this patch to this direction if we agree on this.

Would love to hear other's opinions on this. 

## Todos

✅ 1) Apply this change behind a feature gate as it was indicated at
#27885 (comment)
✅  2) Use an Informer instead of direct API calls.

**Link to tracking Issue:** <Issue number if applicable>
ref:
#27885

**Testing:** <Describe what testing was performed and which tests were
added.>

I experimented with this approach and the results look correct. In order
to verify this I deployed a stress Pod on my machine to consume a target
cpu of 4 cores:

```yaml
apiVersion: v1
kind: Pod
metadata:
  name: cpu-stress
spec:
  containers:
  - name: cpu-stress
    image: polinux/stress
    command: ["stress"]
    args: ["-c", "4"]
```

And then the collected `container.cpu.utilization` for that Pod's
container was at `0,5` as exepcted, based that my machine-node comes
with 8 cores in total:


![cpu-stress](https://github.com/open-telemetry/opentelemetry-collector-contrib/assets/11754898/3abe4a0d-6c99-4b4e-a704-da5789dde01b)

Unit test is also included.

**Documentation:** <Describe the documentation added.>

Added:
https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/32295/files#diff-8ad3b506fb1132c961e8da99b677abd31f0108e3f9ed6999dd96ad3297b51e08

---------

Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
TylerHelmuth added a commit that referenced this issue Jun 14, 2024
**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
This PR adds the `k8s.pod.cpu.node.utilization` metric.
Follow up from
#32295 (comment)
(cc @TylerHelmuth) .

**Link to tracking Issue:** <Issue number if applicable>
Related to
#27885.

**Testing:** <Describe what testing was performed and which tests were
added.> Adjusted the respective unit test to cover this metric as well.

**Documentation:** <Describe the documentation added.> Added

Tested with a single container Pod:


![podCpu](https://github.com/open-telemetry/opentelemetry-collector-contrib/assets/11754898/9a0069c2-7077-4944-93b6-2dde00979bf3)

---------

Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
Co-authored-by: Tiffany Hrabusa <30397949+tiffany76@users.noreply.github.com>
Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>
t00mas pushed a commit to t00mas/opentelemetry-collector-contrib that referenced this issue Jun 18, 2024
…-telemetry#33390)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
This PR adds the `k8s.pod.cpu.node.utilization` metric.
Follow up from
open-telemetry#32295 (comment)
(cc @TylerHelmuth) .

**Link to tracking Issue:** <Issue number if applicable>
Related to
open-telemetry#27885.

**Testing:** <Describe what testing was performed and which tests were
added.> Adjusted the respective unit test to cover this metric as well.

**Documentation:** <Describe the documentation added.> Added

Tested with a single container Pod:


![podCpu](https://github.com/open-telemetry/opentelemetry-collector-contrib/assets/11754898/9a0069c2-7077-4944-93b6-2dde00979bf3)

---------

Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
Co-authored-by: Tiffany Hrabusa <30397949+tiffany76@users.noreply.github.com>
Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants