Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define semantic conventions for k8s metrics #1032

Open
ChrsMark opened this issue May 13, 2024 · 2 comments
Open

Define semantic conventions for k8s metrics #1032

ChrsMark opened this issue May 13, 2024 · 2 comments
Assignees
Labels
area:k8s enhancement New feature or request experts needed This issue or pull request is outside an area where general approvers feel they can approve triage:needs-triage

Comments

@ChrsMark
Copy link
Member

ChrsMark commented May 13, 2024

Area(s)

area:k8s

Is your change request related to a problem? Please describe.

At the moment there are not Semantic Conventions for k8s metrics.

Describe the solution you'd like

Even if we cannot consider the k8s metrics as stable we can start considering adding metrics that are not controversial to get some progress here. This issue aims to collect the existing k8s metrics that exist in the Collector and keep track of any related work.
Bellow I'm providing an initial list with metrics coming from the kubeletstats and k8scluster receivers. Note that these are matter to change with time being so we should get back to the Collector to verify the current state.

cc: @open-telemetry/semconv-k8s-approvers

Describe alternatives you've considered

No response

Additional context

Below there are some metrics from namespaces other than k8s.* as well. I leave them in there intentionally in order to take them into account accordingly.

kubeletstats metrics

k8s.node.cpu.usage
k8s.node.cpu.utilization
k8s.node.cpu.time
k8s.node.memory.available
k8s.node.memory.usage
k8s.node.memory.rss
k8s.node.memory.working_set
k8s.node.memory.page_faults
k8s.node.memory.major_page_faults
k8s.node.filesystem.available
k8s.node.filesystem.capacity
k8s.node.filesystem.usage
k8s.node.network.io
k8s.node.network.errors
k8s.node.uptime
k8s.pod.cpu.usage
k8s.pod.cpu.utilization: Deprecated
k8s.pod.cpu.time
k8s.pod.memory.available
k8s.pod.memory.usage
k8s.pod.cpu_limit_utilization
k8s.pod.cpu_request_utilization
k8s.pod.memory_limit_utilization
k8s.pod.memory_request_utilization
k8s.pod.memory.rss
k8s.pod.memory.working_set
k8s.pod.memory.page_faults
k8s.pod.memory.major_page_faults
k8s.pod.filesystem.available
k8s.pod.filesystem.capacity
k8s.pod.filesystem.usage
k8s.pod.network.io
k8s.pod.network.errors
k8s.pod.uptime
container.cpu.usage: #1128
container.cpu.utilization: Deprecated
container.cpu.time#282
container.memory.available
container.memory.usage: ✅ #282
k8s.container.cpu_limit_utilization
k8s.container.cpu_request_utilization
k8s.container.memory_limit_utilization
k8s.container.memory_request_utilization
container.memory.rss
container.memory.working_set
container.memory.page_faults
container.memory.major_page_faults
container.filesystem.available
container.filesystem.capacity
container.filesystem.usage
container.uptime
k8s.volume.available
k8s.volume.capacity
k8s.volume.inodes
k8s.volume.inodes.free
k8s.volume.inodes.used

k8scluster metrics

k8s.container.cpu_request
k8s.container.cpu_limit
k8s.container.memory_request
k8s.container.memory_limit
k8s.container.storage_request
k8s.container.storage_limit
k8s.container.ephemeralstorage_request
k8s.container.ephemeralstorage_limit
k8s.container.restarts
k8s.container.ready
k8s.pod.phase
k8s.pod.status_reason
k8s.deployment.desired
k8s.deployment.available
k8s.cronjob.active_jobs
k8s.daemonset.current_scheduled_nodes
k8s.daemonset.desired_scheduled_nodes
k8s.daemonset.misscheduled_nodes
k8s.daemonset.ready_nodes
k8s.hpa.max_replicas
k8s.hpa.min_replicas
k8s.hpa.current_replicas
k8s.hpa.desired_replicas
k8s.job.active_pods
k8s.job.desired_successful_pods
k8s.job.failed_pods
k8s.job.max_parallel_pods
k8s.job.successful_pods
k8s.namespace.phase
k8s.replicaset.desired
k8s.replicaset.available
k8s.replication_controller.desired
k8s.replication_controller.available
k8s.resource_quota.hard_limit
k8s.resource_quota.used
k8s.statefulset.desired_pods
k8s.statefulset.ready_pods
k8s.statefulset.current_pods
k8s.statefulset.updated_pods
openshift.clusterquota.limit
openshift.clusterquota.used
openshift.appliedclusterquota.limit
openshift.appliedclusterquota.used
k8s.node.condition

Related issues

TBA

@ChrsMark ChrsMark added enhancement New feature or request experts needed This issue or pull request is outside an area where general approvers feel they can approve triage:needs-triage labels May 13, 2024
@TylerHelmuth
Copy link
Member

I love the idea of moving forward with this work. According to the collector end-user survey k8s and the collector are a big part of our end-user's stack, so moving the related semconvs forwards is a great idea.

@sirianni
Copy link

In general, my team has been happy with the metrics collected by kubeletstatsreceiver and how they are modeled. They are struggling significantly with the "state" metrics that come from k8sclusterreceiver. We are coming from a Datadog background.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:k8s enhancement New feature or request experts needed This issue or pull request is outside an area where general approvers feel they can approve triage:needs-triage
Projects
None yet
Development

No branches or pull requests

4 participants