Skip to content

0.45.0 - cadvisor / malformed metrics #3162

@reefland

Description

@reefland

I have successfully deployed cadvisor 0.45.0 (tried v0.45.0-containerd-cri as well) as daemonset on K3S Kubernetes / Containerd. I've only applied the cadvisor-args.yaml overlay as the others did not seem relevant.

History
The bundled K3s (v1.24.3+k3s1) containerd is disabled as it does not support ZFS snapshotter. Instead I'm using the containerd from Ubuntu 22.04 (1.5.9-0ubuntu3) and while it functions perfectly with containers for K3s and ZFS snapshotter, it does not work properly with kubelet / cAdvisor / Prometheus as image= and container= are missing. And a simple Prometheus query such as:

container_cpu_usage_seconds_total{image!=""}

Returned an empty set.

What I See Now
It was suggested I try this cadvisor instead, and it is better.. almost but not quiet right. Hopefully I'm just missing something. Now that same Prometheus query returns 111 rows, here is an example for 3:

container_cpu_usage_seconds_total{container="cadvisor", container_label_io_kubernetes_container_name="alertmanager", container_label_io_kubernetes_pod_namespace="monitoring", cpu="total", endpoint="http", id="/kubepods/burstable/pod7c0573cd-bba4-4f94-960f-c54cce2bc50e/5ff787742594c67500f255b9926c305246807e92303b43a19c7b95ba1d13dd59", image="quay.io/prometheus/alertmanager:v0.24.0", instance="10.42.0.143:8080", job="monitoring/cadvisor-prometheus-podmonitor", name="5ff787742594c67500f255b9926c305246807e92303b43a19c7b95ba1d13dd59", namespace="cadvisor", pod="cadvisor-tqbj6"}

container_cpu_usage_seconds_total{container="cadvisor", container_label_io_kubernetes_container_name="application-controller", container_label_io_kubernetes_pod_namespace="argocd", cpu="total", endpoint="http", id="/kubepods/burstable/pod9a033e88-9e20-43ef-8632-4551484be608/cedd2605364b981d2b5ec2d5e1eb6ae23abc39d64acf984b85e4f73b8e0a2689", image="quay.io/argoproj/argocd:v2.4.11", instance="10.42.0.143:8080", job="monitoring/cadvisor-prometheus-podmonitor", name="cedd2605364b981d2b5ec2d5e1eb6ae23abc39d64acf984b85e4f73b8e0a2689", namespace="cadvisor", pod="cadvisor-tqbj6"}

container_cpu_usage_seconds_total{container="cadvisor", container_label_io_kubernetes_container_name="applicationset-controller", container_label_io_kubernetes_pod_namespace="argocd", cpu="total", endpoint="http", id="/kubepods/pod5fc900fe-c754-4fe6-a023-b132ab7b0693/6b7b4511e56a66368c210874739d34df90b229d4b69369556b2e9fcc0971abaa", image="quay.io/argoproj/argocd:v2.4.11", instance="10.42.0.143:8080", job="monitoring/cadvisor-prometheus-podmonitor", name="6b7b4511e56a66368c210874739d34df90b229d4b69369556b2e9fcc0971abaa", namespace="cadvisor", pod="cadvisor-tqbj6"}

What doesn't seem right:

  • All the containers now equal "cadvisor" instead of the value specified in container_label_io_kubernetes_container_name
  • All the namespace now equal "cadvisor" instead of the value specified in container_label_io_kubernetes_pod_namespace
  • All the pods now equal "cadvisor-tqbj6" instead of the value specified in id

A Prometheus Query of container_cpu_usage_seconds_total{image!="",container!="cadvisor"} returns an empty set.

Suggestions?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions