cAdvisor metrics stopped working correctly in K3s 1.20 #2831

eplightning · 2021-01-20T10:55:27Z

Environmental Info:
K3s Version:
k3s version v1.20.0+k3s2 (2ea6b16)
go version go1.15.5

Node(s) CPU architecture, OS, and Version:
Linux kube-master0 5.4.0-54-generic #60-Ubuntu SMP Fri Nov 6 10:37:59 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:
1 master, 2 workers, embedded containerd (no docker, no custom CRI)

Describe the bug:
cAdvisor is unable to connect to containerd resulting in mostly empty labels (container, image, name) in metrics.

Adding --kubelet-arg containerd=/run/k3s/containerd/containerd.sock to k3s launch args fixes the issue.

Steps To Reproduce:

k3s server

curl -k --cert /var/lib/rancher/k3s/server/tls/client-admin.crt --key /var/lib/rancher/k3s/server/tls/client-admin.key https://127.0.0.1:6443/api/v1/nodes/NODE_NAME/proxy/metrics/cadvisor

Expected behavior:
Metrics from cAdvisor have non-empty labels:

container_cpu_load_average_10s{container="fluent-bit",id="/kubepods/burstable/pod227a7799-c04b-419a-9d96-98b5ca911666/67d16f0aa8f6e214914c9769a55e8154a9f133646b36974d03b8a3f185ae3e38",image="docker.io/fluent/fluent-bit:1.6",name="67d16f0aa8f6e214914c9769a55e8154a9f133646b36974d03b8a3f185ae3e38",namespace="logging",pod="fluent-bit-vw9cg"} 0 1611138060164

Actual behavior:
cAdvisor metric's labels are empty

container_cpu_load_average_10s{container="",id="/kubepods/burstable/pod227a7799-c04b-419a-9d96-98b5ca911666/67d16f0aa8f6e214914c9769a55e8154a9f133646b36974d03b8a3f185ae3e38",image="",name="",namespace="",pod=""} 0 1611135942449

Additional context / logs:
Most likely regression caused by 5b318d0

That value is used for both argsMap["container-runtime-endpoint"] and argsMap["containerd"] and it seems the containerd one cannot be an URI

The text was updated successfully, but these errors were encountered:

brandond · 2021-01-20T10:59:08Z

Have you tried v1.20.2? Possibly related to kubernetes/kubernetes#97006

Although the fact that overriding the arg resolves it is interesting.

eplightning · 2021-01-20T11:48:59Z

Upgrading to v1.20.2+k3s1 (1d4adb0) didn't help, still need that extra kubelet arg for proper labels.

The only change I noticed was presence of additional metrics, presumably added by kubernetes/kubernetes#97006

mnorrsken · 2021-01-24T10:46:03Z

Great! It is working for me also by adding kubelet-arg. 👍
I had the same problem and I thought the "kubernetes masters" wanted me to do this in prometheus queries:
label_replace(container_cpu_load_average_10s,"container_id", "containerd://$1","id","(?:/.+){3}/(.*)") * on (container_id) group_left(container) (kube_pod_container_info)
😄 @eplightning you saved me a week of creating overcomplicated dashboards!

HaveFun83 · 2021-01-31T16:13:28Z

same here. Thanks a lot for the workaround.

ShylajaDevadiga · 2021-02-05T08:31:48Z

Issue was reproducible on k3s v1.20.2+k3s1

container_spec_cpu_shares{container="",id="/kubepods/burstable/pod9ac68d71-726d-476e-a3e0-3ad9f01765a6/445915b33977b7677603af439673633fc812ed98af2d3cdc1d85d3b937f6654a",image="",name="",namespace="",pod=""}

Validated metrics have non-empty labels using commit id k3s version v1.20.2+k3s-c5e2676d

container_cpu_load_average_10s{container="",id="/kubepods/burstable/podf75b023d-1a57-4228-826b-7f6e57ab978c/685fd28b35a34709ed4295841a54f2df7118813575eb12da095a43fccb92e0d9",image="docker.io/rancher/pause:3.1",name="685fd28b35a34709ed4295841a54f2df7118813575eb12da095a43fccb92e0d9",namespace="kube-system",pod="coredns-854c77959c-x42xt"} 0 1612513308508

lackhoa · 2021-03-18T16:47:58Z

Sorry for commenting on a closed issue, but I am using v1.20.2+k3s1 and adding the kubelet arg doesn't solve the issue for me.
Furthermore, is there any documentation on the containerd kubelet arg?

I can't see it anywhere in the config (other args I added are all there)
kubectl get --raw "/api/v1/nodes/${NODE}/proxy/configz"
I don't see it in the docs.

brandond · 2021-03-18T16:51:28Z

@lackhoa the fix isn't in v1.20.2+k3s1. QA tested on a post-release CI build (v1.20.2+k3s-c5e2676d) off master. That commit is included in v1.20.4+k3s1; please use that.

Not all kubelet args are documented; for historical reasons all cadvisor args are also valid kubelet args, despite their not being in the docs.

lackhoa · 2021-03-18T16:54:30Z

@brandond Ah thanks, I'll try that.
But given that the workaround worked for OP, I assume it'd work for me.

eplightning · 2021-03-18T17:26:01Z

I think the containerd flag was never actually documented since it's consumed by cAdvisor, not directly by kubelet. It's deprecated now but still required if you're running containerd on non-default socket path. More details kubernetes/kubernetes#89903

brandond · 2021-03-18T17:34:26Z

Yeah, it's one of those things that make it clear that dockershim is still the only thing that upstream actually tests, despite all the big talk about deprecating it. You run into all kinds of weird issues if you actually use a different runtime.

lackhoa · 2021-03-19T06:10:23Z

Alright, I can confirm that my problem has been resolved after upgrading to v1.20.4+k3s1, with the added containerd arg.
However, what tripped me up is that some time series belonging to the metrics container_cpu_cfs_periods_total still have empty container field.
Don't know what to make of it, but I'm happy that the container field is back.

brandond · 2021-03-19T07:20:38Z

You don't need to add the containerd arg on v1.20.4.

zalegrala · 2021-10-21T14:06:25Z

Are there docs for how to enable these metrics? I'm looking to use a dashboard that makes use of a metric kube_pod_container_info but I don't know how to enable it or what to scrape.

discordianfish · 2022-10-25T16:25:12Z

I'm running v1.24.2+k3s2 and still have to pass --kubelet-arg containerd=/run/k3s/containerd/containerd.sock to make this work.
Another regression?

brandond · 2022-11-22T08:11:53Z

@discordianfish No, you're just on an old version of K3s. Update to the latest 1.24 patch release.

brandond self-assigned this Jan 20, 2021

brandond added the kind/bug Something isn't working label Jan 20, 2021

brandond mentioned this issue Jan 21, 2021

Only container-runtime-endpoint wants RuntimeSocket path as URI #2836

Merged

brandond added this to To Triage in Development [DEPRECATED] via automation Jan 25, 2021

brandond moved this from To Triage to To Test in Development [DEPRECATED] Jan 25, 2021

brandond mentioned this issue Jan 29, 2021

Cadvisor not reporting Container/Image metadata #473

Closed

davidnuzik assigned ShylajaDevadiga Feb 3, 2021

davidnuzik added this to the v1.20.3+k3s1 milestone Feb 3, 2021

ShylajaDevadiga closed this as completed Feb 5, 2021

Development [DEPRECATED] automation moved this from To Test to Done Issue / Merged PR Feb 5, 2021

This was referenced Feb 5, 2021

cAdvisor metrics are missing pod network metrics #2895

Closed

cAdvisor stops updating container_cpu_usage_seconds_total #2881

Closed

ChipWolf mentioned this issue Feb 22, 2021

[prometheus-kube-stack] grafana not showing resource utilization (no data) prometheus-community/helm-charts#576

Closed

viceice mentioned this issue Mar 3, 2021

Documentation is incorrect for Prometheus troubleshooting lensapp/lens#180

Closed

kladiv mentioned this issue May 5, 2021

Grafana not showing resource utilisation in k3s 1.20 cablespaghetti/k3s-monitoring#6

Open

a-thaler mentioned this issue Jul 14, 2022

kublet metrics not available with k8s 1.24 kyma-project/kyma#14809

Closed

k3s-io locked and limited conversation to collaborators Nov 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cAdvisor metrics stopped working correctly in K3s 1.20 #2831

cAdvisor metrics stopped working correctly in K3s 1.20 #2831

eplightning commented Jan 20, 2021

brandond commented Jan 20, 2021 •

edited

Loading

eplightning commented Jan 20, 2021

mnorrsken commented Jan 24, 2021

HaveFun83 commented Jan 31, 2021

ShylajaDevadiga commented Feb 5, 2021

lackhoa commented Mar 18, 2021 •

edited

Loading

brandond commented Mar 18, 2021 •

edited

Loading

lackhoa commented Mar 18, 2021

eplightning commented Mar 18, 2021

brandond commented Mar 18, 2021

lackhoa commented Mar 19, 2021

brandond commented Mar 19, 2021

zalegrala commented Oct 21, 2021

discordianfish commented Oct 25, 2022

brandond commented Nov 22, 2022

cAdvisor metrics stopped working correctly in K3s 1.20 #2831

cAdvisor metrics stopped working correctly in K3s 1.20 #2831

Comments

eplightning commented Jan 20, 2021

brandond commented Jan 20, 2021 • edited Loading

eplightning commented Jan 20, 2021

mnorrsken commented Jan 24, 2021

HaveFun83 commented Jan 31, 2021

ShylajaDevadiga commented Feb 5, 2021

lackhoa commented Mar 18, 2021 • edited Loading

brandond commented Mar 18, 2021 • edited Loading

lackhoa commented Mar 18, 2021

eplightning commented Mar 18, 2021

brandond commented Mar 18, 2021

lackhoa commented Mar 19, 2021

brandond commented Mar 19, 2021

zalegrala commented Oct 21, 2021

discordianfish commented Oct 25, 2022

brandond commented Nov 22, 2022

brandond commented Jan 20, 2021 •

edited

Loading

lackhoa commented Mar 18, 2021 •

edited

Loading

brandond commented Mar 18, 2021 •

edited

Loading