Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.25.3 cri-o - missing container_network* metrics in kubelet cAdvisor #6805

Closed
Noksa opened this issue Apr 12, 2023 · 9 comments
Closed

1.25.3 cri-o - missing container_network* metrics in kubelet cAdvisor #6805

Noksa opened this issue Apr 12, 2023 · 9 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@Noksa
Copy link

Noksa commented Apr 12, 2023

What happened?

Containter_network metrics are missing once we upgraded 1.25.2 to 1.25.3:

❯ ~ k get --raw /api/v1/nodes/NODE_NAME/proxy/metrics/cadvisor | grep "container_network"
# HELP container_network_receive_bytes_total Cumulative count of bytes received
# TYPE container_network_receive_bytes_total counter
container_network_receive_bytes_total{container="",id="/",image="",interface="cali4d130b05435",name="",namespace="",pod=""} 3.195685419e+09 1681300536414
container_network_receive_bytes_total{container="",id="/",image="",interface="caliecaa2707004",name="",namespace="",pod=""} 3.82519493e+08 1681300536414
container_network_receive_bytes_total{container="",id="/",image="",interface="ens3",name="",namespace="",pod=""} 4.698722634e+09 1681300536414
container_network_receive_bytes_total{container="",id="/",image="",interface="tunl0",name="",namespace="",pod=""} 2.596887196e+09 1681300536414
# HELP container_network_receive_errors_total Cumulative count of errors encountered while receiving

So - no metrics per container/pod.

If we use 1.25.2 cri-o, we see the metrics:

❯ ~ k get --raw "/api/v1/nodes/NODE/proxy/metrics/cadvisor" | grep "container_network_receive_bytes_total"
# HELP container_network_receive_bytes_total Cumulative count of bytes received
# TYPE container_network_receive_bytes_total counter
container_network_receive_bytes_total{container="",id="/",image="",interface="cali5488198f6fd",name="",namespace="",pod=""} 9.5952393e+07 1681300719581
container_network_receive_bytes_total{container="",id="/",image="",interface="cali5eda668cb9c",name="",namespace="",pod=""} 5.823608e+06 1681300719581
container_network_receive_bytes_total{container="",id="/",image="",interface="enp0s6",name="",namespace="",pod=""} 7.86041475e+08 1681300719581
container_network_receive_bytes_total{container="",id="/",image="",interface="tunl0",name="",namespace="",pod=""} 3.6428018e+07 1681300719581
container_network_receive_bytes_total{container="POD",id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod05c79ce2_b716_4441_9882_e70507c657eb.slice/crio-c2c3ded15e26afaff185237c3e6e619fb3cc9c91ebee4de01846033572dc1941.scope",image="",interface="cali5488198f6fd",name="k8s_POD_kube-proxy-szlft_kube-system_05c79ce2-b716-4441-9882-e70507c657eb_0",namespace="kube-system",pod="kube-proxy-szlft"} 9.5919134e+07 1681300708300
container_network_receive_bytes_total{container="POD",id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod05c79ce2_b716_4441_9882_e70507c657eb.slice/crio-c2c3ded15e26afaff185237c3e6e619fb3cc9c91ebee4de01846033572dc1941.scope",image="",interface="cali5eda668cb9c",name="k8s_POD_kube-proxy-szlft_kube-system_05c79ce2-b716-4441-9882-e70507c657eb_0",namespace="kube-system",pod="kube-proxy-szlft"} 5.821388e+06 1681300708300
container_network_receive_bytes_total{container="POD",id="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod05c79ce2_b716_4441_9882_e70507c657eb.slice/crio-c2c3ded15e26afaff185237c3e6e619fb3cc9c91ebee4de01846033572dc1941.scope",image="",interface="enp0s6",name="k8s_POD_kube-proxy-szlft_kube-system_05c79ce2-b716-4441-9882-e70507c657eb_0",namespace="kube-system",pod="kube-proxy-szlft"} 7.85988116e+08 1681300708300

K8s version remains the same.

What did you expect to happen?

Container network metrics should work as before but they don't work in 1.25.3

If we downgrade cri-o to 1.25.2 the metrics start to work.

How can we reproduce it (as minimally and precisely as possible)?

Install 1.25.3 cri-o and 1.25.8 / 1.26.3 k8s

Anything else we need to know?

No response

CRI-O and Kubernetes version

$ crio --version
crio version 1.25.3
Version:        1.25.3
GitCommit:      unknown
GitCommitDate:  unknown
GitTreeState:   clean
BuildDate:      2023-04-04T14:42:21Z
GoVersion:      go1.19
Compiler:       gc
Platform:       linux/arm64
Linkmode:       dynamic
BuildTags:
  apparmor
  seccomp
  containers_image_ostree_stub
  exclude_graphdriver_btrfs
  exclude_graphdriver_devicemapper
  containers_image_openpgp
LDFlags:          -s -w -X github.com/cri-o/cri-o/internal/pkg/criocli.DefaultsPath="" -X github.com/cri-o/cri-o/internal/version.buildDate=2023-04-04T14:42:21Z
SeccompEnabled:   true
AppArmorEnabled:  false
Dependencies:
$ kubectl --version
Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.3", GitCommit:"9e644106593f3f4aa98f8a84b23db5fa378900bd", GitTreeState:"clean", BuildDate:"2023-03-15T13:33:11Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.8", GitCommit:"0ce7342c984110dfc93657d64df5dc3b2c0d1fe9", GitTreeState:"clean", BuildDate:"2023-03-15T13:33:02Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"linux/arm64"}

OS version

# On Linux:
$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
$ uname -a
Linux kazoo5-k8s-cirrus1-sydney-amqp-kazoo5-k8s-cirrus1-sydn-f5858c91 5.15.0-1032-oracle #38~20.04.1-Ubuntu SMP Thu Mar 23 20:47:35 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

Additional environment details (AWS, VirtualBox, physical, etc.)

Oracle self-managed k8s on ubuntu

@Noksa Noksa added the kind/bug Categorizes issue or PR as related to a bug. label Apr 12, 2023
@sohankunkerkar
Copy link
Member

It's related to #6657

@rkojedzinszky
Copy link
Contributor

Seems that on branch release-1.25 565587e causes the wrong behavior.

@haircommander
Copy link
Member

@n4j can you PTAL?

@n4j
Copy link
Contributor

n4j commented Apr 17, 2023

@haircommander Sure.

I am on paternity till 10th May, I'll get started post that.

/assign

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 18, 2023
@haircommander haircommander added the stats an issue with CRI-O's stats or metrics collection label Jun 21, 2023
@haircommander
Copy link
Member

I believe this is fixed now. A new version of 1.25 will be cut semi soon, and it will be fixed there

@haircommander haircommander removed the stats an issue with CRI-O's stats or metrics collection label Jun 21, 2023
@atirek89
Copy link

Hello Everyone, We are facing same issue in crio version 1.25.3, any idea which 1.25 cri-o version solves this issue?

@sohankunkerkar
Copy link
Member

I think we might need to cut a new minor release for 1.25 to get in #6930
cc @haircommander

@haircommander
Copy link
Member

Would you like to begin that process @sohankunkerkar ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

6 participants