-
Notifications
You must be signed in to change notification settings - Fork 479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node details dashboard has missing metrics with containerd #2800
Comments
Containerd exposes a configurable prometheus compatible metrics endpoint (not part of the default config.toml) that we do not set yet.
Not sure if containerd even exposes Network IO - would need some more investigation. |
Any plans to mitigate this issue @wyb1 @danielfoehrKn ? |
Still working on other issues - but I intend to pick it up when picking up ContainerRuntimes again. Otherwise @wyb1 might take a look |
/ping @wyb1 @istvanballok |
Message
|
My latest info is that the cadvisor component is (currently) not exposing some metrics if the container runtime is containerd. This is the reason why the panels are empty on the screenshot above. cc @voelzmo |
The Gardener project currently lacks enough active contributors to adequately respond to all issues and PRs.
/close |
@gardener-ci-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/reopen Looking into this issue I found that actually the network related metrics are exposed by the cadvisor. gardener/charts/seed-monitoring/charts/core/charts/prometheus/templates/config.yaml Lines 168 to 170 in 21ac430
If the container label is empty, the series is dropped. This heuristic was probably introduced to keep only relevant metrics. With docker as container runtime, the container label was "POD". With containerd, it is empty. Note that for network related metrics, the container label doesn't make sense (and hence the empty value is correct), because the containers of a pod share the same network namespace and hence the network related metrics can not distinguish between containers. |
@istvanballok: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/remove-lifecycle rotten |
The runtime cgroup is the cgroup path the container runtime is expected to be isolated in. https://github.com/kubernetes/kubernetes/blob/efa5692c0b5f01bd33d8a112ab98b386300198e7/pkg/kubelet/config/flags.go#L31 Without this flag, the cadvisor metrics exposed by the kubelet via ``` k proxy curl -s http://localhost:8001/api/v1/nodes/<node>/proxy/metrics/cadvisor ``` in a cluster with containerd as container runtime, only contain metrics for the `/system.slice/kubelet.service`. With this command line flag, metrics are reported for both `/system.slice/kubelet.service` and `/system.slice/containerd.service`. This is the expected behavior based on the experience with clusters that use docker as a container runtime: in those clusters, metrics are reported for both the kubelet.service and the docker.service. Consequently in clusters with containerd, one would expect metrics for both the kubelet.service and the containerd.service. See the system services panels in the issue gardener#2800 Co-authored-by: Wesley Bermbach <wesley.bermbach@sap.com> Co-authored-by: Istvan Zoltan Ballok <istvan.zoltan.ballok@sap.com> Co-authored-by: Jeremy Rickards <jeremy.rickards@sap.com>
The runtime cgroup is the cgroup path the container runtime is expected to be isolated in. https://github.com/kubernetes/kubernetes/blob/efa5692c0b5f01bd33d8a112ab98b386300198e7/pkg/kubelet/config/flags.go#L31 Without this flag, the cadvisor metrics exposed by the kubelet via ``` k proxy curl -s http://localhost:8001/api/v1/nodes/<node>/proxy/metrics/cadvisor ``` in a cluster with containerd as container runtime, only contain metrics for the `/system.slice/kubelet.service`. With this command line flag, metrics are reported for both `/system.slice/kubelet.service` and `/system.slice/containerd.service`. This is the expected behavior based on the experience with clusters that use docker as container runtime: in those clusters, metrics are reported for both the kubelet.service and the docker.service. Consequently in clusters with containerd, one would expect metrics for both the kubelet.service and the containerd.service. See the system services panels in the issue gardener#2800 Co-authored-by: Wesley Bermbach <wesley.bermbach@sap.com> Co-authored-by: Istvan Zoltan Ballok <istvan.zoltan.ballok@sap.com> Co-authored-by: Jeremy Rickards <jeremy.rickards@sap.com>
The runtime cgroup is the cgroup path the container runtime is expected to be isolated in. https://github.com/kubernetes/kubernetes/blob/efa5692c0b5f01bd33d8a112ab98b386300198e7/pkg/kubelet/config/flags.go#L31 Without this flag, the cadvisor metrics exposed by the kubelet via ``` k proxy curl -s http://localhost:8001/api/v1/nodes/<node>/proxy/metrics/cadvisor ``` in a cluster with containerd as container runtime, only contain metrics for the `/system.slice/kubelet.service`. With this command line flag, metrics are reported for both `/system.slice/kubelet.service` and `/system.slice/containerd.service`. This is the expected behavior based on the experience with clusters that use docker as container runtime: in those clusters, metrics are reported for both the kubelet.service and the docker.service. Consequently in clusters with containerd, one would expect metrics for both the kubelet.service and the containerd.service. See the system services panels in the issue gardener#2800 Co-authored-by: Wesley Bermbach <wesley.bermbach@sap.com> Co-authored-by: Istvan Zoltan Ballok <istvan.zoltan.ballok@sap.com> Co-authored-by: Jeremy Rickards <jeremy.rickards@sap.com>
The runtime cgroup is the cgroup path the container runtime is expected to be isolated in. https://github.com/kubernetes/kubernetes/blob/efa5692c0b5f01bd33d8a112ab98b386300198e7/pkg/kubelet/config/flags.go#L31 Without this flag, the cadvisor metrics exposed by the kubelet via ``` k proxy curl -s http://localhost:8001/api/v1/nodes/<node>/proxy/metrics/cadvisor ``` in a cluster with containerd as container runtime, only contain metrics for the `/system.slice/kubelet.service`. With this command line flag, metrics are reported for both `/system.slice/kubelet.service` and `/system.slice/containerd.service`. This is the expected behavior based on the experience with clusters that use docker as container runtime: in those clusters, metrics are reported for both the kubelet.service and the docker.service. Consequently in clusters with containerd, one would expect metrics for both the kubelet.service and the containerd.service. See the system services panels in the issue #2800 Co-authored-by: Wesley Bermbach <wesley.bermbach@sap.com> Co-authored-by: Istvan Zoltan Ballok <istvan.zoltan.ballok@sap.com> Co-authored-by: Jeremy Rickards <jeremy.rickards@sap.com> Co-authored-by: Wesley Bermbach <wesley.bermbach@sap.com> Co-authored-by: Jeremy Rickards <jeremy.rickards@sap.com>
How to categorize this issue?
/area monitoring
/kind bug
/priority normal
What happened:
The node details dashboard has missing metrics when the shoot is configured to use containerd.
Network I/O pressure is missing and the system service usage is also missing.
What you expected to happen:
The dashboard should contain the data. Example with docker:
How to reproduce it (as minimally and precisely as possible):
Create a shoot with
check the node details dashboard and see the missing data.
Anything else we need to know?:
Environment:
kubectl version
): 1.18.5The text was updated successfully, but these errors were encountered: