Description
Describe the bug
Im trying to use the metric adv_node_apiserver_no_response to detect apiserver problems in my clusters.
| `adv_node_apiserver_no_response` | ***Advanced***: number of packets that did not get a response from API server |
I created a "vanilla" EKS cluster with K8s 1.30 and then look at the value reported by the retina metric.
To my surprise I saw it was increasing over and over. The cluster, besides that metric, does not seem to have issues or apiserver problems so I enabled debug loglevel to see what the actual problem was.
But from here https://github.com/microsoft/retina/blame/b76bbcdb0ede6665d75755eaf0d8fd0ae07edfd0/pkg/module/metrics/latency.go#L126-L130 logged message from zap.Any is always
"unsupported value type"
which is not informative or helpful
Example of the messages I get
ts=2025-06-12T14:49:33.339Z level=debug caller=metrics/latency.go:126 msg="Evicted item" item= itemError="unsupported value type"
ts=2025-06-12T14:43:38.295Z level=debug caller=metrics/latency.go:129 msg="Incremented no response metric" metric= metricError="unsupported value type"
...
....
...
To Reproduce
Steps to reproduce the behavior:
- Create a EKS cluster with version 1.30
- Change loglevel to Debug
Expected behavior
1 I expect the metric or item error to be more descriptive.
Platform (please complete the following information):
- Kubernetes Version: 1.30
- Host: EKS
- Retina Version: v0.0.34
Metadata
Metadata
Assignees
Labels
Type
Projects
Status