New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When Topology Aware Hints are disabled, kube-proxy shouldn't spam the logs #124341
Comments
/sig network Someone has encountered the same problem and increased the log verbosity to 2 in PR #123322. This change will be included in Kubernetes version 1.30. You can wait 1 day for the 1.30 version to be released. After version 1.30 is released, you can upgrade the cluster to version 1.30, then as long as the log verbosity is less than 2 this problem can be avoided. |
/area logging |
Any chance we can see this back-ported to 1.28 and 1.29? It seems like a relatively safe thing to do, and 1.30 is going to take a while to get upgraded to (especially because we use EKS, which of course is a little bit delayed on the release cycle). |
anyone willing to take on this ? https://github.com/kubernetes/community/blob/master/contributors/devel/sig-release/cherry-picks.md /help |
@aojea: GuidelinesPlease ensure that the issue body includes answers to the following questions:
For more details on the requirements of such an issue, please see here and ensure that they are met. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign |
/close cherry picks were merged |
@aojea: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Thanks everyone! |
What happened?
We have a mix of smaller and larger clusters - and for our larger clusters it is critical that the Topology Aware Routing is used on some of our high volume services. it works great, and it's fallback mode is reasonable.
The problem is this line: https://github.com/kubernetes/kubernetes/blob/v1.28.9/pkg/proxy/topology.go#L168-L171
With this line in place, on a reasonably sized cluster, we emit hundreds of millions of these log messages daily:
A LogQL query shows that on one cluster, we had over 1.2 billion log events from this message:
Here's an hour of the data:
What did you expect to happen?
I understand that its useful to tell people that the zone aware routing isn't working - but doing it in log messages seems less than useful. I don't know anyone who operates clusters and monitors these log messages for errors, but rather would use metrics to alert on such a behavior.
I expect that the Kubernetes Service will report that Topology Aware Routing is or is not working (which it does) via the
kubectl describe
command - and that's it. Other than that, I expectkube-proxy
to consider this as some kind of a debug message and not emit it as aninfo
level message.How can we reproduce it (as minimally and precisely as possible)?
Create a service with
service.kubernetes.io/topology-mode: auto
in the annotations, and only put up one or two pods.. then start sending traffic to the service. Check logs.Anything else we need to know?
We're going to drop these messages with Promtail filtering ... but we really don't want them at all because it honestly takes time up on the hosts to write the messages to local disk, it also makes the
kubectl logs kube-proxy-...
command nearly useless, and it's just overall a waste of space I think.Kubernetes version
Cloud provider
OS version
AWS Bottlerocket 1.19.2
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
The text was updated successfully, but these errors were encountered: