-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relying on undefined behavior - parsing kubernetes resource names #716
Comments
Thank you for bringing this to our attention. We agree that relying on the random suffix of a pod or other resource can be brittle and may not be a reliable method for extracting metadata. We acknowledge the importance of using labels to extract metadata, as this is their intended purpose. Moving forward, we will explore ways to implement this approach by using labels to extract metadata. |
@thockin Similar to the pods and external type.The OwnerReferences of the pods for deployment are replicaset,I think it is unreasonable to query the associated deployment name from apiserver by the process inside pod,Currently, we set deployment name with custom labels in the container through environment variables manually, which is not perfect. Can you provide some relevant suggestions? |
This is a good question - I am not an expert in the metrics but it seems like what you have is a feature request for more labels. Are you trying to get this information from inside the pod itself (its own parent ref), or just somewhere in the monitoring/collection path? Are you trying to extract the replicaset or the deployment, or both? What if a pod is not run under a deployment or replicaset? E.g. a StatefulSet or Job or something custom? |
Yes, this project operates based on three types of metrics, and considerations related to labels for resources and pods. For resource type metrics, we expect to at least add ownerreference when obtaining pod status through the kubelet path, and preferably use the top-level reference such as deployment. For pods type metrics, we expect to easily obtain this information within the pod (such as native rather than manually set environment variables) to facilitate the generation of custom metric labels. For pods that are not under deployment or replicaset, such as statefulset and daemonset, the parent reference of the pod can also be queried. For independent pods, the name is specified at creation and precise matching is also straightforward. I think it's acceptable to set the reference to null when querying. |
Sorry, I'm still not clear on what you are already doing (perhaps in ugly ways) and what you are asking for as net-new features.
Is there a feature request here?
As far as I can tell, |
If a pod's random suffix is fragile and could change at any time, then we would have two new requirements: The first requirement is to add owner or topowner labels to kubelet metrics such as container_cpu_usage_seconds_total. The second requirement is to enable processes within a pod to easily obtain owner or topowner information, which can be used to fill in monitoring metric labels. If the provided owner is a replicaset instead of a deployment, then when querying historical data with a Prometheus expression, we would need to use a query expression like {ownerReferences=~"test-111111|test-222222|test-3333333...}, which could become very ugly if the deployment version changes frequently. But an expression like {topOrKind="deployment",topOrName="test"} would be much simpler. |
I'm afraid the issue is not quite so easy. There are a bunch of issues. Deployments and ReplicaSets can orphan and adopt pods. Any additional information we provide can change over time. Is that acceptable? On Monday the pod "foobar-deadbeef93-xyz76" is owned by replicaset "foobar-deadbeef93" which is owned by deployment "foobar", but on tuesday it is owned by replicaset "hamburger-aabbccddee" and deployment "hamburger". The pod's name only means something at the instant it is created. And even THEN a long name can be truncated to add the random suffix, such that it doesnt actuyally match We could add well-known labels like workload.kubernetes.io/{deployment,replicaset,statuefulset,job,...} - this works for core k8s, but not for 3rd party workload controllers (which do exist!!). We'd have to verify that all of these APIs' object names are valid label values (hint: they are not). We could hash the deployment name if it is too long or use the UID. All that tells you is that 2 pods are in the same deployment, not what the deployment name actually is. We could use annotations instead, but those are not selectable. We could do a combination of labels AND annotations, I guess. Given the adoption issue, we would have to update those on the fly, meaning EVERY workload controller needs to handle it, and they might need more RBAC access. We can't just hash the selector, because an adoption could happen with a different-but-compatible selector. Oh, but env vars are not updatable, so the downward API for this could only be through projected volumes. |
Almost a year later this bug still exists. If kubernetes changes this behavior you will break. |
https://github.com/gocrane/crane/blob/main/pkg/utils/expression_prom_deafult_test.go#L18
The random suffix of a pod or other resource is not guaranteed to be 5 characters, nor is the characterset defined. Someone looked into the implementation details and took a dependency on them. This is brittle and could change at any time.
You should be using labels to extract metadata, not parsing the name. That's literally why they exist.
The text was updated successfully, but these errors were encountered: