Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes SD: Drop __meta_kubernetes_{pod,container}_name labels #1833

Closed
jimmidyson opened this Issue Jul 20, 2016 · 13 comments

Comments

Projects
None yet
4 participants
@jimmidyson
Copy link
Member

jimmidyson commented Jul 20, 2016

Currently Prometheus pod discovery adds the __meta_kubernetes_pod_name labels but this is also inherited via Docker labels that Kuberbetes sets as io.kubernetes.pod.name (available as sanitized Prometheus label io_kubernetes_pod_name) exposed directly from cadvisor.

This is the same for pod namespace & container name (io_kubernetes_pod_namespace & io_kubernetes_container_name respectively.

Seems reasonable to not duplicate labels to reduce data storage, in this case there are at least 3 labels that are duplicated for all pods.

This has been available from 1.2.0 pre-releases, GA release was 2016-03-16. Guess we need to pick a support policy for Kubernetes SD as well?

Any thoughts on this @pdbogen?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jul 20, 2016

In the spirit of 1.0, I don't think we should be dropping anything like this until 2.0.

@jimmidyson

This comment has been minimized.

Copy link
Member Author

jimmidyson commented Jul 20, 2016

Isn't this still marked as experimental? Even so, I kinda agree it should remain now I think about it. Closing...

@jimmidyson jimmidyson closed this Jul 20, 2016

@barkerd427

This comment has been minimized.

Copy link

barkerd427 commented Jul 20, 2016

Jimmi, could you explain why you think it should stay now?

On Jul 20, 2016 06:33, "Jimmi Dyson" notifications@github.com wrote:

Isn't this still marked as experimental? Even so, I kinda agree it should
remain now I think about it. Closing...


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1833 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AF0uxc1E8VR3uw42CiTSoQx_rir9E-3Tks5qXgeGgaJpZM4JQmED
.

@jimmidyson

This comment has been minimized.

Copy link
Member Author

jimmidyson commented Jul 20, 2016

io_kubernetes_pod_name, etc labels are only available post-scrape so can't be used to filter targets in relabelling phase, unlike the kubernetes_pod_name label that is added by discovery.

Thinking about it it seems to make more sense to use the same name in discovery as would be retrieved via scraping.

Another option is that they could be dropped in metric relabelling (post scrape, pre ingestion) phase I guess?

I'm erring towards renaming the discovered labels to match the scraped labels, but will defer to the project maintainers for what compatibility guarantees they want around this.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Jul 20, 2016

Just as a side thought: Duplicated labels do not create more time series and have negligible storage overhead, only indexing becomes a bit more expensive.

@jimmidyson

This comment has been minimized.

Copy link
Member Author

jimmidyson commented Jul 20, 2016

@beorn7 That is interesting - when I read:

CAUTION: Remember that every unique key-value label pair represents a new time series, which can dramatically increase the amount of data stored.

I assumed that reducing label duplication would be a good thing, but from what you've said it doesn't matter too much?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jul 20, 2016

I assumed that reducing label duplication would be a good thing, but from what you've said it doesn't matter too much?

Performance wise it doesn't matter much, however it's good practice that each label should help with distinguishing a time series.

This really sounds like a Cadvisor issue, as Prometheus can't do anything about arbitrary instrumentation labels.

@jimmidyson

This comment has been minimized.

Copy link
Member Author

jimmidyson commented Jul 20, 2016

cadvisor just exposes all Docker labels that are applied to the containers. Kubernetes applies the io.kubernetes.pod.name Docker label when creating the Docker container. Prometheus discovery uses kubernetes_pod_name label (no io_ prefix) & hence the duplication. If we aligned the Prometheus label with the Docker label that Kubernetes applies then we get rid of the duplication.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jul 20, 2016

Prometheus discovery uses kubernetes_pod_name label (no io_ prefix) & hence the duplication.

That must be coming from your configuration of Prometheus, as Prometheus does no mappings out of the box and I don't see it in the example either.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Jul 20, 2016

@beorn7 That is interesting - when I read:

CAUTION: Remember that every unique key-value label pair represents a new time series, which can dramatically increase the amount of data stored.

I assumed that reducing label duplication would be a good thing, but from what you've said it doesn't matter too much?

Yeah, the wording above is not entirely correct. It should read “... every unique combination of key-value label pairs …”. Label duplication doesn't increase the cardinality of unique label pairs.

Note that this is tangential to the semantics discussed in this issue. As Brian said, the fact that the storage deals well with it should not keep us from sane semantics.

@jimmidyson

This comment has been minimized.

Copy link
Member Author

jimmidyson commented Jul 20, 2016

@brian-brazil Doh! Yes that would be it... Sorry for the noise...

@jimmidyson

This comment has been minimized.

Copy link
Member Author

jimmidyson commented Jul 20, 2016

Actually this is a problem in Kubernetes - a relic from before it set Docker labels properly when creating containers. Fixing there.

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.