Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: Kubernetes Service Discovery should discover Pods #1445

Closed
pdbogen opened this Issue Mar 1, 2016 · 11 comments

Comments

Projects
None yet
6 participants
@pdbogen
Copy link
Contributor

pdbogen commented Mar 1, 2016

Current state: k8s SD discovers Services and their Endpoints; and Nodes.

My problem: I need to scrape metrics from Pods that are not ready. k8s does not put non-ready pods into a service, so prom's k8s SD module will never discover them.

Solution: Add pods to k8s sd.

One possibly issue is that prometheus would only be able to communicate directly with pods if it is within the cluster. That's how it is in my case, but this limitation might need to be surfaced somewhere/somehow. However, if prom is in cluster, then it can talk to individual pods using the pod's IP.

p.s., I chatted about this via IRC and was asked to tag @jimmidyson on this issue.

@matthiasr

This comment has been minimized.

Copy link
Contributor

matthiasr commented Mar 2, 2016

Since it is eventually scraping the endpoints, Prometheus needs to talk to the pod IPs already. I think this is a worthwhile change, the only question I have is whether this can be made a non-breaking change.

@jimmidyson

This comment has been minimized.

Copy link
Member

jimmidyson commented Mar 2, 2016

Really happy to have other people's requirements now! When I put this together it was quite opinionated (my opinions, my use cases!) so thanks for the input.

Prometheus talks to node & pod IPs directly, if you want to scrape service endpoints. That means that right now you have to deploy Prometheus in cluster if you want to scrape pods. One way of scraping pods from outside the cluster is to go through the API server proxy, but that would put extra load on the API server that you don't necessarily want to do. You could also use a custom proxy service, fronted by an ingress to do the same thing, but again that means routing all traffic through a couple of places, which might cause issue in your cluster depending on scale.

The current design for k8s discovery was to cover the 80/90% of use cases: scrape pods that have started up correctly (in ready state). Scraping during pod initialization (before ready state) is not that common in my experience as your more concerned in endpoints that are backing a service (as I said, this is not always the case). The ready state also indicates when a pod is ready to receive requests: scraping a pod that hasn't properly started up & e.g. hasn't started Prometheus exporter socket would result in scrape failure, which isn't terrible but needs to be managed.

Adding in watches on individual pods (annotated similarly to what is currently done on a service level) would complete any missing functionality, but needs to be done carefully. Watching pods would require careful filtering of events otherwise you could continually update target groups. Also ensuring that there is no duplication between service endpoint pods & individual pods would need to be done: very possible to have overlap & hence duplicate scraping/metric ingestion.

As for backwards compatibility, there is a warning about that in the docs - k8s discovery is still in beta & hence subject to change. While I wouldn't want to break it unnecessarily, if use cases demand change then I think we're OK to do that, with migration steps if possible (but that would be the call of the project maintainers, not me).

@matthiasr

This comment has been minimized.

Copy link
Contributor

matthiasr commented Mar 2, 2016

Sounds good. FWIW, whether or not you can reach pod IPs from outside the cluster depends on your network setup. Only service IPs are not really reachable (require kube-proxy).

I think the scrape-before-ready use case is very valid – you may have applications that take their time getting ready, but still provide insight before they are. Also, when an instance becomes unhealth / not ready, its metrics are still relevant. We even actively bypass all service discovery health checking (in non-Kubernetes SD) because of this, so I think I would move to discovering pods as soon as possible.

@jimmidyson

This comment has been minimized.

Copy link
Member

jimmidyson commented Mar 2, 2016

We can document the access to pod IPs & assume access to pod IPs from wherever Prometheus is deployed. That will make things simpler.

I'm happy to help review any implementation of this but unfortunately don't have time right now to do it myself. I'd be happy to switch to a pod only discovery totally, as long as we have way to correlate back to a service for aggregation.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 2, 2016

I think the scrape-before-ready use case is very valid – you may have applications that take their time getting ready, but still provide insight before they are.

We should default to scraping this way. Scraping when only healthy misses a lot of useful information.

@jimmidyson

This comment has been minimized.

Copy link
Member

jimmidyson commented Mar 2, 2016

Agreed. Kubernetes behaviour is such that here's not really the idea of health checks taking pods offline & then putting them back in service again: shoot it, start a new one is the stategy.

Readiness checks are only performed until they pods have started properly (fulfil readiness check). I take @matthiasr's point on the usefulness of metrics before readiness probe passes though.

Health checks in Kubernetes are called liveness probes - if a liveness probe fails the pod is killed & a new one is started up.

@pdbogen

This comment has been minimized.

Copy link
Contributor Author

pdbogen commented Mar 2, 2016

Readiness checks are only performed until they pods have started properly (fulfil readiness check).

This is not entirely accurate; readiness checks are performed continuously and can move vaults in and out of the "ready" state.

@jimmidyson

This comment has been minimized.

Copy link
Member

jimmidyson commented Mar 2, 2016

It's not just "not entirely accurate" it's totally wrong... oops! Sorry about that - it used to work like that afaik or maybe I've just always misunderstood it - thanks for enlightening me!

So agree that the lack of scraping unhealthy pods is bad. This discovery definitely needs updating to switch to use pods rather than endpoints.

@pdbogen

This comment has been minimized.

Copy link
Contributor Author

pdbogen commented Mar 2, 2016

I was aiming to be gentle. :)

I'm working on implementing pod discovery right now. For the moment, not in place of endpoints but in addition to. I can personally see the benefit of being able to discovery either way; so I'll leave it up to you guys to strip out the endpoint discovery code or leave it in. I think as long as users judiciously filter on discovered artifact type it shouldn't be a big deal either way.

@pdbogen

This comment has been minimized.

Copy link
Contributor Author

pdbogen commented Mar 2, 2016

As github noted, I've opened #1449 as a PR for this. It works in my testing, and I've written as many unit tests as there were for the existing kubernetes code- zero.

I made the decision to have as a target every (PodIP,Port) pair twice- one with http and one without. My expectation is the user will make extensively use of relabeling to pare this down to only the desired and correct targets. I'll set about determining whether this actually solves my problem now, and possibly update the PR...

@fabxc fabxc added this to the v1.0.0 milestone Apr 25, 2016

@fabxc fabxc closed this in #1449 May 20, 2016

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.