New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Old Kubernetes SD endpoints are still "discovered" and scraped despite no longer existing #10257
Comments
What is the Kubernetes version involved here? |
|
I am not completely convinced this is strictly caused by kubernetes discovery. Target groups for the endpoints role are always generated from scratch, and if Prometheus is getting stale reads from the informer, it wouldn't see the new pods at all. It is also strange that there's only one stale pod in the target group while new deletions are properly picked up. Could the problem be somewhere in the scraping pool instead? |
@jutley if you still have the faulty Kubernetes endpoint, you can try checking if there are any |
We're tracking similar symptoms as well in multiple cases (https://bugzilla.redhat.com/show_bug.cgi?id=1943860). |
We've now discovered this on our PolarSignals GKE cluster too. In the meantime I've deleted the old Pods with:
|
Alright, look a bit around since the issue started showing up over the weekend again. Looking at the Service of one of our Deployments I could see too many endpoints and looking at the endpoint itself, I can see exactly that one Pod still showing up under
This was visible on
Does it seem as if Prometheus isn't properly filtering the |
To me it looks like prometheus/discovery/kubernetes/endpoints.go Lines 304 to 306 in e239e3e
Not sure if this is intended or not though. A solution could be to exclude pods in the |
In the service discovery I can see the On my personal Scaleway cluster these unready Pods will be cleaned up quite quickly by the API server and won't be staying in the namespace for long, whereas it seems that GKE, for example, keeps the Knowing that, I'm inclined to exclude all non-ready Pods in the Prometheus SD in our environment. |
Reading kubernetes/kubernetes#99986 your assumption seems correct. GC is configurable with a flag in kube-controller-manager |
This doesn't see to be configurable on GKE, so at least there we'll probably going to have to filter out all unready endpoints 🤷♂️ |
This happens for us as well, but here |
I just want to add another datapoint here; I'm seeing the same issue as the original post (prometheus 2.32.1, only one in the HA pair showing the issue), and I can confirm it's not unready endpoints (same as @nathan-vp, stale targets all have SD seems to keep working though (adding and removing targets), as can be seen in the screenshot, apparently with the stale targets intact: |
That is super interesting. Can we see:
Thanks! |
I think the next step would be to get a go routine dump. https://prometheus/debug/pprof/goroutine?debug=2 (You can put the output in e.g. a gist). |
go routine for prometheus-k8s-1 (the instance that shows problems) here: https://gist.github.com/pharaujo/eb2d4697e8352883ce8c050f8666003c |
Do you have an idea for how long this instance is not updating its targets properly> |
As far as I can tell, since the restart roughly 2 days ago (visible in the first screenshot I sent). |
What comes into my mind is that we could get to this specific return: prometheus/discovery/kubernetes/endpoints.go Line 194 in 44fcf87
Without sending an empty targetgroup for those endpoints. I'll need to dig firter. i do wonder if there is an easy reproducible way. |
Prometheus: v2.32.1 Occasionaly running into this as well, so another data point. Most recent occurence:
Trying to trigger refresh:
Fwiw (n=2): Occurences so far have been single pod targets. Dump: https://gist.github.com/TBeijen/d99b88d0123d43a6e3b8d191bf4e34a6 In below screenshots stale target situation started ~17:30 |
Happened again in another Kubernetes cluster, right after the prometheus pod was moved to another node. Both pods in the HA pair restarted, only one show the problem. Only 1 in 57 scrape jobs has a "phantom" target. |
Experienced it too. From my tests, to force "forgeting" outdated targets:
|
Hi, I have similar issue, when we are migrating from one version of own nodeexporter to another version, we have observed in endpoints ---> the pod ip of old version is tagged to new version of port number in kubectl get endpoints, and after some time the new pod ip is updated successfully. So do you have any idea of resolving the issue. For clear understanding: |
What did you see instead? Under which circumstances?
We have an alert that can fire when a target cannot be scraped. This began firing, and upon inspection, the target did not actually exist. The target was for a Kubernetes pod that was replaced. It no longer appeared in the Kubernetes apiserver, and its IP address was not in the relevant Services corresponding Endpoints resources.
We run redundant, identical Prometheuses, and this only happened in one of them.
To better understand this state, we manually deleted another Pod of the same Deployment. The affected Prometheus successfully removed the old Pod from its targets and added the new Pod. The original false Pod, however, was still in the target list.
Additionally, it's worth noting that this group of Pods are discovered a few times. We accidentally had two Services pointing to these Pods' metrics endpoint, and both were discovered using a single ServiceMonitor resource from the prometheus-operator. Only one of these Services shows the problem (four targets, including the false one). The other has the appropriate targets (three targets). We also probe these pods using the blackbox-exporter, which shows the exact same issues (seven targets, with three pairs of accidental duplicates, and one false target).
I have absolutely no idea how to reproduce this.
What did you do?
Nothing. The bug occurred while we were hands off.
What did you expect to see?
When the old Pod was removed and replaced, the Prometheus instance should have updated its targets accordingly by removing the old Pod.
Environment
Here is all the job configuration that targets/probes the relevant Pods/Services.
We did not find any interesting logs, but here are the logs surrounding the start of the bug:
The text was updated successfully, but these errors were encountered: