Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Old Kubernetes SD endpoints are still "discovered" and scraped despite no longer existing #10257

Open
jutley opened this issue Feb 3, 2022 · 24 comments

Comments

@jutley
Copy link

jutley commented Feb 3, 2022

What did you see instead? Under which circumstances?

We have an alert that can fire when a target cannot be scraped. This began firing, and upon inspection, the target did not actually exist. The target was for a Kubernetes pod that was replaced. It no longer appeared in the Kubernetes apiserver, and its IP address was not in the relevant Services corresponding Endpoints resources.

We run redundant, identical Prometheuses, and this only happened in one of them.

To better understand this state, we manually deleted another Pod of the same Deployment. The affected Prometheus successfully removed the old Pod from its targets and added the new Pod. The original false Pod, however, was still in the target list.

Additionally, it's worth noting that this group of Pods are discovered a few times. We accidentally had two Services pointing to these Pods' metrics endpoint, and both were discovered using a single ServiceMonitor resource from the prometheus-operator. Only one of these Services shows the problem (four targets, including the false one). The other has the appropriate targets (three targets). We also probe these pods using the blackbox-exporter, which shows the exact same issues (seven targets, with three pairs of accidental duplicates, and one false target).

I have absolutely no idea how to reproduce this.

What did you do?

Nothing. The bug occurred while we were hands off.

What did you expect to see?

When the old Pod was removed and replaced, the Prometheus instance should have updated its targets accordingly by removing the old Pod.

Environment

  • System information:
Linux 5.4.155-flatcar x86_64
  • Prometheus version:
prometheus, version 2.32.1 (branch: HEAD, revision: 41f1a8125e664985dd30674e5bdf6b683eff5d32)
  build user:       root@54b6dbd48b97
  build date:       20211217-22:08:06
  go version:       go1.17.5
  platform:         linux/amd64
  • Prometheus configuration file:

Here is all the job configuration that targets/probes the relevant Pods/Services.

- job_name: serviceMonitor/kube-dns/core-dns/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  follow_redirects: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_k8s_app]
    separator: ;
    regex: kube-dns
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_k8s_app]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: metrics
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    namespaces:
      names:
      - kube-dns
- job_name: serviceMonitor/kube-dns/dns-server-health/0
  honor_timestamps: true
  params:
    module:
    - dns_kubernetes_svc
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /probe
  scheme: http
  follow_redirects: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_k8s_app]
    separator: ;
    regex: kube-dns
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: dns
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: dns
    action: replace
  - source_labels: [__meta_kubernetes_pod_ip]
    separator: ;
    regex: (.*)
    target_label: __param_target
    replacement: $1
    action: replace
  - source_labels: [__param_target]
    separator: ;
    regex: (.*)
    target_label: instance
    replacement: $1
    action: replace
  - separator: ;
    regex: (.*)
    target_label: __address__
    replacement: blackbox-exporter.prometheus:9115
    action: replace
  - separator: ;
    regex: (.*)
    target_label: job
    replacement: dns-kubernetes-svc-blackbox
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    namespaces:
      names:
      - kube-dns
- job_name: serviceMonitor/kube-dns/dns-server-health/1
  honor_timestamps: true
  params:
    module:
    - dns_route53_record
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /probe
  scheme: http
  follow_redirects: true
  relabel_configs:
  - source_labels: [job]
    separator: ;
    regex: (.*)
    target_label: __tmp_prometheus_job_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_label_k8s_app]
    separator: ;
    regex: kube-dns
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: dns
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_container_name]
    separator: ;
    regex: (.*)
    target_label: container
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: dns
    action: replace
  - source_labels: [__meta_kubernetes_pod_ip]
    separator: ;
    regex: (.*)
    target_label: __param_target
    replacement: $1
    action: replace
  - source_labels: [__param_target]
    separator: ;
    regex: (.*)
    target_label: instance
    replacement: $1
    action: replace
  - separator: ;
    regex: (.*)
    target_label: __address__
    replacement: blackbox-exporter.prometheus:9115
    action: replace
  - separator: ;
    regex: (.*)
    target_label: job
    replacement: dns-route53-record-blackbox
    action: replace
  - source_labels: [__address__]
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep
  kubernetes_sd_configs:
  - role: endpoints
    kubeconfig_file: ""
    follow_redirects: true
    namespaces:
      names:
      - kube-dns
  • Logs:

We did not find any interesting logs, but here are the logs surrounding the start of the bug:

ts=2022-02-03T09:00:09.464Z caller=compact.go:518 level=info component=tsdb msg="write block" mint=1643868000025 maxt=16
43875200000 ulid=01FTZCZPTXY02H4J1RRGFC133J duration=9.242232921s
ts=2022-02-03T09:00:09.935Z caller=head.go:812 level=info component=tsdb msg="Head GC completed" duration=468.087779ms
ts=2022-02-03T09:00:09.999Z caller=checkpoint.go:98 level=info component=tsdb msg="Creating checkpoint" from_segment=747
 to_segment=749 mint=1643875200000
ts=2022-02-03T09:00:17.901Z caller=head.go:981 level=info component=tsdb msg="WAL checkpoint complete" first=747 last=74
9 duration=7.902867284s
ts=2022-02-03T11:00:09.509Z caller=compact.go:518 level=info component=tsdb msg="write block" mint=1643875200145 maxt=16
43882400000 ulid=01FTZKVE2Y697HQ7XP60J31F04 duration=9.286627633s
ts=2022-02-03T11:00:10.009Z caller=head.go:812 level=info component=tsdb msg="Head GC completed" duration=497.591114ms
ts=2022-02-03T11:00:10.062Z caller=checkpoint.go:98 level=info component=tsdb msg="Creating checkpoint" from_segment=750
 to_segment=752 mint=1643882400000
ts=2022-02-03T11:00:18.421Z caller=head.go:981 level=info component=tsdb msg="WAL checkpoint complete" first=750 last=75
2 duration=8.359750705s
ts=2022-02-03T11:00:48.763Z caller=compact.go:459 level=info component=tsdb msg="compact blocks" count=3 mint=1643846400
080 maxt=1643868000000 ulid=01FTZKVZVPMF7D3E9543S6MH6H sources="[01FTYRCH2Z443XHBNB2FF41DNN 01FTYZ88AYXA5K7VAPCMXBT7BB 0
1FTZ63ZJYAK50QS1GHT7DCXJE]" duration=30.340985514s
ts=2022-02-03T11:00:48.782Z caller=db.go:1279 level=info component=tsdb msg="Deleting obsolete block" block=01FTYRCH2Z44
3XHBNB2FF41DNN
ts=2022-02-03T11:00:48.797Z caller=db.go:1279 level=info component=tsdb msg="Deleting obsolete block" block=01FTYZ88AYXA
5K7VAPCMXBT7BB
ts=2022-02-03T11:00:48.814Z caller=db.go:1279 level=info component=tsdb msg="Deleting obsolete block" block=01FTZ63ZJYAK
50QS1GHT7DCXJE
ts=2022-02-03T13:00:14.248Z caller=compact.go:518 level=info component=tsdb msg="write block" mint=1643882400041 maxt=16
43889600000 ulid=01FTZTQ5AYKX9TKQJ0STSR6VG1 duration=14.026654896s
ts=2022-02-03T13:00:14.737Z caller=head.go:812 level=info component=tsdb msg="Head GC completed" duration=485.328335ms
ts=2022-02-03T13:00:14.806Z caller=checkpoint.go:98 level=info component=tsdb msg="Creating checkpoint" from_segment=753
 to_segment=755 mint=1643889600000
ts=2022-02-03T13:00:19.997Z caller=head.go:981 level=info component=tsdb msg="WAL checkpoint complete" first=753 last=75
5 duration=5.190818906s
ts=2022-02-03T15:00:15.284Z caller=compact.go:518 level=info component=tsdb msg="write block" mint=1643889600052 maxt=16
43896800000 ulid=01FV01JWJYNX4MDBA0Q4RFZ5YR duration=15.061840545s
ts=2022-02-03T15:00:15.935Z caller=head.go:812 level=info component=tsdb msg="Head GC completed" duration=647.428299ms
ts=2022-02-03T15:00:15.998Z caller=checkpoint.go:98 level=info component=tsdb msg="Creating checkpoint" from_segment=756
 to_segment=758 mint=1643896800000
ts=2022-02-03T15:00:21.498Z caller=head.go:981 level=info component=tsdb msg="WAL checkpoint complete" first=756 last=75
8 duration=5.499604249s
@brancz
Copy link
Member

brancz commented Feb 6, 2022

What is the Kubernetes version involved here?

cc @simonpasquier @fpetkovski @philipgough

@jutley
Copy link
Author

jutley commented Feb 7, 2022

Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:53:14Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}

@fpetkovski
Copy link
Contributor

I am not completely convinced this is strictly caused by kubernetes discovery. Target groups for the endpoints role are always generated from scratch, and if Prometheus is getting stale reads from the informer, it wouldn't see the new pods at all. It is also strange that there's only one stale pod in the target group while new deletions are properly picked up.

Could the problem be somewhere in the scraping pool instead?

@fpetkovski
Copy link
Contributor

fpetkovski commented Feb 10, 2022

@jutley if you still have the faulty Kubernetes endpoint, you can try checking if there are any notReadyAddresses in the subsets field. These IPs will not be shown in the regular kubectl get endpoints output, you will need to add a -o yaml, but Prometheus will pick them up as well and try to scrape them.

@jan--f
Copy link
Contributor

jan--f commented Feb 10, 2022

We're tracking similar symptoms as well in multiple cases (https://bugzilla.redhat.com/show_bug.cgi?id=1943860).
I agree with @fpetkovski, the k8s SD and Informer side looks unsuspicious.
The cases we have seen seem to coincide with either heavy load on the apiserver (i.e. request throttling is in effect) or temporary outage/churn of in the apiserver (due to say a node shutting down). Sometimes both those parameters coincide with the bug result here.
The bugzilla link also further links to other bugzillas and an open issue in kube-state-metrics: kubernetes/kube-state-metrics#1569

@metalmatze
Copy link
Member

We've now discovered this on our PolarSignals GKE cluster too.
I'll take a closer look at it next week.

In the meantime I've deleted the old Pods with:

kubectl delete pod -n observability --field-selector=status.phase==Succeeded
kubectl delete pod -n observability --field-selector=status.phase==Failed

@metalmatze
Copy link
Member

Alright, look a bit around since the issue started showing up over the weekend again.
It seems like Prometheus itself is doing just fine. I don't see any errors and all Prometheus Kubernetes SD looks just fine too.

prom1
prom2
prom3

Looking at the Service of one of our Deployments I could see too many endpoints and looking at the endpoint itself, I can see exactly that one Pod still showing up under notReadyAddresses:

apiVersion: v1
kind: Endpoints
metadata:
  annotations:
    endpoints.kubernetes.io/last-change-trigger-time: "2022-02-21T10:11:17Z"
  creationTimestamp: "2021-01-04T13:41:25Z"
  labels:
    api.polarsignals.com/groupcache: profile-sharing
    app.kubernetes.io/component: api
    app.kubernetes.io/instance: api
    app.kubernetes.io/name: polarsignals-api
    app.kubernetes.io/version: 514d7d2c4818546255f2ba10beba7c4e348c27264c580116a87a936f0afee08
  name: api
  namespace: api
  resourceVersion: "289624038"
  uid: 918bdc6c-5d22-4ee9-9ce9-f280db4865b2
subsets:
- addresses:
  - ip: 10.0.0.47
    nodeName: gke-europe-west3-0-e2-medium-b3fa18ca-hg90
    targetRef:
      kind: Pod
      name: api-86d654d6b8-zkpr2
      namespace: api
      resourceVersion: "289623730"
      uid: fadfd43e-1648-45f8-bcb7-accd58f0780e
  - ip: 10.0.0.48
    nodeName: gke-europe-west3-0-e2-medium-b3fa18ca-hg90
    targetRef:
      kind: Pod
      name: api-86d654d6b8-gg7lf
      namespace: api
      resourceVersion: "289624035"
      uid: 83ad9041-0ea9-4d03-965a-2eabe05f9a33
  - ip: 10.0.1.48
    nodeName: gke-europe-west3-0-e2-medium-716d4dc9-l67q
    targetRef:
      kind: Pod
      name: api-86d654d6b8-6xsc8
      namespace: api
      resourceVersion: "289623693"
      uid: f5d3c0b7-6626-4d09-b1b1-7481da3f01d5
  - ip: 10.0.2.65
    nodeName: gke-europe-west3-0-e2-medium-54ff9fe7-pz7j
    targetRef:
      kind: Pod
      name: api-86d654d6b8-jdbr9
      namespace: api
      resourceVersion: "289623942"
      uid: 57144d01-39d5-4d0c-ab60-d7c47518f8e1
  - ip: 10.0.3.13
    nodeName: gke-europe-west3-0-preemptible-e2-hig-e392ee80-6twh
    targetRef:
      kind: Pod
      name: api-86d654d6b8-bp5gj
      namespace: api
      resourceVersion: "289623662"
      uid: 5a47f714-84d2-4aba-a7e9-21337f349c32
  - ip: 10.0.3.15
    nodeName: gke-europe-west3-0-preemptible-e2-hig-e392ee80-6twh
    targetRef:
      kind: Pod
      name: api-86d654d6b8-rmwfr
      namespace: api
      resourceVersion: "289623907"
      uid: 9e2e9920-a6f5-4099-86af-9fb307ffc704
  - ip: 10.0.4.46
    nodeName: gke-europe-west3-0-preemptible-e2-hig-17a4163d-rwsq
    targetRef:
      kind: Pod
      name: api-86d654d6b8-dxl9b
      namespace: api
      resourceVersion: "289623861"
      uid: 186c285e-1c50-4f4a-a432-6aba5fccd494
  - ip: 10.0.4.47
    nodeName: gke-europe-west3-0-preemptible-e2-hig-17a4163d-rwsq
    targetRef:
      kind: Pod
      name: api-86d654d6b8-bbhq7
      namespace: api
      resourceVersion: "289623886"
      uid: af3aa13e-7a2c-41d9-a096-e3d4a42ed52a
  - ip: 10.0.5.21
    nodeName: gke-europe-west3-0-preemptible-e2-hig-4e3c6d35-5xkh
    targetRef:
      kind: Pod
      name: api-86d654d6b8-t68c4
      namespace: api
      resourceVersion: "289623769"
      uid: cfbc1849-f472-41e8-93a6-0f94723b7435
  - ip: 10.0.5.22
    nodeName: gke-europe-west3-0-preemptible-e2-hig-4e3c6d35-5xkh
    targetRef:
      kind: Pod
      name: api-86d654d6b8-8lm6q
      namespace: api
      resourceVersion: "289624016"
      uid: 292823ae-ea56-4150-87bb-8717dc414313
  notReadyAddresses:
  - ip: 10.0.0.34
    nodeName: gke-europe-west3-0-e2-medium-b3fa18ca-hg90
    targetRef:
      kind: Pod
      name: api-674477959-pkrf2
      namespace: api
      resourceVersion: "288605097"
      uid: 1c46c3a7-7665-4997-b180-59bf4bbc8115
  ports:
  - name: grpc
    port: 10901
    protocol: TCP
  - name: http
    port: 8080
    protocol: TCP

infra polarsignals com_prometheus_targets (1)

This was visible on

Prometheus v2.33.1
Kubernetes: v1.22.4-gke.1501

Does it seem as if Prometheus isn't properly filtering the notReadyAddresses if some condition is met?

@fpetkovski
Copy link
Contributor

To me it looks like notReadyAddresses are treated the same way as regular addresses, with the exception that the kubernetes_endpoint_ready label is set to false:

for _, addr := range ss.NotReadyAddresses {
add(addr, port, "false")
}

Not sure if this is intended or not though.

A solution could be to exclude pods in the Succeeded phase from being discovered, but this changes the assumption that all pods backed by an endpoint are expected to be running at all times.

@metalmatze
Copy link
Member

In the service discovery I can see the __meta_kubernetes_endpoint_ready="false" being set to false correctly.
Reading through prometheus-operator/prometheus-operator#3965, I'm more of the impression that it is a semantic decision on what makes sense for a given environment.

On my personal Scaleway cluster these unready Pods will be cleaned up quite quickly by the API server and won't be staying in the namespace for long, whereas it seems that GKE, for example, keeps the Status: Failed Pods around so they can be inspected later on (at least that's what I imagine the reasoning being).

Knowing that, I'm inclined to exclude all non-ready Pods in the Prometheus SD in our environment.

@philipgough
Copy link
Contributor

whereas it seems that GKE, for example, keeps the Status: Failed Pods around so they can be inspected later on

Reading kubernetes/kubernetes#99986 your assumption seems correct.

GC is configurable with a flag in kube-controller-manager

@metalmatze
Copy link
Member

This doesn't see to be configurable on GKE, so at least there we'll probably going to have to filter out all unready endpoints 🤷‍♂️
https://issuetracker.google.com/issues/172663707?pli=1

@nathan-vp
Copy link

In the service discovery I can see the __meta_kubernetes_endpoint_ready="false" being set to false correctly.

This happens for us as well, but here __meta_kubernetes_endpoint_ready="true" is set to true incorrectly (the Pod is not there anymore).

@pharaujo
Copy link
Contributor

pharaujo commented Jul 6, 2022

I just want to add another datapoint here; I'm seeing the same issue as the original post (prometheus 2.32.1, only one in the HA pair showing the issue), and I can confirm it's not unready endpoints (same as @nathan-vp, stale targets all have __meta_kubernetes_endpoint_ready="true"). From 55 scrape jobs in the affected prometheus instance, 5 are showing the issue. This started happening after we rotated the nodes in the cluster (upgrading EKS from 1.21 to 1.22); both pods had a restart.

SD seems to keep working though (adding and removing targets), as can be seen in the screenshot, apparently with the stale targets intact:
Screenshot 2022-07-06 at 12 17 07

@roidelapluie
Copy link
Member

That is super interesting. Can we see:

  • prometheus_target_scrape_pools_failed_total
  • prometheus_target_scrape_pool_sync_total

Thanks!

@pharaujo
Copy link
Contributor

pharaujo commented Jul 6, 2022

Let me know if you want different aggregations!

Screenshot 2022-07-06 at 14 27 22

Screenshot 2022-07-06 at 14 27 45

@roidelapluie
Copy link
Member

I think the next step would be to get a go routine dump.

https://prometheus/debug/pprof/goroutine?debug=2 (You can put the output in e.g. a gist).

@pharaujo
Copy link
Contributor

pharaujo commented Jul 6, 2022

go routine for prometheus-k8s-1 (the instance that shows problems) here: https://gist.github.com/pharaujo/eb2d4697e8352883ce8c050f8666003c

@roidelapluie
Copy link
Member

Do you have an idea for how long this instance is not updating its targets properly>

@pharaujo
Copy link
Contributor

pharaujo commented Jul 6, 2022

As far as I can tell, since the restart roughly 2 days ago (visible in the first screenshot I sent).

@roidelapluie
Copy link
Member

What comes into my mind is that we could get to this specific return:

Without sending an empty targetgroup for those endpoints. I'll need to dig firter. i do wonder if there is an easy reproducible way.

@TBeijen
Copy link

TBeijen commented Aug 1, 2022

Prometheus: v2.32.1
EKS 1.22

Occasionaly running into this as well, so another data point.

Most recent occurence:

  • Started when prometheus pod itself was replaced (node replacement)
  • It looks like the target got replaced at that moment as well (in case prometheus operator pod)
  • kubectl get endpoints holds just the new pod. Not the old pod (which no longer exists), also not in notReadyAddresses state via -0 yaml
  • Old target visible in /service-discovery, having __meta_kubernetes_endpoint_ready="true"
  • Only target present for that particular node, which is to be expected since the node no longer exists. (via __meta_kubernetes_pod_node_name)
  • Persists for ~14hr at this moment.

Trying to trigger refresh:

  • Restarting the target pod that incorrectly has its old target in list. No effect, the 'up' target gets replaced as it should, however the stale target persists.
  • Restarting prometheus itself: This 'fixes' the problem. Target list now correctly shows the single pod.

Fwiw (n=2): Occurences so far have been single pod targets.

Dump: https://gist.github.com/TBeijen/d99b88d0123d43a6e3b8d191bf4e34a6

In below screenshots stale target situation started ~17:30

Screenshot 2022-08-01 at 07 55 51

Screenshot 2022-08-01 at 07 55 09

@pharaujo
Copy link
Contributor

pharaujo commented Aug 9, 2022

Happened again in another Kubernetes cluster, right after the prometheus pod was moved to another node. Both pods in the HA pair restarted, only one show the problem. Only 1 in 57 scrape jobs has a "phantom" target.

@hervenicol
Copy link

hervenicol commented Dec 9, 2022

Experienced it too.

From my tests, to force "forgeting" outdated targets:

@Bhargavram3468
Copy link

Hi,

I have similar issue, when we are migrating from one version of own nodeexporter to another version, we have observed in endpoints ---> the pod ip of old version is tagged to new version of port number in kubectl get endpoints, and after some time the new pod ip is updated successfully. So do you have any idea of resolving the issue.

For clear understanding:
old pod ip starts with 10.xx.xx.xx:9690
new pod Ip starts with 192.XX.XX.XX:9100
but Iam getting 10.XX.XX.XX:9100 for some time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests