Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes SD: Unrelated new pod recycling old IP shows up as old pod of configured job #2266

Closed
beorn7 opened this Issue Dec 8, 2016 · 6 comments

Comments

Projects
None yet
2 participants
@beorn7
Copy link
Member

beorn7 commented Dec 8, 2016

What did you do?

We are running a large scale K8s cluster with many Prometheis monitoring individual jobs.

What did you expect to see?

If a pod goes away and another pod with the same IP number but completely different labels gets created, it should now show the new labels in Prometheus SD, not the old ones.

What did you see instead? Under which circumstances?

What likely happened: A pod of job A was removed. Shortly after, a pod for job B was created that happened to get the same IP number as the just removed pod. On the Prometheus server for job B, all was fine. On the Prometheus server for job A, the old pod stayed around, with exactly the same __meta_kubernetes_… labels as before. Of course, now the Prometheus for job A was scraping that pod as if it belonged to job A, although the pod was exposing metrics for job B, obviously. That created a huge confusion with the resulting metrics.

A restart of the job-A-Prometheus fixed the issue.

This is either a problem with the caching in the K8s client code within Prometheus, or it is a problem with the way Prometheus identifies a change in the labels for a pod IP.

Environment

  • System information:

Linux 4.4.10+soundcloud x86_64

  • Prometheus version:
prometheus, version 1.4.1 (branch: master, revision: 2a89e8733f240d3cd57a6520b52c36ac4744ce12)
  build user:       root@e685d23d8809
  build date:       20161128-09:59:22
  go version:       go1.7.3
  • Prometheus configuration file:

Relevant portions:

scrape_configs:
- job_name: api-mobile-k2
  kubernetes_sd_configs:
  - api_server: https://api.k2.k8s.s-cloud.net
    role: pod
    tls_config:
      ca_file: /srv/prometheus/k8s-certificates/k2/ca.crt
      cert_file: /srv/prometheus/k8s-certificates/k2/client.crt
      key_file: /srv/prometheus/k8s-certificates/k2/client.key
  relabel_configs:
  - action: keep
    source_labels:
    - __meta_kubernetes_namespace
    - __meta_kubernetes_pod_annotation_prometheus_io_port
    regex: .+;[0-9]+|;
  - action: keep
    source_labels:
    - __meta_kubernetes_namespace
    - __meta_kubernetes_pod_label_system
    regex: .+;api-mobile|;

Many more relabel configs to follow… But the important one is here: If the pod is not labeled as api-mobile, it is dropped. The new pod recycling the IP number is labeled differently, but the api-mobile Prometheus sees it labeled as the old pod was labeled. As can be see in the tool tip on the targets list.

  • Logs:

Only checkpointing messages in the logs, no error messages at all.

@beorn7

This comment has been minimized.

Copy link
Member Author

beorn7 commented Dec 8, 2016

@fabxc Assigning to you as you probably know best what might happening here. Something with the way targets are hashed/identified, and if the new pod is seen as the old ones, the labels are never updated?

@beorn7

This comment has been minimized.

Copy link
Member Author

beorn7 commented Dec 8, 2016

#2262 smells like a related issue, for a completely different SD mechanism.

@beorn7

This comment has been minimized.

Copy link
Member Author

beorn7 commented Dec 8, 2016

This has happened at least twice. It is kind of worrying because it can go unnoticed, so we might have severe underreporting. (In the two detected cases, we had a per-endpoint alert, and that one pod from a completely different service happened to have a similarly named metric that tickled the alert.)

beorn7 added a commit that referenced this issue Jan 6, 2017

Retrieval: Avoid copying Target
retreival.Target contains a mutex. It was copied in the Targets()
call. This potentially can wreak a lot of havoc.

It might even have caused the issues reported as #2266 and #2262 .

@beorn7 beorn7 assigned beorn7 and unassigned fabxc Feb 16, 2017

@beorn7

This comment has been minimized.

Copy link
Member Author

beorn7 commented Feb 16, 2017

I'm investigating if #2323 fixed it (by detecting re-used pod IPs in SoundCloud production and check if any Prometheus servers still monitors them under their old use).

@beorn7

This comment has been minimized.

Copy link
Member Author

beorn7 commented Feb 16, 2017

Investigated a number of cases, and they all look good. Closing until proven otherwise.

@beorn7 beorn7 closed this Feb 16, 2017

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.