Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kubernetes] Delete targets for evicted pods #3837

Closed
seleznev opened this Issue Feb 14, 2018 · 7 comments

Comments

Projects
None yet
3 participants
@seleznev
Copy link

seleznev commented Feb 14, 2018

For now Prometheus tries to scrape metrics from evicted pods (that have no IP and doesn't perform in cluster). It can cause false alerts or ever scraping metrics from another pod (for ex. if you use hostNetwork+hostPort for your deployment and new pod will be created on same node as evicted pod).

What did you do?

Steps to reproduce:

  1. Create deployment with 1 replica and annotations like this:
annotations:
  prometheus.io/scrape: "true"
  prometheus.io/path: "/metrics"
  prometheus.io/port: "80"
  1. On /targets page you will see new target for created pod.

  2. Do something to trigger eviction pods from node (for example, change --eviction-hard settings for kubelet).

  3. Verify that pod is evicted with kubectl. It will looks like:

$ kubectl get pods -o wide -a
NAME                        READY     STATUS      RESTARTS   AGE       IP             NODE
test-app-7ccb8fbb85-8b4rz   0/1       Evicted     0          20h       <none>         node2
test-app-7ccb8fbb85-bmnrf   1/1       Running     0          18h       10.244.3.6     node2
  1. Check /targets page. Target for evicted pod will be still here.

What did you expect to see?

No target for evicted pod on step 5.

What did you see instead? Under which circumstances?

Evicted pod has target on /targets page (with down state in most cases).

Environment

  • System information:

    Linux 4.4.0-104-generic x86_64

  • Prometheus version:

prometheus, version 2.1.0 (branch: HEAD, revision: 85f23d82a045d103ea7f3c89a91fba4a93e6367a)
  build user:       root@6e784304d3ff
  build date:       20180119-12:01:23
  go version:       go1.9.2
  • Prometheus configuration file:

    Relevant part:

global:
  scrape_interval: 30s
  scrape_timeout: 30s
  evaluation_interval: 1m
scrape_configs:

  [...]

- job_name: kubernetes-pods
  scrape_interval: 30s
  scrape_timeout: 30s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - api_server: null
    role: pod
    namespaces:
      names: []
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
    separator: ;
    regex: "true"
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
    separator: ;
    regex: (.+)
    target_label: __metrics_path__
    replacement: $1
    action: replace
  - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
    separator: ;
    regex: (.+):(?:\d+);(\d+)
    target_label: __address__
    replacement: ${1}:${2}
    action: replace
  - separator: ;
    regex: __meta_kubernetes_pod_label_(.+)
    replacement: $1
    action: labelmap
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: kubernetes_namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: kubernetes_pod_name
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_node_name]
    separator: ;
    regex: (.*)
    target_label: kubernetes_node_name
    replacement: $1
    action: replace
  • Logs:
n/a
@seleznev

This comment has been minimized.

Copy link
Author

seleznev commented Feb 14, 2018

Related sources: https://github.com/prometheus/prometheus/blob/master/discovery/kubernetes/pod.go

How it works (apparently):

  1. Prometheus on startup get all pods from Kubernetes cluster and creates targets. If pod doesn't have IP address it will be skipped.

  2. Prometheus subscribe for events about creating/updating/deleting pods in the cluster.

  3. When pod has been evicted, Prometheus receive update event and try to update target based on the new pod information. But it skipped because evicted pod doesn't have IP address.

I can't reproduce the issue with following patch (but it doesn't look good):

diff --git a/discovery/kubernetes/pod.go b/discovery/kubernetes/pod.go
index ef8ce6d..cfe9e33 100644
--- a/discovery/kubernetes/pod.go
+++ b/discovery/kubernetes/pod.go
@@ -98,7 +98,14 @@ func (p *Pod) Run(ctx context.Context, ch chan<- []*config.TargetGroup) {
                                p.logger.With("err", err).Errorln("converting to Pod object failed")
                                return
                        }
-                       send(p.buildPod(pod))
+
+                       // During eviction pod can lose their IP.
+                       // In this case we should remove them from targets.
+                       if len(pod.Status.PodIP) == 0 {
+                               send(&config.TargetGroup{Source: podSource(pod)})
+                       } else {
+                               send(p.buildPod(pod))
+                       }
                },
        })
 

Maybe it will be helpful.

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Feb 14, 2018

I would think that there is some sort of condition field or something in the Pod's status, that it has been evicted, that seem like a more appropriate check that the amount of IPs (maybe there are other conditions than eviction that make IP count 0? I wouldn't want to swallow those without knowing what's happening).

In general I think this should be added.

@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Feb 14, 2018

FWIW buildPod() already discards pods which have an empty IP address (see https://github.com/prometheus/prometheus/blob/master/discovery/kubernetes/pod.go#L175-L177).

A change like this would make sense IMO:

diff --git a/discovery/kubernetes/pod.go b/discovery/kubernetes/pod.go
index d868b115..68f95f62 100644
--- a/discovery/kubernetes/pod.go
+++ b/discovery/kubernetes/pod.go
@@ -170,16 +170,16 @@ func podLabels(pod *apiv1.Pod) model.LabelSet {
 }
 
 func (p *Pod) buildPod(pod *apiv1.Pod) *targetgroup.Group {
-       // During startup the pod may not have an IP yet. This does not even allow
-       // for an up metric, so we skip the target.
-       if len(pod.Status.PodIP) == 0 {
-               return nil
-       }
        tg := &targetgroup.Group{
                Source: podSource(pod),
        }
        tg.Labels = podLabels(pod)
        tg.Labels[namespaceLabel] = lv(pod.Namespace)
+       // During startup the pod may not have an IP yet. This does not even allow
+       // for an up metric, so we skip the target.
+       if len(pod.Status.PodIP) == 0 {
+               return tg
+       }
 
        for _, c := range pod.Spec.Containers {
                // If no ports are defined for the container, create an anonymous
@brancz

This comment has been minimized.

Copy link
Member

brancz commented Feb 14, 2018

Sounds reasonable @simonpasquier 🙂

@seleznev

This comment has been minimized.

Copy link
Author

seleznev commented Feb 15, 2018

I can't reproduce the issue with @simonpasquier's patch (tested with v2.2.0-rc.0) and it looks great. 🎉

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Feb 15, 2018

Interesting. Then it seems it was one of the general SD bugs that were present in v2.1.0. Sorry for the inconvenience. @simonpasquier's patch is still applicable though I think, so let's keep this open until we merge that.

@fabxc fabxc closed this in #3842 Feb 19, 2018

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.