Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus Alerting on Pods with HTTPS Scheme #4741

Closed
spirrello opened this Issue Oct 15, 2018 · 5 comments

Comments

Projects
None yet
2 participants
@spirrello
Copy link

spirrello commented Oct 15, 2018

Bug Report

What did you do?

Apply a job to monitor pods over HTTP and HTTPS.

What did you expect to see?

To be able to monitor pods with HTTP and HTTPS endpoints without alerting when pods are up.

What did you see instead? Under which circumstances?

All pods that are using HTTPS for a metric endpoint are in an Up state but are alerting as down as well.

Environment

  • System information:

uname -srm
Linux 4.4.0-116-generic x86_64

  • Prometheus version:

prometheus, version 2.3.2 (branch: HEAD, revision: 71af5e2)
build user: root@5258e0bd9cc1
build date: 20180712-14:02:52
go version: go1.10.3

  • Prometheus configuration file:

    - job_name: 'kubernetes-pods'

      kubernetes_sd_configs:
      - role: pod
      tls_config:
        insecure_skip_verify: true

      relabel_configs:
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
        action: replace
        target_label: __scheme__
        regex: (.+)
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
      - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
        action: replace
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
        target_label: __address__
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_pod_name]
        action: replace
        target_label: kubernetes_pod_name
  • Logs:

No logs indicating these pods are down.

Screenshots:

scrape-down

scrape-up

@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Oct 16, 2018

You probably have a mismatch between your kubernetes annotations and the relabeling configuration. Can you share the Targets page?

It looks as if this is actually a question about usage and not development. For the future, I suggest that you use our user mailing list, which you can also search. If you prefer more interactive help, join or our IRC channel, #prometheus on irc.freenode.net. Please be aware that our IRC channel has no logs, is not searchable, and that people might not answer quickly if they are busy or asleep. If in doubt, you should choose the mailing list.

@spirrello

This comment has been minimized.

Copy link
Author

spirrello commented Oct 16, 2018

@simonpasquier Thanks for responding. Do you want a private email with a PDF of the targets page? I originally tried asking the question in the user mail list but never heard back from anyone.

Here's my post in the user list.

https://mail.google.com/mail/u/0/#search/prometheus/WhctKJTzzZLrqxbGPwkqScFRGcLTMVQwNrDmkXtGsBsFMXXTPPhNtBFBnfPDbKXsZrdtnbv

@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Oct 16, 2018

@spirrello

This comment has been minimized.

Copy link
Author

spirrello commented Oct 24, 2018

Looks like the root cause is how Prometheus reports the uptime for these pods. I tested this by querying for pods via "up == 0". This does seem like a bug though I haven't dug into the code to solidify this theory but from a user perspective it's not working as one would think.

For the time being, I've refactored my alerting to use this instead and so far so good:

expr: kube_deployment_status_replicas_unavailable > 0

@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Nov 9, 2018

Closing for now as you've found an alternative approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.