Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alerting time series misses external label #2271

Closed
svenmueller opened this Issue Dec 10, 2016 · 9 comments

Comments

Projects
None yet
4 participants
@svenmueller
Copy link

svenmueller commented Dec 10, 2016

What did you do?
Querying for alert time series data.

What did you expect to see?
The labels should be the same for the same alert timeseries data.

What did you see instead? Under which circumstances?

When querying for alerts, two timeseries are returned instead of a single one. One is missing the external label "hostname".

Example:
Alert "InstanceLowMemory" is fired for instance "domain.com" which creates a timeseries with external label "hostname". But for the same alert, there is also a different timeseries without label "hostname".

Query:

ALERTS{alertstate="firing", customer="abc", alertname="InstanceLowMemory"} 

Result:

ALERTS{alertname="InstanceLowMemory",alertstate="firing",customer="abc",environment="stage",hostname="prometheus-yalow",instance="domain.com",job="node",region="eu",severity="page"}
ALERTS{alertname="InstanceLowMemory",alertstate="firing",customer="abc",environment="stage",instance="domain.com",job="node",region="eu",severity="page"}

Environment

  • System information:

    Linux 3.13.0-105-generic x86_64

  • Prometheus version:

    prometheus, version 1.4.1 (branch: master, revision: 2a89e87)
    build user: root@e685d23d8809
    build date: 20161128-09:59:22
    go version: go1.7.3

  • Alertmanager version:

    alertmanager, version 0.5.1 (branch: master, revision: 0ea1cac51e6a620ec09d053f0484b97932b5c902)
    build user: root@fb407787b8bf
    build date: 20161125-08:14:40
    go version: go1.7.3

  • Prometheus configuration file:

# my global config
global:
  scrape_interval:     15s # By default, scrape targets every 15 seconds.
  evaluation_interval: 15s # By default, scrape targets every 15 seconds.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      hostname: 'prometheus-awesome-yalow'

# Load and evaluate rules in this file every 'evaluation_interval' seconds.
rule_files:
  - "/etc/prometheus/rules/*.rules"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9090']


  # Scrape the Node Exporter every 5 seconds.
  - job_name: 'node'
    scrape_interval: 5s
    file_sd_configs:
      - files:
        - /etc/prometheus/targets/*.yaml

    relabel_configs:
      - source_labels: [__address__]
        regex: (.*):9100
        replacement: ${1}
        target_label: instance
      - source_labels: [instance]
        regex: .*\.([\w,-]+)\.domain\.com
        replacement: ${1}
        target_label: region
      - source_labels: [instance]
        regex: (\w+).*
        replacement: ${1}
        target_label: customer
      - source_labels: [instance]
        regex: .*-(dev|stage|prod)-.*
        replacement: ${1}
        target_label: environment

alerting:
  alertmanagers:
    - static_configs:
      - targets: ["alertmanager:9093"]
@brancz

This comment has been minimized.

Copy link
Member

brancz commented Dec 12, 2016

My guess is that this is happening because external labels are only attached at the point when a notification is sent out, whereas we should have already added them at the point where it is added to the storage. Does that sounds right @brian-brazil @fabxc ?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Dec 12, 2016

They should only be added when they're sent out, otherwise would not be only external labels.

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Dec 12, 2016

Makes sense. So which time-series should exist then/not exist @brian-brazil ?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Dec 13, 2016

Only the timeseries without external labels should exist.

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Dec 13, 2016

Ok that makes sense as well. I'll have a look later if I can find something that could be causing this.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Jan 17, 2017

@brancz did you look into that any further?

@brancz

This comment has been minimized.

Copy link
Member

brancz commented Jan 17, 2017

I don't think so - sorry. I don't think I found what was causing this, but I also don't remember how much time I spent on it.

@fabxc fabxc added this to the v1.5 milestone Jan 17, 2017

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented May 15, 2017

This was fixed.

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 23, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 23, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.