Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job relabeling stops working after v2.4.3 #5404

Closed
mattalberts opened this Issue Mar 25, 2019 · 13 comments

Comments

Projects
None yet
4 participants
@mattalberts
Copy link

mattalberts commented Mar 25, 2019

Bug Report

Relabeling a job stoped working after v2.4.3.

What did you do?
Set scrape config to relabel job based on __meta_kubernetes_namespace

      relabel_configs:
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        regex: (.+)
        replacement: $1
        target_label: job

What did you expect to see?
I expected to see the job label replaced by the value of __meta_kubernetes_namespace. the expected behavior occurs on v2.4.3 and prior.

prom-job-namespace

What did you see instead? Under which circumstances?
However, relabeling job using the same mechanism in version v2.5.0, v2.6.1, v2.7.2, and v2.8.0 does not work. All jobs remain defaulted to the value of job_name

prom-job-defaulted

Environment

  • kubernetes v1.10.11
  • docker v17.0.3-1
  • Prometheus version:
  • v2.4.3 - 167a4b4 - job relabels
  • v2.5.0 - 67dc912 - job remains defaulted
  • v2.6.1 - b639fe1 - job remains defaulted
  • v2.7.2 - 82f98c8 - job remains defaulted
  • v2.8.0 - 5936949 - job remains defaulted
  • Prometheus configuration file:
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-conf-v0
  namespace: monitoring-system
data:
  prometheus.yaml: |
    global:
      scrape_interval: 15s
      scrape_timeout: 5s
      evaluation_interval: 15s
    rule_files:
    - /etc/prometheus/rules.d/alerting-rules
    - /etc/prometheus/rules.d/recording-rules
    - /etc/prometheus/rules.d/alerting-rules-canary
    - /etc/prometheus/rules.d/recording-rules-canary
    - /etc/cloudflare/rules.d/*.rules
    scrape_configs:
    - job_name: ~prometheus
      static_configs:
      - targets:
        - localhost:9090
    - job_name: ~prometheus-all
      kubernetes_sd_configs:
      - role: endpoints
      relabel_configs:
      - action: keep
        regex: monitoring-system;prometheus-server;http
        source_labels:
        - __meta_kubernetes_namespace
        - __meta_kubernetes_service_name
        - __meta_kubernetes_endpoint_port_name
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
        replacement: service_$1
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
        replacement: pod_$1
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        target_label: kubernetes_namespace
      - action: replace
        source_labels:
        - __meta_kubernetes_service_name
        target_label: kubernetes_name
      - action: replace
        source_labels:
        - __meta_kubernetes_service_name
        target_label: service_name
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_name
        target_label: pod_name
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_node_name
        target_label: pod_node
    - job_name: ~prometheus-loader
      static_configs:
      - targets: ['localhost:8080']
        labels:
          sidecar: rule-loader
      relabel_configs:
      - replacement: ~prometheus-sidecars
        target_label: job
    - job_name: ~kubernetes-apiservers
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: endpoints
      relabel_configs:
      - action: keep
        regex: default;kubernetes;https
        source_labels:
        - __meta_kubernetes_namespace
        - __meta_kubernetes_service_name
        - __meta_kubernetes_endpoint_port_name
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
        replacement: service_$1
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
        replacement: pod_$1
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        target_label: kubernetes_namespace
      - action: replace
        source_labels:
        - __meta_kubernetes_service_name
        target_label: kubernetes_name
      - action: replace
        source_labels:
        - __meta_kubernetes_service_name
        target_label: service_name
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_name
        target_label: pod_name
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_node_name
        target_label: pod_node
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        insecure_skip_verify: true
    - job_name: ~kubernetes-nodes
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: node
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - replacement: kubernetes.default.svc:443
        target_label: __address__
      - regex: (.+)
        replacement: /api/v1/nodes/${1}/proxy/metrics
        source_labels:
        - __meta_kubernetes_node_name
        target_label: __metrics_path__
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        insecure_skip_verify: true
    - job_name: ~kubernetes-nodes-cadvisor
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubernetes_sd_configs:
      - role: node
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - replacement: kubernetes.default.svc:443
        target_label: __address__
      - regex: (.+)
        replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
        source_labels:
        - __meta_kubernetes_node_name
        target_label: __metrics_path__
      scheme: https
      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        insecure_skip_verify: true
    # Maintaining prometheus scrapes for backward compatibility
    - job_name: ~kubernetes-service-endpoints
      kubernetes_sd_configs:
      - role: endpoints
      relabel_configs:
      - action: keep
        regex: true
        source_labels:
        - __meta_kubernetes_service_annotation_prometheus_io_scrape
      - action: replace
        regex: (https?)
        source_labels:
        - __meta_kubernetes_service_annotation_prometheus_io_scheme
        target_label: __scheme__
      - action: replace
        regex: (.+)
        source_labels:
        - __meta_kubernetes_service_annotation_prometheus_io_path
        target_label: __metrics_path__
      - action: replace
        regex: (.+)
        source_labels:
        - __meta_kubernetes_service_annotation_prometheus_io_format
        target_label: __param_format
      - action: replace
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
        source_labels:
        - __address__
        - __meta_kubernetes_service_annotation_prometheus_io_port
        target_label: __address__
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
        replacement: service_$1
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
        replacement: pod_$1
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        target_label: kubernetes_namespace
      - action: replace
        source_labels:
        - __meta_kubernetes_service_name
        target_label: kubernetes_name
      - action: replace
        source_labels:
        - __meta_kubernetes_service_name
        target_label: service_name
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_name
        target_label: pod_name
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_node_name
        target_label: pod_node
      tls_config:
        insecure_skip_verify: true
    - job_name: ~kubernetes-services
      kubernetes_sd_configs:
      - role: service
      metrics_path: /probe
      params:
        module:
        - http_2xx
      relabel_configs:
      - action: keep
        regex: true
        source_labels:
        - __meta_kubernetes_service_annotation_prometheus_io_probe
      - source_labels:
        - __address__
        target_label: __param_target
      - replacement: blackbox
        target_label: __address__
      - source_labels:
        - __param_target
        target_label: instance
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
        replacement: service_$1
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        target_label: kubernetes_namespace
      - action: replace
        source_labels:
        - __meta_kubernetes_service_name
        target_label: kubernetes_name
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        - __meta_kubernetes_service_name
        regex: (.+);(.+)
        replacement: $1/$2
        target_label: job
    - job_name: ~kubernetes-pods
      kubernetes_sd_configs:
      - role: pod
      relabel_configs:
      - action: keep
        regex: true
        source_labels:
        - __meta_kubernetes_pod_annotation_prometheus_io_scrape
      - action: replace
        regex: (https?)
        source_labels:
        - __meta_kubernetes_service_annotation_prometheus_io_scheme
        target_label: __scheme__
      - action: replace
        regex: (.+)
        source_labels:
        - __meta_kubernetes_pod_annotation_prometheus_io_path
        target_label: __metrics_path__
      - action: replace
        regex: (.+)
        source_labels:
        - __meta_kubernetes_pod_annotation_prometheus_io_format
        target_label: __param_format
      - action: replace
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
        source_labels:
        - __address__
        - __meta_kubernetes_pod_annotation_prometheus_io_port
        target_label: __address__
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
        replacement: pod_$1
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        target_label: kubernetes_namespace
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_controller_name
        regex: (.*)(?:-.*)
        replacement: $1
        target_label: kubernetes_name
      - action: replace
        source_labels:
        - __meta_kubernetes_service_name
        target_label: service_name
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_name
        target_label: pod_name
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_node_name
        target_label: pod_node
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        regex: (.+)
        replacement: $1
        target_label: job
      tls_config:
        insecure_skip_verify: true
---
  • Logs:
level=info ts=2019-03-25T16:18:54.563858517Z caller=main.go:324 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2019-03-25T16:18:54.563877897Z caller=main.go:325 vm_limits="(soft=unlimited, hard=unlimited)"
level=info ts=2019-03-25T16:18:54.564497109Z caller=main.go:640 msg="Starting TSDB ..."
level=info ts=2019-03-25T16:18:54.564552513Z caller=web.go:418 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2019-03-25T16:18:54.571051524Z caller=main.go:655 msg="TSDB started"
level=info ts=2019-03-25T16:18:54.571167146Z caller=main.go:724 msg="Loading configuration file" filename=/etc/prometheus/conf/prometheus.yaml
level=info ts=2019-03-25T16:18:54.575622448Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-25T16:18:54.5770268Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-25T16:18:54.57823994Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-25T16:18:54.57937525Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-25T16:18:54.696921408Z caller=main.go:751 msg="Completed loading of configuration file" filename=/etc/prometheus/conf/prometheus.yaml
level=info ts=2019-03-25T16:18:54.696969869Z caller=main.go:609 msg="Server is ready to receive web requests."
level=info ts=2019-03-25T16:18:55.255975995Z caller=main.go:724 msg="Loading configuration file" filename=/etc/prometheus/conf/prometheus.yaml
level=info ts=2019-03-25T16:18:55.258432271Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-25T16:18:55.259244949Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-25T16:18:55.259993149Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-25T16:18:55.260784327Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-25T16:18:55.486117788Z caller=main.go:751 msg="Completed loading of configuration file" filename=/etc/prometheus/conf/prometheus.yaml
level=info ts=2019-03-25T16:18:55.486591971Z caller=main.go:724 msg="Loading configuration file" filename=/etc/prometheus/conf/prometheus.yaml
level=error ts=2019-03-25T16:18:55.488777474Z caller=endpoints.go:130 component="discovery manager scrape" discovery=k8s role=endpoint msg="endpoints informer unable to sync cache"
level=error ts=2019-03-25T16:18:55.488812409Z caller=pod.go:85 component="discovery manager scrape" discovery=k8s role=pod msg="pod informer unable to sync cache"
level=info ts=2019-03-25T16:18:55.488936287Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-25T16:18:55.489628732Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-25T16:18:55.490315703Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-25T16:18:55.490956702Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-25T16:18:55.687249852Z caller=main.go:751 msg="Completed loading of configuration file" filename=/etc/prometheus/conf/prometheus.yaml

note: nothing relevant, but included it anyway.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 25, 2019

It still works as before, the status page changed however. The job label was briefly not shown here, but it's there again now.

@mattalberts

This comment has been minimized.

Copy link
Author

mattalberts commented Mar 25, 2019

@brian-brazil Ah! Okay, so it was only the status page that was changed. Is the update. That's good to know. I've just confirmed that job was set on the metric (even through it was not displayed on the status page).

prom-job-in-metrics

The job label was briefly not shown here, but it's there again now.

When you say briefly, was there a commit that corrected the issue in master and I just need to wait for the blessed release?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 25, 2019

That release was several months ago, either 2.5 or 2.6.

@mattalberts

This comment has been minimized.

Copy link
Author

mattalberts commented Mar 25, 2019

Maybe it regressed for v2.8.0? All my screenshots where job is rendered as job_name are from v2.8.0

@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Mar 26, 2019

Right there's been a regression introduced by the Boostrap upgrade, see #5406.

@codesome

This comment has been minimized.

Copy link
Member

codesome commented Mar 26, 2019

Closed by #5406

@codesome codesome closed this Mar 26, 2019

@mattalberts

This comment has been minimized.

Copy link
Author

mattalberts commented Apr 2, 2019

fantastic! thank you!

@mattalberts

This comment has been minimized.

Copy link
Author

mattalberts commented Apr 3, 2019

@codesome @simonpasquier @brian-brazil
Does this ticket need to be re-opened? I just installed v2.8.1 and it exhibits the same issue.
prom-relabel-job-failure-version
prom-relabel-job-failure-target

    - job_name: ~kubernetes-pods
      kubernetes_sd_configs:
      - role: pod
      relabel_configs:
      - action: keep
        regex: true
        source_labels:
        - __meta_kubernetes_pod_annotation_prometheus_io_scrape
      - action: replace
        regex: (https?)
        source_labels:
        - __meta_kubernetes_service_annotation_prometheus_io_scheme
        target_label: __scheme__
      - action: replace
        regex: (.+)
        source_labels:
        - __meta_kubernetes_pod_annotation_prometheus_io_path
        target_label: __metrics_path__
      - action: replace
        regex: (.+)
        source_labels:
        - __meta_kubernetes_pod_annotation_prometheus_io_format
        target_label: __param_format
      - action: replace
        regex: ([^:]+)(?::\d+)?;(\d+)
        replacement: $1:$2
        source_labels:
        - __address__
        - __meta_kubernetes_pod_annotation_prometheus_io_port
        target_label: __address__
      - action: labelmap
        regex: __meta_kubernetes_pod_label_(.+)
        replacement: pod_$1
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        target_label: kubernetes_namespace
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_controller_name
        regex: (.*)(?:-.*)
        replacement: $1
        target_label: kubernetes_name
      - action: replace
        source_labels:
        - __meta_kubernetes_service_name
        target_label: service_name
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_name
        target_label: pod_name
      - action: replace
        source_labels:
        - __meta_kubernetes_pod_node_name
        target_label: pod_node
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        regex: (.+)
        replacement: $1
        target_label: job

note: the job should be the namespace

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Apr 3, 2019

That looks fine to me.

@mattalberts

This comment has been minimized.

Copy link
Author

mattalberts commented Apr 3, 2019

In the example posted (by both config and pictures), the job is replaced by __meta_kubernetes_namespace . However, the targets page still shows the job by its default job_name as ~kubernetes-pods rather than the value of __meta_kubernetes_namespace . In the case of the example posted, that would be bq-etl.

The same example would have rendered like this in v2.4.3
prom-relabel-job-v2 4 3-target

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Apr 3, 2019

That's how it works now, everything is grouped by job_name.

@mattalberts

This comment has been minimized.

Copy link
Author

mattalberts commented Apr 3, 2019

Okay, so the new behavior is to group by job_name. To alter group'ing then, in-line with previous versions, do I need to add a rule to over-write job_name with the value of __meta_kubernetes_namespace

naively adding something like this (I mean I know job_name doesn't show up in the "Before relabing" section), does nothing to the grouping.

      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        regex: (.+)
        replacement: $1
        target_label: job_name
      - action: replace
        source_labels:
        - __meta_kubernetes_namespace
        regex: (.+)
        replacement: $1
        target_label: job

Is there an alternate way to change grouping on the targets page? Or was that feature removed completely?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Apr 3, 2019

There's no way to change grouping, it's now always by job_name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.