Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relabeling with multiple target files broken in 1.5.0 #2377

Closed
joe-pll opened this Issue Jan 30, 2017 · 10 comments

Comments

Projects
None yet
7 participants
@joe-pll
Copy link

joe-pll commented Jan 30, 2017

Hi, first of all thank you for your job, I really appreciate it.

Config

I have this Prometheus configuration and as target, for simplicity, there are three different files, corresponding to three different jobs, in the blackbox targets' directory.

global:
  scrape_interval: 10s
  evaluation_interval: 10s
 
rule_files: 
 - /etc/prometheus/rules/recording/*.rules
 - /etc/prometheus/rules/alerting/*.rules
 
 
scrape_configs:
 - job_name: 'override-exporters'
   scrape_interval: 5s
 
   file_sd_configs:
     - files:
       - '/etc/prometheus/targets/exporters/*.yml'
 
 - job_name: 'override-blackbox'
   scrape_interval: 10s
   metrics_path: /probe
   file_sd_configs:
     - files:
         - '/etc/prometheus/targets/blackbox/*.yml'
   relabel_configs:
     - source_labels: [module]
       target_label: __param_module
     - source_labels: [module, __address__]
       regex: ^[http|tcp].*;(.*):(.*)$
       target_label: __param_target
       replacement: ${1}:${2}
     - source_labels: [module, __address__]
       regex: ^icmp;(.*):?.*
       target_label: __param_target
       replacement: ${1}
     - source_labels: [__param_target]
       regex: ^(.*):(.*)
       target_label: remote_address
       replacement: ${1}
     - source_labels: [__param_target]
       regex: ^([^:]*)$
       target_label: remote_address
       replacement: ${1}
     - source_labels: [host]
       target_label: instance
     - source_labels: []
       target_label: module
       replacement: ""
     - source_labels: []
       target_label: host
       replacement: ""
     - source_labels: []
       regex: .*
       target_label: __address__
       replacement: '<link_to_the_blackbox_exporter>:9115'

In the directory /etc/prometheus/targets/blackbox/ there are the following files. In my configuration each file represents a job hence, all the targets for a job are in the same file.
job_http.yml

- labels:
    host: host1
    job: blackbox-http
    module: http_get_5s
  targets:
  - 1.1.1.1:8080

job_ping.yml

- labels:
    host: host1
    job: blackbox-ping
    module: icmp
  targets:
  - 1.1.1.1

job_ssh.yml

- labels:
    host: host1
    job: blackbox-ssh
    module: tcp
  targets:
  - 1.1.1.1:22

Problem

I was using the version v1.4.0 and everything was fine; the files were loaded correctly and the relabeling didn't have problems.
After the upgrade to v1.5.0 the relabeling for all the targets of the first file only (first file in alphabetical order) is not replacing the labels correctly.
The label remote_address is rewritten as <link_to_the_blackbox_exporter> and not with the address 1.1.1.1 as it should be; in addition the relabel works only for the first 2/3 seconds after Prometheus startup. But I repeat, this bad behavior happens only for the targets of the first file; the relabeling for the other files keeps working.

When I unify all the targets from all the files in an unique one everything works fine.
Furthermore when Prometheus loads the files, the first one is loaded immediately the others, instead, are shown in the interface only after the first scrape of a random target of the first file.

The log in debug mode doesn't give any other information.

Giuseppe

@juliusv juliusv added the kind/bug label Jan 30, 2017

@jeinwag

This comment has been minimized.

Copy link

jeinwag commented Jan 31, 2017

I'm also experiencing some very wird relabeling issues when using Consul service discovery with multiple servers, e.g.:

- job_name: cadvisor
  consul_sd_configs:
     - server: consul.dc1
       services: [ cadvisor ]
     - server: consul.dc2
       services: [ cadvisor ]
   relabel_configs:
      - source_labels: ['__meta_consul_node','__meta_consul_dc']
        regex:         '(.*?);(.*)'
        target_label:  'instance'
        replacement:   '$1.$2'

After a couple of scrapes some of the instance labels will only contain '.dc1' or '.dc2', the node name is missing. Then the __meta_consul_node is missing from the "before relabeling" popup, too.

It looks like these issues are caused by the changes to the relabeling logic in 6c07453.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Jan 31, 2017

Assigning to @brian-brazil to verify if it is connected to his relabeling changes.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Jan 31, 2017

Also, this might be the same as reported by Jeffrey Ollie (possibly @jcollie on GH) via the mailing list:

"""
I just tried this out and there appears to be a regression regarding relabeling. I have a number of scrape configs that look like this:

   - job_name: 'apcups'
    scrape_interval: 180s
    scrape_timeout: 120s
    metrics_path: /snmp
    params:
      module: [apcups]
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 127.0.0.1:19116
    file_sd_configs:
     - files:
       - /srv/prometheus/etc/apcups.yml

Everything works just fine on 1.4.1, but when I switch to 1.5.0 everything starts out just fine, but after a while the __param_target and the instance labels are set to 127.0.0.1:19116 (or whatever is put into the address label). Switching back to 1.4.1 solves the problem.
"""

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jan 31, 2017

The only other potentially related change I see is 767c070, but this will need some debugging.

@jeinwag

This comment has been minimized.

Copy link

jeinwag commented Jan 31, 2017

I tested builds at 6c07453 and the previous commit, the build at the previous commit doesn't cause any problems for me, the build at 6c07453 has the issues described previously.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jan 31, 2017

That narrows it down, thanks.

@alexsomesan

This comment has been minimized.

Copy link
Contributor

alexsomesan commented Feb 1, 2017

#2375 could also be a manifestation of the issue reported here.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 1, 2017

Looking at the code, I suspect the issue is that the relabelling is propogating back to targetsFromGroup and thus the target group.

@chriswiggins

This comment has been minimized.

Copy link

chriswiggins commented Feb 3, 2017

Can confirm that the latest PR #2386 fixes this issue for us. Thanks @brian-brazil !

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.