Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote writes not sending `job` label after the first few writes #3080

Closed
ghost opened this Issue Aug 15, 2017 · 7 comments

Comments

Projects
None yet
4 participants
@ghost
Copy link

ghost commented Aug 15, 2017

What did you do?
Configured prometheus to write to a remote endpoint

What did you expect to see?
A constant flow of requests to that endpoint including a job label

What did you see instead? Under which circumstances?
The requests following the first few requests did not include job. The change occurs fairly quickly in a fresh/plain install where the only configuration is setting a remote write URL.

Environment
Mac OSX Sierra 10.12.6 and Centos RHEL 7

  • System information:

    Darwin 16.7.0 x86_64 and Linux 3.10.0-514.26.2.el7.YAHOO.20170707.8.x86_64 x86_64

  • Prometheus version:

prometheus, version 2.0.0-beta.1 (branch: HEAD, revision: 4dcb465)
build user: root@d3f9974fac5a
build date: 20170811-12:16:25
go version: go1.8.3

  • Prometheus configuration file:
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'codelab-monitor'

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first.rules"
  # - "second.rules"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9090']
remote_write:
  - url: http://localhost:1337/write

@ghost ghost changed the title Remote writes not sending `job` dimension after the first write Remote writes not sending `job` dimension after the first few writes Aug 15, 2017

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Aug 15, 2017

It seems to be the case that when a metric is cached, the "rule labels" are not appended, i.e. the behaviour that ruleLabelsAppender offers does not occur in the case where the metric is cached (by cached, I mean when the conditional statement on line 776 of retrieval/scrape.go is true).

https://github.com/prometheus/prometheus/blob/dev-2.0/retrieval/scrape.go#L776

@ghost ghost changed the title Remote writes not sending `job` dimension after the first few writes Remote writes not sending `job` label after the first few writes Aug 16, 2017

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Aug 21, 2017

@tomwilkie

@ghost

This comment has been minimized.

Copy link
Author

ghost commented Aug 21, 2017

A possibly related issue is the caching dropping certain metrics and not just certain labels. When I disabled caching (by setting ok = false), I noticed a certain metric that was only showing up in the first few writes now showing up consistently among the writes.

screen shot 2017-08-21 at 4 20 37 pm

The change goes into effect at 15:30, for reference. Each of the previous spikes is from the initial writes, after which the metrics just stop being reported.

@krasi-georgiev

This comment has been minimized.

Copy link
Member

krasi-georgiev commented Sep 6, 2017

DIBS

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Sep 14, 2017

This should be fixed via #3151. Unfortunately, it didn't go into beta.3 by accident. We should probably cut beta.4 for that.

@gouthamve

This comment has been minimized.

Copy link
Member

gouthamve commented Oct 6, 2017

Can this be closed?

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 23, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 23, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.