Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stale issue when federate from other Prometheus #4082

Closed
dmitriy-lukyanchikov opened this Issue Apr 12, 2018 · 6 comments

Comments

Projects
None yet
2 participants
@dmitriy-lukyanchikov
Copy link

dmitriy-lukyanchikov commented Apr 12, 2018

Proposal

remove timestamp from federated metrics

Bug Report

What did you do?
I have 2 prometheus servers A and B. Server B configured to federate from server A.
server A scrape metric with couple of labels inside file from couple of nodes, it look like this server_labels{dc="fr",env="prod"} 1
I use it in group_left query

then i run query to see metric labels changes in time on graph from Server B that do federation

What did you expect to see?
i expected to see different metric labels on graph, and see changes of labels

What did you see instead? Under which circumstances?
many-to-many error. Investigation confermed that its a stale issue because all federated values have timestamp and prometheus wait exactly 5 minutes before make decision that this metric are not exist. This was the cause that for period of 5-6 minutes tehre is two copy of metric with different labels.
from Server A (that actually collect metrics from nodes) query work perfectly.
All federated metrics looks like this server_labels{dc="fr",env="prod"} 1 1523536675185

Environment

VMs on Digital Ocean

  • System information:

Linux 4.4.0-119-generic x86_64

  • Prometheus version:

both prometheus servers are 2.1.0

  • Prometheus configuration file:
# my global config
global:
  scrape_interval:     60s # By default, scrape targets every 15 seconds.
  evaluation_interval: 60s # By default, scrape targets every 15 seconds.
  scrape_timeout: 50s
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).

# Load and evaluate rules in this file every 'evaluation_interval' seconds.
rule_files:
#  - "lr_recording.rules"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:

  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  # Auto-discovery via consul tags by Registrator


  - job_name: 'federate_lr_metrics_node'
    scrape_interval: 60s

    honor_labels: true
    metrics_path: '/federate'

    params:
      'match[]':
        - '{__name__="server_labels"}'


    static_configs:
      - targets:
        - 'server_ip:9090'
@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Apr 12, 2018

This behaviour is correct, removing the timestamp would result in the data having the wrong timestamp.

I think you're looking for external_labels.

@dmitriy-lukyanchikov

This comment has been minimized.

Copy link
Author

dmitriy-lukyanchikov commented Apr 12, 2018

But how the external_labels will help if metric is federated with timestamp that cause this issue?

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Apr 12, 2018

It makes more sense to ask questions like this on the prometheus-users mailing list rather than in a GitHub issue. On the mailing list, more people are available to potentially respond to your question, and the whole community can benefit from the answers provided.

@dmitriy-lukyanchikov

This comment has been minimized.

Copy link
Author

dmitriy-lukyanchikov commented Apr 12, 2018

I mean that i think it will not help at all because the issue appear because of timestamp, and this config do not remove it, but ok i will ask them too

@dmitriy-lukyanchikov

This comment has been minimized.

Copy link
Author

dmitriy-lukyanchikov commented Apr 12, 2018

@brian-brazil I think you misunderstood me this metric not on prometheus server itself, its on huge amount of nodes inside the company, and external_labels is not a solution here. So i need get this metrics ans somehow federate them without timestamp, but if you say that its impossible i think its not a good, because many federated metrics will have duplication
FYI this config that i provide its just for example

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.