Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upFederated prometheus server doesn't update metric drop #3882
Comments
This comment has been minimized.
This comment has been minimized.
|
This is the expected staleness behaviour for data with timestamps such as from federation. This is not how federation is meant to be used for this and other reasons, see https://www.robustperception.io/federation-what-is-it-good-for/ |
brian-brazil
closed this
Feb 23, 2018
This comment has been minimized.
This comment has been minimized.
|
Thank you @brian-brazil . I see your point. As general behaviour of federated prometheus node, I would prefer that a federated nodes wouldn't report any data when no data is pull from the scraping node instead of keeping the data value stuck on 40 for about 8 minutes: Luca |
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 22, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |

lucapalano commentedFeb 23, 2018
•
edited
What did you do?
I configured different prometheus servers and one federated Prometheus which scapes from others (host1, host2, host3) with attached configuration prometheus.txt
The federated reports, in the graph, the following federated metrics:

There are 3 metrics/lines. They were scraped from a prometheus server (1st level prometheus server) which is in charge of collecting all the metrics for customer "test". Each metric was reported by a separated agent and is constantly monitored in order to trigger an alert in case a metric drops (in other words, in case of service failure).
So, I did some tests in order to simulate a service failure by stopping the reporting agent.
What did you expect to see?
I expect to see the metric drop on both prometheus servers (the federated one and the 1st level prometheus server) after trying to stop the agent.
What did you see instead? Under which circumstances?


I was able to see a metric drop down only on the "1st level prometheus server" (the one which collects metrics from the agents) as reported by the red line in this graph:
The problem was with the federated server. Interruption wasn't immediate:
I waited for about 8 minutes and 4 seconds before collecting an interruption in the graph line. In those minutes, the graph line was stuck on 40 without fluctuations.
I wasn't able to perform any tuning in order to fix this behaviour. So, this looks like a bug (IMHO).
Environment
System information:
Linux 3.10.0-327.36.3.el7.x86_64 x86_64 (where the agent and the prometheus servers run)Prometheus version:
prometheus, version 2.0.0 (branch: HEAD, revision: 0a74f98)
build user: root@615b82cb36b6
build date: 20171108-07:11:59
go version: go1.9.2
Prometheus configuration file:
Thanks in advance for your replies and your help! :-)
Luca