Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upDiscard timestamps at scrape time #5302
Comments
roidelapluie
changed the title
Improve staleness handling
Discard timestamps at scrape time
Mar 4, 2019
This comment has been minimized.
This comment has been minimized.
|
If another monitoring system is exposing timestamps, we typically want them. Do you have a particular use case in mind? |
This comment has been minimized.
This comment has been minimized.
|
I have a partner exposing time series with timestamps (rate of requests) but they do not publish the 0's. They do it by interrogating elasticsearch and have a custom java agent doing the prometheus exposition format. It means that staleness can not apply. Another use case: mtail 0.24 exposed 0 as timestamp (its was bug, solved in latest release). I know this is not bugs in prometheus and I am not suggesting to put that as a default of course, but for those edge cases being able to discard the timestamps would be great. |
roidelapluie
added a commit
to roidelapluie/prometheus
that referenced
this issue
Mar 4, 2019
roidelapluie
added a commit
to roidelapluie/prometheus
that referenced
this issue
Mar 4, 2019
roidelapluie
added a commit
to roidelapluie/prometheus
that referenced
this issue
Mar 4, 2019
roidelapluie
added a commit
to roidelapluie/prometheus
that referenced
this issue
Mar 4, 2019
This comment has been minimized.
This comment has been minimized.
For that I imagine you'd be using avg_over_time, which is going to give the same result either way.
Such a sample would be dropped. I'm not opposed to this, but I'd like to see a good use case first. |
roidelapluie
added a commit
to roidelapluie/prometheus
that referenced
this issue
Mar 4, 2019
This comment has been minimized.
This comment has been minimized.
|
Yes it would be dropped. But the metrics were actually relevant so we should not want to drop it, just scrape it as current time. |
This comment has been minimized.
This comment has been minimized.
|
avg_over_time would not give the same result, and is not really useful here as my usecase involves a metric wich is already computed by our partner over 5 minutes. Rate value:
Actually Exposed by our partner:
We can not see the drop at "0". If we ignore the timestamp, we could get:
It is an ever bigger problem with labels as some labels are visible for 5 minutes but they should actually only appear once. |
roidelapluie
added a commit
to roidelapluie/prometheus
that referenced
this issue
Mar 4, 2019
This comment has been minimized.
This comment has been minimized.
|
You'd get no data at that point though, and that's pretty hard to work with. |
This comment has been minimized.
This comment has been minimized.
|
I do a sum() of the rates by some useful labels so I can work with that. |
This comment has been minimized.
This comment has been minimized.
|
I currently have a ugly recording rule:
(we have a job config that relabels enduser_inbound:http_requests_duration_seconds:avg5m to raw_enduser_inbound:http_requests_duration_seconds:avg5m at scrape time) That is causing me way more issues than this proposal to ignore timestamps. |
This comment has been minimized.
This comment has been minimized.
|
I don't think this solves the fundamental problem, CloudWatch also does this for example and we'd end up persisting duplicating a sample twice rather than (correctly) only taking in the value we want at one timestamp. Your example data also doesn't line up with what you're saying. |
This comment has been minimized.
This comment has been minimized.
|
Use case 1: I get metrics: http_call:rate5m{service="a"} 10 1551784037000 And on the next scrape, I get http_call:rate5m{service="a"} 11 1551784067000 If I sum up : sum (http_call:rate5m) at 1551784067001, I will get: 15. I would like to get: 11. I am not the master of those metrics, just a consumer. Use case 2 Broken exporters behaviour: google/mtail#210 |
This comment has been minimized.
This comment has been minimized.
|
I agree that this is not something that will be needed a lot but I need it now and I expect that with prometheus format becoming more important in the future other people will need this. (ps: I would really like to avoid writing my own proxy for this) |
This comment has been minimized.
This comment has been minimized.
|
Use case 3 Federating prometheus with systems exposing timestamps but refreshing at a slow interval (e.g. 10m) |
roidelapluie commentedMar 4, 2019
Proposal
Use case. Why is this important?
When Prometheus is federating another monitoring system, it could be that that monitoring system is exposing timestamps.
In the case of federation with timestamps, Prometheus will not mark the metric as stale if it disappear. Instead, it will be present up to the query.loopback value.
Considering that Open Metrics will not help
OpenObservability/OpenMetrics#37
Could we have a job option honor_timestamps to discard timestamps? That way power user can deal with broken third parties.