Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discard timestamps at scrape time #5302

Closed
roidelapluie opened this Issue Mar 4, 2019 · 12 comments

Comments

Projects
None yet
3 participants
@roidelapluie
Copy link
Contributor

roidelapluie commented Mar 4, 2019

Proposal

Use case. Why is this important?

When Prometheus is federating another monitoring system, it could be that that monitoring system is exposing timestamps.
In the case of federation with timestamps, Prometheus will not mark the metric as stale if it disappear. Instead, it will be present up to the query.loopback value.

Considering that Open Metrics will not help
OpenObservability/OpenMetrics#37

Could we have a job option honor_timestamps to discard timestamps? That way power user can deal with broken third parties.

@roidelapluie roidelapluie changed the title Improve staleness handling Discard timestamps at scrape time Mar 4, 2019

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 4, 2019

If another monitoring system is exposing timestamps, we typically want them. Do you have a particular use case in mind?

@roidelapluie

This comment has been minimized.

Copy link
Contributor Author

roidelapluie commented Mar 4, 2019

I have a partner exposing time series with timestamps (rate of requests) but they do not publish the 0's. They do it by interrogating elasticsearch and have a custom java agent doing the prometheus exposition format. It means that staleness can not apply.

Another use case: mtail 0.24 exposed 0 as timestamp (its was bug, solved in latest release).

I know this is not bugs in prometheus and I am not suggesting to put that as a default of course, but for those edge cases being able to discard the timestamps would be great.

roidelapluie added a commit to roidelapluie/prometheus that referenced this issue Mar 4, 2019

Add honor_timestamps
Fixes prometheus#5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

roidelapluie added a commit to roidelapluie/prometheus that referenced this issue Mar 4, 2019

Add honor_timestamps
Fixes prometheus#5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

roidelapluie added a commit to roidelapluie/prometheus that referenced this issue Mar 4, 2019

Add honor_timestamps
Fixes prometheus#5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

roidelapluie added a commit to roidelapluie/prometheus that referenced this issue Mar 4, 2019

Add honor_timestamps
Fixes prometheus#5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 4, 2019

I have a partner exposing time series with timestamps (rate of requests) but they do not publish the 0's.

For that I imagine you'd be using avg_over_time, which is going to give the same result either way.

Another use case: mtail 0.24 exposed 0 as timestamp (its was bug, solved in latest release).

Such a sample would be dropped.

I'm not opposed to this, but I'd like to see a good use case first.

roidelapluie added a commit to roidelapluie/prometheus that referenced this issue Mar 4, 2019

Add honor_timestamps
Fixes prometheus#5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
@roidelapluie

This comment has been minimized.

Copy link
Contributor Author

roidelapluie commented Mar 4, 2019

Yes it would be dropped. But the metrics were actually relevant so we should not want to drop it, just scrape it as current time.

@roidelapluie

This comment has been minimized.

Copy link
Contributor Author

roidelapluie commented Mar 4, 2019

avg_over_time would not give the same result, and is not really useful here as my usecase involves a metric wich is already computed by our partner over 5 minutes.

Rate value:

Time Value
0  10
10 5
20 10
30 0
40 10

Actually Exposed by our partner:

Time Value
0  10
10 5
20 10
40 10

We can not see the drop at "0".

If we ignore the timestamp, we could get:

Time Value
1  10
11 5
21 10
31 -stale-
41 10

It is an ever bigger problem with labels as some labels are visible for 5 minutes but they should actually only appear once.

roidelapluie added a commit to roidelapluie/prometheus that referenced this issue Mar 4, 2019

Add honor_timestamps
Fixes prometheus#5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 4, 2019

You'd get no data at that point though, and that's pretty hard to work with.

@roidelapluie

This comment has been minimized.

Copy link
Contributor Author

roidelapluie commented Mar 4, 2019

I do a sum() of the rates by some useful labels so I can work with that.

@roidelapluie

This comment has been minimized.

Copy link
Contributor Author

roidelapluie commented Mar 4, 2019

I currently have a ugly recording rule:

record: enduser_inbound:http_requests_duration_seconds:avg5m
expr: |
    raw_enduser_inbound:http_requests_duration_seconds:avg5m
    and (time() - timestamp(raw_enduser_inbound:http_requests_duration_seconds:avg5m) < 65)

(we have a job config that relabels enduser_inbound:http_requests_duration_seconds:avg5m to raw_enduser_inbound:http_requests_duration_seconds:avg5m at scrape time)

That is causing me way more issues than this proposal to ignore timestamps.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 5, 2019

I don't think this solves the fundamental problem, CloudWatch also does this for example and we'd end up persisting duplicating a sample twice rather than (correctly) only taking in the value we want at one timestamp.

Your example data also doesn't line up with what you're saying.

@roidelapluie

This comment has been minimized.

Copy link
Contributor Author

roidelapluie commented Mar 5, 2019

Use case 1:

I get metrics:

http_call:rate5m{service="a"} 10 1551784037000
http_call:rate5m{service="b"} 4 1551784037000

And on the next scrape, I get

http_call:rate5m{service="a"} 11 1551784067000

If I sum up : sum (http_call:rate5m) at 1551784067001, I will get: 15. I would like to get: 11.

I am not the master of those metrics, just a consumer.

Use case 2

Broken exporters behaviour: google/mtail#210

@roidelapluie

This comment has been minimized.

Copy link
Contributor Author

roidelapluie commented Mar 5, 2019

I agree that this is not something that will be needed a lot but I need it now and I expect that with prometheus format becoming more important in the future other people will need this.

(ps: I would really like to avoid writing my own proxy for this)

@roidelapluie

This comment has been minimized.

Copy link
Contributor Author

roidelapluie commented Mar 5, 2019

Use case 3

Federating prometheus with systems exposing timestamps but refreshing at a slow interval (e.g. 10m)

roidelapluie added a commit to roidelapluie/prometheus that referenced this issue Mar 8, 2019

Add honor_timestamps
Fixes prometheus#5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

roidelapluie added a commit to roidelapluie/prometheus that referenced this issue Mar 8, 2019

Add honor_timestamps
Fixes prometheus#5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

roidelapluie added a commit to roidelapluie/prometheus that referenced this issue Mar 12, 2019

Add honor_timestamps
Fixes prometheus#5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

roidelapluie added a commit to roidelapluie/prometheus that referenced this issue Mar 12, 2019

Add honor_timestamps
Fixes prometheus#5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

roidelapluie added a commit to roidelapluie/prometheus that referenced this issue Mar 12, 2019

Add honor_timestamps
Fixes prometheus#5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

roidelapluie added a commit to roidelapluie/prometheus that referenced this issue Mar 14, 2019

Add honor_timestamps
Fixes prometheus#5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

roidelapluie added a commit to roidelapluie/prometheus that referenced this issue Mar 14, 2019

Add honor_timestamps
Fixes prometheus#5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>

brian-brazil added a commit that referenced this issue Mar 15, 2019

Add honor_timestamps (#5304)
Fixes #5302

Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.