Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does not discard duplicate data #51

Closed
msiebuhr opened this issue Jul 30, 2015 · 2 comments
Closed

Does not discard duplicate data #51

msiebuhr opened this issue Jul 30, 2015 · 2 comments
Assignees

Comments

@msiebuhr
Copy link

I tried pushing a bunch of historical data to pushgw:

# TYPE http_request_duration_microseconds summary
http_request_duration_microseconds{method="GET",quantile="0.5",source="web.1",status="200"} 3000 1438224000000
http_request_duration_microseconds{method="GET",quantile="0.9",source="web.1",status="200"} 3000 1438224000000
http_request_duration_microseconds{method="GET",quantile="0.99",source="web.1",status="200"} 3000 1438224000000
http_request_duration_microseconds_count{method="GET",source="web.1",status="200"} 1 1438224000000
http_request_duration_microseconds_sum{method="GET",source="web.1",status="200"} 3000 1438224000000

http_request_duration_microseconds{method="GET",quantile="0.5",source="web.1",status="200"} 4000 1438224720000
http_request_duration_microseconds{method="GET",quantile="0.9",source="web.1",status="200"} 4000 1438224720000
http_request_duration_microseconds{method="GET",quantile="0.99",source="web.1",status="200"} 4000 1438224720000
http_request_duration_microseconds_count{method="GET",source="web.1",status="200"} 1 1438224720000
http_request_duration_microseconds_sum{method="GET",source="web.1",status="200"} 4000 1438224720000

I get the same amount of output, but with all the timestamps set to equal the latest known for the series (fetching http://gw:9091/metrics). Except the _count and _sum has been deduplicated:

http_request_duration_microseconds{method="GET",source="web.1",status="200",quantile="0.5"} 3000 1438224720000
http_request_duration_microseconds{method="GET",source="web.1",status="200",quantile="0.9"} 3000 1438224720000
http_request_duration_microseconds{method="GET",source="web.1",status="200",quantile="0.99"} 3000 1438224720000

http_request_duration_microseconds{method="GET",source="web.1",status="200",quantile="0.5"} 4000 1438224720000
http_request_duration_microseconds{method="GET",source="web.1",status="200",quantile="0.9"} 4000 1438224720000
http_request_duration_microseconds{method="GET",source="web.1",status="200",quantile="0.99"} 4000 1438224720000

http_request_duration_microseconds_sum{method="GET",source="web.1",status="200"} 4000 1438224720000
http_request_duration_microseconds_count{method="GET",source="web.1",status="200"} 1 1438224720000

(Extra linebreaks are mine.)

@beorn7
Copy link
Member

beorn7 commented Jul 30, 2015

Yeah, I guess the parser should notice if duplicate quantiles are set and only take the latest. And at least warn... See prometheus/client_golang#150 where this has to be fixed.

On a more general note, the problem is that the text format is a very imperfect representation of the "truth". The native metric format of Prometheus is defined in https://github.com/prometheus/client_model/blob/master/metrics.proto .
While a summary looks like many independent time series in the text format as well as in the representation on the server, in the protobuf format all the quantiles and the sum and the count are just one complex metric with one timestamp.

@beorn7 beorn7 self-assigned this Jul 30, 2015
@beorn7
Copy link
Member

beorn7 commented Oct 20, 2015

This will be fixed as part of prometheus/client_golang#150

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants