New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

estimated histgoram are accumulate twice #3827

Closed
amnonh opened this Issue Oct 8, 2018 · 2 comments

Comments

Projects
None yet
3 participants
@amnonh
Contributor

amnonh commented Oct 8, 2018

Commit fc4416a change the get_histogram method to return an accumulative histogram instead of a regular histogram.

prometheus assumes that the return histogram is regular one and accumulate the result also,
The end result is that the histograms are now broken.

@duarten duarten added the bug label Oct 8, 2018

@glommer

This comment has been minimized.

Contributor

glommer commented Oct 8, 2018

I don't understand how this is the case. Before that patch, the histograms were verified to be bogus. Can you show me an example of the histograms being bad now ?

Also this link : https://prometheus.io/docs/concepts/metric_types/#histogram explicitly says that the buckets are expected to be cumulative: (see first bullet)

A histogram with a base metric name of <basename> exposes multiple time series during a scrape:

* cumulative counters for the observation buckets, exposed as <basename>_bucket{le="<upper inclusive bound>"}
* the total sum of all observed values, exposed as <basename>_sum
* the count of events that have been observed, exposed as <basename>_count (identical to <basename>_bucket{le="+Inf"} above)
@amnonh

This comment has been minimized.

Contributor

amnonh commented Oct 8, 2018

Prometheus expect it to be accumulative, but that is taking care of on the Prometheus level.

Point your browser to:
http://{ip}:9180/metrics

(for example http://localhost:9180/metrics)

and look for scylla_storage_proxy_coordinator_write_latency_bucket for example

see that the total number is less than the value of the last bucket.

The code that does the accumulation is found at:
prometheus.cc:write_text_representation
while I was debugging this issue i found another one with the labels of the sum and count.

The issue was only detected now because Prometheus 2.x uses only the text representation, and there is a difference in the implementation between the protobuf and the text.

avikivity added a commit to scylladb/seastar that referenced this issue Oct 9, 2018

prometheus: Fix histogram text representation
The current implementation of the histogram text representation has two
issues:
* The histograms values, are already accumulated, no need to accumulate them
again
* Need to add labels to the sum and count to associate them to the right
metric.

This patch fixes those two issues.

See scylladb/scylla#3827

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20181009121725.29332-1-amnon@scylladb.com>

avikivity added a commit to scylladb/scylla-seastar that referenced this issue Oct 9, 2018

prometheus: Fix histogram text representation
The current implementation of the histogram text representation has two
issues:
* The histograms values, are already accumulated, no need to accumulate them
again
* Need to add labels to the sum and count to associate them to the right
metric.

This patch fixes those two issues.

See scylladb/scylla#3827

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20181009121725.29332-1-amnon@scylladb.com>
(cherry picked from commit 4669469be5162b6d5138062c6d0446bcadf39a3b)

avikivity added a commit that referenced this issue Oct 9, 2018

Update seastar submodule
* seastar 5712816...39b89de (1):
  > prometheus: Fix histogram text representation

Fixes #3827.

avikivity added a commit to scylladb/scylla-seastar that referenced this issue Oct 9, 2018

prometheus: Fix histogram text representation
The current implementation of the histogram text representation has two
issues:
* The histograms values, are already accumulated, no need to accumulate them
again
* Need to add labels to the sum and count to associate them to the right
metric.

This patch fixes those two issues.

See scylladb/scylla#3827

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Message-Id: <20181009121725.29332-1-amnon@scylladb.com>
(cherry picked from commit 4669469be5162b6d5138062c6d0446bcadf39a3b)

avikivity added a commit that referenced this issue Oct 9, 2018

Update seastar submodule
* seastar ebf4812...b846dfe (1):
  > prometheus: Fix histogram text representation

Fixes #3827.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment