Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch remaining metrics to seconds #1606

Closed
brian-brazil opened this Issue Apr 29, 2016 · 12 comments

Comments

Projects
None yet
4 participants
@brian-brazil
Copy link
Member

brian-brazil commented Apr 29, 2016

There's still a handful of metrics in Prometheus not using seconds, mostly around storage. We should switch them to seconds for consistency.

While it's proposed that it's okay to change this sort of thing post 1.0, it'd be best to get it in before that.

@grobie

This comment has been minimized.

Copy link
Member

grobie commented Apr 29, 2016

Some of the quantile summaries should also be histograms for aggregations.

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Apr 29, 2016

Most of those summaries should be just changed not to have quantiles imho.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Apr 30, 2016

What's the benefit of removing them? They are not exactly expensive in the
Prometheus case.

On Fri, Apr 29, 2016, 7:50 PM Brian Brazil notifications@github.com wrote:

Most of those summaries should be just changed not to have quantiles imho.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#1606 (comment)

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Apr 30, 2016

I think they're of low value and potentially confusing given the disproportionate importance that is attached to quantiles compared to their actual utility. Histhograms are also favoured, so if we really need this data (I think we can live without) we should use those.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Apr 30, 2016

+1 for changing everything to seconds.

I find some of the quantiles extremely useful. In general, if the latency of a single host matters, no aggregation is needed, and precomputed quantiles are very handy. A Prometheus server is a good example where you are interested in the latency of a single server.

At SC, we use a number of quantiles exposed by the Prometheus server quite regularly. We can go through all the summaries and decide for each one which of the following is the right course of action: (a) remove, (b) turn into histogram, or (c) leave it as is is.

@brian-brazil brian-brazil added this to the v1.0.0 milestone Jun 23, 2016

@fabxc fabxc closed this Jun 23, 2016

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Jun 23, 2016

Problem is that some non-second metrics come from the infamous InstrumentHandler in client_golang.

I see the following possibilities:

  1. Wait for the reworked client_golang and port to it before 1.0. (I have committed to a deadline of early August for that. Which is probably too late for the 1.0 plans.)
  2. Keep it as it is, and break the related metrics post 1.0 when we update to the reworked client_golang.
  3. Fix the metrics to seconds in head of client_golang (or a special branch) and vendor it into Prometheus.
  4. Don't use the InstrumentHandler at all and use direct instrumentation where needed.

(3) Has the problem that we probably want to revisit the metrics provided by InstrumentHandler anyway, so it's almost as breaking later as (2). (4) is possibly the best solution anyway as InstrumentHandler is problematic in many ways and provides many metrics we don't even need. It is, however, the most work as we have to really think which of the metrics provided by InstrumentHandler we want to keep, which to drop, and which to modify.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Jun 23, 2016

Reopening to not lose track.

@beorn7 beorn7 reopened this Jun 23, 2016

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Jun 24, 2016

Metrics are not part of our stability guarantees. Holding of 1.0 for just that doesn't seem appropriate.
Removing from milestone.

@fabxc fabxc removed this from the v1.0.0 milestone Jun 24, 2016

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Jun 24, 2016

Fine with me. So consensus is (2) then? @brian-brazil ?

@brian-brazil

This comment has been minimized.

Copy link
Member Author

brian-brazil commented Jun 24, 2016

2 is what I was thinking.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented Jun 24, 2016

Cool. I'll close this again, then, because the metric change will happen implicitly with the move to the new client library.

@beorn7 beorn7 closed this Jun 24, 2016

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.