Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus enhancement #1120

Merged
merged 5 commits into from
Jun 30, 2022
Merged

Prometheus enhancement #1120

merged 5 commits into from
Jun 30, 2022

Conversation

amnonh
Copy link
Contributor

@amnonh amnonh commented Jun 30, 2022

Motivation:
Histograms are the Prometheus killer. They are big to send over the
network and are big to store in Prometheus.
Histograms' size is always an issue, but with Seastar, it becomes even trickier.
Typically, each shard would collect its histograms, so the overall data
is multiplied by the number of shards.

This series addresses the need to report quantile information like latency
without generating massive metrics reports.

A summary is a Prometheus metric type that holds a quantile summary (i.e.
p95, p99).

The downside of summaries is that they cannot be aggregated, which is
needed for a distributed system (i.e., calculate the p99 latency of a cluster).

The series adds four tools for Prometheus performance:

  1. Add summaries.
  2. Optionally, remove empty metrics. It's common to register metrics for
    optional services. It is now possible to mark those metrics as
    skip_when_empty and they will not be reported if they were never used.
  3. Allow aggregating metrics. The most common case is reporting per-node
    histogram instead of per shard. For example, for multi-nodes quantile calculation,
    we need a per-node histogram. It is now possible to mark a registered
    metric for aggregation. The metrics layer will aggregate this
    metric based on a list of labels. (Typically, this will be by shard,
    but it could be any other combination of labels).
  4. Reuse the stringstream instead of recreating an object on each
    iteration.

V6
changed the name of the skip_when_empty variable to should_skip_when_empty to make the compiler happy.

V5
Main changes:

  • split the aggregate and skip_on_empty into two patches
  • Use bool_class for skip_when_empty this also allows using the parenthesis
    operator to mark that a metric should use skip_when_empty.
  • Fix some typos in the comments
  • Add noexcept when needed
  • Change "\n" to '\n' when possible.

V4
Aggregation and skip_when_empty are now generalized in a sense that it's
a per metric configuration and can be used on any metric, not just
histogram.

The 'aggregate by' now accept labels instead of string which makes the API
cleaner.

The stringstream that is used inside Prometheus text reporting is now
reused instead of recreating on each iteration.

The aggregator object in Prometheus was rename and some comments were
added to make its use clearer.

V3
The main issue with the previous version was that it was too specific.
This version breaks the functionality and pushes some of the logic to
whoever uses the metrics layer.

Summaries are now just another kind of histogram, with no special magic
around them and no need for their own type.

A user can use any combination of labels for aggregation.
It's currently only applied to histograms, but we can do it with other
metrics if it will be helpful.

In the current implementation, to report a summary per shard and a single
histogram per node, the user of the metrics layer would register two metrics,
one for the summary and one for the histogram.

This patch add support for the summary type on the metrics layer.

A summary is a different kind of histogram, it's buckets are percentile
so the reporting layer (i.e. Prometheus for example) would know to
report it correctly.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
This patch adds a missing part to how histograms are being aggregated,
it needs to aggregate the sum and count as well.

Fixes scylladb#1114

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Skip when empty, means that metrics that are not in used, will not be
reported.

A common scenario is that a user register a metric, but that metric is
never used.
The most common case is histogram and summary but it is also valid
for counters.

This patch adds an option to mark a metric with skip_when_empty.
When done so, if a metric was never used (true for histogram, counters
and summaries) it will not be reported.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Aggregate by labels is a mechanism for reporting aggregated results. Most
commonly it is used to report one histogram per node instead of per
shard, but it can be used for counters, and gauges.

This patch adds an option to mark a metric with a vector of labels.
That vector will be part of the metric meta-data so the reporting
layer would be able to aggregate over it.
This patch adds multiple functionality to Prometheus reporting:
1. Add summary reporting. Summaries are used in seastar to report aggregated
   percentile information (for example p95 and p99)
   The main usage is to report per-shard summary of a latency
   histograms.
2. Support aggregated metrics. With an aggregated metrics,
   Prometheus would aggregate multiple metrics based on labels and
   would report the result. Usually this would be for reporting a single
   latency histogram per node instead of per shard. But it could be used
   for counters and gauge as well.
3. Skip empty counters, histograms and summaries. It's a common practice to
   register lots of metrics even if they are not being used.
   Histograms have a huge effect on performance, so not reporting an empty
   histogram is a great performance boost both for the application and
   for the Prometheus server.
   This is true for Summaries and Counters as well, marking a metric
   with skip_when_empty would mean Prometheus will not report the
   metric if it was never used.
4. As an optimization, the stringstream that is used per metric is
   reused and clear insted of recreated.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
@avikivity
Copy link
Member

Please use more descriptive subjects.

@nyh
Copy link
Contributor

nyh commented Jun 30, 2022

Also please don't include the version history as part of the commit message - just describe what you ended up with.
I'll fix both problems and commit.

@nyh nyh merged commit ce78808 into scylladb:master Jun 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants