Metrics Histogram instrument default: any histogram sketch #982

jmacd · 2020-09-22T05:37:26Z

What are you trying to achieve?

Accepting this issue means changing the metric API and SDK specification to state that "any histogram sketch" library may be chosen as the default aggregation for ValueRecorder instruments, subject to the community decision in each OpenTelemetry implementation. Note that the classic fixed-boundary histogram is a viable solution, although there will usually be a better approach that works automatically without fixed boundaries.

As stated in #636, there has been considerable debate over the default for ValueRecorder. This case was restated in #919, where the proposal was for DDSketch. From the discussion and from surveying this space a little bit more, it's clear that there are many existing libraries and algorithms that produce variable-boundary histograms. The (current) v0.5 OTLP protocol has this to say (e.g., from DoubleHistogramDataPoint):

  // explicit_bounds is the only supported bucket option currently.
  // TODO: Add more bucket options.

  // explicit_bounds specifies buckets with explicitly defined bounds for values.
  // The bucket boundaries are described by "bounds" field.
  //
  // This defines size(bounds) + 1 (= N) buckets. The boundaries for bucket
  // at index i are:
  //
  // (-infinity, bounds[i]) for i == 0
  // [bounds[i-1], bounds[i]) for 0 < i < N-1
  // [bounds[i], +infinity) for i == N-1
  // The values in bounds array must be strictly increasing.
  //
  // Note: only [a, b) intervals are currently supported for each bucket except the first one.
  // If we decide to also support (a, b] intervals we should add support for these by defining
  // a boolean value which decides what type of intervals to use.
  repeated double explicit_bounds = 7;

The next logical step is to propose that each OpenTelemetry default SDK choose the best available sketch library that is produces a histogram representation. Examples include:

HdrHistogram
DDSketch
CircllHist
UDDSketch
DynaHist

This definition, "best available sketch library", allows an SDK to fall back in two ways: (1) if there is no sketch library, a fixed-boundary histogram implementation may be a good solution, (2) if the SDK is stateless, it will be better to encode a raw value than to encode a single-value histogram.

The the OpenMetrics Histogram BucketOptions message struct, which includes Linear, Exponential, and Explicit options, is a good baseline set of supports. The discussion in #919 goes into depth about ways that the DDSketch representation can be compressed. This compression is nice-to-have, especially where we can implement it once--as in the OTel collecgtor. However, as pointed out here, there are a number of pieces of code that have to be written in each language SDK for clients to achieve this compression. This will be a lot less work if any available sketch library can be used.

Note that dealing with both positive and negative values can require special support in many of these representations. Note also there is a common case for handling zero in exponential schemes: the consequence is a bucket that has zero width. This would require changes in the definition of explicit_bounds, which doesn't permit empty buckets (as seen above).

Also discussed in #919, there is sometimes a question of whether to represent histograms with floating point or with integer counts. There are two known reasons for wanting floating-point count:

Statsd "sample-rate" parameters lead to non-integer count values
Sketch data structures often produce non-integer counts.

For example, the T-Digest paper section §2.9) details how to compute a cumulative distribution function from a T-Digest, which naturally converts into a histogram representation with N-1 buckets derived from N T-Digest "centroids". When converting T-Digest sketches into histograms we end up with non-integer counts.

Proposed OTLP data point

To make this proposal concrete, it calls for a third histogram data point message type. Whereas OTLP v0.5 has DoubleHistogramDataPoint and IntHistogramDataPoint, this proposal would add SketchHistogramDataPoint which has double-valued counts to support both cases mentioned above. Something like this:

message SketchHistogramDataPoint {
  // [labels & two timestamps]

  double count = 4;

  double sum = 5;

  oneof buckets {
    Explicit    explicit = 7;
    Linear      linear = 8;
    Exponential exponential = 8;
    DDSketch    ddsketch = 10;
    CircllHist  circllhist = 11;
    // ...
  }

  // [min, max, exemplars]
}

Additional context.

Although the issue of how to treat raw points is separate, this issue otherwise appears to address the many of the concerns raised in #636 as well as allow for DDSketch where available. There are several known situations where this does not necessarily give the user everything they want:

When long collection intervals are used it is sometimes more desirable for the aggregation to be LastValue instead of a histogram-like sketch. In these conditions, it may be more interesting to know the last value than to know the distribution of values. This case can be potentially remedied by the user configuring a non-default aggregation, or it can be remedied by using an asynchronous instrument (i.e., ValueObserver) because its default aggregation is LastValue (Metrics: specify LastValue as default aggregation for ValueObserver instrument #834).
When a Prometheus backend is used to process this data, one timeseries is recorded per histogram bucket. In this representation, variation over time can lead to a large number of buckets. Histogram-merging logic when Prometheus is the target may want to merge (interpolate) variable-boundary histogram data points into configurable-fixed-boundary histogram data points.
Exact min and max values are still in demand for histogram data points. These should be made available as independent fields.

A minor note about about Summary data points (i.e., quantile-value listings). These were removed in OTLP v0.5 while we focused on that release, but this type of data point can be represented in the proposed SketchHistogramDataPoint described above, with one histogram bucket (and floating-point count) per quantile.

The text was updated successfully, but these errors were encountered:

yzhuge · 2020-10-01T17:38:36Z

I suggest that the exponential message be extended to support the “log linear” format, which basically allows an exponential bucket to be split into equal width linear sub-buckets. The log linear format is used by HdrHistogram and Circlhist. The Circlhist paper https://arxiv.org/pdf/2001.06561.pdf has detailed description of it.

The change on exponential format is very simple. From https://github.com/OpenObservability/OpenMetrics/blob/7dfd2087ed3c62ede886646c222ad15112556704/proto/openmetrics_data_model.proto#L173 , we currently have:

  message Exponential {
      // Required.
      // Must be greater than 0.
      uint32 num_exponential_buckets = 1;

      // Required.
      // Must be greater than 1.
      double growth_factor = 2;

      // Required.
      // Must be greater than 0.
      double scale = 3;
    }

All we need is a new number_of_linear_subbuckets field. It can be optional and defaults to 1 (standard exponential histogram). Because the change is so simple, I am suggesting to extend the current exponential format, rather than adding a new log linear format. With linear subbucket support, HDR histogram, CirclHist and alike can compress the bound field to just a few parameters. The large number of Hdr Histogram users will welcome such a feature and may adopt Open Telemetry more quickly.

Related question, “dealing with both positive and negative values can require special support in many of these representations”. May I suggest that an optional 2nd histogram be allowed for negative numbers when the format is exponential? This is a common solution for negative numbers when using exponential histograms

yzhuge · 2020-10-02T00:38:03Z

Looking at #919 again, it appears that DDSketch has several value to bucket mapping methods. The canonical one uses standard log algorithm, the others use approximation, for lower cpu cost. The various mapping methods can be found at https://github.com/DataDog/sketches-java/tree/master/src/main/java/com/datadoghq/sketch/ddsketch/mapping (the link from issue 919 did not work, but I managed to find the classes in the repo). The main sketch class has constructors to choose a method.

The canonical method should be able to produce a standard exponential histogram. The only sticky point may be that DDSketch allows bucket collapsion, where multiple buckets (typically at the lower end) may be merged to save space. However, if the format defines a starting (0, minTrackedValue) bucket and an ending (maxTrackedValue, +infinity) bucket, the consumer of the message can determine if collapsion happened. In a no-collapsion case, the counter at both ends should be 0. With collapsion, one or both ends' bucket will have non-zero count (some data fell into the open ended catch-all buckets). This format will support collapsion on one or both ends, and communicate collapsion info from histogram producer to consumer.

jmacd · 2020-10-15T07:13:14Z

@yzhuge I am following your suggestion above about a number_of_linear_subbuckets field. The Circllhist paper diagrams are great at explaining this 2-parameter histogram with B=logarithm-base and N=number-of-linear-subbuckets, but what's appealing to me about this is the potential for compatibility with Prometheus, which uses

DefBuckets = []float64{.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10}

This is approximately a B=10/N=4 encoding (matching the Circllhist diagram), except it's really a N=3-out-of-4 encoding scheme in which the 1..10 interval is divided into 4 linear buckets and then the last two are collapsed.

Note that both the B=10/N=4 and B=10/N=90 boundaries can be correctly collapsed into the default Prometheus N=3-out-of-4 scheme. For example, a B=10/N=90 histogram has 15 buckets covering .1 to .25, 25 buckets covering .25 to .50, and 50 buckets covering 0.50 to 1.0. We can convert either of these into Prometheus histograms without loss of accuracy or precision.

This leads to thinking about either the B=10/N=4 or B=10/N=90 boundaries by default, but I'm not interested in support for general parameterized histograms (e.g., a B=3/N=2 encoding has boundaries at 1, 2, 3, 6, 9, 18, 27, 54, 81, ... and does not interest me). Note that we could specify this encoding without specifying how these histograms are computed.

jmacd · 2020-10-15T18:56:20Z

I made a logic error in the analysis above, but it's minor. The Prometheus scheme can be generated from a B=10/N=18 bucketing scheme, where the buckets are collapsed in groups of 3, 5, and 10 to synthesize the 0.1/0.25/0.5/1 boundaries.

githomin · 2020-10-16T16:23:03Z

Looking at #919 again, it appears that DDSketch has several value to bucket mapping methods. The canonical one uses standard log algorithm, the others use approximation, for lower cpu cost.

We'd like to note that this proposal would cut off OT users from with the faster and more memory-efficient mappings such as DDSketch's CubicallyInterpolatedMapping (about 3 times less costly than a basic exponential mapping, with only 1% space overhead) or DynaHist's LogQuadraticLayout

The canonical method should be able to produce a standard exponential histogram. The only sticky point may be that DDSketch allows bucket collapsion,

This won't be a problem. There is no special handling of collapsing is taken into account in our protocol as well. It should also be noted that collapsing is purely a safe-guard and should almost never happen as shown in our paper.

yzhuge · 2020-10-22T19:44:46Z

See histogram update PR OpenObservability/OpenMetrics#152
The PR went to the wrong repo. Reworking it on the correct repo.

yzhuge · 2020-10-26T21:03:58Z

Take 2: open-telemetry/opentelemetry-proto#226

jmacd · 2021-09-14T16:11:57Z

The metrics SDK specification reads "select the best Histogram Aggregation available":

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#histogram-aggregation

jmacd added the spec:metrics Related to the specification/metrics directory label Sep 22, 2020

jmacd mentioned this issue Sep 22, 2020

Specify LastValue for ValueObserver, TBD for ValueRecorder #984

Merged

andrewhsu added priority:p1 Highest priority level release:required-for-ga Must be resolved before GA release, or nice to have before GA area:sdk Related to the SDK priority:p2 Medium priority level and removed priority:p1 Highest priority level labels Sep 22, 2020

andrewhsu added this to Required for GA, needs action. Add label:release:required-for-ga to issues and PRs when moving them to this column. in GA Spec Burndown Sep 26, 2020

jmacd mentioned this issue Sep 29, 2020

Enable exporting Histogram aggregation to OTLP metric open-telemetry/opentelemetry-go#1209

Merged

jmacd mentioned this issue Oct 6, 2020

Prometheus receiver does not scrape some Summary type metrics open-telemetry/opentelemetry-collector#1638

Closed

jmacd assigned yzhuge Oct 15, 2020

bogdandrutu mentioned this issue Oct 16, 2020

ValueRecorder - default aggregation #636

Closed

yzhuge mentioned this issue Oct 22, 2020

Update histogram protocol OpenObservability/OpenMetrics#152

Closed

andrewhsu added priority:p1 Highest priority level and removed priority:p2 Medium priority level labels Oct 23, 2020

yzhuge mentioned this issue Oct 26, 2020

Add more histogram types open-telemetry/opentelemetry-proto#226

Closed

This was referenced Oct 26, 2020

Support quantiles with Summary Metric for Amazon CloudWatch interoperability #1145

Closed

Restore Support for Summary Metric in OTLP as a compatibility feature for Prometheus #1146

Closed

alexppg mentioned this issue Oct 27, 2020

Aggregate ValueRecorder to HistogramAggregator on Prometheus open-telemetry/opentelemetry-python#1255

Closed

jmacd mentioned this issue Jan 3, 2021

Histogram aggregator functional options open-telemetry/opentelemetry-go#1434

Merged

Sytten mentioned this issue Jan 15, 2021

Metrics prisma/prisma#5129

Closed

reyang self-assigned this Mar 3, 2021

reyang added this to the Metrics API/SDK Feature Freeze milestone Mar 3, 2021

gavindoudou mentioned this issue Mar 18, 2021

Histogram Metrics: Add some default boundaries or algorithms to Histogram open-telemetry/opentelemetry-proto#280

Closed

reyang added this to To do in Spec - Metrics API/SDK Feature Freeze May 6, 2021

jmacd changed the title ~~Metrics ValueRecorder instrument default: any histogram sketch~~ Metrics Histogram instrument default: any histogram sketch Jul 23, 2021

jmacd mentioned this issue Jul 23, 2021

Referendum on Histogram format #1776

Closed

andrewthauer mentioned this issue Aug 18, 2021

Instrument catalog with prometheus metrics backstage/backstage#6848

Merged

5 tasks

jmacd closed this as completed Sep 14, 2021

GA Spec Burndown automation moved this from Required/Allowed for GA, todo. to Required/Allowed for GA, resolved. Sep 14, 2021

Spec - Metrics API/SDK Feature Freeze automation moved this from To do to Done Sep 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metrics Histogram instrument default: any histogram sketch #982

Metrics Histogram instrument default: any histogram sketch #982

jmacd commented Sep 22, 2020 •

edited

Loading

yzhuge commented Oct 1, 2020

yzhuge commented Oct 2, 2020

jmacd commented Oct 15, 2020

jmacd commented Oct 15, 2020

githomin commented Oct 16, 2020

yzhuge commented Oct 22, 2020 •

edited

Loading

yzhuge commented Oct 26, 2020

jmacd commented Sep 14, 2021

Metrics Histogram instrument default: any histogram sketch #982

Metrics Histogram instrument default: any histogram sketch #982

Comments

jmacd commented Sep 22, 2020 • edited Loading

yzhuge commented Oct 1, 2020

yzhuge commented Oct 2, 2020

jmacd commented Oct 15, 2020

jmacd commented Oct 15, 2020

githomin commented Oct 16, 2020

yzhuge commented Oct 22, 2020 • edited Loading

yzhuge commented Oct 26, 2020

jmacd commented Sep 14, 2021

jmacd commented Sep 22, 2020 •

edited

Loading

yzhuge commented Oct 22, 2020 •

edited

Loading