Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Histogram support multiple boundaries types vs different types of histogram aggregation #259

Closed
bogdandrutu opened this issue Feb 12, 2021 · 3 comments · Fixed by #291
Closed

Comments

@bogdandrutu
Copy link
Member

bogdandrutu commented Feb 12, 2021

During the meeting on Thu, Feb 11, there was a discussion about our data model supporting new aggregation types, and the answer was clearly we do support that, but that comment triggered me a new idea.

Instead of deprecating the current explicit boundaries from our current definition of Histogram, what if we just add a new type ExponentialHistogram, and that combined with #256 and #257 will keep the number of types at a reasonable number.

Advantages of this solution is that #226 can be a simple addition to the current data model, and also can improve the way how counts are stored for large number of buckets (like using int64 instead of fixed64, or even using other encoding that does not require to have a value for 0 buckets, etc.).

Another advantage is that we don't have to deal with situation where points in the same histogram metric have different boundaries types.

I would like to propose this for discussion and make a decision about this. I am not 100% convinced on one or the other, but I try to propose this possibility.

/cc @jmacd @open-telemetry/specs-metrics-approvers

@bogdandrutu bogdandrutu changed the title Histogram support multiple boundaries types vs different types of histogram aggregatio Histogram support multiple boundaries types vs different types of histogram aggregation Feb 12, 2021
@jmacd
Copy link
Contributor

jmacd commented Feb 24, 2021

I've started thinking about one more alternative to your suggestion, also motivated by the comments in the tail of #226, in particular
#226 (comment).

The idea is to combine the Discretization and Compaction fields into a single type, a oneof various Buckets. In the existing histogram we have

message ExplicitCountBuckets {
  repeated double explicit_bounds = 1;
  repeated fixed64 bucket_counts = 2;
}

These two fields determine both buckets (Discretization) and counts (Compaction). I've been thinking about one other equally explicit form (mentioned in today's Data Model meeting):

message ExplicitSampleBuckets {
  repeated double explicit_bounds = 1;
  repeated double bucket_counts = 2;
}

This type is identical to ExplicitCountBuckets, but counts are floating-point values. This has two uses:

  1. Statsd sampled histograms can be conveyed correctly (for arbitrary sampling rates) using this encoding
  2. Prometheus summaries can be expressed in this way (independent of differences in temporality)

In the first example, a statsd h-type metric with @0.4 sample rate (i.e., 40% sampling), a single observation should output a count of 2.5 in the histogram. (Note: I also need what is presently fixed64 count = 4; to become double count = 4.)

In the second example, a Prometheus summary for quantiles [0.25, 0.5, 0.75, 0.9, 0.95, 0.99] with total count C, total sum S.

Let the calculated quantile values equal the calculated values P25, P50, P75, P90, P95, P99. The output histogram is as follows:

count: C
sum: S
explicit_bounds: [P25, P50, P75, P90, P95, P99]
bucket_counts: [
  (0.25-0)*C,
  (0.5-0.25)*C,
  (0.75-0.5)*C,
  (0.9-0.75)*C,
  (0.95-0.9)*C,
  (0.99-0.95)*C,
  (1.0-0.99)*C, 
]

Note that the final count bucket is redundant, since it represents (P99,+Inf), equal to the total count minus the sum of all bucket counts. (By this logic we could shorten bucket_counts by one, if we pleased, in the present-day explicit-boundary histogram.)

The suggestion in @oertl's comment, bullets 1 and 2, could be summarized as follows:
(a) Each bucketing scheme should be able to decide whether it can merge with another of the same type but potentially different parameterization. As @postwait says, this should only be done when the schemes support non-lossy merging.
(b) Fancy schemes support auto-combining buckets with compatible parameters (e.g., UDDSketch) can be added later.

Anyway, please also consider this approach.

I'm looking forward to @yzhuge's continued work on this, #226 (comment), and want to thank @githomin for this reminder:

I think we should keep in mind the goal of providing histograms with relative accuracy guarantees where users don't have to set any parameters

@yzhuge
Copy link
Contributor

yzhuge commented Mar 3, 2021

I like the idea of each histogram types having its own count encoding. I made the bounds share one common count array in #226 only because I wanted to minimize changes. Since we haven't actually released the protocol, minimizing changes or backward compatibility is not a big concern. Thus we should be more forward looking than backward looking. To be clear, future version could look like:

oneof buckets {
ExplicitCountBuckets explicitCountBuckets;
ExplicitSampleBuckets explicitSampleBuckets;
ExponentialBuckets exponentialBuckets;
}

Each "Buckets" message will have its own bounds and counts.

@jmacd
Copy link
Contributor

jmacd commented Mar 4, 2021

@yzhuge I am currently working on a PR that does this basic factoring. I am going to propose that we ignore the difference between ExplicitCountBuckets and ExplicitSampleBuckets at this point and let all counts be doubles, but I will do that in a separate PR. Also I will never again mention this idea about Summary metrics. 😀

bogdandrutu added a commit to bogdandrutu/opentelemetry-proto that referenced this issue Apr 1, 2021
Fixes: open-telemetry#259

Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>
bogdandrutu added a commit to bogdandrutu/opentelemetry-proto that referenced this issue Apr 2, 2021
Fixes: open-telemetry#259

Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>
bogdandrutu added a commit that referenced this issue Apr 12, 2021
Fixes: #259

Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment