Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Last value to MinMaxSumCount OTLP data point a.k.a. "MMLSC" #117

Closed
wants to merge 2 commits into from

Conversation

jmacd
Copy link
Contributor

@jmacd jmacd commented Jun 17, 2020

This proposal was the outcome of a discussion in open-telemetry/opentelemetry-specification#636.

It combines LastValue aggregation into the existing MinMaxSumCount aggregator, which will support exporting Gauges to Prometheus and Statsd over OTLP.

@jmacd jmacd requested review from a team as code owners July 9, 2020 06:37
@bogdandrutu
Copy link
Member

Please rebase and fix the CI

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the title of this OTEP is misleading, the impact is larger because the OTEP forces this new type of aggregation in the protocol (OTLP), which may be/not be justified.

Changing the default aggregation for the ValueRecorder is independent of how the data are serialized on the wire, and should be stated as that, I personally have no problem on changing that, but I see problems when exposing this new type on the wire:

  1. Arguments can be made to expose any combination of sum/count/min/max/last_value/mean/50%tile/etc. which will cause the protocol to explode in terms of supported aggregations.
  2. This OTEP specify that Prometheus needs the last value (which is true), but it only needs that in some scenarios (most likely for ValueObserver). For example if the ValueRecorder is used to record latencies then no reason to have last value for Prometheus, but when using ValueObserver for CPU temperature (and no other aggregation like grouping by one cpu_core_id) then it needs the last value which happens to be equal with the sum anyway. For Prometheus it is more important to distinguish between these two cases than supporting Last value here.

To summarize the OTEP makes an assumption that this new aggregation will be exported as a gauge with the last value in Prometheus, which is not always the case and more information are necessary in order to support that.

As I commented in the original ValueObserver otep, the default aggregator for that should be "LastValue", because we simply passthrough the value from the observer to the exporter (without doing any aggregation), there were arguments against that such as the fact that this will be different than what the backend will do, but with the new proposal from the proto repository 162 most likely the right default aggregation should be a SCALAR_[INT/DOULE] and GROUPING (which means it is not calculated as a sum of measurements)

@jmacd
Copy link
Contributor Author

jmacd commented Jul 17, 2020

I'd like to very carefully separate my response to this into two parts, one related to ValueRecorder, and one about ValueObserver. The discussion about ValueObserver is more difficult, so let's find agreement about ValueRecorder before moving on.

There has been an issue raised by multiple early adopters of OTel Metrics who noticed that a ValueRecorder would be exposed as a Summary in Prometheus. For example, open-telemetry/opentelemetry-go#875 and open-telemetry/opentelemetry-specification#636.

Prometheus summary is not the desired default, and the alternatives are to ensure a Histogram or a Gauge (scalar) is the default. We are already explicitly choosing the less-expensive option for all instruments, which is why MMSC has been specified as the default for ValueRecorder. And there are good explanations for this, and some vendors really want this. If you're collecting metrics on a short time interval, the MMSC summary is the maximally-useful-fixed-size way to expose a distribution. Any quantile information other than 0 (min) and 1 (max) requires variable-size to encode precisely (in a mergeable way), so we defined MMSC as a fixed-size summarization.

But when we speak of summarizing distributions, we are generally eliminating the time dimension. We know the min, max, sum and count, but we know nothing about the relationship between these values and the time variable. Adding a last-value to the summarization adds one piece of information about the time dimension, with which:

  • Prometheus gets the default its users want (L)
  • Other vendors still get the information they want (MMSC, e.g., New Relic)
  • Knowing the difference between Mean == (S/C) and Last (L) tells us a bound on variance (this is more technical than we really care about, but Chebyshev's inequality gives us some information here IIUC).

Now to your points:

new type of aggregation

This is a change to MMSC, not a new aggregation. Adding one value to MMSC makes it MMLSC. We were already encoding MMSC using a Summary data point, L just adds one value in that encoding and this PR makes Min/Max fields to reduce the encode size.

Arguments can be made to expose any combination of sum/count/min/max/last_value

I don't think that these arguments will succeed. The definition of MMLSC is that all five fields are always included in the Summary. I would state that any use of Summary must fill in all five fields. The use of quantiles / percentiles is optional, orthogonal, and complementary to MMLSC.

This OTEP specify that Prometheus needs the last value (which is true), but it only needs that in some scenarios (most likely for ValueObserver)

I think you are implying that Histogram should be the default for ValueRecorder+Prometheus, but we haven't specified the default aggregation to depend on the exporter. So I assert that Prometheus wants the Last value by default in all scenarios.

If ValueRecorder is used to record latencies, recording the Last value gives meaningful, statistically-viable information about the stream of events, but it does discard the sum, count, min, and maximum values. It's less information, and it's susceptible to systematic errors (e.g., due to a correlation between the collection event and the last value measured) but on a long time scale you can plot the last value and the result is meaningful. Not necessarily as useful as a histogram, but cheaper. Not necessarily as useful as a MMSC summary on short time scales, but more useful than a MMSC summary on long time scales.

So, I view this as a good compromise for ValueRecorder. I expect the additional value will compress reasonably well so that MMLSC encoding will not be much larger than MMSC. Users will have to configure ValueRecorder to use Histogram when they want the more-expensive option, otherwise exporters will use MMSC or L depending on the conventions in that system.

@jmacd
Copy link
Contributor Author

jmacd commented Jul 17, 2020

I will not write such a lengthy explanation for ValueObserver at this juncture, as the points in that debate are somewhat different, and I have one new idea that may help settle this. First we should decide on a term for ASYNCHRONOUS as discussed here: open-telemetry/opentelemetry-proto#168 (comment), after which I will argue that LastValue for an Observer instrument should be uniform-randomly selected from the observed values of a single callback. This preserves the semantics of LastValue used in the argument above ("recording the Last value gives meaningful, statistically-viable information about the stream of events").

@jmacd
Copy link
Contributor Author

jmacd commented Jul 18, 2020

@open-telemetry/specs-metrics-approvers See also open-telemetry/opentelemetry-proto#170

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
metrics Relates to the Metrics API/SDK
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants