Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional min and max fields to histogram data model. #1915

Merged
merged 11 commits into from Oct 1, 2021

Conversation

jack-berg
Copy link
Member

Related to Issue #1902. See this convo as well.

Changes

Expand histogram description in metric datamodel to include optional min and max fields.

The Metric Points section of the data model is marked as stable, but I believe that the optional designation on these fields makes this a non breaking change.

Related issues #

N/A.

@jack-berg
Copy link
Member Author

If folks want to proceed with this change, I'd also like to update the model-data-histogram.png image, but don't have the original file.

I can also update the changelog if necessary.

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR needs to clarify in more details the correlation between these new fields and the temporality and how to calculate "delta -> cumulative -> delta"

@jack-berg
Copy link
Member Author

Here's a proposal for how to handle "delta to cumulative" and "cumulative to delta" conversion.

Cumulative to Delta Conversion

Summary: The min / max on cumulative histograms include values recorded more recently than an unspecified max age. Assume the max age is equal to the cumulative collection interval.
Pros: Simple to compute min / max: just take the cumulative min / max as is which is effectively delta already.
Cons: Inaccurate if the cumulative max age is not equal to the delta window size. For example, if the cumulative max age is larger than the delta window, then the reported delta max / min may include values from previous windows. If the cumulative max age is smaller than the delta window, then the reported delta min / max may be missing values included in the count / sum. However, the min / max recorded in cumulative histogram would already be subject to this inaccuracy. In general, the cumulative max age needs to be greater or equal to the cumulative collection interval.

histogram {"temporality":  "cumulative", "start": "00:00:00", "end": "00:05:00", "count":  5, "sum": 50, "min":  5, "max": 20}
histogram {"temporality":  "cumulative", "start": "00:00:00", "end": "00:10:00", "count":  8, "sum": 70, "min":  4, "max": 14}
histogram {"temporality":  "cumulative", "start": "00:00:00", "end": "00:15:00", "count":  13, "sum": 100, "min":  3, "max": 16}

converts to:

histogram {"temporality":  "delta", "start": "00:00:00", "end": "00:05:00", "count":  5, "sum": 50, "min":  5, "max": 20}
histogram {"temporality":  "delta", "start": "00:05:00", "end": "00:10:00", "count":  3, "sum": 20, "min":  4, "max": 14}
histogram {"temporality":  "delta", "start": "00:10:00", "end": "00:15:00", "count":  5, "sum": 30, "min":  3, "max": 16}

Delta to Cumulative Conversion

Summary: The min / max on cumulative histograms include values recorded more recently than an unspecified max age. Use the delta min / max values as is in the cumulative histogram, assuming the max age is equal to the delta window size.
Pros: Simple to compute min / max: just take the delta min / max as is.
Cons: The min / max in the resulting cumulative histograms won't have any standard max age.

histogram {"temporality":  "delta", "start": "00:00:00", "end": "00:05:00", "count":  5, "sum": 50, "min":  5, "max": 20}
histogram {"temporality":  "delta", "start": "00:05:00", "end": "00:10:00", "count":  3, "sum": 20, "min":  4, "max": 14}
histogram {"temporality":  "delta", "start": "00:10:00", "end": "00:15:00", "count":  5, "sum": 30, "min":  3, "max": 16}

converts to:

histogram {"temporality":  "cumulative", "start": "00:00:00", "end": "00:05:00", "count":  5, "sum": 50, "min":  5, "max": 20}
histogram {"temporality":  "cumulative", "start": "00:00:00", "end": "00:10:00", "count":  8, "sum": 70, "min":  4, "max": 14}
histogram {"temporality":  "cumulative", "start": "00:00:00", "end": "00:15:00", "count":  13, "sum": 100, "min":  3, "max": 16}

Thoughts
The delta to cumulative conversion I propose has little downside. However the cumulative to delta conversion definitely has the opportunity to introduce error into the resulting delta histograms. IMO, this error is probably unavoidable, and a consequence of min / max being a poor match for cumulative metrics to begin with. If this error is unacceptable, options include:

  • Ensure delta histograms are generated at point of data generation so that cumulative to delta conversion is not needed.
  • Ensure the max age used when collecting cumulative min / max is equal to the cumulative collection interval.
  • Drop the min and max fields. Although I assume in many scenarios data with some inaccuracy is better than no data at all.

Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think min/max need to abide by the aggregation temporality, just like all the other fields in histogram.

@jack-berg
Copy link
Member Author

I think min/max need to abide by the aggregation temporality, just like all the other fields in histogram.

Just want to call out that if min/max abide by aggregation temporality and represent the cumulative min / max since recording began, then it's most reasonable to drop them when converting from cumulative to delta. The conversion from delta to cumulative is straight forward.

Copy link
Member

@reyang reyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmacd
Copy link
Contributor

jmacd commented Sep 17, 2021

I'm skeptical about the resistance to this proposal.

On the surface, remember that this discussion is meant to enable a MinMaxSumCount style of aggregation (i.e., a zero-bucket histogram). We want to encode Min/Max/Sum/Count without any further resolution, so the buckets of a histogram are not a substitute. We're interested in Min and Max quantiles specifically because they are mergeable. The comments above saying they are not mergeable address temporal aggregation, not spatial aggregation.

Let's say there wasn't a cumulative temporality, and we were only discussing delta-temporality histograms. There's still a problem with temporal alignment, but it's not enough of a problem for me to discard the data. A great deal of metrics data will pass through temporal alignment before it is finally shown to the user. When splitting and merging deltas to perform realignment, the same problem shows up. Why is this such an obstacle?

I think of Min and Max values as like Gauges. When it comes to merging multiple of them, you select the last value of each series. We can write rules for merging histograms of either temporality, always output the last value of the min/max that you have.

@jsuereth
Copy link
Contributor

@jmacd I commented here with my thoughts on the way forward.

I think the major issue for us to figure out is basically: Should we leverage the natural mathematical function "min" as the merge algorithm for the value "min" on histogram (similarly for max)?

I think it would be totally viable to say 'no' to this and make min/max have Gauge semantics within histogram (just like "sum" and "count" have Sum semantics).

I do think the natural min/max functions have meaning, even spatially but perhaps there's a scenario I'm not thinking of. THis is why my bias is towards giving "min" and "max" their own aggregate function. Can you give me an example of spatial min-max merge where mathematical "min" and "max" wouldn't make sense?

@jsuereth jsuereth added area:data-model For issues related to data model spec:metrics Related to the specification/metrics directory labels Sep 18, 2021
@jmacd
Copy link
Contributor

jmacd commented Sep 21, 2021

Can you give me an example of spatial min-max merge where mathematical "min" and "max" wouldn't make sense?

No, I believe these always make sense for spatial merge. I was proposing "Gauge semantics" as a trick to let min and max work for temporal aggregation, it's just careful language to allow delta temporality to do one thing and cumulative temporality to do another without breaking any definitions. 😀

If we defined the min/max to be interpreted over the window (X, Time] where X=Max(StartTime, Time-10m), then delta temporality with intervals <= 10 minutes are able to encode the exact min and max and cumulative exports at least follow the same semantics. You could convert delta to cumulative and correctly meet the specification by keeping up to 10 minutes of state, and when you convert from cumulative back to delta now the min/max fields are no longer exact. This doesn't bother me, but we could add a flag to indicate when min/max are exact if that will help.

@reyang
Copy link
Member

reyang commented Sep 21, 2021

Discussed during the 9/21/21 Metrics SIG Meeting and concluded that @jack-berg will update the PR:

Conclusion:
min/max would be added to Histogram, and they respect the temporality for both Delta and Cumulative
Add a section saying that Delta->Cumulative will work (and is mathematically correct), and Cumulative->Delta won’t work mathematically (due to the information loss), however folks can take the min/max and convert it to something else (e.g. Gauge) or simply drop min/max on the floor
[reyang] +1
[jsuereth] +1
[jack-berg] +1 this would work for us
[bogdan] +1 happy with the proposal, need to follow up regarding how hard it would be for the Collector to implement that (need to double check before making the decision, it might be a huge change)

@jack-berg
Copy link
Member Author

I've updated the description of the min and max fields to reflect the conclusion of the 9/21 SIG meeting.

Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM now. If you'd like to update the pictures, ping me on slack and I'll get you access to the google drawing to reproduce them.

Copy link
Member

@reyang reyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR looks better, would prefer couple of days to think more on this but seems to be the right track

CHANGELOG.md Outdated Show resolved Hide resolved
@bogdandrutu bogdandrutu dismissed their stale review September 23, 2021 18:31

Dismiss the blocking, no longer reason to block, but not yet to approve :)

@jsuereth jsuereth merged commit 4e1f01e into open-telemetry:main Oct 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:data-model For issues related to data model spec:metrics Related to the specification/metrics directory
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants