Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add comments about StartTimeUnixNano #295

Merged
merged 12 commits into from
May 11, 2021
Merged

Conversation

jmacd
Copy link
Contributor

@jmacd jmacd commented Apr 22, 2021

This follows the discussion from 4/20/21, notes here.

CC: @vishiy @jsuereth @open-telemetry/specs-metrics-approvers

Resolves #292
Resolves #289

@jmacd jmacd requested review from a team as code owners April 22, 2021 09:07
Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, but I'm requested two changes:

  1. Make it explicitly clear this algorithm is an outline. Possibly include a link to a specification section where we can go into more details and justification.
  2. We should be explicit with identity of metrics. At a minimum "resource" should be included in your key for the map.

opentelemetry/proto/metrics/v1/metrics.proto Outdated Show resolved Hide resolved
opentelemetry/proto/metrics/v1/metrics.proto Outdated Show resolved Hide resolved
opentelemetry/proto/metrics/v1/metrics.proto Show resolved Hide resolved
opentelemetry/proto/metrics/v1/metrics.proto Outdated Show resolved Hide resolved
opentelemetry/proto/metrics/v1/metrics.proto Outdated Show resolved Hide resolved
opentelemetry/proto/metrics/v1/metrics.proto Outdated Show resolved Hide resolved
opentelemetry/proto/metrics/v1/metrics.proto Outdated Show resolved Hide resolved
opentelemetry/proto/metrics/v1/metrics.proto Outdated Show resolved Hide resolved
opentelemetry/proto/metrics/v1/metrics.proto Outdated Show resolved Hide resolved
opentelemetry/proto/metrics/v1/metrics.proto Outdated Show resolved Hide resolved
jmacd and others added 2 commits April 23, 2021 12:52
Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com>
@jmacd
Copy link
Contributor Author

jmacd commented Apr 26, 2021

@jsuereth and @tigrannajaryan please take another look at 03b2d36, I merged another commit clarifying how StartTimeUnixNano MAY be used with Gauges and Summaries for overlap detection.

@bogdandrutu please review.

@jmacd
Copy link
Contributor Author

jmacd commented May 5, 2021

@open-telemetry/specs-metrics-approvers PTAL.

opentelemetry/proto/metrics/v1/metrics.proto Show resolved Hide resolved
opentelemetry/proto/metrics/v1/metrics.proto Outdated Show resolved Hide resolved
opentelemetry/proto/metrics/v1/metrics.proto Outdated Show resolved Hide resolved
// and with it prescribes a solution for filling in StartTimeUnixNanos
// with as much information as possible to assist consumers with
// calculating rates and detecting overlap. This process can be
// applied to Sum and Histogram points with cumulative aggregation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot apply that unless you know:

  1. Sum is monotonic
  2. Histogram Sum is monotonic (which is something that we don't know, and have issue Is knowing that a Histogram/Summary sum is monotonic important? #187), which I think was not resolved, because we need at least to add a comment that the sum is "counter".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT For histograms we should use the "count" as the counter and all other bins values are counters, so we are good on that.

But the question about #187 still stays, it says that is resolved in a manner but I don't see that in the documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to adding documentation on how to handle things as we discussed in #187. Do you want that in a separate PR (I can do that) or we can update this PR with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate PR is good.

Copy link
Contributor Author

@jmacd jmacd May 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to separate the topic of "reset" from the topic of computing a rate. The notion of a reset I'm trying to develop has to do with eliminating overlapping series that may have been accidentally written as well as calculating correct rates.

The logic here applies a StartTime to all the kinds of points you might find in a Prometheus RemoteWrite output (or WAL), for example, and the outcome is an OTLP stream that is not overlapping with itself as well as encodes information about reset semantics.

This formalizes the Prometheus heuristic for resetting counters in a way that lets OTLP points be written out of order, but it also ensures that two Prometheus servers configured to write the same series, converted into OTLP this way would be detected as overlapping.

opentelemetry/proto/metrics/v1/metrics.proto Outdated Show resolved Hide resolved
opentelemetry/proto/metrics/v1/metrics.proto Outdated Show resolved Hide resolved
// Data points with/ the 0 value for TimeUnixNano SHOULD be rejected
// by consumers.
//
// # StartTimeUnixNano
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interest of both getting this PR in, and ensuring information is easily found, Would It make sense to move the depiction of unbroken sequences and start time synthesis into the DataModel specificaiton (with a link from here) rather than in the proto file?

a. We may evolve/improve the description.
b. This isn't a great medium for delivering "why" content.

My mantra for this file is "What, not why", and for the DataModel specification is "Why then What" (i.e. Justification, then Specification semantics).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK; will do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 @jsuereth also we make sure we don't duplicate specification and get one of them outdated

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also remove from line 134 to 154 for the moment. We can add it later.

@jmacd jmacd merged commit 78d3024 into open-telemetry:main May 11, 2021
@jmacd jmacd deleted the jmacd/start_time branch May 11, 2021 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants