Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial cut at migrating jmacd's datamodel document into the spec #1512

Merged
merged 19 commits into from
Mar 18, 2021

Conversation

jsuereth
Copy link
Contributor

@jsuereth jsuereth commented Mar 8, 2021

From Task: https://github.com/open-telemetry/opentelemetry-specification/projects/3#card-56227942

  • Imports an initial version of @jmacd's Metric DataModel document, which elucidates a LOT of important points/goals in the metric data model over the existing specification.
  • Adds TODO sections for discussion beyond the initial "Events" - "Data Model" - "Time Series" fragmentation.
  • Updates verbiage in root specification around the metrics data model.

Note: This PR is 100% focused on framing the problem with the 3 models and separation of concerns, as well as defining some key use cases for evaluating the model.

Replaces #1510 with permission to do so.

@jsuereth jsuereth changed the title [wip] Initial cut at migrating Josh MacD's datamodel document into the spec… Initial cut at migrating jmacd's datamodel document into the spec Mar 8, 2021
@jsuereth jsuereth marked this pull request as ready for review March 8, 2021 18:41
@jsuereth jsuereth requested review from a team as code owners March 8, 2021 18:41
@aabmass
Copy link
Member

aabmass commented Mar 9, 2021

specification/metrics/datamodel.md Show resolved Hide resolved
specification/metrics/datamodel.md Outdated Show resolved Hide resolved
specification/metrics/datamodel.md Outdated Show resolved Hide resolved
specification/metrics/datamodel.md Show resolved Hide resolved
specification/metrics/datamodel.md Outdated Show resolved Hide resolved
specification/metrics/datamodel.md Show resolved Hide resolved
specification/metrics/datamodel.md Outdated Show resolved Hide resolved
Co-authored-by: Reiley Yang <reyang@microsoft.com>
specification/metrics/datamodel.md Show resolved Hide resolved
specification/metrics/datamodel.md Outdated Show resolved Hide resolved
specification/metrics/datamodel.md Show resolved Hide resolved
specification/metrics/datamodel.md Outdated Show resolved Hide resolved
Co-authored-by: Reiley Yang <reyang@microsoft.com>
@jsuereth jsuereth added this to Written Specification in Spec - Metrics Data Model and Protocol Mar 16, 2021
- With delta temporality: stateless collector
- With cumulative temporality: stateful collector
6. OTel SDK exports directly to 3P backend
7. OTel SDK exports directly to 3P backend
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Is "3P" defined somewhere?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would replace with the explicit text :)

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, I think we can iterate over some details later.

specification/metrics/datamodel.md Show resolved Hide resolved
- With delta temporality: stateless collector
- With cumulative temporality: stateful collector
6. OTel SDK exports directly to 3P backend
7. OTel SDK exports directly to 3P backend
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would replace with the explicit text :)

Comment on lines +129 to +135
- Using OTLP as an intermediary format between two non-compatible formats
- Importing [statsd](https://github.com/statsd/statsd) => Prometheus PRW
- Importing [collectd](https://collectd.org/wiki/index.php/Binary_protocol#:~:text=The%20binary%20protocol%20is%20the,some%20documentation%20to%20reimplement%20it)
=> Prometheus PRW
- Importing Prometheus endpoint scrape => [statsd push | collectd | opencensus]
- Importing OpenCensus "oca" => any non OC or OTel format
- TODO: define others.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I fail a bit to understand this:

It is in scope to have Prometheus -> OTLP -> PRW and it is in scope to have StatsD -> OTLP then why is not in scope to have StatsD -> OTLP -> PRW? mathematically if the translation function OTLP -> PRW is defined and StatsD -> OTLP is defined then I can compose them to get StatsD -> OTLP -> PRW

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think StatsD -> OTLP -> PRW is in-scope. We haven't added support for raw sampled metric events, which is how they appear to the StatsD receiver, so somewhere in that picture is a stateful conversion to OTLP(cumulative) which is not a difficult task, just requires memory.

I believe there is an additional concept that belongs in the specification, and I'm sorry to introduce it here, that I think of as a counterpart to the Single-Writer definition we have discussed. It can be called a Single-Reader property, and it is required to perform a correct delta-to-cumulative translation.

Single-Reader: Having access to the whole stream of data for a metric. This happens when an OTel collector is configured as a per-node agent and every process on the node reports through that agent. This happens when an OTel collector pool runs the Kafka exporter and Kafka receiver with appropriate configuration.

StatsD -> OTLP -> PRW should be in-scope for a single-node deployment, and it ought to behave just as https://github.com/prometheus/statsd_exporter would.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu There's a lot of nuance to Statsd => PRW. While TECHNICALLY you can shove bits into PRW, when I say "compatibility" i mean the way we defined it for Prometheus: "A user of prometheus shouldn't be able to tell if OTLP was used to transport prometheues style metric to their backend". Similarly, for users of Statsd today, we should HONOR the statsd conventions and formats from statsd => PRW. The reason Statsd => PRW is out of scope is because of all the non-data-related nuance (e.g. naming conventions, namespace options etc. that a statsd user would expect).

To be frank: I don't think it's OTLP's job to define how Statsd maps into PRW, and we shouldn't sacrifice the design of OTLP for that use case. We can attempt to provide the best translation possible, but it should not "block" or otherwise inhibit progress. There's a difference between "out of scope" and "can't do". What I'm suggesting is we won't prioritize or make a ton of effort to solve statsd => PRW bugs that are inherent to the impedance mismatch of those two models.

@jmacd I'm actually tying your "Single-Reader" into the "SingleWriter rule. Specifically the way I'm phrasing Single-WRiter in the not-yet-public PR is that any given timeseries should originate from a --SingleWriter. This includes aggregation time series, meaning if the collector is manipulating a metrics, it becomes the new single-writer of that aggregation. It is also required for delta-to-cumulative manipulation (which in my opinion, that's a new series of data). To some extent I think this nuance is just terminology and we need to make sure we COVER the instance of when it's OK to do "Delta-Cumulative" and what that implies on your telemetry-pipeline architecture.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

It is also required for delta-to-cumulative manipulation (which in my opinion, that's a new series of data).

Yes, that's how I've been thinking about it too. It's permitted to output the same metric name, in this case, because the incoming data is "terminal".

Comment on lines +129 to +135
- Using OTLP as an intermediary format between two non-compatible formats
- Importing [statsd](https://github.com/statsd/statsd) => Prometheus PRW
- Importing [collectd](https://collectd.org/wiki/index.php/Binary_protocol#:~:text=The%20binary%20protocol%20is%20the,some%20documentation%20to%20reimplement%20it)
=> Prometheus PRW
- Importing Prometheus endpoint scrape => [statsd push | collectd | opencensus]
- Importing OpenCensus "oca" => any non OC or OTel format
- TODO: define others.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think StatsD -> OTLP -> PRW is in-scope. We haven't added support for raw sampled metric events, which is how they appear to the StatsD receiver, so somewhere in that picture is a stateful conversion to OTLP(cumulative) which is not a difficult task, just requires memory.

I believe there is an additional concept that belongs in the specification, and I'm sorry to introduce it here, that I think of as a counterpart to the Single-Writer definition we have discussed. It can be called a Single-Reader property, and it is required to perform a correct delta-to-cumulative translation.

Single-Reader: Having access to the whole stream of data for a metric. This happens when an OTel collector is configured as a per-node agent and every process on the node reports through that agent. This happens when an OTel collector pool runs the Kafka exporter and Kafka receiver with appropriate configuration.

StatsD -> OTLP -> PRW should be in-scope for a single-node deployment, and it ought to behave just as https://github.com/prometheus/statsd_exporter would.

specification/metrics/datamodel.md Show resolved Hide resolved
@jmacd jmacd merged commit 05327d4 into open-telemetry:main Mar 18, 2021
@jsuereth jsuereth moved this from Written Specification to Done in Spec - Metrics Data Model and Protocol Mar 19, 2021
ThomsonTan pushed a commit to ThomsonTan/opentelemetry-specification that referenced this pull request Mar 30, 2021
…en-telemetry#1512)

* Initial cut at migrating Josh MacD's datamodel document into the specification.
Co-authored-by: Aaron Abbott <aaronabbott@google.com>
Co-authored-by: Reiley Yang <reyang@microsoft.com>
Co-authored-by: Joshua MacDonald <jmacd@users.noreply.github.com>
@jsuereth jsuereth deleted the metric-data-model-initial branch April 17, 2021 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants