Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial draft of Prometheus <-> OTLP datamodel specification. #2017

Closed

Conversation

jsuereth
Copy link
Contributor

DO NOT SUBMIT (draft only)

For shared contribution/discussion and improvements.

Updates data model specification to denote prometheus compatibilty.

  • Should align with all the work done in the collector / prometheus-wg
  • Should allow Prometheus-exporter requirements for metrics SDK

Pending items:

  • Full review from prometheus-wg folks, with additional items
    • Prometheus -> OTLP -> Prometheus is fully detailed/working in Collector
    • (non-prometheus OTLP) -> Prometheus needs fleshing out
  • Attach bugs/tickets around missing prometheus data types
  • Exemplar mappings

Related issues:

cc @alolita for help shaping this up.

specification/metrics/datamodel.md Outdated Show resolved Hide resolved
- `le` label is used to identify histogram bucket boundaries and counts.
- `quantile` label is used to identify quantile points in summary metrics.
- `__name__` is used to identify the metric name of the data point.
- `__metrics_path__` is ignored in OTLP.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for __name__ (which is not part of the OpenMetrics protocol, has built-in meaning in the relabeling machinery, and is special in the Prometheus Remote Write protocol), I believe all the other __-prefixed labels are meant for the relabeling machinery to have access to, they're just "not exported".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The collector seems to use __schema__, but not sure if that's a PRW detail or not.

**Status**: [Experimental](../document-status.md)

This section denotes how to convert from prometheus scraped metrics to the
OpenTelemtery metric data model and how to create prometheus metrics from
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
OpenTelemtery metric data model and how to create prometheus metrics from
OpenTelemetry metric data model and how to create Prometheus metrics from

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, shouldn't Prometheus be uppercase throughout?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this was hastily written, will go through and fix.


Prometheus metric labels are split in OpenTelemetry across Resource attributes
and Metric data stream attributes. Some labels are used within metric
families to denote semantics which open-telemetry captures within the structure
Copy link

@hdost hdost Oct 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional open-telemetry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, good catch!


Prometheus Stateset is dropped (TBD).

Prometheus Info is dropped (TBD).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considered Prometheus Info are typically represented with a constant 1 I could see this easily mapping into a OTel Gauge. As if I remember correctly they're already exported as Gauges. (So maybe it doesn't need to be specified)

Copy link

@hdost hdost Oct 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok looking at Java client it used to be a part of Gauge metric family, but now it's been moved to Unknown. So based on our reference either way it should be a gauge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should specify it in some way. I think coming in as a gauge works, we just need a way to do OpenMetrics -> OTLP -> OpenMetrics here.


OpenTelemetry Sum follows this logic:

- If the aggregation temporality is cumulative and the sum is monotonic,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we specify this on the reverse that Prometheus Counter becomes a cumulative monotonic OTLP Sum ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

that is not explicitly called out as being handled specially will be included
in the set of attributes for a metric data stream.

Here is a table of the sett of prometheus labels that are lifted into Resource
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Here is a table of the sett of prometheus labels that are lifted into Resource
Here is a table of the set of prometheus labels that are lifted into Resource

@github-actions
Copy link

github-actions bot commented Nov 6, 2021

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Nov 6, 2021
Additionally, in Prometheus metric labels must match the following regex: `[a-zA-Z_:]([a-zA-Z0-9_:])*`. Metrics
from OpenTelemetry with unsupported Attribute names should replace invalid characters with the `_` character. This
may cause ambiguity in scenarios where mulitple similar-named attributes share invalid characters at the same
location. This is considered an unsupported case, and is highly unlikely.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
location. This is considered an unsupported case, and is highly unlikely.
location. This is considered as an unsupported case, and is highly unlikely.

- `__metrics_path__` is ignored in OTLP.

Additionally, in Prometheus metric labels must match the following regex: `[a-zA-Z_:]([a-zA-Z0-9_:])*`. Metrics
from OpenTelemetry with unsupported Attribute names should replace invalid characters with the `_` character. This
Copy link
Member

@reyang reyang Nov 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest that we take a special case here:

if the first character of the label key is a digit ([0-9]), prepend a _ instead of replace it with _

For example:

OpenTelemetry attribute name Prometheus label key
"9123" "_9123"

Rather than:

OpenTelemetry attribute name Prometheus label key
"9123" "_123"

Copy link
Member

@reyang reyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some suggestions, and I'll write the prototype in OpenTelemetry .NET (e.g. open-telemetry/opentelemetry-dotnet#2578)


OpenTelemetry Histogram becomes a metric family with the following:

- A single `{name}_count` metric denoting the count field of the histogram.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prometheus doesn't have a separate notion of "unit", based on the Prometheus Metric and Label naming convention, it seems the general guidance is to append unit as a suffix. For example:

OpenTelemetry metric name (from instrument/view) OpenTelemetry metric unit Prometheus metric name
http.server.duration milliseconds http_server_duration_milliseconds
http.server.active_requests N/A http_server_active_requests

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that prometheus is moving the direction of OpenMetrics which does support Units as a concept https://github.com/OpenObservability/OpenMetrics we may want to suggest that path, so not changing the metrics name but instead is metadata on the metric.

OpenTelemetry can assume some shape/structure to metric family groups, any
metric belonging to a family that is not treated specially should be exported
as its own metric stream. For example, while histograms are expected to
live in a metric family with metrics `{name}_sum`, `{name}_count` and `{name}`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the Prometheus best practices for Histogram and Summaries, we should include unit:

  1. If there is no unit (empty string, null string), histogram should use {name}, {name}_count and {name}_sum.
  2. If there is a valid unit, histogram should use {name}_{unit}, {name}_{unit}_count and {name}_{unit}_sum.

@github-actions
Copy link

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Nov 18, 2021
@hdost
Copy link

hdost commented Nov 22, 2021

isn't this still active?

@jsuereth
Copy link
Contributor Author

jsuereth commented Dec 9, 2021

It is active

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Dec 18, 2021
@jsuereth jsuereth removed the Stale label Dec 19, 2021
@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jan 11, 2022
@alolita
Copy link
Member

alolita commented Jan 11, 2022

I discussed this draft PR w @jsuereth @jmacd @Aneurysm9 earlier this week. We're planning to work towards completing this PR in the Prometheus workgroup.

@jsuereth
Copy link
Contributor Author

Closing in favor of #2266

@jsuereth jsuereth closed this Jan 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants