The istio_request_duration_milliseconds metic isn't adhering to the Prometheus standard time unit #44168

modulitos · 2023-03-29T19:52:15Z

Bug Description

The istio_request_duration_milliseconds metric isn't adhering to the Prometheus standard practice for using seconds as the base unit for time: https://prometheus.io/docs/practices/naming/#base-units

Some libraries assume the buckets to be in seconds, which can cause unexpected behavior, like this one: pyrra-dev/pyrra#667

I plan to work around this issue by effectively relabeling the metrics with a Prometheus recording rule, but perhaps this metric needs to be in milliseconds for some reason?

Also, it looks like this metric used to be in seconds, but was changed to milliseconds during the upgrade to in-proxy telemetry (aka v2) from Mixer-based telemetry (aka v1):
https://istio.io/latest/about/faq/metrics-and-logs/
https://istio.io/v1.6/docs/reference/config/policy-and-telemetry/metrics/

Version

❯ istioctl version
client version: 1.17.1
control plane version: 1.17.1
data plane version: 1.17.1

❯ kubectl version --short
Client Version: v1.26.3
Kustomize Version: v4.5.7
Server Version: v1.24.10-gke.2300



### Additional Information

_No response_

The text was updated successfully, but these errors were encountered:

kyessenov · 2023-03-31T03:02:36Z

More recent standard require milliseconds, see https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/semantic_conventions/http-metrics.md#metric-httpclientduration.

Second is too large of a unit for requests IMHO.

modulitos · 2023-04-05T17:39:04Z

@kyessenov Thank you for the info! That makes sense, although there seems to be a discrepancy between Prometheus' base units (in seconds), and the semantic conventions in the http metrics by OpenTelemetry (in milliseconds).

It seems like Istio is adhering to the OpenTelemetry convention, so I'll close this issue for now.

modulitos · 2023-04-10T17:10:44Z

FYI OpenTelemetry has recently decided to align duration metric units with the prometheus convention of seconds: open-telemetry/opentelemetry-specification#2977

but as long as the unit metadata is in there, then I think this is still correct and there's nothing to fix:

The OpenMetrics UNIT metadata, if present, MUST be converted to the unit of the OTLP metric. After trimming type-specific suffixes, such as _total for counters, the unit MUST be trimmed from the suffix as well, if the metric suffix matches the unit. The unit SHOULD be translated from Prometheus conventions to OpenTelemetry conventions

https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/compatibility/prometheus_and_openmetrics.md#metric-metadata

istio-policy-bot added the area/extensions and telemetry label Mar 29, 2023

rvalkenaers mentioned this issue Apr 5, 2023

Support for millisecond granularity on latency buckets? pyrra-dev/pyrra#667

Closed

modulitos closed this as completed Apr 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The istio_request_duration_milliseconds metic isn't adhering to the Prometheus standard time unit #44168

The istio_request_duration_milliseconds metic isn't adhering to the Prometheus standard time unit #44168

modulitos commented Mar 29, 2023 •

edited

kyessenov commented Mar 31, 2023

modulitos commented Apr 5, 2023

modulitos commented Apr 10, 2023

The istio_request_duration_milliseconds metic isn't adhering to the Prometheus standard time unit #44168

The istio_request_duration_milliseconds metic isn't adhering to the Prometheus standard time unit #44168

Comments

modulitos commented Mar 29, 2023 • edited

Bug Description

Version

kyessenov commented Mar 31, 2023

modulitos commented Apr 5, 2023

modulitos commented Apr 10, 2023

modulitos commented Mar 29, 2023 •

edited