Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear warning: Misaligned starting timestamps #38394

Open
DasMagischeToastbrot opened this issue Mar 5, 2025 · 2 comments
Open

Unclear warning: Misaligned starting timestamps #38394

DasMagischeToastbrot opened this issue Mar 5, 2025 · 2 comments
Labels
exporter/prometheus question Further information is requested

Comments

@DasMagischeToastbrot
Copy link

DasMagischeToastbrot commented Mar 5, 2025

Component(s)

exporter/prometheus

What happened?

Description

We get an unclear warning and don't know what to do to fix this Misaligned starting timestamps. Can someone please explain what it means and what is the purpose?

Collector version

v0.121.0

Environment information

Environment

OS: Amazon-Linux-2023

OpenTelemetry Collector configuration

receivers:
      otlp:
        protocols:
          grpc:
            endpoint: ${env:MY_POD_IP}:4317
          http:
            endpoint: ${env:MY_POD_IP}:4318
      prometheus:
        config:
          scrape_configs:
          - job_name: otel-collector-metrics
            scrape_interval: 10s
            static_configs:
            - targets: ['localhost:8889']
    processors:
      memory_limiter:
        check_interval: 1s
        limit_percentage: 75
        spike_limit_percentage: 15
      batch:
        send_batch_size: 10000
        timeout: 10s
      transform:
        metric_statements:
          - context: datapoint
            statements:
            - set(attributes["container_image_name"], resource.attributes["container.image.name"])
            - set(attributes["container_image_tag"], resource.attributes["container.image.tag"])
            - set(attributes["k8s_deployment_name"], resource.attributes["k8s.deployment.name"])
            - set(attributes["k8s_namespace_name"], resource.attributes["k8s.namespace.name"])
            - set(attributes["k8s_node_name"], resource.attributes["k8s.node.name"])
            - set(attributes["k8s_pod_ip"], resource.attributes["k8s.pod.ip"])
            - set(attributes["k8s_pod_name"], resource.attributes["k8s.pod.name"])
            - set(attributes["k8s_pod_start_time"], resource.attributes["k8s.pod.start.time"])
            - set(attributes["k8s_pod_uid"], resource.attributes["k8s.pod.uid"])
    exporters:
      debug:
        verbosity: basic
      prometheus:
        endpoint: 0.0.0.0:8888
        namespace: otel
        metric_expiration: 1h
    connectors:
      spanmetrics:
        dimensions:
          - name: decrypt.version
          - name: event.name
          - name: http.status_code
        aggregation_temporality: "AGGREGATION_TEMPORALITY_DELTA"
        histogram:
        exemplars:
          enabled: true
        exclude_dimensions: ['status.code']
        metrics_expiration: 30m
        events:
          enabled: true
          dimensions:
            - name: exception.type
            - name: exception.message
        resource_metrics_key_attributes:
          - service.name
          - telemetry.sdk.language
          - telemetry.sdk.name
        namespace: spanmetrics
        metrics_flush_interval: 10s
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [debug, spanmetrics]
        metrics:
          receivers: [otlp, prometheus, spanmetrics]
          processors: [memory_limiter, transform, batch]
          exporters: [debug, prometheus]
        logs:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [debug]
      telemetry:
        metrics:
          address: 0.0.0.0:8889
          level: detailed

Log output

2025-03-05T13:00:06.905Z    warn    prometheusexporter@v0.121.0/accumulator.go:263    Misaligned starting timestamps    {"otelcol.component.id": "prometheus", "otelcol.component.kind": "Exporter", "otelcol ││ .signal": "metrics", "ip_start_time": "2025-03-05 12:59:52.024299095 +0000 UTC", "pp_start_time": "2025-03-05 12:58:32.024488269 +0000 UTC", "pp_timestamp": "2025-03-05 12:59:12.024491565 +0000 UTC", "ip_t ││ imestamp": "2025-03-05 13:00:02.02438911 +0000 UTC"}                                                                                                                                                  ││ 
2025-03-05T13:00:06.905Z    warn    prometheusexporter@v0.121.0/accumulator.go:263    Misaligned starting timestamps    {"otelcol.component.id": "prometheus", "otelcol.component.kind": "Exporter", "otelcol ││ .signal": "metrics", "ip_start_time": "2025-03-05 12:59:52.024299095 +0000 UTC", "pp_start_time": "2025-03-05 12:58:32.024488269 +0000 UTC", "pp_timestamp": "2025-03-05 12:59:12.024491565 +0000 UTC", "ip_t ││ imestamp": "2025-03-05 13:00:02.02438911 +0000 UTC"}                                                                                                                                                          ││ 
2025-03-05T13:00:06.905Z    warn    prometheusexporter@v0.121.0/accumulator.go:263    Misaligned starting timestamps    {"otelcol.component.id": "prometheus", "otelcol.component.kind": "Exporter", "otelcol ││ .signal": "metrics", "ip_start_time": "2025-03-05 12:59:52.024299095 +0000 UTC", "pp_start_time": "2025-03-05 12:58:32.024488269 +0000 UTC", "pp_timestamp": "2025-03-05 12:59:12.024491565 +0000 UTC", "ip_t ││ imestamp": "2025-03-05 13:00:02.02438911 +0000 UTC"}                                                                                                                                                          ││ 
2025-03-05T13:00:06.905Z    warn    prometheusexporter@v0.121.0/accumulator.go:263    Misaligned starting timestamps    {"otelcol.component.id": "prometheus", "otelcol.component.kind": "Exporter", "otelcol ││ .signal": "metrics", "ip_start_time": "2025-03-05 12:59:52.024299095 +0000 UTC", "pp_start_time": "2025-03-05 12:58:32.024488269 +0000 UTC", "pp_timestamp": "2025-03-05 12:59:12.024491565 +0000 UTC", "ip_t ││ imestamp": "2025-03-05 13:00:02.02438911 +0000 UTC"}

Additional context

In front of the otel-collectore there are multiple otel-agents and each agent collects multiple metrics from multiple services. Furthermore there are two otel-collector running.

@DasMagischeToastbrot DasMagischeToastbrot added bug Something isn't working needs triage New item requiring triage labels Mar 5, 2025
Copy link
Contributor

github-actions bot commented Mar 5, 2025

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@ArthurSens
Copy link
Member

The Prometheus exposition format does not support Deltas; therefore, when receiving deltas, the Prometheus exporter transforms them into their cumulative representation by aggregating deltas together. (At least that's my understanding 😅)

Transforming deltas to cumulative without losing data or semantics is not that simple. The OTel documentation page shows that a lot goes on with StartTimestamp ordering[1] and the concept of Single-Writer[2][3].

Since you mention using multiple collectors, I'm assuming that histograms are arriving out of order or that data points from a single application are being sent separately to different collectors. With data points being sent to different collectors, we're starting to see gaps between the StartTimestamps, causing the misalignment.

If you need to scale your collectors, I'd advise you to start using the Loadbalancer exporter before your Prometheus pipeline to ensure metrics from the same stream go to the same pipeline.

Code reference for the Misaligned starting timestamps logline.

switch histogram.AggregationTemporality() {
case pmetric.AggregationTemporalityDelta:
pp := mv.value.Histogram().DataPoints().At(0) // previous aggregated value for time range
if ip.StartTimestamp().AsTime() != pp.Timestamp().AsTime() {
// treat misalignment as restart and reset, or violation of single-writer principle and drop
a.logger.With(
zap.String("ip_start_time", ip.StartTimestamp().String()),
zap.String("pp_start_time", pp.StartTimestamp().String()),
zap.String("pp_timestamp", pp.Timestamp().String()),
zap.String("ip_timestamp", ip.Timestamp().String()),
).Warn("Misaligned starting timestamps")
if !ip.StartTimestamp().AsTime().After(pp.Timestamp().AsTime()) {
a.logger.With(
zap.String("metric_name", metric.Name()),
).Warn("Dropped misaligned histogram datapoint")
continue
}
a.logger.Debug("treating it like reset")
ip.CopyTo(m.Histogram().DataPoints().AppendEmpty())
} else {
a.logger.Debug("Accumulate another histogram datapoint")
accumulateHistogramValues(pp, ip, m.Histogram().DataPoints().AppendEmpty())
}

@ArthurSens ArthurSens added question Further information is requested and removed bug Something isn't working needs triage New item requiring triage labels Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exporter/prometheus question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants