Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenTelemetry: WARNING: Instrument has recorded multiple values for the same attributes #15623

Closed
marcosmarxm opened this issue Aug 12, 2022 · 4 comments
Labels
area/metrics Related to metrics and data gathering team/compose team/prod-eng zendesk

Comments

@marcosmarxm
Copy link
Member

This Github issue is synchronized with Zendesk:

Ticket ID: #1847
Priority: normal
Group: Community Assistance Engineer
Assignee: Sajarin

Original ticket description:

Context

Hello!

We are using Airbyte 0.39.42-alpha with Docker Compose, and are setting it up to send metrics using OpenTelemetry, using information from the following documentation and threads:

According to the documentation, we have updated the Docker Compose stack to:

  • setup the airbyte-metrics-reporter service for OpenTelemetry
  • setup the airbyte-worker service for OpenTelemetry
  • setup the opentelemetry-collector service to handle OTEL gRPC calls, and expose metrics using the Prometheus exporter

Additionally, we have setup:

  • Prometheus to scrape data from opentelemetry-collector
  • Grafana to display Prometheus metrics

Issue

When the airbyte-metrics-reporter-service emits metrics using the OpenTelemetry SDK, the following warning can be seen:

airbyte-metrics-reporter  | Aug 08, 2022 3:21:29 PM io.opentelemetry.sdk.internal.ThrottlingLogger doLog
airbyte-metrics-reporter  | WARNING: Instrument oldest_running_job_age_secs has recorded multiple values for the same attributes.
airbyte-metrics-reporter  | Aug 08, 2022 3:21:29 PM io.opentelemetry.sdk.internal.ThrottlingLogger doLog
airbyte-metrics-reporter  | WARNING: Instrument num_running_jobs has recorded multiple values for the same attributes.
airbyte-metrics-reporter  | Aug 08, 2022 3:21:29 PM io.opentelemetry.sdk.internal.ThrottlingLogger doLog
airbyte-metrics-reporter  | WARNING: Instrument oldest_pending_job_age_secs has recorded multiple values for the same attributes.

When sync jobs are running, the gauges corresponding to the number of pending and running jobs do not seem to be updated accordingly, e.g. with two sync jobs running:

$ curl --silent http://localhost:8889/metrics | rg 'num_running'

HELP airbyte_num_running_jobs number of running jobs

TYPE airbyte_num_running_jobs gauge

airbyte_num_running_jobs{job="metrics-reporter"} 0

image

This issue seems to be limited to gauge values, as counters are correctly incremented:

image

Configuration details

Please find the (curated) configuration related to OpenTelemetry that we used for the different services:


.env

VERSION=0.39.42-alpha
PUBLISH_METRICS="true"
METRIC_CLIENT=otel
OTEL_COLLECTOR_ENDPOINT="http://otel-collector:4317"

docker-compose.yml

services:
worker:
environment:
- PUBLISH_METRICS=${PUBLISH_METRICS}
- METRIC_CLIENT=${METRIC_CLIENT}
- OTEL_COLLECTOR_ENDPOINT=${OTEL_COLLECTOR_ENDPOINT}

airbyte-metrics-reporter:
image: airbyte/metrics-reporter:${VERSION}
logging: *default-logging
container_name: airbyte-metrics-reporter
environment:

  • CONFIG_DATABASE_PASSWORD=${CONFIG_DATABASE_PASSWORD:-}
  • CONFIG_DATABASE_URL=${CONFIG_DATABASE_URL:-}
  • CONFIG_DATABASE_USER=${CONFIG_DATABASE_USER:-}
  • CONFIGS_DATABASE_MINIMUM_FLYWAY_MIGRATION_VERSION=${CONFIGS_DATABASE_MINIMUM_FLYWAY_MIGRATION_VERSION:-}
  • CONFIG_ROOT=${CONFIG_ROOT}
  • DATABASE_PASSWORD=${DATABASE_PASSWORD}
  • DATABASE_URL=jdbc:postgresql://${DATABASE_HOST}:${DATABASE_PORT}/${DATABASE_DB}
  • DATABASE_USER=${DATABASE_USER}
  • PUBLISH_METRICS=${PUBLISH_METRICS}
  • METRIC_CLIENT=${METRIC_CLIENT}
  • OTEL_COLLECTOR_ENDPOINT=${OTEL_COLLECTOR_ENDPOINT}

otel-collector:
image: otel/opentelemetry-collector:0.57.2
command: ["--config=/etc/otel-collector-config.yaml"]
ports:

  • "8888:8888" # Prometheus metrics exposed by the collector
  • "8889:8889" # Prometheus exporter metrics
    volumes:
  • ./otel-collector/otel-collector-config.yaml:/etc/otel-collector-config.yaml

otel-collector/otel-collector-config.yaml

---
receivers:
otlp:
protocols:
grpc: {}

processors:
batch: {}

exporters:
logging: {}
prometheus:
endpoint: 0.0.0.0:8889
namespace: airbyte
send_timestamps: true
metric_expiration: 60m

extensions:
health_check:
pprof:
zpages:

service:
extensions: [health_check, pprof, zpages]
pipelines:
metrics:
receivers: [otlp]
processors: [batch]
exporters: [logging, prometheus]

Attempts
After seeing the following issue being fixed on the OTEL SDK:

I tried bumping the version of the SDK to 1.16 using Airbyte’s deps.toml and rebuilding the Docker image for airbyte-metrics-reporter:

$ git clone https://github.com/airbytehq/airbyte
$ cd airbyte
$ vim deps.toml    # set OTEL SDK version to 1.16.0
$ cd airbyte-metrics/reporter
$ ../../gradlew build

but observed the same behaviour: warning messages, gauges stuck to 0.

The following discussion may provide better insights as to why the emission of the latest value fails for Airbyte gauges:

Please let me know if you need more information to reproduce the issue, I’ll also be happy to contribute fixes :slight_smile:

Thanks,

Aurélien

[Discourse post]

@marcosmarxm
Copy link
Member Author

Comment made from Zendesk by Sajarin on 2022-08-12 at 19:39:

Hey @virtualtam, we really appreciate this post. I made an issue relating to your question on Github, please add your thoughts and follow the discussion over there! 

@marcosmarxm
Copy link
Member Author

Comment made from Zendesk by Marcos Marx on 2022-08-13 at 08:42:

Hi @sajarin , thanks for following up!

I’ll head over to Github to continue the discussion :+1:

For anyone facing similar behaviour with OpenTelemetry metrics collection, the corresponding issue is airbytehq/airbyte#15623 - OpenTelemetry: WARNING: Instrument has recorded multiple values for the same attributes

[Discourse post]

@geneyen-chu
Copy link

do we have any update of this ? it seems the number of metrics is not correct.

@id13
Copy link

id13 commented Feb 20, 2023

Any news on this issue ? We are experiencing it as well on helm chart 0.42.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/metrics Related to metrics and data gathering team/compose team/prod-eng zendesk
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants