Skip to content

fix(metrics): bind meter to our provider directly, bypass single-call global#179

Merged
pavanputhra merged 1 commit into
mainfrom
fix/otel-bypass-global-meter-provider
May 20, 2026
Merged

fix(metrics): bind meter to our provider directly, bypass single-call global#179
pavanputhra merged 1 commit into
mainfrom
fix/otel-bypass-global-meter-provider

Conversation

@pavanputhra
Copy link
Copy Markdown
Contributor

Summary

Follow-up to #178. The fork-safe init landed, but observed peak `conserver.vcons.inflight` was still 128 across 4 worker processes — same as before — and `host.name` / `service.instance.id` were missing from `resource_attrs` in the metrics backend.

Root cause

OpenTelemetry Python's `metrics.set_meter_provider()` is single-call. When `opentelemetry-instrumentation` (auto-instrumentation) is active in the process, it registers a default MeterProvider early in startup, before this module's lazy `_init_otel_metrics()` runs. Our subsequent `set_meter_provider(our_provider)` is silently ignored, and `metrics.get_meter(name)` returns the global proxy bound to the auto-instrumentation's provider — whose resource we don't control.

The dead giveaway in the ClickHouse `resource_attrs` was `telemetry.auto.version: 0.59b0` — that key is only set by the auto-instrumentation distro's MeterProvider, not by any provider we'd create.

Fix

Bind `meter` to OUR provider directly via `provider.get_meter(name)`, never touching the global. Auto-instrumentation metrics continue to flow through their own pipeline; our application metrics flow through ours. Both end up at the collector independently with their own resource attributes.

Diff

```diff

  • metrics.set_meter_provider(provider)
  • meter = metrics.get_meter(name)
  • meter = provider.get_meter(name)
    ```

Tests updated to assert `set_meter_provider` is NOT called and `meter` is bound to the provider's own meter.

Test plan

  • 24 tests pass locally (the fork-safe + inflight + queue_metrics suites)
  • After deploy: distinct `resource_attrs` per worker (`host.name` + `service.instance.id` populated). `conserver.vcons.inflight` peak should reach ~512 (2 pods × 2 workers × 128) when the transcribe queue is deep enough.

🤖 Generated with Claude Code

… global

Follow-up to #178. The fork-safe init landed, but
SignOz queries still showed only one worker's value (peak inflight 128
across 4 worker processes instead of the expected ~512). Root cause:
the user-provided OTel resource (``host.name``, ``service.instance.id``)
never reached the metrics backend.

OpenTelemetry Python's ``metrics.set_meter_provider()`` is single-call.
When ``opentelemetry-instrumentation`` (auto-instrumentation) is active
in the process, it registers a default MeterProvider early in startup —
before this module's lazy ``_init_otel_metrics()`` runs. Our subsequent
``set_meter_provider(our_provider)`` call is silently ignored, and
``metrics.get_meter(__name__)`` returns the global proxy bound to the
auto-instrumentation's provider — whose resource we don't control.

The visible CH symptom: ``resource_attrs`` on every conserver.*
metric only contains ``service.name`` + ``telemetry.sdk.*`` +
``telemetry.auto.version``; ``host.name`` and ``service.instance.id``
that this module sets are dropped. All four worker processes collapse
onto a single fingerprint, last-write-wins.

Fix: bind ``meter`` to OUR provider directly via
``provider.get_meter(__name__)``, never touching the global. Our
provider has the correct resource and its own export pipeline to
the OTel collector. Auto-instrumentation metrics continue to flow
through their own pipeline independently — both arrive at the
collector.

Tests updated to assert ``set_meter_provider`` is NOT called and the
``meter`` global is bound to the provider's own meter, not the global
proxy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pavanputhra pavanputhra force-pushed the fix/otel-bypass-global-meter-provider branch from 6797301 to d679a4c Compare May 20, 2026 19:59
@pavanputhra pavanputhra merged commit 580c626 into main May 20, 2026
1 check passed
@pavanputhra pavanputhra deleted the fix/otel-bypass-global-meter-provider branch May 20, 2026 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant