fix(metrics): bind meter to our provider directly, bypass single-call global#179
Merged
Merged
Conversation
… global Follow-up to #178. The fork-safe init landed, but SignOz queries still showed only one worker's value (peak inflight 128 across 4 worker processes instead of the expected ~512). Root cause: the user-provided OTel resource (``host.name``, ``service.instance.id``) never reached the metrics backend. OpenTelemetry Python's ``metrics.set_meter_provider()`` is single-call. When ``opentelemetry-instrumentation`` (auto-instrumentation) is active in the process, it registers a default MeterProvider early in startup — before this module's lazy ``_init_otel_metrics()`` runs. Our subsequent ``set_meter_provider(our_provider)`` call is silently ignored, and ``metrics.get_meter(__name__)`` returns the global proxy bound to the auto-instrumentation's provider — whose resource we don't control. The visible CH symptom: ``resource_attrs`` on every conserver.* metric only contains ``service.name`` + ``telemetry.sdk.*`` + ``telemetry.auto.version``; ``host.name`` and ``service.instance.id`` that this module sets are dropped. All four worker processes collapse onto a single fingerprint, last-write-wins. Fix: bind ``meter`` to OUR provider directly via ``provider.get_meter(__name__)``, never touching the global. Our provider has the correct resource and its own export pipeline to the OTel collector. Auto-instrumentation metrics continue to flow through their own pipeline independently — both arrive at the collector. Tests updated to assert ``set_meter_provider`` is NOT called and the ``meter`` global is bound to the provider's own meter, not the global proxy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6797301 to
d679a4c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #178. The fork-safe init landed, but observed peak `conserver.vcons.inflight` was still 128 across 4 worker processes — same as before — and `host.name` / `service.instance.id` were missing from `resource_attrs` in the metrics backend.
Root cause
OpenTelemetry Python's `metrics.set_meter_provider()` is single-call. When `opentelemetry-instrumentation` (auto-instrumentation) is active in the process, it registers a default MeterProvider early in startup, before this module's lazy `_init_otel_metrics()` runs. Our subsequent `set_meter_provider(our_provider)` is silently ignored, and `metrics.get_meter(name)` returns the global proxy bound to the auto-instrumentation's provider — whose resource we don't control.
The dead giveaway in the ClickHouse `resource_attrs` was `telemetry.auto.version: 0.59b0` — that key is only set by the auto-instrumentation distro's MeterProvider, not by any provider we'd create.
Fix
Bind `meter` to OUR provider directly via `provider.get_meter(name)`, never touching the global. Auto-instrumentation metrics continue to flow through their own pipeline; our application metrics flow through ours. Both end up at the collector independently with their own resource attributes.
Diff
```diff
```
Tests updated to assert `set_meter_provider` is NOT called and `meter` is bound to the provider's own meter.
Test plan
🤖 Generated with Claude Code