Skip to content

feat(config): honor OTEL_CONFIG_FILE in the SDK configurator#5271

Draft
MikeGoldsmith wants to merge 18 commits into
open-telemetry:mainfrom
MikeGoldsmith:mike/config-file-env-routing
Draft

feat(config): honor OTEL_CONFIG_FILE in the SDK configurator#5271
MikeGoldsmith wants to merge 18 commits into
open-telemetry:mainfrom
MikeGoldsmith:mike/config-file-env-routing

Conversation

@MikeGoldsmith
Copy link
Copy Markdown
Member

@MikeGoldsmith MikeGoldsmith commented Jun 3, 2026

Description

Wires the SDK entry point to honor the OTEL_CONFIG_FILE environment variable. When set, _OTelSDKConfigurator._configure loads the referenced YAML/JSON file via load_config_file() and applies it via configure_sdk() — bypassing the existing env-var-based _initialize_components() path entirely.

Per spec v1.0.0: when a config file is given, it is the sole source of truth; other OTEL_* variables are ignored except as ${env:VAR} substitutions inside the file.

OTEL_CONFIG_FILE=/path/to/otel-config.yaml python my_app.py

Behavior

  • Env var unset → existing _initialize_components(**kwargs) path. No change.
  • Env var setconfigure_sdk(load_config_file(path)). Other env vars ignored.
  • Env var set + kwargs passed → warning logged listing the ignored kwargs (matters for distros that subclass _OTelSDKConfigurator and inject kwargs via super()._configure(**kwargs)). The file is authoritative.
  • Env var set, file missing/invalidConfigurationError propagates. Loud failure per spec.

Implementation notes

  • OTEL_CONFIG_FILE constant added to opentelemetry.sdk.environment_variables.
  • File-loader imports (pyyaml, jsonschema) stay lazy inside the if config_file: branch so the SDK does not require [file-configuration] extras unless a config file is actually used.

Refs #3631

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

5 new tests in tests/_configuration/test_configurator_file_routing.py:

  • Env var unset → _initialize_components called with passed kwargs (existing behavior preserved)
  • Env var set → configure_sdk(load_config_file(path)) called; _initialize_components NOT called
  • Env var set, missing file → ConfigurationError propagates
  • Env var set + kwargs → warning logged, kwargs not forwarded
  • Distro override pattern (subclass mutating kwargs then calling super()._configure) still works when env var is unset

Manually smoke-tested end-to-end: with a minimal YAML file, the SDK TracerProvider is constructed with the configured processor.

Does This PR Require a Contrib Repo Change?

  • Yes.
  • No.

Checklist:

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

Note

This PR is stacked on top of #5269#5270. The diff includes both upstream PRs' changes until they land; rebase will clean it up. Transient failures expected until then:

  • changelog — validator flags 5269.added and 5270.added as added on this PR (resolves as each lands)
  • public-symbols-checkOTEL_CONFIG_FILE is a new public symbol on opentelemetry.sdk.environment_variables. Intentional (env vars are public API contract). Needs an "Approve Public API check" label from a maintainer.

Adds `_dict_to_dataclass` in `_conversion.py` which walks each field's
type annotation and converts:
- nested dicts → typed dataclass instances
- lists of dicts → lists of typed dataclasses
- string/value → Enum members (e.g. log_level: info)
- unknown keys → routed to the @_additional_properties decorator

The loader's `_dict_to_model` now produces a fully-typed
OpenTelemetryConfiguration tree end-to-end. Factory functions can rely
on typed attribute access (config.tracer_provider.processors[0].batch
.exporter.otlp_http.endpoint) instead of failing on raw dicts.

This closes the gap between load_config_file() and the factory
functions — YAML/JSON config → SDK objects now works end-to-end.

Closes open-telemetry#5127

Assisted-by: Claude Opus 4.6
- Use TypeVar for _dict_to_dataclass return — callers now get the
  correct type instead of Any
- Use collections.abc.Mapping for input (more permissive than dict)
- Add explicit is_dataclass check at entry — raises TypeError with a
  descriptive message instead of failing later in dataclasses.fields

Assisted-by: Claude Opus 4.6
Astroid 3.x (used by pylint 3.x) follows typing.get_type_hints into
Python 3.14's annotationlib, which contains t-string literals it can't
parse and crashes with AttributeError on 'visit_templatestr'. Wrapping
the call in a helper that returns dict[str, Any] stops the inference at
the declared return type.

Assisted-by: Claude Opus 4.7
Same effect as the prior helper — declaring the local as ``dict[str, Any]``
stops astroid's inference at the annotation rather than tracing into the
typing internals.

Assisted-by: Claude Opus 4.7
Single entry point that takes a parsed OpenTelemetryConfiguration,
builds the resource, and applies the tracer/meter/logger providers
and propagator globally. Honors the top-level disabled flag — when
true, no globals are touched.

The orchestrator is a thin composition of the existing per-signal
configure_* factories; the deeper unification with the env-var path
(see open-telemetry#5126) is left for follow-up.

Refs open-telemetry#3631
Refs open-telemetry#5126

Assisted-by: Claude Opus 4.7
When the environment variable is set, route the SDK through the
declarative config path — load the file via load_config_file() and
apply it via configure_sdk() — in place of the env-var-based
_initialize_components(). Other OTEL_* vars are ignored (per spec
v1.0.0: when a config file is given, it is the sole source of truth).

Kwargs passed to _OTelSDKConfigurator._configure are ignored with a
warning when the file path is set, so distros that inject kwargs via
super() see a clear signal rather than silent drops.

The file-loader imports (pyyaml, jsonschema) stay lazy so installs
without the file-configuration extras are not affected.

Refs open-telemetry#3631

Assisted-by: Claude Opus 4.7
… codespell

Replace the bespoke _Level enum (which violated pylint's invalid-name on
lowercase members) with the real ExemplarFilter enum from models.py — the
generated models use lowercase values verbatim from the JSON schema, so
using one of them avoids fighting the linter and exercises the same code
path with real data shapes.

Add 'astroid' to codespell's ignore-words-list; the prior commit's
explanatory comment mentions the library by name and codespell flagged it
as a misspelling of 'asteroid'.

Assisted-by: Claude Opus 4.7
Move ``SdkTracerProvider`` import to module top (ruff PLC0415 /
pylint C0415) and add explicit ``# pylint: disable=no-self-use``
on the three mock-only tests that intentionally do not touch
``self``.

Assisted-by: Claude Opus 4.7
The configure_sdk / load_config_file imports inside ``_configure``
are deliberately deferred so that the SDK does not pull in the
optional file-configuration extras (pyyaml, jsonschema) unless
``OTEL_CONFIG_FILE`` is actually set. Annotate with the corresponding
pylint and ruff suppressions; the existing comment already explains
why.

Assisted-by: Claude Opus 4.7
@MikeGoldsmith MikeGoldsmith added the Approve Public API check This label shows that the public symbols added or changed in a PR are strictly necessary label Jun 3, 2026
@MikeGoldsmith MikeGoldsmith moved this to Ready for review in Python PR digest Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Approve Public API check This label shows that the public symbols added or changed in a PR are strictly necessary

Projects

Status: Ready for review

Development

Successfully merging this pull request may close these issues.

1 participant