Adding core OTEL layer with accompanying sample and tests#342
Adding core OTEL layer with accompanying sample and tests#342rodrigobr-msft wants to merge 3 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Introduces a first-class OpenTelemetry (OTEL) telemetry subsystem in microsoft-agents-hosting-core, along with a sample app and test suite to validate spans/metrics behavior, and minor refactors to ApplicationError raising patterns for readability.
Changes:
- Added
microsoft_agents.hosting.core.telemetry(core tracer/meter access, span-wrapper abstractions, resource metadata, and small attribute/utils helpers). - Added OTEL-focused tests (span wrapper behavior, tracer/meter initialization, metric delta reader) and shared test fixtures/utilities.
- Added an OTEL sample (
test_samples/otel) and registered OTEL deps in hosting-core packaging.
Reviewed changes
Copilot reviewed 25 out of 28 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/hosting_core/telemetry/test_utils.py | Unit tests for telemetry utility helpers (scopes, conversation id, delivery mode). |
| tests/hosting_core/telemetry/test_simple_span_wrapper.py | Tests for span wrapper lifecycle, attributes, status, and exception behavior. |
| tests/hosting_core/telemetry/test_agents_telemetry.py | Tests for tracer/meter initialization and span callback behavior. |
| tests/hosting_core/telemetry/init.py | Telemetry test package marker. |
| tests/_common/telemetry_utils.py | Helper functions to locate/sum metrics in collected output. |
| tests/_common/fixtures/telemetry.py | In-memory OTEL exporter/metric reader fixtures + delta metric reader wrapper. |
| tests/_common/_tests/test_delta_metric_reader.py | Unit tests validating DeltaMetricReader semantics. |
| test_samples/otel/start_dashboard.ps1 | Helper script to launch an OTEL dashboard container. |
| test_samples/otel/src/telemetry.py | Sample OTEL provider configuration and library instrumentation hooks. |
| test_samples/otel/src/start_server.py | Sample aiohttp server startup wiring for an agent endpoint. |
| test_samples/otel/src/main.py | Sample entrypoint wiring telemetry + agent + server. |
| test_samples/otel/src/get_user_info.py | Sample Graph call used by the OTEL demo agent. |
| test_samples/otel/src/card.py | Sample adaptive card rendering helper. |
| test_samples/otel/src/agent.py | Sample agent application demonstrating auth + basic messaging flows. |
| test_samples/otel/src/init.py | Sample package marker. |
| test_samples/otel/requirements.txt | Sample runtime dependencies including OTEL instrumentations/exporters. |
| test_samples/otel/env.TEMPLATE | Sample environment configuration template including OTEL env vars. |
| libraries/microsoft-agents-hosting-core/setup.py | Adds opentelemetry-api / opentelemetry-sdk as hosting-core dependencies. |
| libraries/microsoft-agents-hosting-core/microsoft_agents/hosting/core/telemetry/utils.py | Telemetry utility helpers for extracting/formatting common activity values. |
| libraries/microsoft-agents-hosting-core/microsoft_agents/hosting/core/telemetry/core/type_defs.py | Shared callback/attribute type aliases for telemetry layer. |
| libraries/microsoft-agents-hosting-core/microsoft_agents/hosting/core/telemetry/core/simple_span_wrapper.py | Simple span wrapper implementation built atop the base wrapper + agents telemetry. |
| libraries/microsoft-agents-hosting-core/microsoft_agents/hosting/core/telemetry/core/resource.py | Defines OTEL Resource and service identity constants. |
| libraries/microsoft-agents-hosting-core/microsoft_agents/hosting/core/telemetry/core/base_span_wrapper.py | Base span wrapper lifecycle abstraction (context manager + manual start/end). |
| libraries/microsoft-agents-hosting-core/microsoft_agents/hosting/core/telemetry/core/_agents_telemetry.py | Tracer/meter access and a timed span context manager with callback support. |
| libraries/microsoft-agents-hosting-core/microsoft_agents/hosting/core/telemetry/core/init.py | Core telemetry public surface exports. |
| libraries/microsoft-agents-hosting-core/microsoft_agents/hosting/core/telemetry/attributes.py | Telemetry attribute key constants + UNKNOWN value. |
| libraries/microsoft-agents-hosting-core/microsoft_agents/hosting/core/telemetry/init.py | Telemetry package exports and design note header. |
| libraries/microsoft-agents-hosting-core/microsoft_agents/hosting/core/app/agent_application.py | Refactors ApplicationError raising blocks into a consistent multiline style. |
| libraries/microsoft-agents-activity/microsoft_agents/activity/otel/init.py | Adds initial instrumentor scaffold for activity package OTEL integration. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| from opentelemetry.sdk.resources import Resource | ||
|
|
||
| SERVICE_NAME = "microsoft_agents" | ||
| SERVICE_VERSION = "1.0.0" | ||
|
|
There was a problem hiding this comment.
SERVICE_VERSION is hard-coded to "1.0.0", which will drift from the actual installed package version and make traces/metrics misleading. Consider deriving it from distribution metadata (e.g., importlib.metadata.version("microsoft-agents-hosting-core") with a safe fallback), similar to connector/get_product_info.py using importlib.metadata.version(...).
| from opentelemetry.sdk.resources import Resource | |
| SERVICE_NAME = "microsoft_agents" | |
| SERVICE_VERSION = "1.0.0" | |
| from importlib import metadata as importlib_metadata | |
| from opentelemetry.sdk.resources import Resource | |
| SERVICE_NAME = "microsoft_agents" | |
| def _get_service_version() -> str: | |
| """Return the installed package version, or a safe fallback.""" | |
| try: | |
| return importlib_metadata.version("microsoft-agents-hosting-core") | |
| except importlib_metadata.PackageNotFoundError: | |
| # Package metadata not available (e.g., editable install or non-standard env). | |
| return "unknown" | |
| except Exception: | |
| # Any other unexpected error: do not break telemetry initialization. | |
| return "unknown" | |
| SERVICE_VERSION = _get_service_version() |
| # This design hides the "mess" of telemetry to one location rather than throughout the codebase. | ||
| # | ||
| # NOTE: this module should not be auto-loaded from __init__.py in order to avoid | ||
|
|
There was a problem hiding this comment.
The module header note is incomplete (...in order to avoid) and doesn’t explain what should be avoided. Please complete or remove the comment so future readers aren’t left with a dangling design note.
| # unintended side effects during import and to keep telemetry initialization explicit. |
| try: | ||
| run_app(APP, host="localhost", port=environ.get("PORT", 3978)) | ||
| except Exception as error: | ||
| raise error |
There was a problem hiding this comment.
run_app(..., port=environ.get("PORT", 3978)) will pass a string when PORT is set, but aiohttp expects an int port. Cast the env var to int (and handle invalid values) to avoid runtime failures. Also the surrounding try/except that immediately re-raises (raise error) is redundant and can be dropped (or use bare raise if you intend to add logging).
| try: | |
| run_app(APP, host="localhost", port=environ.get("PORT", 3978)) | |
| except Exception as error: | |
| raise error | |
| port_env = environ.get("PORT") | |
| port = 3978 | |
| if port_env is not None: | |
| try: | |
| port = int(port_env) | |
| except ValueError: | |
| logger.warning( | |
| "Invalid PORT environment variable %r; falling back to default port %d", | |
| port_env, | |
| port, | |
| ) | |
| run_app(APP, host="localhost", port=port) |
|
|
||
| configure_otel_providers(service_name="quickstart_agent") | ||
|
|
||
| from .agent import AGENT_APP, CONNECTION_MANAGER | ||
| from .start_server import start_server | ||
|
|
||
| start_server( | ||
| agent_application=AGENT_APP, | ||
| auth_configuration=CONNECTION_MANAGER.get_default_connection_configuration(), | ||
| ) |
There was a problem hiding this comment.
configure_otel_providers(...) is executed at import time and the module performs imports after side effects. This makes the sample harder to reuse/import (and violates standard import ordering). Consider moving the setup + server start into a main() function guarded by if __name__ == "__main__": so importing the module doesn’t automatically configure global OTEL providers and start a server.
| configure_otel_providers(service_name="quickstart_agent") | |
| from .agent import AGENT_APP, CONNECTION_MANAGER | |
| from .start_server import start_server | |
| start_server( | |
| agent_application=AGENT_APP, | |
| auth_configuration=CONNECTION_MANAGER.get_default_connection_configuration(), | |
| ) | |
| from .agent import AGENT_APP, CONNECTION_MANAGER | |
| from .start_server import start_server | |
| def main() -> None: | |
| configure_otel_providers(service_name="quickstart_agent") | |
| start_server( | |
| agent_application=AGENT_APP, | |
| auth_configuration=CONNECTION_MANAGER.get_default_connection_configuration(), | |
| ) | |
| if __name__ == "__main__": | |
| main() |
| assert otel_span is not None | ||
|
|
||
| def test_otel_span_raises_when_not_started(self): | ||
| """Accessing otel_span before start raises RuntimeError.""" |
There was a problem hiding this comment.
The docstring says otel_span access "raises RuntimeError" when not started, but the assertion checks that it is None. Update the docstring (or the behavior) so they match.
| """Accessing otel_span before start raises RuntimeError.""" | |
| """Accessing otel_span before start returns None.""" |
This pull request introduces a new telemetry subsystem for the
microsoft-agents-hosting-corepackage, providing a structured and maintainable way to instrument the codebase with OpenTelemetry spans and metrics. The design centralizes telemetry logic, making it easier to manage and extend, and avoids scattering telemetry code throughout the application. Additionally, it updates dependencies and error handling for improved robustness.Key changes include:
Telemetry Subsystem Implementation
Added a new
telemetrymodule, including core components likeagents_telemetry, span wrappers (BaseSpanWrapper,SimpleSpanWrapper), attribute definitions, resource configuration, and utility functions for extracting telemetry-relevant data. This subsystem enables structured, consistent telemetry instrumentation throughout the codebase. [1] [2] [3] [4] [5] [6] [7] [8] [9]Added a design note in
telemetry/__init__.pyexplaining the rationale for centralizing telemetry logic and not auto-loading the module to avoid unnecessary overhead.Dependency Management
setup.pyto addopentelemetry-apiandopentelemetry-sdkas required dependencies for telemetry support.Error Handling Improvements
agent_application.pyto use consistent multi-line string formatting withApplicationError, improving readability and maintainability. [1] [2] [3] [4] [5] [6]Sample/Test Configuration
.envtemplate intest_samples/otel/env.TEMPLATEfor configuring OpenTelemetry exporters and related environment variables, facilitating local testing and deployment of telemetry features.