Summary
Add end-to-end OpenTelemetry tracing so every agent run emits a single correlated trace that spans the agent graph, each LLM turn, and each MCP tool call — with the LiteLLM proxy contributing its own child span per call.
Motivation
Today, per-turn token usage is captured only in AgentResult.trajectory (see DeepAgentRunner._build_trajectory, runner.py:111-161), and LLM-call detail lives only in the proxy logs. Debugging questions like "which turn produced the 14k-token prompt?" require manually cross-referencing timestamps. OTEL gives us one trace per run with flamegraph-style drill-down, works across all four runners uniformly, and is vendor-neutral (any OTLP backend: Jaeger, Tempo, Honeycomb, Grafana Cloud).
Scope
-
New package src/observability/ with a tracing.py module exposing:
init_tracing(service_name: str) -> None — configure TracerProvider, OTLP/HTTP exporter, and HTTPXClientInstrumentor (auto-injects traceparent into LiteLLM proxy requests).
get_tracer() -> Tracer — convenience accessor used by runners.
- No-op when
OTEL_SDK_DISABLED=true or required env not set, so existing users are unaffected.
-
Runner instrumentation using GenAI semantic conventions:
- Root span
agent.run with attrs agent.runner, gen_ai.request.model, gen_ai.system.
- Child
agent.turn spans with turn.index, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens.
- Child
tool.call spans with tool.name, tool.server.
- All four runners:
plan_execute, claude_agent, openai_agent, deep_agent.
-
CLI wiring: each entry point (plan-execute, claude-agent, openai-agent, deep-agent) calls init_tracing() once at startup.
-
LiteLLM proxy config docs: document callbacks: ["otel"] + OTEL_EXPORTER_OTLP_ENDPOINT so the proxy's spans nest under the agent trace via the propagated traceparent header.
-
Tests: unit tests using InMemorySpanExporter to assert the expected span tree per runner.
Dependencies
Added as a new optional group [dependency-groups.otel] so OTEL is opt-in:
opentelemetry-api
opentelemetry-sdk
opentelemetry-exporter-otlp-proto-http
opentelemetry-instrumentation-httpx
Out of scope
- Metrics (counters/histograms) — tracing first, metrics in a follow-up.
- Replacing the existing
Trajectory serialization — trajectories stay as the in-process result object.
- LangSmith or Langfuse integration — OTEL-only for now; these can be added as additional exporters later if desired.
Summary
Add end-to-end OpenTelemetry tracing so every agent run emits a single correlated trace that spans the agent graph, each LLM turn, and each MCP tool call — with the LiteLLM proxy contributing its own child span per call.
Motivation
Today, per-turn token usage is captured only in
AgentResult.trajectory(seeDeepAgentRunner._build_trajectory,runner.py:111-161), and LLM-call detail lives only in the proxy logs. Debugging questions like "which turn produced the 14k-token prompt?" require manually cross-referencing timestamps. OTEL gives us one trace per run with flamegraph-style drill-down, works across all four runners uniformly, and is vendor-neutral (any OTLP backend: Jaeger, Tempo, Honeycomb, Grafana Cloud).Scope
New package
src/observability/with atracing.pymodule exposing:init_tracing(service_name: str) -> None— configureTracerProvider, OTLP/HTTP exporter, andHTTPXClientInstrumentor(auto-injectstraceparentinto LiteLLM proxy requests).get_tracer() -> Tracer— convenience accessor used by runners.OTEL_SDK_DISABLED=trueor required env not set, so existing users are unaffected.Runner instrumentation using GenAI semantic conventions:
agent.runwith attrsagent.runner,gen_ai.request.model,gen_ai.system.agent.turnspans withturn.index,gen_ai.usage.input_tokens,gen_ai.usage.output_tokens.tool.callspans withtool.name,tool.server.plan_execute,claude_agent,openai_agent,deep_agent.CLI wiring: each entry point (
plan-execute,claude-agent,openai-agent,deep-agent) callsinit_tracing()once at startup.LiteLLM proxy config docs: document
callbacks: ["otel"]+OTEL_EXPORTER_OTLP_ENDPOINTso the proxy's spans nest under the agent trace via the propagatedtraceparentheader.Tests: unit tests using
InMemorySpanExporterto assert the expected span tree per runner.Dependencies
Added as a new optional group
[dependency-groups.otel]so OTEL is opt-in:opentelemetry-apiopentelemetry-sdkopentelemetry-exporter-otlp-proto-httpopentelemetry-instrumentation-httpxOut of scope
Trajectoryserialization — trajectories stay as the in-process result object.