Skip to content

Python: Streaming path: Chat span is sibling of invoke_agent span instead of child (broken trace context propagation) #5528

@singankit

Description

@singankit

Bug Description

When running a hosted agent sample with tracing enabled, the Chat span created by ChatTelemetryLayer appears as a sibling of the inner invoke_agent span instead of being a child. This breaks the expected parent-child hierarchy in distributed traces.

Expected trace hierarchy:

invoke_agent (outer - hosting)
  └── invoke_agent (inner - AgentTelemetryLayer)
        └── chat gpt-4.1 (ChatTelemetryLayer)

Actual trace hierarchy:

invoke_agent (outer - hosting)
  ├── invoke_agent (inner - AgentTelemetryLayer)
  └── chat gpt-4.1 (ChatTelemetryLayer)   ← sibling, not child

Root Cause

In agent_framework/observability.py, the streaming path of AgentTelemetryLayer (around line 1545-1561) calls execute() before the agent span is created:

if stream:
    run_result = execute()                    # line 1547 — triggers ChatTelemetryLayer, which creates the Chat span
    # ...
    span = get_tracer().start_span(...)       # line 1561 — agent span created AFTER execute()

Since the agent span doesn't exist yet when execute() runs, the Chat span has no parent to attach to and becomes a sibling.

Compare with the non-streaming path (line 1629-1668), which correctly wraps execute() inside _get_span():

with _get_span(...) as span:            # agent span created FIRST with trace.use_span()
    response = await execute()            # Chat span correctly inherits parent context

Additionally, ChatTelemetryLayer's streaming path (line 1299) creates its span using get_tracer().start_span() without trace.use_span(), so even if the agent span existed, context would not propagate to any deeper children.

The code comment at lines 1293-1296 explains this is intentional to avoid "Failed to detach context" errors from OpenTelemetry when streaming spans are closed asynchronously. However, a fix could:

  1. Create the agent span before calling execute()
  2. Temporarily attach context during execute() using trace.use_span(end_on_exit=False) or manual context token attach/detach

Steps to Reproduce

  1. Clone the foundry-samples repo and navigate to the basic responses sample:
    samples/python/hosted-agents/deploy-test-agents/agent-framework/responses/01-basic
    (https://github.com/microsoft/foundry-samples/tree/main/samples/python/hosted-agents/deploy-test-agents/agent-framework/responses/01-basic)
  2. Deploy and run the sample (which calls enable_instrumentation(enable_sensitive_data=True))
  3. Send a request to the agent
  4. Inspect the generated trace — the Chat span will be a sibling of the inner invoke_agent span

Environment

  • agent-framework version: 1.2.0
  • agent-framework-core version: 1.2.0
  • Python 3.10+

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions