Python: Streaming path: Chat span is sibling of invoke_agent span instead of child (broken trace context propagation)

## Bug Description

When running a hosted agent sample with tracing enabled, the **Chat** span created by `ChatTelemetryLayer` appears as a **sibling** of the inner `invoke_agent` span instead of being a **child**. This breaks the expected parent-child hierarchy in distributed traces.

**Expected trace hierarchy:**

```
invoke_agent (outer - hosting)
  └── invoke_agent (inner - AgentTelemetryLayer)
        └── chat gpt-4.1 (ChatTelemetryLayer)
```

**Actual trace hierarchy:**

```
invoke_agent (outer - hosting)
  ├── invoke_agent (inner - AgentTelemetryLayer)
  └── chat gpt-4.1 (ChatTelemetryLayer)   ← sibling, not child
```

## Root Cause

In `agent_framework/observability.py`, the **streaming path** of `AgentTelemetryLayer` (around line 1545-1561) calls `execute()` **before** the agent span is created:

```python
if stream:
    run_result = execute()                    # line 1547 — triggers ChatTelemetryLayer, which creates the Chat span
    # ...
    span = get_tracer().start_span(...)       # line 1561 — agent span created AFTER execute()
```

Since the agent span doesn't exist yet when `execute()` runs, the Chat span has no parent to attach to and becomes a sibling.

Compare with the **non-streaming path** (line 1629-1668), which correctly wraps `execute()` inside `_get_span()`:

```python
with _get_span(...) as span:            # agent span created FIRST with trace.use_span()
    response = await execute()            # Chat span correctly inherits parent context
```

Additionally, `ChatTelemetryLayer`'s streaming path (line 1299) creates its span using `get_tracer().start_span()` without `trace.use_span()`, so even if the agent span existed, context would not propagate to any deeper children.

The code comment at lines 1293-1296 explains this is intentional to avoid "Failed to detach context" errors from OpenTelemetry when streaming spans are closed asynchronously. However, a fix could:
1. Create the agent span **before** calling `execute()`
2. Temporarily attach context during `execute()` using `trace.use_span(end_on_exit=False)` or manual context token attach/detach

## Steps to Reproduce

1. Clone the foundry-samples repo and navigate to the basic responses sample:
   `samples/python/hosted-agents/deploy-test-agents/agent-framework/responses/01-basic`
   (https://github.com/microsoft/foundry-samples/tree/main/samples/python/hosted-agents/deploy-test-agents/agent-framework/responses/01-basic)
2. Deploy and run the sample (which calls `enable_instrumentation(enable_sensitive_data=True)`)
3. Send a request to the agent
4. Inspect the generated trace — the Chat span will be a sibling of the inner invoke_agent span

## Environment

- `agent-framework` version: 1.2.0
- `agent-framework-core` version: 1.2.0
- Python 3.10+


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Streaming path: Chat span is sibling of invoke_agent span instead of child (broken trace context propagation) #5528

Bug Description

Root Cause

Steps to Reproduce

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Python: Streaming path: Chat span is sibling of invoke_agent span instead of child (broken trace context propagation) #5528

Description

Bug Description

Root Cause

Steps to Reproduce

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions