Python: read response headers defensively to support stream wrappers without `.headers` (#6028) by EmilienMottet · Pull Request #6029 · microsoft/agent-framework

EmilienMottet · 2026-05-22T11:09:11Z

Summary

Since #5910, OpenAIChatClient._inner_get_response() reads .headers on the raw response returned by client.responses.with_raw_response.create/retrieve(...) to surface the x-ms-served-model Azure header. When azure-ai-projects experimental GenAI tracing is enabled (AZURE_EXPERIMENTAL_ENABLE_GENAI_TRACING=true), the instrumentor wraps the raw streaming response in an AsyncStreamWrapper (defined inline at azure/ai/projects/telemetry/_responses_instrumentor.py:2929) that exposes .response / .stream_async_iter but not .headers. Reading raw_create_response.headers then raises:

AttributeError: 'AsyncStreamWrapper' object has no attribute 'headers'

FoundryChatClient rethrows this as a ChatClientException and every streaming call breaks (workflows + free chat) as soon as you upgrade to agent-framework-openai>=1.6.0 with experimental Azure tracing on.

Fix

Read headers defensively at all four call sites in agent_framework_openai/_chat_client.py:

served_model = self._extract_served_model(getattr(raw_response, "headers", None))

_extract_served_model() (line 739) already short-circuits on None, so the served-model surfacing degrades gracefully — update.model falls back to the deployment alias instead of crashing the call. No behavior change when .headers is present.

The four call sites covered:

_chat_client.py:639 — raw_stream_response (retrieve-streaming, continuation token)
_chat_client.py:680 — raw_create_response (streaming create)
_chat_client.py:709 — raw_response (non-streaming retrieve)
_chat_client.py:731 — raw_response (non-streaming create)

Test plan

New regression test test_streaming_response_without_headers_attribute_does_not_crash simulates a stream wrapper that raises AttributeError on .headers, mirroring the azure-ai-projects instrumentor behavior. Asserts the stream completes and update.model falls back to the deployment alias.
Manual repro in M365 Agents Playground against gpt-5.1-dzs and gpt-5.5-dzs Azure deployments confirms the crash disappears with this patch (validated independently of any local workarounds).
Existing tests test_served_model_header_* already cover the happy path (headers present) and are not touched by this change.

Notes for reviewers

The defensive getattr is preferred over a try/except AttributeError block to avoid swallowing unrelated attribute errors that might surface from response model changes.
An alternative would be to make the instrumentor wrapper expose .headers, but that lives in azure-ai-projects (separate repo) and only this side can land quickly to unblock users on the 1.6.0 train.
Documented as experimental by Microsoft: "GenAI tracing instrumentation is an experimental preview feature" — so silently dropping the served-model surfacing in this niche is consistent with the broader contract.

EmilienMottet · 2026-05-22T11:17:09Z

@microsoft-github-policy-service agree company=Michelin

….headers` (microsoft#6028) `OpenAIChatClient._inner_get_response()` reads `.headers` on the raw streaming response returned by `client.responses.with_raw_response.create(stream=True)` (and its three sibling call sites - retrieve-streaming, non-streaming create and background retrieve) to surface the `x-ms-served-model` Azure header, introduced in microsoft#5910. When `azure-ai-projects>=2.1.0` experimental GenAI tracing is enabled (`AZURE_EXPERIMENTAL_ENABLE_GENAI_TRACING=true`), the instrumentor wraps the raw streaming response in an inline `AsyncStreamWrapper` that exposes `.response` but not `.headers`. Reading `raw_create_response.headers` then raises `AttributeError: 'AsyncStreamWrapper' object has no attribute 'headers'`, which `FoundryChatClient` rethrows as a `ChatClientException` and breaks every streaming call (workflows and free chat). Fix: read the header dict via `getattr(raw_response, "headers", None)` at all four call sites. `_extract_served_model()` already short-circuits on `None`, so the served-model surfacing degrades gracefully (model stays the deployment alias) instead of crashing when the response is wrapped by an instrumentor that does not proxy `.headers`. Regression test added: `test_streaming_response_without_headers_attribute_does_not_crash` simulates a stream wrapper that raises `AttributeError` on `.headers` and asserts the stream still completes with the deployment alias as `update.model`. Fixes microsoft#6028

jluocsa

Nice catch. The getattr(raw_response, "headers", None) choice is correct, and your description-rationale for preferring it over try/except AttributeError (so unrelated attribute errors still surface) is the right call. The _StreamWrapperWithoutHeaders mock is faithful to the real azure.ai.projects.telemetry._responses_instrumentor.AsyncStreamWrapper — the explicit assert not hasattr(headerless_stream, "headers") sanity check is a nice belt-and-suspenders touch that will catch regressions if anyone later switches the production code to a try/except Exception: and accidentally swallows unrelated errors.

Two non-blocking thoughts you might consider for a small follow-up — happy if neither lands:

1. Hoist the defensive read so the rationale lives in one place

Right now four call sites repeat the same pattern, with three of them carrying a # See note above on raw_stream_response.headers. reference back to the first site. If a later refactor moves or splits the first comment, the other three orphan. A tiny helper would DRY this up:

def _safe_extract_served_model(self, raw_response: object) -> str | None:
    """Read ``x-ms-served-model`` defensively. Telemetry instrumentors
    (e.g. ``azure-ai-projects`` experimental tracing) wrap streaming
    responses in objects that don't proxy ``.headers``; degrade gracefully.
    """
    return self._extract_served_model(getattr(raw_response, "headers", None))

Each call site then collapses back to one line with no inline comment needed:

served_model = self._safe_extract_served_model(raw_stream_response)

Trade-off: a small extra method on the class vs. single-source-of-truth for the defensive read. If _extract_served_model(headers) is intentionally kept narrow ("headers mapping → str | None", no knowledge of response objects), the current inline approach is also defensible — purely your call.

2. Make the test assertion's intent explicit

assert update.model == "test-model" near the end of test_streaming_response_without_headers_attribute_does_not_crash proves the graceful-degradation path (deployment alias preserved when the served-model header is missing), not just that the stream doesn't crash. Worth one line of comment so a future reader doesn't mistake it for a plain "model name round-trip" assertion:

for update in updates:
    # No header → no override → model stays the deployment alias (graceful degradation, not crash).
    assert update.model == "test-model"

LGTM either way — this unblocks 1.6.0 users on experimental Azure tracing today, the test is a real regression test (not a smoke test), and the design rationale for getattr vs try/except in the PR description is exactly right.

EmilienMottet force-pushed the fix/openai-asyncstreamwrapper-headers-attr branch from b194ac4 to 758e1a2 Compare May 22, 2026 11:16

moonbox3 added the python label May 22, 2026

EmilienMottet force-pushed the fix/openai-asyncstreamwrapper-headers-attr branch from 758e1a2 to ad7b9ff Compare May 22, 2026 12:25

jluocsa reviewed May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: read response headers defensively to support stream wrappers without `.headers` (#6028)#6029

Python: read response headers defensively to support stream wrappers without `.headers` (#6028)#6029
EmilienMottet wants to merge 1 commit into
microsoft:mainfrom
EmilienMottet:fix/openai-asyncstreamwrapper-headers-attr

EmilienMottet commented May 22, 2026

Uh oh!

EmilienMottet commented May 22, 2026 •

edited

Loading

Uh oh!

jluocsa left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

EmilienMottet commented May 22, 2026

Summary

Fix

Test plan

Notes for reviewers

Uh oh!

EmilienMottet commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jluocsa left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

EmilienMottet commented May 22, 2026 •

edited

Loading