Python: read response headers defensively to support stream wrappers without .headers (#6028)#6029
Conversation
b194ac4 to
758e1a2
Compare
|
@microsoft-github-policy-service agree company=Michelin |
….headers` (microsoft#6028) `OpenAIChatClient._inner_get_response()` reads `.headers` on the raw streaming response returned by `client.responses.with_raw_response.create(stream=True)` (and its three sibling call sites - retrieve-streaming, non-streaming create and background retrieve) to surface the `x-ms-served-model` Azure header, introduced in microsoft#5910. When `azure-ai-projects>=2.1.0` experimental GenAI tracing is enabled (`AZURE_EXPERIMENTAL_ENABLE_GENAI_TRACING=true`), the instrumentor wraps the raw streaming response in an inline `AsyncStreamWrapper` that exposes `.response` but not `.headers`. Reading `raw_create_response.headers` then raises `AttributeError: 'AsyncStreamWrapper' object has no attribute 'headers'`, which `FoundryChatClient` rethrows as a `ChatClientException` and breaks every streaming call (workflows and free chat). Fix: read the header dict via `getattr(raw_response, "headers", None)` at all four call sites. `_extract_served_model()` already short-circuits on `None`, so the served-model surfacing degrades gracefully (model stays the deployment alias) instead of crashing when the response is wrapped by an instrumentor that does not proxy `.headers`. Regression test added: `test_streaming_response_without_headers_attribute_does_not_crash` simulates a stream wrapper that raises `AttributeError` on `.headers` and asserts the stream still completes with the deployment alias as `update.model`. Fixes microsoft#6028
758e1a2 to
ad7b9ff
Compare
jluocsa
left a comment
There was a problem hiding this comment.
Nice catch. The getattr(raw_response, "headers", None) choice is correct, and your description-rationale for preferring it over try/except AttributeError (so unrelated attribute errors still surface) is the right call. The _StreamWrapperWithoutHeaders mock is faithful to the real azure.ai.projects.telemetry._responses_instrumentor.AsyncStreamWrapper — the explicit assert not hasattr(headerless_stream, "headers") sanity check is a nice belt-and-suspenders touch that will catch regressions if anyone later switches the production code to a try/except Exception: and accidentally swallows unrelated errors.
Two non-blocking thoughts you might consider for a small follow-up — happy if neither lands:
1. Hoist the defensive read so the rationale lives in one place
Right now four call sites repeat the same pattern, with three of them carrying a # See note above on raw_stream_response.headers. reference back to the first site. If a later refactor moves or splits the first comment, the other three orphan. A tiny helper would DRY this up:
def _safe_extract_served_model(self, raw_response: object) -> str | None:
"""Read ``x-ms-served-model`` defensively. Telemetry instrumentors
(e.g. ``azure-ai-projects`` experimental tracing) wrap streaming
responses in objects that don't proxy ``.headers``; degrade gracefully.
"""
return self._extract_served_model(getattr(raw_response, "headers", None))Each call site then collapses back to one line with no inline comment needed:
served_model = self._safe_extract_served_model(raw_stream_response)Trade-off: a small extra method on the class vs. single-source-of-truth for the defensive read. If _extract_served_model(headers) is intentionally kept narrow ("headers mapping → str | None", no knowledge of response objects), the current inline approach is also defensible — purely your call.
2. Make the test assertion's intent explicit
assert update.model == "test-model" near the end of test_streaming_response_without_headers_attribute_does_not_crash proves the graceful-degradation path (deployment alias preserved when the served-model header is missing), not just that the stream doesn't crash. Worth one line of comment so a future reader doesn't mistake it for a plain "model name round-trip" assertion:
for update in updates:
# No header → no override → model stays the deployment alias (graceful degradation, not crash).
assert update.model == "test-model"LGTM either way — this unblocks 1.6.0 users on experimental Azure tracing today, the test is a real regression test (not a smoke test), and the design rationale for getattr vs try/except in the PR description is exactly right.
Summary
Fixes #6028.
Since #5910,
OpenAIChatClient._inner_get_response()reads.headerson the raw response returned byclient.responses.with_raw_response.create/retrieve(...)to surface thex-ms-served-modelAzure header. Whenazure-ai-projectsexperimental GenAI tracing is enabled (AZURE_EXPERIMENTAL_ENABLE_GENAI_TRACING=true), the instrumentor wraps the raw streaming response in anAsyncStreamWrapper(defined inline atazure/ai/projects/telemetry/_responses_instrumentor.py:2929) that exposes.response/.stream_async_iterbut not.headers. Readingraw_create_response.headersthen raises:FoundryChatClientrethrows this as aChatClientExceptionand every streaming call breaks (workflows + free chat) as soon as you upgrade toagent-framework-openai>=1.6.0with experimental Azure tracing on.Fix
Read headers defensively at all four call sites in
agent_framework_openai/_chat_client.py:_extract_served_model()(line 739) already short-circuits onNone, so the served-model surfacing degrades gracefully —update.modelfalls back to the deployment alias instead of crashing the call. No behavior change when.headersis present.The four call sites covered:
_chat_client.py:639—raw_stream_response(retrieve-streaming, continuation token)_chat_client.py:680—raw_create_response(streaming create)_chat_client.py:709—raw_response(non-streaming retrieve)_chat_client.py:731—raw_response(non-streaming create)Test plan
test_streaming_response_without_headers_attribute_does_not_crashsimulates a stream wrapper that raisesAttributeErroron.headers, mirroring theazure-ai-projectsinstrumentor behavior. Asserts the stream completes andupdate.modelfalls back to the deployment alias.gpt-5.1-dzsandgpt-5.5-dzsAzure deployments confirms the crash disappears with this patch (validated independently of any local workarounds).test_served_model_header_*already cover the happy path (headers present) and are not touched by this change.Notes for reviewers
getattris preferred over atry/except AttributeErrorblock to avoid swallowing unrelated attribute errors that might surface from response model changes..headers, but that lives inazure-ai-projects(separate repo) and only this side can land quickly to unblock users on the 1.6.0 train.