Skip to content

fix(azure): promote x-ms-served-model header into Response.model for Responses API#3293

Open
xodn348 wants to merge 1 commit into
openai:mainfrom
xodn348:fix/azure-responses-serve-model
Open

fix(azure): promote x-ms-served-model header into Response.model for Responses API#3293
xodn348 wants to merge 1 commit into
openai:mainfrom
xodn348:fix/azure-responses-serve-model

Conversation

@xodn348
Copy link
Copy Markdown

@xodn348 xodn348 commented May 21, 2026

Summary

  • Adds _process_response overrides to both AzureOpenAI and AsyncAzureOpenAI that promote the x-ms-served-model response header into Response.model for Responses API calls.
  • Mirrors the existing behaviour for Chat Completions (where Azure already rewrites model from the header) so callers can see the actually-deployed model name rather than the logical alias.
  • Zero impact on non-/responses paths and on responses that lack the header.

Issue

Fixes #3271 — when using AzureOpenAI.responses.create(...) the returned Response.model reflects the requested model alias (e.g. "gpt-4o") rather than the precise deployment actually served (e.g. "gpt-4o-2024-11-20"). Azure surfaces the real model name in the x-ms-served-model HTTP response header; this PR reads that header and writes it back into Response.model.

Changes

src/openai/lib/azure.py

  • Added Type to the typing imports.
  • Imported ResponseT from .._types.
  • Added AzureOpenAI._process_response (sync): calls super(), then if the URL starts with /responses and x-ms-served-model is non-empty, patches result.model (non-streaming) or wraps stream._iterator to patch event.response.model for every event that carries a response: Response field (streaming).
  • Added AsyncAzureOpenAI._process_response (async): identical logic using await super() and an async generator.

tests/lib/test_azure.py

  • Added import for openai.types.responses.response.Response.
  • Added five focused unit tests covering: header promoted (sync), header absent (sync), non-/responses endpoint not affected, header promoted (async), header absent (async).

Local verification

=== LOCAL_TEST_PASSED ===
$ python -m pytest tests/lib/test_azure.py -v
...
collected 64 items

tests/lib/test_azure.py::test_implicit_deployment_path[client0] PASSED
tests/lib/test_azure.py::test_implicit_deployment_path[client1] PASSED
tests/lib/test_azure.py::test_client_copying[sync_client-copy] PASSED
tests/lib/test_azure.py::test_client_copying[sync_client-with_options] PASSED
tests/lib/test_azure.py::test_client_copying[async_client-copy] PASSED
tests/lib/test_azure.py::test_client_copying[async_client-with_options] PASSED
...
tests/lib/test_azure.py::test_azure_responses_x_ms_served_model_promoted PASSED
tests/lib/test_azure.py::test_azure_responses_x_ms_served_model_absent PASSED
tests/lib/test_azure.py::test_azure_non_responses_endpoint_not_affected PASSED
tests/lib/test_azure.py::test_async_azure_responses_x_ms_served_model_promoted PASSED
tests/lib/test_azure.py::test_async_azure_responses_x_ms_served_model_absent PASSED
...
64 passed in 3.44s
=== LOCAL_TEST_PASSED ===

Risk

Low. The override only activates when:

  1. The URL path starts with /responses (exact SDK-generated path for the Responses resource), AND
  2. The x-ms-served-model header is present and non-empty.

All other paths and clients are untouched. Response uses a mutable Pydantic BaseModel (extra="allow"), so field assignment is safe. The streaming wrapper is a thin generator replacement on _iterator; it does not touch transport or parsing.

@xodn348 xodn348 requested a review from a team as a code owner May 21, 2026 11:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AzureOpenAI/Foundry: promote x-ms-served-model header into Response.model for the Responses API (parity with OpenAI/Azure Chat Completions)

1 participant