AzureOpenAI/Foundry: promote x-ms-served-model header into Response.model for the Responses API (parity with OpenAI/Azure Chat Completions)

## Confirm this is a feature request for the Python library and not the underlying OpenAI API

- This is a feature request for the Python library

## Describe the feature or improvement you're looking for

For Azure OpenAI clients (`AzureOpenAI` / `AsyncAzureOpenAI`), please promote the value of the `x-ms-served-model` response header into `Response.model` (and into `event.response.model` for streamed `response.*` events) when calling the Responses API.

This would bring `AzureOpenAI` in line with OpenAI's own behavior on every other endpoint.

### Background

OpenAI's own (non-Azure) endpoints already return the **dated snapshot** of the model that served the request, regardless of the alias sent in the request. Empirically (today, via `openai` 1.x):

| Sent `model` (alias) | `Response.model` from OpenAI |
| --- | --- |
| `gpt-4o-mini` | `gpt-4o-mini-2024-07-18` |
| `gpt-4o` | `gpt-4o-2024-08-06` |
| `gpt-5-nano` | `gpt-5-nano-2025-08-07` |

The same is true on **Azure Chat Completions** — the body already returns the snapshot — and (as a courtesy) Azure surfaces the deployment alias in `x-ms-deployment-name`:

```text
[Chat Completions sent=gpt-5-nano]
  body.model            = 'gpt-5-nano-2025-08-07'   ✅ snapshot
  x-ms-served-model     = None
  x-ms-deployment-name  = 'gpt-5-nano'
```

But on the **Azure Responses API** specifically, the body carries the deployment alias and the snapshot is moved into the `x-ms-served-model` response header (not exposed by the SDK today):

```text
[Responses sent=gpt-5-nano, non-streaming]
  body.model            = 'gpt-5-nano'              ❌ alias, not snapshot
  x-ms-served-model     = 'gpt-5-nano-2025-08-07'
  x-ms-deployment-name  = (not sent on Responses)

[Responses sent=gpt-5-nano, streaming]
  body.model (per event)= 'gpt-5-nano'              ❌
  x-ms-served-model     = 'gpt-5-nano-2025-08-07'
```

So Azure Responses is the outlier among all four code paths.

## Additional context

### Why this matters

Downstream consumers (telemetry, eval, RAG/agent frameworks, etc.) read `Response.model` to record _which model actually served the request_ — to attribute traces, costs, regressions, and behavior changes to specific dated snapshots. With the current SDK behavior, Azure Responses users see the deployment alias only, which can mask:

- Auto-rollouts when an alias deployment is bumped from one snapshot to a newer one.
- Spillover routing (`x-ms-is-spilled-over: true`) where a different snapshot serves the request.
- Regional differences in which snapshot backs the same alias.

### Proposed change (behavioral)

In `openai/lib/azure.py`, in `BaseAzureClient` (and async equivalent), add response post-processing such that:

- For `/responses` endpoint responses, when the `x-ms-served-model` response header is present and non-empty, replace `Response.model` with that value before the SDK returns the parsed object.
- For streaming `responses.create(stream=True, ...)` and `responses.retrieve(..., stream=True)`, do the same on every event whose `event.response.model` is currently set (e.g. `response.created`, `response.in_progress`, `response.completed`, ...) so that the value is consistent across the lifetime of the stream.
- A non-empty / stripped guard on the header keeps the SDK robust to empty values.

We deliberately suggest the **behavioral** fix (overwrite `Response.model`) rather than a backward-compatible additive `Response.served_model` field, because:

1. It restores parity with what `Response.model` already means on OpenAI's own endpoints and on Azure Chat Completions: _the snapshot of the model that actually served the request_.
2. It avoids forcing every framework / telemetry library to learn an Azure-specific second field; today they read `response.model` and that should "just work".
3. The Azure-side mapping from alias to snapshot is **only ever known on the response**, so promoting the header at the SDK boundary is the natural place to do it (callers can still get the alias they sent from their own request options).

### Impact / breakage analysis

The only consumers that would observe a behavior change are those who:

- Use `AzureOpenAI`, AND
- Call the Responses API specifically, AND
- Read `Response.model` expecting the deployment alias.

For those callers, the deployment alias they sent is already known on their side (it's the `model` they passed in). Azure also still surfaces it in `x-ms-deployment-name` for Chat Completions — and could trivially start sending the same header on Responses if needed for parity.

### Reference implementation in a third-party framework

We've shipped a local fix in [microsoft/agent-framework#5910](https://github.com/microsoft/agent-framework/pull/5910) that does exactly this at the framework level (because we cannot wait on an SDK release). Doing this in `openai-python` would let us drop that workaround and benefit every Azure customer of the Responses API uniformly.


Sent `model` (alias)	`Response.model` from OpenAI
`gpt-4o-mini`	`gpt-4o-mini-2024-07-18`
`gpt-4o`	`gpt-4o-2024-08-06`
`gpt-5-nano`	`gpt-5-nano-2025-08-07`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AzureOpenAI/Foundry: promote x-ms-served-model header into Response.model for the Responses API (parity with OpenAI/Azure Chat Completions) #3271

Confirm this is a feature request for the Python library and not the underlying OpenAI API

Describe the feature or improvement you're looking for

Background

Additional context

Why this matters

Proposed change (behavioral)

Impact / breakage analysis

Reference implementation in a third-party framework

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

AzureOpenAI/Foundry: promote x-ms-served-model header into Response.model for the Responses API (parity with OpenAI/Azure Chat Completions) #3271

Description

Confirm this is a feature request for the Python library and not the underlying OpenAI API

Describe the feature or improvement you're looking for

Background

Additional context

Why this matters

Proposed change (behavioral)

Impact / breakage analysis

Reference implementation in a third-party framework

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions