Skip to content

Chat Completions model should reject unsupported server-managed state #3275

@Aphroq

Description

@Aphroq

Please read this first

  • Have you read the docs? Yes. The OpenAI migration docs state that Chat Completions callers must store and manage conversation context themselves, while previous_response_id is a Responses API state strategy.
  • Have you searched for related issues? Yes. I searched upstream issues and PRs for ChatCompletions previous_response_id conversation_id and did not find a direct duplicate.

Describe the bug

OpenAIChatCompletionsModel.get_response() and stream_response() accept previous_response_id and conversation_id, but the Chat Completions request does not send those parameters. The returned ModelResponse.response_id is also None.

If higher-level server-managed conversation configuration disables local session persistence, callers can believe server-side context is active while Chat Completions only receives the current input.

Debug information

  • Agents SDK version: main@683b6e7
  • Python version: Python 3.12.1

Repro steps

Run this from the repository root:

import asyncio

import httpx
from openai.types.chat.chat_completion import ChatCompletion, Choice
from openai.types.chat.chat_completion_message import ChatCompletionMessage

from agents import ModelSettings, ModelTracing, OpenAIChatCompletionsModel


class DummyCompletions:
    def __init__(self):
        self.kwargs = None

    async def create(self, **kwargs):
        self.kwargs = kwargs
        return ChatCompletion(
            id="chatcmpl_123",
            created=0,
            model="fake",
            object="chat.completion",
            choices=[
                Choice(
                    index=0,
                    finish_reason="stop",
                    message=ChatCompletionMessage(role="assistant", content="ok"),
                )
            ],
        )


class DummyClient:
    def __init__(self, completions):
        self.chat = type("Chat", (), {"completions": completions})()
        self.base_url = httpx.URL("https://api.openai.com/v1/")


async def main():
    completions = DummyCompletions()
    model = OpenAIChatCompletionsModel(
        model="gpt-4",
        openai_client=DummyClient(completions),
    )
    await model.get_response(
        system_instructions=None,
        input="second turn only",
        model_settings=ModelSettings(),
        tools=[],
        output_schema=None,
        handoffs=[],
        tracing=ModelTracing.DISABLED,
        previous_response_id="resp_123",
        conversation_id=None,
        prompt=None,
    )
    print("previous_response_id sent:", "previous_response_id" in completions.kwargs)


asyncio.run(main())

Actual result on current main:

previous_response_id sent: False

No error is raised.

Expected behavior

The Chat Completions backend should fail fast when callers pass previous_response_id or conversation_id, since it cannot honor server-managed conversation state. Callers should pass full conversation history to Chat Completions or use a Responses/conversation-capable backend.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions