[FEAT] Add stateful responses layer with history rehydration and DB persistence by maralbahari · Pull Request #21 · vllm-project/agentic-api

maralbahari · 2026-04-21T12:44:52Z

Summary

Implements the stateful responses layer for agentic-api: a full request orchestration pipeline that adds conversation history, protocol translation, and a three-table persistence store on top of the existing vLLM proxy gateway.

What's in this PR

Stateful POST /v1/responses — previous_response_id rehydration chains turns from prior responses; conversation_id rehydration loads full multi-turn sessions from the DB
Three-table SQLAlchemy schema — Item, Response, and Conversation tables (SQLite default, PostgreSQL for multi-worker); SchemaManager handles DDL lifecycle
pydantic_ai–backed orchestration — Engine drives the full request lifecycle; Pipeline runs the pydantic_ai agent against vLLM's Responses Model per request
Protocol translation pipeline — PydanticAINormalizer converts pydantic_ai events → internal NormalizedEvents; ResponseComposer converts them → OpenAI Responses API SSE events
Input translation — RequestInputTranslator converts InputItem/OutputItem → pydantic_ai ModelMessages; StoreInputTranslator normalizes history items before persistence
Response store (--response-store-enabled, default on) — ResponseStore saves/loads response checkpoints; ConversationStore manages multi-turn session state
Dual-mode proxy — response_store_enabled=false falls back to raw HTTP passthrough (the original proxy); response_store_enabled=true routes through the full managed pipeline
--conversation-store-enabled flag — opt-in multi-turn conversation tracking (default off)
VCR cassette-based E2E tests — test_responses_api.py and test_conversation_api.py replay responses recorded against the OpenAI API; no GPU or real vLLM needed

Implementation Detail

Layer 3 — Core Orchestration (ADR-01 §4)

core/engine.py: Engine orchestrates the full request: rehydration → translation → agent run → normalization → composition → persistence
core/pipeline.py: Pipeline wraps the pydantic_ai agent run for a single turn, yielding NormalizedEvent stream
core/normalizer.py: PydanticAINormalizer: maps pydantic_ai StreamEvent subtypes → MessageStarted, MessageDelta, ReasoningDelta, FunctionCallStarted, FunctionCallDelta, FunctionCallDone, MessageDone.
core/composer.py: ResponseComposer: maps NormalizedEvent → OpenAI Responses API SSE frames (response.created, response.output_item.added, response.output_text.delta, response.output_item.done, response.completed, etc.)
core/translator.py: RequestInputTranslator: converts InputMessage/OutputMessage (including tool calls and results) → pydantic_ai ModelMessages
core/normalized_events.py : internal dataclass hierarchy for the normalizer↔composer contract
core/sse.py: SSE frame encoder

Layer 4 — Persistence / Store (ADR-02)

store/response.py — ResponseStore: saves completed responses, rehydrates history from previous_response_id chain
store/conversation.py — ConversationStore: saves/loads conversation state, appends new items per turn
store/translator.py — StoreInputTranslator: normalises raw input items from the DB before passing to the translator

Layer 5 — Database (ADR-02)

database/db_engine.py — async SQLAlchemy engine; PostgreSQL advisory lock helpers for future multi-worker support
database/schema.py — SchemaManager: creates/drops all tables; called during FastAPI lifespan
database/session.py — async session factory; @session_transaction and @run_in_session decorators
database/item.py / database/response.py / database/conversation.py — ORM models for the three-table schema

Test Plan

uv run pytest

Tests cover stateful multi-turn rehydration (previous_response_id and conversation_id), streaming vs non-streaming, protocol normalizer/composer unit tests, store persistence, and the full pipeline — all in-process via VCR cassettes recorded against the OpenAI API; no GPU or real vLLM needed.

Test Results

100 passed, 1 warning in 1.43s

============================= test session starts ==============================
platform linux -- Python 3.12.11, pytest-9.0.2, pluggy-1.6.0 --
cachedir: .pytest_cache
configfile: pyproject.toml
plugins: anyio-4.13.0
collecting ... collected 100 items

tests/core/test_composer.py::test_start_emits_created_and_in_progress PASSED
tests/core/test_composer.py::test_feed_before_start_raises PASSED
tests/core/test_composer.py::test_message_started_emits_output_item_added_and_content_part_added PASSED
tests/core/test_composer.py::test_message_done_emits_text_done_content_part_done_output_item_done PASSED
tests/core/test_composer.py::test_message_output_index_and_item_id_stable PASSED
tests/core/test_composer.py::test_function_call_started_emits_output_item_added PASSED
tests/core/test_composer.py::test_function_call_done_emits_arguments_done_and_output_item_done PASSED
tests/core/test_composer.py::test_function_call_arguments_deltas_attributed_to_function_item PASSED
tests/core/test_composer.py::test_reasoning_events_emit_nothing PASSED
tests/core/test_composer.py::test_usage_final_emits_response_completed PASSED
tests/core/test_composer.py::test_completed_response_omits_reasoning_item_when_no_thinking_part PASSED
tests/core/test_composer.py::test_incomplete_response_sets_incomplete_details_for_max_output_tokens PASSED
tests/core/test_composer.py::test_make_error_events_emits_error_and_failed PASSED
tests/core/test_composer.py::test_sequence_numbers_are_monotonically_increasing PASSED
tests/core/test_normalizer.py::test_text_part_start_emits_message_started PASSED
tests/core/test_normalizer.py::test_text_part_start_with_content_emits_started_and_delta PASSED
tests/core/test_normalizer.py::test_text_part_delta_emits_message_delta PASSED
tests/core/test_normalizer.py::test_text_part_delta_empty_content_emits_nothing PASSED
tests/core/test_normalizer.py::test_text_part_end_emits_message_done PASSED
tests/core/test_normalizer.py::test_thinking_part_start_emits_reasoning_started PASSED
tests/core/test_normalizer.py::test_thinking_part_start_with_content_emits_started_and_delta PASSED
tests/core/test_normalizer.py::test_thinking_part_delta_emits_reasoning_delta PASSED
tests/core/test_normalizer.py::test_thinking_part_end_emits_reasoning_done PASSED
tests/core/test_normalizer.py::test_tool_call_part_start_emits_function_call_started PASSED
tests/core/test_normalizer.py::test_tool_call_part_start_with_args_emits_started_and_delta PASSED
tests/core/test_normalizer.py::test_tool_call_part_delta_emits_arguments_delta PASSED
tests/core/test_normalizer.py::test_tool_call_part_end_emits_function_call_done PASSED
tests/core/test_normalizer.py::test_agent_run_result_event_emits_usage_final PASSED
tests/core/test_normalizer.py::test_unknown_event_emits_nothing PASSED
tests/core/test_normalizer.py::test_delta_for_unknown_index_emits_nothing PASSED
tests/core/test_normalizer.py::test_multiple_parts_get_distinct_item_keys PASSED
tests/core/test_pipeline.py::test_pipeline_emits_completed_event_for_text_output PASSED
tests/core/test_pipeline.py::test_pipeline_response_status_completed_after_run PASSED
tests/core/test_pipeline.py::test_pipeline_output_contains_text_content PASSED
tests/core/test_pipeline.py::test_pipeline_emits_function_call_events PASSED
tests/core/test_pipeline.py::test_pipeline_handles_empty_stream PASSED
tests/core/test_pipeline.py::test_pipeline_sequence_numbers_monotonically_increasing PASSED
tests/core/test_translator.py::test_user_message_becomes_model_request_with_user_prompt_part PASSED
tests/core/test_translator.py::test_system_message_becomes_model_request_with_system_prompt_part PASSED
tests/core/test_translator.py::test_developer_message_becomes_system_prompt_part PASSED
tests/core/test_translator.py::test_assistant_input_message_becomes_model_response_with_text_part PASSED
tests/core/test_translator.py::test_input_message_with_content_list_joins_text PASSED
tests/core/test_translator.py::test_tool_result_becomes_model_request_with_tool_return_part PASSED
tests/core/test_translator.py::test_output_message_becomes_model_response_with_text_part PASSED
tests/core/test_translator.py::test_output_message_with_multiple_content_parts_joins_text PASSED
tests/core/test_translator.py::test_function_tool_call_becomes_model_response_with_tool_call_part PASSED
tests/core/test_translator.py::test_translate_preserves_order_of_mixed_items PASSED
tests/core/test_translator.py::test_empty_list_returns_empty PASSED
tests/core/test_translator.py::test_translator_is_singleton PASSED
tests/store/test_conversation_store.py::test_get_returns_none_for_missing PASSED
tests/store/test_conversation_store.py::test_get_or_create_creates_new_conversation PASSED
tests/store/test_conversation_store.py::test_get_or_create_returns_existing PASSED
tests/store/test_conversation_store.py::test_put_turn_appends_items PASSED
tests/store/test_conversation_store.py::test_put_turn_accumulates_across_turns PASSED
tests/store/test_conversation_store.py::test_put_turn_raises_for_missing_conversation PASSED
tests/store/test_conversation_store.py::test_put_turn_raises_for_duplicate_response_id PASSED
tests/store/test_conversation_store.py::test_rehydrate_raises_for_missing_conversation PASSED
tests/store/test_conversation_store.py::test_rehydrate_empty_conversation PASSED
tests/store/test_conversation_store.py::test_rehydrate_restores_items_in_order PASSED
tests/store/test_conversation_store.py::test_rehydrate_multi_turn_order PASSED
tests/store/test_response_store.py::test_get_returns_none_for_missing PASSED
tests/store/test_response_store.py::test_get_or_raise_raises_for_missing PASSED
tests/store/test_response_store.py::test_put_and_get_round_trip PASSED
tests/store/test_response_store.py::test_put_skipped_when_store_false PASSED
tests/store/test_response_store.py::test_put_skipped_when_status_not_persistable PASSED
tests/store/test_response_store.py::test_duplicate_response_id_raises_bad_input PASSED
tests/store/test_response_store.py::test_previous_response_id_stored PASSED
tests/store/test_response_store.py::test_rehydrate_restores_items_in_order PASSED
tests/store/test_response_store.py::test_rehydrate_multi_turn_accumulates_history PASSED
tests/store/test_translator.py::test_normalize_input_str_wraps_in_user_message PASSED
tests/store/test_translator.py::test_normalize_input_list_returned_unchanged PASSED
tests/store/test_translator.py::test_normalize_input_empty_list_returned_unchanged PASSED
tests/store/test_translator.py::test_resolve_tools_returns_request_tools_when_explicitly_set PASSED
tests/store/test_translator.py::test_resolve_tools_returns_stored_tools_when_not_explicitly_set PASSED
tests/store/test_translator.py::test_resolve_tools_returns_none_when_effective_is_none PASSED
tests/store/test_translator.py::test_resolve_tools_request_none_falls_back_to_stored PASSED
tests/store/test_translator.py::test_resolve_tools_stored_none_explicitly_set_returns_none PASSED
tests/store/test_translator.py::test_resolve_tool_choice_returns_request_when_explicitly_set PASSED
tests/store/test_translator.py::test_resolve_tool_choice_returns_stored_when_not_explicitly_set PASSED
tests/store/test_translator.py::test_resolve_tool_choice_function_choice_preserved PASSED
tests/store/test_translator.py::test_store_translator_is_singleton PASSED
tests/test_conversation_api.py::test_two_turn_nonstreaming PASSED
tests/test_conversation_api.py::test_two_turn_streaming PASSED
tests/test_conversation_api.py::test_conversation_isolation PASSED
tests/test_conversation_api.py::test_branch_off_turn_1 PASSED
tests/test_conversation_api.py::test_multi_branch PASSED
tests/test_proxy.py::test_proxy_responses_non_stream_passthrough PASSED
tests/test_proxy.py::test_proxy_responses_stream_passthrough PASSED
tests/test_proxy.py::test_proxy_hop_by_hop_headers_stripped PASSED
tests/test_proxy.py::test_proxy_authorization_env_key_injected PASSED
tests/test_proxy.py::test_proxy_authorization_client_header_takes_precedence PASSED
tests/test_proxy.py::test_proxy_upstream_http_error_passthrough PASSED
tests/test_proxy.py::test_proxy_stream_mid_stream_failure_closes_without_synthetic_events PASSED
tests/test_proxy.py::test_proxy_connect_error_maps_to_502 PASSED
tests/test_proxy.py::test_proxy_timeout_maps_to_504 PASSED
tests/test_responses_api.py::test_single_turn_nonstreaming PASSED
tests/test_responses_api.py::test_single_turn_streaming PASSED
tests/test_responses_api.py::test_two_turn_nonstreaming_previous_response_id PASSED
tests/test_responses_api.py::test_two_turn_streaming_previous_response_id PASSED
tests/test_responses_api.py::test_store_disabled_not_reusable PASSED

=============================== warnings summary ===============================
.venv/lib/python3.12/site-packages/_pytest/config/__init__.py:1428
 /agentic-api/.venv/lib/python3.12/site-packages/_pytest/config/__init__.py:1428: PytestConfigWarning: Unknown config option: asyncio_mode

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================== 100 passed, 1 warning in 1.43s ========================

Signed-off-by: maral <maralbahari.98@gmail.com>

Co-authored-by: Tan Jia Huei tanjiahuei@gmail.com Co-authored-by: noobHappylife aratar1991@hotmail.com Co-authored-by: Claude Signed-off-by: maralbahari maralbahari.98@gmail.com Signed-off-by: maral <maralbahari.98@gmail.com>

Co-authored-by: Claude Signed-off-by: maralbahari maralbahari.98@gmail.com Signed-off-by: maral <maralbahari.98@gmail.com>

Signed-off-by: maral <maralbahari.98@gmail.com>

Co-authored-by: Claude Signed-off-by: maral <maralbahari.98@gmail.com>

noobHappylife

For the test cassettes, should we also use a model with reasoning too?

noobHappylife · 2026-04-23T07:54:49Z

+        default="sqlite+aiosqlite:///./agentic_api.db",
+        description="SQLAlchemy async database URL.",
+    )
+    db_dialect: str = Field(


Do we need this? we should be able to know it's sqlite or postgres from the db_url already?

this is in the case that user created their Postgres database hosted and passed the url.

Yeah, so I meant the db_dialect can be derived from db_url directly so we don't need to set it manually?

Yeh but need to make sure they pass with posgres keyword along as well cause a url might contain host and port. So i guess for the url field need to add a verify check to pass the schema postgresql://

noobHappylife · 2026-04-23T08:27:14Z

+        elif isinstance(event, MessageDone):
+            yield from self._message_done(event)
+        elif isinstance(event, ReasoningStarted):
+            pass  # Reasoning items are not emitted as output events in this implementation


what does this means? not emitted as output events

yes for now. This PR doesnt focus on that. in another PR the reasoning events would be added.

noobHappylife · 2026-04-23T08:34:23Z

+                effective_tool_choice=hydrated_body.tool_choice,
+                effective_instructions=hydrated_body.instructions,
+            )
+            await self._conversation_store.put_turn(  # type: ignore[union-attr]


If a response failed halfway, does the emitted events still write into DB? and the full history list will be "hanging" with a failed event?

it is all handled by the conversation CRUD ConversationStore if something goes wrong there is error being displayed and nothing would be stored. since these are async function and the CURD transaction session would roll back and the nothing would be stored.

So, say the upstream got an error. We should treat it as an error event right https://www.openresponses.org/specification#errors? So from agentic-api PoV there shouldn't be "error", so user just see error event, and response.failed.

In this case, is the response stored?

If something goes wrong on upstream that would be an event error. The _persist function is only called to store the responses on successful events therefore there the data is stored.

noobHappylife · 2026-04-23T08:36:28Z

+from sqlalchemy.orm import DeclarativeBase
+
+
+class Base(DeclarativeBase):


should we define the fields that exists in all tables here? Also should we consider using sqlmodel?

This is the base table with no fields other than default builtin. each Response, Item and Conversation tables would write their own field/column. The sqlmodel is built on top of declarative sqlalchemy and it actually slower in query since it uses pydantic it add overhead and we dont want that overhead in CRUD usually. since we use dataclasses and querying the tables directly it is much faster. than sqlmodel and pydantic.

noobHappylife · 2026-04-23T08:47:57Z

+        proxy_client_manager: ProxyClientManager = (
+            request.app.state.proxy_client_manager
+        )
+        return await proxy_responses(


Should we also use the engine even if disabled response_store? so the behavior is consistent (since it goes through the same compose/normalize steps.

yes we can use the engine too.

@noobHappylife added a TODO for this to handle properly in another PR.

Signed-off-by: maral <maralbahari.98@gmail.com>

franciscojavierarceo

Thanks for the thorough work here — the architecture is clean and the test coverage is great. Left inline comments on the items worth addressing, grouped roughly by severity.

franciscojavierarceo · 2026-04-29T17:43:16Z

+            conversation_id=row.id,
+            history_item_ids=[],
+            created_at=row.created_at,
+        )


Critical: TOCTOU race in get_or_create

get and create_conversation run in separate sessions. Under concurrent requests with the same conversation_id, two coroutines can both observe None and both attempt to create — causing an unhandled IntegrityError (500).

A proven pattern for this is to use an atomic upsert (INSERT ... ON CONFLICT DO NOTHING + RETURNING, or catch IntegrityError and retry with a get). Other implementations of the Responses API use dialect-specific insert().on_conflict_do_nothing() for exactly this reason — it makes the create-if-not-exists operation atomic at the DB level without needing application-level locking.

@franciscojavierarceo implemented dialect-specific insert().on_conflict_do_nothing().

franciscojavierarceo · 2026-04-29T17:43:16Z

+                metadata=metadata_,
+            )
+        except IntegrityError as e:
+            raise BadInputError(f"Response id already exists: {response_id}") from e


Critical: Lost-update race in put_turn

put_turn reads history_item_ids via self.get() (session 1), appends new IDs in Python, then writes the full list via _persist_conversation_turn (session 2). Two concurrent turns for the same conversation will both read the same history_item_ids and each will append their own items — the second write silently overwrites the first's items.

Two approaches that work well here:

Single session with SELECT ... FOR UPDATE — serialize concurrent writes to the same conversation at the row level.

Append-only item table — instead of maintaining a mutable history_item_ids list on the Conversation row, store each item with a conversation_id FK and an ordering column (e.g., created_at + sequence). Rehydration becomes a simple SELECT ... WHERE conversation_id = ? ORDER BY seq. This eliminates the read-modify-write cycle entirely and is the pattern other Responses API implementations use successfully.

@franciscojavierarceo I have applied your second suggestion. where we retrieve history from item tables for conversation api.

franciscojavierarceo · 2026-04-29T17:43:16Z

+            ItemPayload.model_validate(items_by_id[item_id].data).item
+            for item_id in stored.history_item_ids
+            if item_id in items_by_id
+        ]


High: Silent data loss during rehydration

if item_id in items_by_id

Missing Item rows (due to data corruption, partial cleanup, or eventual consistency) are silently skipped. The conversation history will be truncated without any error or log warning, which could lead to incorrect LLM context and confusing downstream behavior.

At minimum, log a warning for each missing item. Ideally, raise if len(items_by_id) != len(stored.history_item_ids) — a count mismatch indicates data integrity issues that shouldn't be silently swallowed.

@franciscojavierarceo this wouldn't be necessary anymore here. after changing to second suggestion above.

franciscojavierarceo · 2026-04-29T17:43:16Z

+    """
+    global _session_factory
+    _session_factory = async_sessionmaker(
+        engine, class_=AsyncSession, expire_on_commit=False


Critical: configure_session_factory is not idempotent

This unconditionally replaces the module-level _session_factory global every time it's called. Both ResponseStore.__init__ and ConversationStore.__init__ call it. The comment on the ConversationStore side claims idempotency, but the implementation doesn't enforce it.

Suggestion — add a guard:

if _session_factory is not None: return

Or do an identity check on the engine to catch misuse early.

franciscojavierarceo · 2026-04-29T17:43:16Z

+
+
+DONE_MARKER = "data: [DONE]\n\n"
+TERMINAL_EVENT_TYPES = {"response.completed", "response.failed"}


High: response.incomplete missing from TERMINAL_EVENT_TYPES

The composer can emit "response.incomplete" (see composer.py around line 277), and the engine treats it as terminal. But this set only contains completed and failed, so the data: [DONE] marker won't be emitted immediately after an incomplete event.

TERMINAL_EVENT_TYPES = {"response.completed", "response.failed", "response.incomplete"}

franciscojavierarceo · 2026-04-29T17:43:16Z

+    tool_choice: ToolChoice = Field(default_factory=AutoToolChoice)
+    stream: bool = False
+    response_store_enabled: bool = True
+    conversation_store_enabled: bool = False


Medium: conversation_store_enabled not gated server-side

This is a per-request field — any client can enable conversation tracking by setting it in their request body. There's no server-level config to disallow it.

Consider adding a RuntimeConfig.conversation_store_enabled flag that gates whether the per-request field is honored. When the server flag is off, ignore the client field (or return 400). This gives operators control over whether the feature is available.

@franciscojavierarceo we're planning to split conversation handling into its own dedicated router in another PR. Once that separation is in place, the per-request flag will be removed and server-side control will live naturally in the conversation router's config. added a TODO for this

franciscojavierarceo · 2026-04-29T17:43:16Z

+                    _cached_text_clause("SELECT pg_advisory_unlock(:k)"), {"k": key}
+                )
+            except Exception:
+                return


Medium: Advisory lock unlock failure silently swallowed

except Exception: return

If the unlock fails (e.g., connection dropped), the PostgreSQL advisory lock leaks with no log output. At minimum, log a warning here so operators can detect lock leaks.

franciscojavierarceo · 2026-04-29T17:43:16Z

+        return ModelRequest(
+            parts=[
+                ToolReturnPart(
+                    tool_name="", content=item.output, tool_call_id=item.call_id


Medium: Empty tool_name passed to ToolReturnPart

ToolReturnPart(tool_name="", content=item.output, tool_call_id=item.call_id)

This is an inherent limitation of the OpenAI Responses API contract (function_call_output doesn't include the tool name). Worth adding a short comment explaining why, so future readers don't try to "fix" it. If the history contains a prior FunctionToolCall item with the matching call_id, the tool name could be looked up from there.

franciscojavierarceo · 2026-04-29T17:43:16Z

+            ItemPayload.model_validate(items_by_id[item_id].data).item
+            for item_id in stored.history_item_ids
+            if item_id in items_by_id
+        ]


High: Silent data loss during rehydration (same issue as conversation store)

if item_id in items_by_id

Same concern as ConversationStore.rehydrate — missing items are silently skipped. Should log a warning or raise on count mismatch.

@franciscojavierarceo fixed added a warning log.

franciscojavierarceo · 2026-04-29T17:43:16Z

+                effective_tool_choice=hydrated_body.tool_choice,
+                effective_instructions=hydrated_body.instructions,
+            )
+            await self._conversation_store.put_turn(  # type: ignore[union-attr]


Medium: Consider upsert-based streaming persistence

Currently, the response is persisted only on completion via _persist. If the process crashes mid-stream, the response is lost entirely.

A pattern that works well for streaming is checkpoint-based upsert persistence: INSERT the response with in_progress status when streaming begins, then UPDATE (upsert) at key events (output_item.done, response.completed). This lets clients poll GET /v1/responses/{id} to see partial progress, and ensures incomplete responses are at least partially recoverable after crashes.

Not necessarily required for this PR, but worth considering for a follow-up — especially if background/async response execution is planned.

@franciscojavierarceo added a #TODO for this.

franciscojavierarceo

Two additional follow-up comments on streaming resilience and future-proofing.

franciscojavierarceo · 2026-04-29T18:27:38Z

+        async for event in self._iter_events(run_settings, pipeline, stream=True):
+            if event.type in {"response.completed", "response.incomplete"}:
+                await self._persist(
+                    hydrated_body=hydrated_body,


Medium: Persistence failure kills the SSE stream

await self._persist(...) is in the hot path of the streaming response. If the DB write throws (connection drop, constraint violation, etc.), the exception propagates and kills the SSE stream — the client gets an abrupt disconnect instead of the response they were already receiving.

A more resilient pattern is best-effort persistence: wrap _persist in a try/except, log the failure as a warning, and let the stream complete. The client still gets their full response even if the DB hiccups. The response just won't be rehydratable for future turns.

try: await self._persist(...) except Exception: logger.warning("Failed to persist response %s", pipeline.composer.response.id, exc_info=True)

franciscojavierarceo · 2026-04-29T18:27:38Z

+                    request_tool_choice=self._body.tool_choice,
+                    stored_tool_choice=stored.metadata.effective_tool_choice,
+                    tool_choice_explicitly_set="tool_choice" in fields_set,
+                ),


Low (future-proofing): No status validation on previous_response_id

_rehydrate fetches the stored response via get_or_raise but doesn't check whether the referenced response has a terminal status (completed, incomplete, failed). Today this is fine because responses are only persisted on completion. But once streaming persistence lands (where responses are INSERT'd as in_progress), a client could reference an in-progress response and get partial/inconsistent history.

Worth adding a status check here when that work arrives:

if stored.status not in {"completed", "incomplete", "failed"}: raise BadInputError(f"Cannot chain from response with status '{stored.status}'")

@franciscojavierarceo added a a #TODO for this

Signed-off-by: maral <maralbahari.98@gmail.com>

…t_turn Signed-off-by: maral <maralbahari.98@gmail.com>

Signed-off-by: maral <maralbahari.98@gmail.com>

leseb

given our recent direction to move to rust, i believe this should be closed.

maralbahari · 2026-05-11T12:16:50Z

given our recent direction to move to rust, i believe this should be closed.

@leseb Sure. will open PR converting this PR from python to rust.

maralbahari added 9 commits April 14, 2026 05:55

add databse and stateful responses flow

9f043b5

Signed-off-by: maral <maralbahari.98@gmail.com>

update rehydration with conversation table

c46549c

Co-authored-by: Tan Jia Huei tanjiahuei@gmail.com Co-authored-by: noobHappylife aratar1991@hotmail.com Co-authored-by: Claude Signed-off-by: maralbahari maralbahari.98@gmail.com Signed-off-by: maral <maralbahari.98@gmail.com>

add unit tests

42fd05b

Co-authored-by: Claude Signed-off-by: maralbahari maralbahari.98@gmail.com Signed-off-by: maral <maralbahari.98@gmail.com>

make conversation api presist along with responses

73da713

Signed-off-by: maral <maralbahari.98@gmail.com>

clean code

4b06d40

Signed-off-by: maral <maralbahari.98@gmail.com>

update tests

7eddfa5

Signed-off-by: maral <maralbahari.98@gmail.com>

update packages depencencies

1b182f7

Signed-off-by: maral <maralbahari.98@gmail.com>

fix unit tests and add cassettes with recoder

adafa76

Co-authored-by: Claude Signed-off-by: maral <maralbahari.98@gmail.com>

Merge remote-tracking branch 'origin/main' into responses-flow

cf067bf

maralbahari requested review from bbrowning, franciscojavierarceo, jiahuei, leseb, noobHappylife, qandrew and tjtanaa as code owners April 21, 2026 12:44

noobHappylife reviewed Apr 23, 2026

View reviewed changes

get dialect from db url and add validators

41aa568

Signed-off-by: maral <maralbahari.98@gmail.com>

This comment was marked as outdated.

Sign in to view

franciscojavierarceo reviewed Apr 29, 2026

View reviewed changes

franciscojavierarceo mentioned this pull request May 1, 2026

feat: file_search tool support with Files & Vector Stores APIs #22

Closed

maralbahari added 6 commits May 4, 2026 02:28

fix db url validator

dd462b2

Signed-off-by: maral <maralbahari.98@gmail.com>

redesign conversation db and resolve Critical: Lost-update race in pu…

1120a5f

…t_turn Signed-off-by: maral <maralbahari.98@gmail.com>

session_factory_engine unconditional replacement

373ea72

Signed-off-by: maral <maralbahari.98@gmail.com>

fix exception catch and missing event in sse

eeafb8b

Signed-off-by: maral <maralbahari.98@gmail.com>

add todo comment

d0c6b74

Signed-off-by: maral <maralbahari.98@gmail.com>

add TODO for proxy router

155e8b1

Signed-off-by: maral <maralbahari.98@gmail.com>

leseb requested changes May 11, 2026

View reviewed changes

maralbahari closed this May 11, 2026

		from sqlalchemy.orm import DeclarativeBase


		class Base(DeclarativeBase):



		DONE_MARKER = "data: [DONE]\n\n"
		TERMINAL_EVENT_TYPES = {"response.completed", "response.failed"}

Conversation

maralbahari commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in this PR

Implementation Detail

Test Plan

Test Results

Uh oh!

noobHappylife left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

noobHappylife Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maralbahari Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

franciscojavierarceo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

maralbahari commented Apr 21, 2026 •

edited

Loading

noobHappylife Apr 23, 2026 •

edited

Loading

maralbahari Apr 23, 2026 •

edited

Loading