Skip to content

AIChatAgent orphaned stream recovery can merge a new assistant response into the previous assistant message #1691

@cgrdavies

Description

@cgrdavies

Summary

When an @cloudflare/ai-chat stream is interrupted before final assistant-message persistence, orphan recovery reconstructs the assistant message from stored stream chunks. If those stored chunks do not include a provider start.messageId, _persistOrphanedStream() can fall back to the last assistant message already present in chat history.

That fallback is correct for some continuation cases, but it is wrong for a normal new assistant turn after a later user message. In that case, recovered chunks for user-two can be persisted using assistant-one's id, corrupting both persisted chat history and future model context.

Observed with:

  • @cloudflare/ai-chat@0.8.1
  • agents@0.14.3
  • AIChatAgent native chat recovery across Durable Object deploy/code-update reset

Minimal fix here: f6a8bc4...cgrdavies:codex/persist-chat-assistant-message-id

Why this matters

This is a durability/recovery correctness issue. It can happen across Durable Object hibernation, deploy churn, isolate restart, or reconnect recovery, exactly where resumable streams are meant to preserve user-visible work.

The resulting transcript can become semantically wrong:

user-one
assistant-one
user-two
// interrupted assistant-two chunks are recovered into assistant-one

That corrupts UI state and subsequent model input. In a real chat UI this can show up as message ordering problems, a second user request apparently being answered inside the previous assistant turn, or later recovery/model context being based on a malformed transcript.

Suspected cause

AIChatAgent._reply() allocates a new assistant message id in memory before processing the response stream, but for streams without a provider start.messageId, that allocated id is not durably associated with the resumable stream metadata.

During orphan reconstruction, _persistOrphanedStream() only has stored stream chunks. If no start.messageId is present, it falls back to the last assistant message in chat history.

Relevant areas:

  • packages/ai-chat/src/index.ts
    • _reply()
    • _createStreamingAssistantMessage()
    • _persistOrphanedStream()
  • packages/agents/src/chat/resumable-stream.ts
    • ResumableStream.start()
    • cf_ai_chat_stream_metadata

The fallback to the last assistant message is appropriate for some continuation recovery paths, but not for a normal new assistant response after a later user message.

Minimal repro shape

Persist messages:

[
  { id: "user-one", role: "user", parts: [...] },
  { id: "assistant-one", role: "assistant", parts: [...] },
  { id: "user-two", role: "user", parts: [...] }
]

Start a normal response stream for user-two.

Store chunks that do not include a provider message id:

{"type":"start"}
{"type":"text-start","id":"t"}
{"type":"text-delta","id":"t","delta":"second response"}
{"type":"text-end","id":"t"}

Then:

  1. Simulate Durable Object hibernation/restart before final persistMessages().
  2. Reconnect and ACK stream resume.
  3. Let orphan reconstruction run.

Expected:

user-one
assistant-one
user-two
assistant-two // recovered chunks for user-two

Actual:

user-one
assistant-one // now contains its original response plus recovered chunks for user-two
user-two

Proposed fix direction

Persist the allocated assistant message id in stream metadata when the stream starts, before chunks are produced. Orphan recovery should use that stored id when reconstructing a stream that does not have a provider start.messageId.

Important details:

  • Provider-supplied start.messageId should still win when present and appropriate for non-continuation streams.
  • Continuation streams should still merge into the intended existing assistant message.
  • Existing stream rows without the new metadata need a backward-compatible fallback.
  • Any stream metadata schema migration should only swallow the expected duplicate-column case, not arbitrary SQL errors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions