Skip to content

ai-chat: validate persisted messages, fix metadata on broadcast/resume, document advanced patterns#910

Merged
threepointone merged 2 commits intomainfrom
validate-messages-and-metadata
Feb 14, 2026
Merged

ai-chat: validate persisted messages, fix metadata on broadcast/resume, document advanced patterns#910
threepointone merged 2 commits intomainfrom
validate-messages-and-metadata

Conversation

@threepointone
Copy link
Contributor

Summary

  • Structural message validation: Messages loaded from SQLite are now validated for required structure (id, role, parts). Malformed rows from corruption, tampering, or schema drift are logged and silently skipped instead of crashing the agent.
  • Message metadata on broadcast/resume path: The client-side broadcast (multi-tab sync) and stream resume (reconnection) paths now propagate messageMetadata from start/finish/message-metadata stream chunks. Previously, cross-tab clients and reconnecting clients only received metadata after the final CF_AGENT_CHAT_MESSAGES broadcast — now they see it during streaming. Also fixes metadata loss during tool continuations.
  • Advanced Patterns docs: New section in docs/chat-agents.md documenting AI SDK features that work out of the box in onChatMessage: prepareStep (dynamic model/tool control), wrapLanguageModel (middleware for guardrails, RAG, caching), generateObject (structured output in tools), ToolLoopAgent (reusable subagents), and async function* (preliminary tool results for streaming progress).

Changes

File What changed
packages/ai-chat/src/index.ts Added isValidMessageStructure() validation in _loadMessagesFromDb()
packages/ai-chat/src/react.tsx activeStreamRef now tracks metadata; chunk handler captures metadata from stream events; flushActiveStreamToMessages includes metadata; continuation path carries over existing metadata
packages/ai-chat/src/tests/validate-messages.test.ts 9 tests covering all validation branches (missing id, empty id, bad role, non-array parts, non-object values, empty parts accepted, metadata preserved, all valid roles)
packages/ai-chat/src/tests/message-builder.test.ts 4 tests verifying applyChunkToParts returns false for metadata-level chunks (start, finish, message-metadata, finish-step)
packages/ai-chat/src/tests/worker.ts Added insertRawMessage() helper for injecting malformed rows in tests
docs/chat-agents.md New "Advanced Patterns" section (~215 lines) with prepareStep, middleware, generateObject, ToolLoopAgent, and preliminary results
.changeset/validate-messages-and-metadata.md Patch changeset for @cloudflare/ai-chat

Test plan

  • All 149 ai-chat workers tests pass (21 test files)
  • All 572 agents package tests pass (37 test files)
  • Full npm run check passes (typecheck, oxlint, oxfmt, sherif, export checks)
  • Manually test multi-tab metadata propagation with the resumable-stream-chat example

…st/resume path.

**Structural message validation:**

Messages loaded from SQLite are now validated for required structure (non-empty `id` string, valid `role`, `parts` is an array). Malformed rows — from corruption, manual tampering, or schema drift — are logged with a warning and silently skipped instead of crashing the agent. This is intentionally lenient: empty `parts` arrays are allowed (streams that errored mid-flight), and no tool/data schema validation is performed at load time (that remains a userland concern via `safeValidateUIMessages` from the AI SDK).

**Message metadata on broadcast/resume path:**

The server already captures `messageMetadata` from `start`, `finish`, and `message-metadata` stream chunks and persists it on `message.metadata`. However, the client-side broadcast path (multi-tab sync) and stream resume path (reconnection) did not propagate metadata — the `activeStreamRef` only tracked `parts`. Now it also tracks `metadata`, and `flushActiveStreamToMessages` includes it in the partial message flushed to React state. This means cross-tab clients and reconnecting clients see metadata (model info, token usage, timestamps) during streaming, not just after the final `CF_AGENT_CHAT_MESSAGES` broadcast.
@changeset-bot
Copy link

changeset-bot bot commented Feb 14, 2026

🦋 Changeset detected

Latest commit: 58924ab

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@cloudflare/ai-chat Patch
agents Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new
Copy link

pkg-pr-new bot commented Feb 14, 2026

Open in StackBlitz

npm i https://pkg.pr.new/cloudflare/agents@910

commit: 58924ab

@threepointone
Copy link
Contributor Author

/bonk review this pr

@ask-bonk
Copy link
Contributor

ask-bonk bot commented Feb 14, 2026

Now I have a thorough understanding of all the changes. Let me compile the review.


PR Review: ai-chat: validate persisted messages, fix metadata on broadcast/resume, document advanced patterns

Overall Assessment

Clean, well-structured PR. The three concerns (validation, metadata propagation, docs) are logically distinct but small enough to ship together. The code is defensive, well-commented, and thoroughly tested.


1. Message Validation (packages/ai-chat/src/index.ts:46-58)

Looks good. isValidMessageStructure is minimal and correct:

  • Null/non-object check
  • Non-empty string id
  • Allowlist of three valid roles
  • parts must be an array (empty is allowed — the leniency is documented and intentional)

The placement inside _loadMessagesFromDb() at line 722 is right — it validates after JSON parse but before the message enters the cache or is returned to callers. The console.warn + skip approach is the correct call for data-plane code; throwing would crash the agent on corrupt rows, which is worse.

One question: isValidMessageStructure is a module-level function, not a method or an export. That's fine for now, but if other consumers ever need validation (e.g., persistMessages validating inbound messages before writing), it would need to be shared. Not a blocker — just noting the design boundary.

Minor observation: The as Record<string, unknown> cast on line 48 is fine given the preceding typeof + null check. Some teams prefer a runtime helper like assertRecord, but this is idiomatic TypeScript narrowing and consistent with the rest of the codebase.


2. Metadata on broadcast/resume (packages/ai-chat/src/react.tsx)

This is the most intricate part of the PR.

activeStreamRef shape change (line 880-885): Adding metadata?: Record<string, unknown> to the ref is clean. The optional field means existing code paths that don't set it are unaffected.

flushActiveStreamToMessages (line 892-920): The conditional spread ...(activeMsg.metadata != null && { metadata: activeMsg.metadata }) at line 908 is correct — it avoids adding a metadata: undefined key to the message object.

Metadata capture in chunk handler (lines 1079-1096): The logic is:

  1. Only run when applyChunkToParts returns false (meaning it's a metadata-level chunk)
  2. Only for start, finish, message-metadata types
  3. Merge with spread, so later chunks override earlier keys

This is correct. The start chunk also updates activeMsg.messageId (line 1088-1089), which is important for server-assigned IDs.

Continuation metadata carry-over (lines 1032-1058): When a continuation starts, existing metadata from the last assistant message is shallow-copied into the new activeStreamRef. This prevents metadata loss across tool continuations. The shallow copy ({ ...metadata }) is fine here — metadata values are primitive or simple objects (token counts, model names, timestamps).

One thing to verify: On the CF_AGENT_STREAM_RESUMING path (line 998-1007), the activeStreamRef is initialized without metadata:

activeStreamRef.current = {
  id: data.id,
  messageId: nanoid(),
  parts: []
};

This is correct because a resumed stream starts fresh (the server replays all chunks including start), so metadata will be populated as chunks arrive. No issue here.


3. Tests

validate-messages.test.ts: 9 tests covering all validation branches. Good coverage:

  • Valid messages pass through
  • Missing id, empty id, invalid role, non-array parts, non-object values — all filtered
  • Empty parts array — preserved (leniency)
  • Metadata preserved through round-trip
  • All three roles accepted

The insertRawMessage helper in worker.ts:291-296 is a clean way to inject malformed data without going through persistMessages (which would serialize correctly).

message-builder.test.ts additions (lines 595-636): 4 tests verifying that applyChunkToParts returns false for metadata-level chunk types. This documents the contract that the caller (react.tsx) relies on.

Test observation: The validation tests don't test what happens with unparseable JSON (e.g., insertRawMessage("bad", "not json at all")). Looking at _loadMessagesFromDb, the catch block at line 733 handles this case, so it's covered by existing behavior. But it might be worth adding a single test for that path to make the coverage explicit.


4. Docs (docs/chat-agents.md lines 617-829)

The new "Advanced Patterns" section is well-written and covers useful AI SDK patterns:

  • prepareStep for dynamic model/tool control
  • wrapLanguageModel for middleware
  • generateObject for structured output in tools
  • ToolLoopAgent for subagent delegation
  • async function* for preliminary tool results

The prose after each code block explains when to use the pattern and what the tradeoffs are. The callout about ToolLoopAgent not being suited as a replacement for streamText in onChatMessage (line 796) is a good guardrail.

One API reference question: The LanguageModelV3Middleware type on line 678 — is this the current type name in the AI SDK, or could it be LanguageModelV2Middleware or just LanguageModelMiddleware depending on version? If the AI SDK docs still reference V2, readers might be confused. Worth double-checking.


5. Changeset

The changeset is thorough and well-written. It explains both changes clearly and notes the design rationale for lenient validation. No issues.


Summary

Area Verdict
Validation logic Clean, correct, well-placed
Metadata propagation Correct across all paths (broadcast, resume, continuation)
Tests Thorough — 13 new tests covering all branches
Docs Clear, practical, well-scoped
Changeset Good

Suggestions (non-blocking):

  1. Consider adding a test for the JSON parse failure path in _loadMessagesFromDb (completely unparseable data) to make that coverage explicit.
  2. Verify that LanguageModelV3Middleware is the current type name in the AI SDK version the project depends on.

Looks good to ship.

github run

Add a unit test that ensures completely unparseable JSON messages are discarded by message validation. The test sets up an agent, persists a valid message, inserts a broken JSON raw message via agentStub.insertRawMessage, and asserts that getValidatedMessages returns only the valid message.
@threepointone threepointone merged commit a668155 into main Feb 14, 2026
4 checks passed
@threepointone threepointone deleted the validate-messages-and-metadata branch February 14, 2026 16:14
@github-actions github-actions bot mentioned this pull request Feb 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant