Merge upstream cloudflare/agents PR #1559: pre-stream chat recovery#2
Closed
rwdaigle wants to merge 7 commits into
Closed
Merge upstream cloudflare/agents PR #1559: pre-stream chat recovery#2rwdaigle wants to merge 7 commits into
rwdaigle wants to merge 7 commits into
Conversation
- Add early stashing of chat fiber snapshots before model inference begins, allowing interrupted pre-stream turns to be reconciled by chat recovery. - Introduce `retry: true` as a ChatRecoveryOptions field for retrying an interrupted turn against the existing unanswered user message when no partial assistant message exists.
…are#1563) * Add Chat SDK messenger example Demonstrates Chat SDK ingress on Agents with subagent-backed state and Think-owned conversation replies. Co-authored-by: Cursor <cursoragent@cursor.com> * Stream Chat SDK messenger replies Adds Think chat streaming with RPC-safe cancellation so messenger delivery failures can stop the corresponding sub-agent turn. Co-authored-by: Cursor <cursoragent@cursor.com> * Add managed fiber jobs Introduce managed fiber jobs on top of runFiber so agents can durably accept idempotent background work, inspect retained status, cancel running jobs, explicitly resolve interrupted jobs, and record recovery policy decisions. This adds the cf_agents_fibers ledger, schema v8 migration, status/list/delete/resolve APIs, cooperative cancellation signals, and waitForCompletion support that waits on terminal ledger state instead of only the callback promise. Tighten crash recovery semantics for managed work by reconciling stale run rows, recovering ledger-only pending/running rows, skipping recovery for already-terminal fibers, settling setup failures, and letting onFiberRecovered return a FiberRecoveryResult to move interrupted fibers to completed, error, aborted, or intentionally interrupted. The implementation also tracks active managed executions and terminal waiters so duplicate requests can join in-memory work when possible while post-restart retries drive the same recovery path. Use the new managed fiber API in the Chat SDK messenger example for AI replies. Telegram messages now get a stable per-message idempotency boundary, completion waiting preserves Chat SDK per-thread visible reply serialization, and recovery policy is explicit: accepted replies are replayed while mid-stream interruptions post a concise apology and settle the retained job. Expand coverage across unit, sub-agent, schema, and real eviction tests. The E2E harness now starts wrangler dev with persisted SQLite state, kills it mid-managed-fiber, restarts it, and verifies interrupted retention, recovery-result settlement, duplicate waitForCompletion retries after restart, and sub-agent managed fiber recovery through the parent alarm. Document the new durable job surface in the Agent and durable execution docs, including waitForCompletion, cancellation behavior, retained terminal records, explicit recovery outcomes, and how this differs from Think message admission. Co-authored-by: Cursor <cursoragent@cursor.com> * Polish managed fiber cleanup API Rename the public managed-fiber terminal timestamp from completedAt to settledAt, and rename the cleanup filter from completedBefore to settledBefore. These names better describe terminal rows across completed, error, aborted, and interrupted states while keeping the existing SQLite completed_at column internal. Make default deleteFibers() cleanup preserve interrupted rows. Interrupted managed fibers often need inspection or explicit application-level resolution, so callers must now opt in to deleting them by passing status: "interrupted". Clarify FiberContext.snapshot documentation so it does not imply callbacks are automatically re-entered with recovered snapshots; recovery snapshots are delivered through onFiberRecovered(). Add a regression test that default cleanup deletes completed rows while preserving interrupted rows, then verifies explicit interrupted cleanup still works. Co-authored-by: Cursor <cursoragent@cursor.com> * Document managed fiber adoption patterns Add practical guidance for using managed fibers around webhook-style application jobs, including retained cleanup with settledBefore, interrupted recovery, resolveFiber, and waitForCompletion behavior. Clarify the boundary between Think submissions and managed fibers across the Think docs, package README, server-driven messaging docs, webhook docs, and examples so users can distinguish durable Think turn admission from app-owned side-effect jobs. Co-authored-by: Cursor <cursoragent@cursor.com> * Fix PR install after main package bumps Use the workspace dependency for the Chat SDK messenger example's Think package so npm ci can resolve the merged branch after main's version-package release. Always run npm ci in the shared GitHub install action while relying on setup-node's npm package cache, avoiding stale node_modules cache hits that can mask lockfile drift. Co-authored-by: Cursor <cursoragent@cursor.com> * Fix managed fiber review issues Correct the malformed Think changeset frontmatter so Changesets can parse the release metadata. Ensure waitForCompletion waits for a terminal managed fiber status even when duplicate calls race with an already-running recovery pass, and cover the race with a regression test. Also document and test the Chat SDK state adapter's list-level TTL behavior. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
…ata for recovery before inference starts Merges the changes from cloudflare#1559 (feat/pre-stream-recovery) which adds early chat-turn recovery snapshots so interrupted turns can be recovered before any stream metadata or chunks exist. Introduces ChatRecoveryOptions.retry to retry the latest unanswered user message instead of continuing a partial assistant response. PR commits included: - b2c347a feat: Stash chat metadata for recovery before inference starts - bcdee9c CI merge fixes - b464554 Update packages/think/src/think.ts Also includes upstream main commits merged into the PR branch (cloudflare#1561, cloudflare#1563).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Brings in the changes from upstream cloudflare/agents#1559 (
feat/pre-stream-recovery).The upstream PR adds early chat-turn recovery snapshots so interrupted turns can be recovered before any stream metadata or chunks exist. It introduces a
ChatRecoveryOptions.retryfield that enables recovery to retry the latest unanswered user message rather than continuing a partial assistant response. The behavior is applied consistently across@cloudflare/thinkand@cloudflare/ai-chat.PR commits included
b2c347afeat: Stash chat metadata for recovery before inference startsbcdee9cCI merge fixesb464554Update packages/think/src/think.tsThe merged upstream branch also carries the following commits that landed on
cloudflare/agents:mainafter our fork point and were pulled into the PR branch via its merge-from-main:831ba1dthink: expose additive stop conditions (think: expose additive TurnConfig stopWhen conditions cloudflare/agents#1561)32cde40Add Chat SDK messenger example with managed fiber durability (Add Chat SDK messenger example with managed fiber durability cloudflare/agents#1563)Diff scope
64 files changed, ~6190 insertions / ~189 deletions. The merge applied cleanly with no conflicts.
Test plan
@cloudflare/think,@cloudflare/ai-chat,agents)packages/agents/src/chat/recovery.tsand recovery flow integrationexamples/chat-sdk-messengerexample (came in via the merged main commits)Generated by Claude Code