@cloudflare/think@0.9.0
Minor Changes
-
#1656
4c2d1a7Thanks @cjol! - Rebuild the Think execute tool on the codemode connector runtime, with built-in human-in-the-loop approvals.Unified execute tool.
createExecuteToolnow builds oncreateCodemodeRuntimewith connectors instead of a bare executor:state.*(the agent's workspace filesystem via@cloudflare/shell'sStateConnector),cdp.*(browser automation viaagents/browser'sBrowserConnector, included automatically whenenv.BROWSERis bound), andtools.*(any AI SDKToolSetadapted via@cloudflare/codemode'sToolSetConnector). Executions are durable — recorded on aCodemodeRuntimefacet with abort-and-replay — and completed results are truncated for the model while the full value stays on the execution record.- Agent one-liner —
createExecuteTool(this)infersctx,env.LOADER,env.BROWSER, and the workspace-backed state backend from the Think agent, and accepts an overrides object for custom tools and options.createExecuteRuntime(this)returns the underlying{ runtime, connectors, tool }for host-side wiring. The runtime handle is exposed on the agent asthis.codemode. - Human-in-the-loop. Tools with
needsApproval: truepause the execution durably. The paused tool output (with bounded pending-call args) flows to the model, which reports and waits. Think gains built-in callables —pendingExecutions(),approveExecution(executionId),rejectExecution(executionId, reason?)— that resolve the pause on the codemode runtime, replace the paused output in the transcript viapausedExecutionUpdate, and auto-continue the conversation so the model sees the outcome. Approval UIs must render args frompendingExecutions()(authoritative, full) rather than the transcript'spending(a truncated preview bounded for model context). Approvals survive Durable Object restarts and are safe against double-approval, expiry (expirePaused), and stale UIs. If the paused tool part is no longer in the transcript when the approval lands (e.g. compacted away), the outcome is appended as a system note instead of being dropped. - The Think framework's generated worker entry exports the
CodemodeRuntimefacet class automatically (also re-exported from@cloudflare/think/server-entry). - Think's
createBrowserToolsfollows the rebuiltagents/browserconnector model (single durablebrowser_executetool, session modes, stable attach handles) — see theagentschangeset. - Model-facing guidance.
createExecuteToolnow renders per-namespace usage hints in the execute tool description (state.*object-argument filesystem calls, the actualtools.*method names,cdp.*), so models stop inventing ahost.*/fs.*API. Theload_extensiondescription clarifies that itshostbridge exists only inside extension source. The workspacebashtool description now states the workspace is mounted at/(no/workspace), and the bash sandbox no longer persists its synthetic/bin,/usr,/dev,/procpaths into the workspace (previously the first bash call wrote ~160 shell-builtin stubs into the user's workspace and floodedchangedFiles).
- Agent one-liner —
Patch Changes
-
#1740
6c9de59Thanks @threepointone! - Defer one-shot scheduled callbacks (and chat-recovery give-ups) on platform transients instead of consuming them mid-deploy (#1730).A mid-execution Durable Object code-update reset surfaces storage failures in two shapes: the verbatim reset/supersede messages (already deferred) and
SqlError: SQL query failed: Network connection lost.— a wrapper that drops the CFretryableflag and dodges the reset matcher. The second shape burned the in-process retry budget inside the same few-seconds reset window (which outlasts the retry schedule by design) and then consumed the one-shot row on exhaustion, freezing the turn for minutes until incident re-detection — in the reported production capture, storage was healthy again 15 ms after the final attempt.agents— new cause-awareisPlatformTransientErrorclassifier (exported, alongsideisDurableObjectCodeUpdateReset): reset/supersede messages,retryable-flagged platform errors (excluding overloaded), and "Network connection lost.", looked up through wrappercausechains._executeScheduleCallbackkeeps in-process retries for connection-lost transients (a genuine blip heals fast) but on exhaustion of a one-shot row it now re-throws instead of swallowing, so the row survives and the alarm re-runs it in the healthy window that follows. Genuine application errors are still abandoned aftermaxAttemptsexactly as before.@cloudflare/think—_handleRecoveryCallbackErrornow defers (re-throws) on any platform transient instead of terminalizing through a give-up whose own seal needs the storage that is down; the bookkeeping write on the defer path is best-effort. The defer path no longer marks the recovered submissionerror(which made the deferred re-run skip withsubmission_not_running— a self-defeating defer); it staysrunningfor the re-run to pick up. The give-up now seals the incidentexhaustedonly after the terminal writes succeed, so a transient mid-seal defers the whole give-up for an idempotent re-run instead of half-sealing.@cloudflare/ai-chat— same give-up seal ordering: the incident is sealed only after_exhaustChatRecovery(incl. the durable terminal record) succeeds, so a transient mid-seal preserves the one-shot row and the give-up re-runs in full on a healthy isolate.
-
#1737
bc43133Thanks @cjol! - Fix the two remaining #1575 gaps in how in-band stream errors ({type: "error", errorText}chunks inside an otherwise-healthy provider stream) are observed after the fact.Errored-stream replay (partial content was lost on reconnect). A client reconnecting after an in-band error received the terminal error frame (#1645) but not the content the model streamed before the error — the replay path only served
status = 'completed'streams, so an errored stream's buffered chunks were unreachable, and the server pushes no messages on connect.ResumableStreamgainsreplayErroredChunksByRequestId, and the resume-ACK terminal replay (_replayTerminalOnAckin both AIChatAgent and Think) now replays the errored stream's stored chunks before thedone: true, error: trueframe, so a reconnecting client observes the same sequence a live client did. No wire-format or schema changes: replayed chunks reuse the existingreplay: trueframe shape and the error text still comes from the durable terminal record.Agent-tool error attribution (cross-run contamination). When an in-band error frame was broadcast on a child agent and the active run was unknown, the error was stamped onto every tailed run — so an unrelated turn's failure (or one of several overlapping runs) could mark healthy runs as
error, and capture depended on a tailer being attached at the right moment. Frames are now attributed by the request id they carry: each agent-tool run is bound to its turn's request id when the turn starts (persisted on the run row at start rather than at terminal, so attribution survives a DO restart mid-run), and only the owning run's error/progress state is updated. Frame inspection also no longer requires an attached tailer, so error capture is independent of tailer timing. -
#1712
835e7b0Thanks @threepointone! - Reclaim resumable-stream buffers from an alarm so idle chats don't leak storage (#1706)Resumable-stream chunk buffers (
cf_ai_chat_stream_*) were only swept lazily when a subsequent stream completed. A chat that received a single turn and then went idle never triggered that sweep, so its buffers lingered in the Durable Object's SQLite for the lifetime of the DO.AIChatAgentandThinknow arm a scheduled cleanup alarm whenever a stream starts and whenever it finishes (completes or errors). Arming on start guarantees that a stream whose DO is evicted mid-flight and never reaches a finish still gets a future sweep instead of leaking. This is the safety net for the non-durable path (e.g.chatRecovery: false, theAIChatAgentdefault): those turns don't run insiderunFiber, so there's no leftoverkeepAlivealarm and no fiber-recovery scan, and if the client never reconnects nothing else wakes the DO. (DurablerunFiberturns already self-heal — thekeepAlivealarm survives eviction, wakes the DO, and recovery finalizes the stream, which arms cleanup — so arming on start is belt-and-suspenders there.) The alarm sweeps aged buffers via the retention windows below and re-arms only while reclaimable rows remain, so a fully-swept DO stops waking itself. Arming is idempotent so high-turn-count chats never accumulate cleanup schedules; the in-callback re-arm uses a fresh (non-idempotent) row so it survives the one-shot deletion of the firing schedule. No per-turn Durable Object and no change to the session DO lifecycle are required.Retention is now split into two short, purpose-specific windows instead of a single 24h threshold: completed/errored buffers are kept for a brief 10-minute reconnect-and-replay grace (the assistant message is persisted separately, so the buffer is only needed to replay a just-finished stream or deliver a terminal error frame to a reconnecting client), while abandoned in-flight (
streaming) rows are kept for 1 hour so an interrupted turn has ample time to be resumed or recovered before its buffer is presumed dead. The abandoned-row sweep keys off last chunk activity rather than stream start time, so a long-running stream that is still emitting chunks is never reclaimed mid-flight.ResumableStreamgainscleanup(now?)(force a sweep, bypassing the lazy interval gate) andhasReclaimableStreams()to support alarm-driven cleanup. -
#1741
1d8641dThanks @threepointone! - Prevent cancelled durable submissions from appending their messages when they were already claimed but still waiting behind an active turn. -
#1713
18c438bThanks @threepointone! - Support client tools on the Think sub-agentchat()RPC path (#1709)ChatOptionsnow acceptsclientTools(the sameClientToolSchema[]carried over the WebSocket chat protocol) and anonClientToolCallexecutor. This lets a parent agent that drives a Think sub-agent overchat()expose client-defined tools to the sub-agent and complete the tool round trip within the same turn:await child.chat(message, callback, { signal, clientTools: [ { name: "get_user_timezone", parameters: { type: "object" } } ], onClientToolCall: async ({ toolName, input }) => runClientTool(toolName, input) });
Without
onClientToolCall, the schemas are still registered and the model's call is surfaced through the stream callback (execute-less), matching the WebSocket behavior. With it, the call is resolved inline so the turn can continue to completion — the RPC stream callback has no inbound result channel of its own.Unlike the WebSocket path, the schemas and executor are kept per-turn and are NOT persisted: the executor is a live RPC reference that cannot survive an eviction, and there is no SPA to replay a
tool-result. This keeps chat recovery correct — an eviction-interrupted client-tool call is repaired like a server tool (the model proceeds) rather than being mistaken for a pending human interaction and parking forever.agents/chat'screateToolsFromClientSchemasgains an optional{ execute }delegate (and exports a newClientToolExecutortype) to build the executable variant. Both additions are backward-compatible. -
#1724
c18a446Thanks @whoiskatrin! - Stop oversized sessions from permanently bricking the Durable Object withSQLITE_NOMEMon wake (#1710).A throw out of
onStartis terminal: partyserver resets its init state and rethrows, so every wake — including platform alarm retries — re-runs the failingonStartforever, and the failure survives redeploys because it is driven by stored data. Long-lived media-heavy sessions hit exactly this once eager full-transcript hydration approached the isolate's memory budget. Four changes:onStartdegrades instead of throwing. Transcript hydration, declared scheduled-task reconciliation, and durable submission/workflow recovery are now best-effort: failures are recorded (readable via the new publicgetOnStartDegradations()), logged with remediation hints, and emitted aschat:onstart:degradedobservability events, and the agent comes up reachable. The user-definedonStart()is intentionally NOT guarded.hydrationByteBudget(default 24MB). Cache refreshes hydrate at most this many stored bytes; an oversized transcript boots as a bounded window of the most recent messages — never fewer than the read-time truncation span the model sees at full fidelity (4 messages), so windowing cannot starve the model's context — and emitschat:hydration:windowed(on change, not on every sync). Durable storage is never truncated by this;session.getHistory()still reads the full path. Set toInfinityto restore unbounded hydration.mediaEviction(default on). Background passes rewrite oversized inline media — largedata:URL file parts and large strings nested in tool outputs — in messages that have aged out of the recent window, replacing them with size/path markers. By default the original bytes are preserved as workspace files under/attachments/evicted/(written BEFORE the row is rewritten, so no pass can lose data); setexternalizeToWorkspace: falseto drop them orfalseto disable. Passes are memory-bounded: row sizes come fromgetHistoryRowStats(), only rows large enough to contain an evictable value are parsed, one at a time, and rewrites use the session's silent maintenance path so no per-row full-history token estimate runs. When a pass stops atmaxRowsPerPasswith a backlog, the next pass is scheduled automatically. Providers without row-stats support log a one-time warning instead of silently no-opping.- Plain
textparts are never evicted, andkeepRecentMessagesis clamped to at least the read-time truncation window (4) so eviction can never rewrite content the model still sees at full fidelity.
-
#1715
5f6003fThanks @threepointone! - Supportexperimental_transformonTurnConfig. The transform(s) returned frombeforeTurnare now forwarded tostreamTextin the inference loop, so callers can inspect or rewrite the stream — for example, detecting tool results that carry{ content, sources }and enqueuing additionalsourceparts via the transform's controller. Accepts a single transform or an array applied in order. Closes #1714. -
Updated dependencies [
b2b6762,4c2d1a7,4c2d1a7,7bcd1b1,4c2d1a7]:- @cloudflare/codemode@0.4.0
- create-think@0.0.4
- @cloudflare/shell@0.4.0