feat(experimental): Postgres session providers with Hyperdrive support#1297
feat(experimental): Postgres session providers with Hyperdrive support#1297mattzcarey wants to merge 31 commits intomainfrom
Conversation
- New experimental/session-planetscale example showing how to connect and wire up PlanetScaleSessionProvider + PlanetScaleContextProvider - Fix type inconsistency: PlanetScaleSessionProvider now uses SessionMessage (matching the SessionProvider interface) instead of importing UIMessage directly from ai
- Rename PlanetScale → Postgres providers (PostgresSessionProvider, PostgresContextProvider, PostgresSearchProvider) - Add PostgresSearchProvider with tsvector FTS for searchable knowledge blocks - Harden context tools: remove enum constraints, validate in execute, return error strings instead of throwing - Simplify set_context API: remove key param, auto-generate from title/content slug - Search blocks show entry count only (no key listing) - Append newline separator in appendToBlock - Add ignoreIncompleteToolCalls to prevent orphaned tool call errors - Add Hyperdrive example with Vite + React client - Add 33 tests covering providers, round-trip, and convertToModelMessages compat - Update docs with Postgres setup, migration SQL, and search docs
…prompt rendering - Remove key param from set_context — auto-generate from title/content slug - Add title param for stable keyed entries (skills/search) - Harden all tools: remove enum constraints, validate in execute, return errors - Fix prompt lifecycle: freezeSystemPrompt returns cached, refreshSystemPrompt reloads from providers - Remove clearCachedPrompt — refreshSystemPrompt covers invalidation - clearMessages calls refreshSystemPrompt to invalidate cached prompt - Fix captureSnapshot: render empty writable blocks so LLM knows they exist - Clean system prompt rendering: [readonly], [searchable], [loadable], [not searchable] tags - Search provider get() returns count only, no key listing - appendToBlock adds newline separator - Simplify soul prompt in example - Separate getSystemPrompt (cached) and refreshSystemPrompt (force reload) callables - Add prompt lifecycle tests (freeze, refresh, invalidation, concurrent) - Update docs: search provider, prompt lifecycle, generic Postgres setup
|
- Await getMessage/updateMessage in SessionManager.upsert (manager.ts) - Add session_id filter and depth guard to recursive CTEs in PostgresSessionProvider (postgres.ts) - Use 'stored !== null' instead of 'stored' in freezeSystemPrompt to handle empty strings (context.ts) - Guard against undefined _agent in addContext when using SessionProvider (session.ts)
…and tests Session methods became async to support PostgresSessionProvider, but callers in think.ts, experimental examples, and test files were not updated. This adds proper await/async handling throughout: - think.ts: cache messages for sync getter, await all session calls - manager.ts: make getHistory/clearMessages/deleteMessages/etc async - multi-session.ts: await all session and manager calls in tests - session.test.ts: fix UIMessage→SessionMessage, appendMessage return type - session-search/server.ts: await getHistory/clearMessages - session-multichat/server.ts: await getHistory - client-tools.ts: await getBranches - postgres.ts: fix parent type cast
… updates - Add _syncMessages() after _applyToolUpdateToMessages in Think - Add .catch() to fire-and-forget _reclaimLoadedSkill callback - Smart newline separator in appendToBlock (skip if content starts with \n) - Fix [searchable] tag: show for all searchable blocks regardless of writability - Update search/skills tests for removed key param and new tag format
- Fix oxfmt formatting in context.ts and skills.test.ts - Add @types/pg dev dependency for typecheck - Rename getConnection → getPgConnection to avoid Agent base class collision - Fix Text component className prop in client.tsx - Await clearMessages() in SessionManager.delete()
- Await manager.delete() in multichat example and multi-session test - Extract text parts from JSON in PostgresSessionProvider.searchMessages instead of returning raw JSON content
…return types - Add Session.create(SessionProvider) tests to session.test.ts (runs under workers pool) - Fix appendMessage mock return type (block syntax to return void, not number) - Add await to getHistory in minimal create test - Remove unused imports from postgres-providers.test.ts
| return `Error: key is required for searchable block "${label}"`; | ||
| await this.setSearchEntry(label, key, content); | ||
| if (block.isSkill || block.isSearchable) { | ||
| const key = slugify(title ?? content); |
There was a problem hiding this comment.
🟡 slugify truncation causes silent key collisions when title is omitted for keyed blocks
When the LLM calls set_context for a skill or searchable block without providing a title, the key is generated via slugify(content) which truncates at 60 characters. Two different contents that share the same first 60 characters (after lowercasing and stripping non-alphanumeric chars) will produce identical keys, causing the second entry to silently overwrite the first.
For example, "The deployment process for production requires approval from security team" and "The deployment process for production requires approval from management" would both slugify to the same key. The old code required an explicit key parameter, avoiding this collision risk entirely.
| const key = slugify(title ?? content); | |
| const key = title ? slugify(title) : `${slugify(content)}-${Array.from(new TextEncoder().encode(content)).reduce((h, b) => (((h << 5) - h) + b) | 0, 0).toString(36).replace('-', 'n')}`; |
Was this helpful? React with 👍 or 👎 to provide feedback.
- Add text_content column to assistant_messages migration - Generate tsvector from text_content instead of raw JSON content - Populate text_content with extracted text parts on append/update - Search results return text_content directly - Fix mock handleUpdate for new column layout - Align ai dependency version with main (^6.0.158)
- Fix oxfmt formatting in react.tsx - Fix implicit any in resumable-stream-chat onData callback
| "$schema": "../../node_modules/wrangler/config-schema.json", | ||
| "account_id": "543fbdef1eeaed8a02c251c8c4d9510b", | ||
| "name": "agents-session-planetscale-example", | ||
| "main": "src/server.ts", | ||
| "compatibility_date": "2026-01-28", | ||
| "compatibility_flags": ["nodejs_compat"], | ||
| "ai": { | ||
| "binding": "AI" | ||
| }, | ||
| "assets": { | ||
| "directory": "./public", | ||
| "not_found_handling": "single-page-application", | ||
| "run_worker_first": ["/agents/*"] | ||
| }, | ||
| "hyperdrive": [ | ||
| { | ||
| "binding": "HYPERDRIVE", | ||
| "id": "e9c4a010628841f2a23f30d7fdceb63d" |
There was a problem hiding this comment.
🟡 Hardcoded account_id and Hyperdrive ID in example wrangler.jsonc
The experimental/session-planetscale/wrangler.jsonc hardcodes account_id and a Hyperdrive id. The repository's AGENTS.md mandates "Never hardcode secrets or API keys." No other example in the repo includes account_id in its wrangler config. The Hyperdrive ID (e9c4a010628841f2a23f30d7fdceb63d) identifies a specific deployed resource tied to an individual account, and will fail for any other contributor or deployment.
| "$schema": "../../node_modules/wrangler/config-schema.json", | |
| "account_id": "543fbdef1eeaed8a02c251c8c4d9510b", | |
| "name": "agents-session-planetscale-example", | |
| "main": "src/server.ts", | |
| "compatibility_date": "2026-01-28", | |
| "compatibility_flags": ["nodejs_compat"], | |
| "ai": { | |
| "binding": "AI" | |
| }, | |
| "assets": { | |
| "directory": "./public", | |
| "not_found_handling": "single-page-application", | |
| "run_worker_first": ["/agents/*"] | |
| }, | |
| "hyperdrive": [ | |
| { | |
| "binding": "HYPERDRIVE", | |
| "id": "e9c4a010628841f2a23f30d7fdceb63d" | |
| "$schema": "../../node_modules/wrangler/config-schema.json", | |
| "name": "agents-session-planetscale-example", | |
| "main": "src/server.ts", | |
| "compatibility_date": "2026-01-28", | |
| "compatibility_flags": ["nodejs_compat"], | |
| "ai": { | |
| "binding": "AI" | |
| }, | |
| "assets": { | |
| "directory": "./public", | |
| "not_found_handling": "single-page-application", | |
| "run_worker_first": ["/agents/*"] | |
| }, | |
| "hyperdrive": [ | |
| { | |
| "binding": "HYPERDRIVE", | |
| "id": "<your-hyperdrive-id>" | |
| } | |
| ], |
Was this helpful? React with 👍 or 👎 to provide feedback.
…stead of full sync
…ages are source of truth during turn
| function slugify(text: string): string { | ||
| return ( | ||
| text | ||
| .slice(0, 60) | ||
| .toLowerCase() | ||
| .replace(/[^a-z0-9]+/g, "-") | ||
| .replace(/^-|-$/g, "") || "entry" | ||
| ); | ||
| } |
There was a problem hiding this comment.
🔴 slugify fallback key "entry" causes silent data overwrites for non-Latin content
The new slugify function strips all non [a-z0-9] characters, then falls back to "entry" if the result is empty. When the LLM doesn't provide a title (it's optional), slugify(content) is used as the key. For non-Latin content (Chinese, Japanese, Arabic, emoji-only, etc.), slugify produces "entry" for every input, causing all entries to silently overwrite each other.
Example of the collision
For a knowledge base with entries:
set_context({ label: "knowledge", content: "用户喜欢咖啡" })→ key ="entry"set_context({ label: "knowledge", content: "用户的名字是小明" })→ key ="entry"(overwrites first!)
The second write silently replaces the first because both map to key "entry".
Prompt for agents
The slugify function in context.ts:24-32 strips all non-ASCII-alphanumeric characters and falls back to "entry" when nothing remains. This causes silent data loss for non-Latin content (Chinese, Japanese, Arabic, emoji, etc.) since all such content maps to the same key "entry", overwriting each other.
The function is used in the set_context tool at context.ts:673 where `slugify(title ?? content)` generates the storage key for skill/search blocks.
Possible approaches:
1. Use a hash (e.g., first 8 chars of a SHA-256 hex digest) of the full text as a fallback instead of the static "entry" string.
2. Allow Unicode letters in the slug (e.g., use a Unicode-aware regex like /[^\p{L}\p{N}]+/gu).
3. Generate a random UUID as the fallback key when the slug is empty.
Any approach must ensure that the same input consistently produces the same key (for upsert semantics), so option 1 (hash-based) is likely the best fit.
Was this helpful? React with 👍 or 👎 to provide feedback.
| const parent = | ||
| parentId ?? ((await this.latestLeafRow())?.id as string) ?? null; |
There was a problem hiding this comment.
🔴 appendMessage with explicit parentId: null incorrectly auto-detects parent instead of creating a root message
In PostgresSessionProvider.appendMessage, parentId uses nullish coalescing (??) which treats both null and undefined the same way. When called with explicit parentId: null (meaning "create a root message with no parent"), the code falls through to latestLeafRow() and attaches the message as a child of the latest leaf instead.
Code path
At postgres.ts:125-126:
const parent =
parentId ?? ((await this.latestLeafRow())?.id as string) ?? null;If parentId is null, null ?? latestLeaf evaluates to latestLeaf, not null. The SessionProvider interface at provider.ts:57-60 declares parentId?: string | null, where null should mean "no parent" and undefined/omitted should mean "auto-detect".
This breaks the branching contract — callers who explicitly pass null to create a root message get an unexpected parent chain instead. The AgentSessionProvider at providers/agent.ts likely has the same distinction via its SQL logic.
| const parent = | |
| parentId ?? ((await this.latestLeafRow())?.id as string) ?? null; | |
| const parent = | |
| parentId !== undefined | |
| ? parentId | |
| : (((await this.latestLeafRow())?.id as string) ?? null); |
Was this helpful? React with 👍 or 👎 to provide feedback.
| // If the provider is async, history is a Promise — skip restore for async providers | ||
| if (history instanceof Promise) return; |
There was a problem hiding this comment.
🟡 Skill restoration silently skipped for async providers, losing loaded-skill tracking after hibernation
_restoreLoadedSkills() at packages/agents/src/experimental/memory/session/session.ts:214-216 checks if (history instanceof Promise) return and silently skips skill restoration for all async SessionProvider implementations (including the new PostgresSessionProvider). After DO hibernation/eviction, skills that were loaded via load_context are forgotten — the _loadedSkills set is empty. This means unload_context reports "not currently loaded" for skills that are actually loaded in the conversation, and the unload_context tool description shows "No skills currently loaded" even when skills are present in history.
Prompt for agents
In session.ts _restoreLoadedSkills(), the method skips entirely when the provider is async (returns a Promise from getHistory). This causes loaded skills to be silently lost after hibernation for Postgres-backed sessions.
Consider making _restoreLoadedSkills async and calling it with await in _ensureReady. Since _ensureReady is called at the start of every Session method (which are all now async), making it async should be safe. Alternatively, defer the restore to the first async method call (e.g. inside getHistory or tools) so it runs before the data is needed.
The key issue is in _ensureReady (line 157) which is synchronous. Either make _ensureReady async and await skill restoration, or lazily restore skills on first async use.
Was this helpful? React with 👍 or 👎 to provide feedback.
| wrangler deploy | ||
| ``` | ||
|
|
||
| Tables (`assistant_messages`, `assistant_compactions`, `cf_agents_context_blocks`) are auto-created on first request. |
There was a problem hiding this comment.
🔴 session-planetscale README falsely claims tables are auto-created
The README states "Tables (assistant_messages, assistant_compactions, cf_agents_context_blocks) are auto-created on first request" but PostgresSessionProvider (packages/agents/src/experimental/memory/session/providers/postgres.ts) has no table creation logic whatsoever. The main docs at docs/sessions.md:688 correctly say "Run this once in your database console" with manual SQL. Users following the example's README will hit runtime errors on first request because the tables don't exist.
| Tables (`assistant_messages`, `assistant_compactions`, `cf_agents_context_blocks`) are auto-created on first request. | |
| Tables must be created before first use. Run the migration SQL from [the docs](../../docs/sessions.md#3-create-the-tables) in your database console. |
Was this helpful? React with 👍 or 👎 to provide feedback.
| ```bash | ||
| wrangler secret put PLANETSCALE_HOST | ||
| # paste: your-db-xxxxxxx.us-east-2.psdb.cloud | ||
|
|
||
| wrangler secret put PLANETSCALE_USERNAME | ||
| # paste: your username | ||
|
|
||
| wrangler secret put PLANETSCALE_PASSWORD | ||
| # paste: your password | ||
| ``` |
There was a problem hiding this comment.
🔴 session-planetscale README describes PlanetScale/MySQL setup but code uses Postgres/Hyperdrive
The README describes setting up PlanetScale secrets (PLANETSCALE_HOST, PLANETSCALE_USERNAME, PLANETSCALE_PASSWORD) and references @planetscale/database (a MySQL driver), but the actual server code imports Client from pg (PostgreSQL), uses PostgresSessionProvider/PostgresContextProvider/PostgresSearchProvider, and connects via this.env.HYPERDRIVE.connectionString. The env.d.ts declares HYPERDRIVE: Hyperdrive, not PlanetScale secrets. Users following the README's setup instructions (lines 26–35) would configure credentials the code never reads.
Prompt for agents
The session-planetscale README describes a PlanetScale/MySQL setup (individual secrets for host, username, password, and @planetscale/database driver) but the actual implementation in src/server.ts uses pg (PostgreSQL) with Cloudflare Hyperdrive. The env.d.ts declares HYPERDRIVE: Hyperdrive, and wrangler.jsonc configures a hyperdrive binding. The entire README sections 1-3 (Create a PlanetScale database, Get connection credentials, Set Worker secrets) need to be rewritten to describe the actual Postgres + Hyperdrive setup: create a Postgres database, create a Hyperdrive config with wrangler, and reference the binding. Alternatively, rename the example from session-planetscale to session-postgres to match the implementation.
Was this helpful? React with 👍 or 👎 to provide feedback.
agents
@cloudflare/ai-chat
@cloudflare/codemode
hono-agents
@cloudflare/shell
@cloudflare/think
@cloudflare/voice
@cloudflare/worker-bundler
commit: |
| if (!this._pgClient) { | ||
| this._pgClient = new Client({ | ||
| connectionString: this.env.HYPERDRIVE.connectionString | ||
| }); | ||
| await this._pgClient.connect(); | ||
| } | ||
| return wrapPgClient(this._pgClient); |
There was a problem hiding this comment.
🔴 getPgConnection leaves unconnected client on connect() failure, breaking all subsequent calls
In getPgConnection, this._pgClient is assigned the new Client instance before await this._pgClient.connect(). If connect() throws (e.g., network error, wrong credentials), this._pgClient is set but not connected. On all subsequent calls, !this._pgClient is false, so the method skips the connection block and returns a wrapper around the unconnected client. Every database query after the first failed connection will fail silently or throw confusing errors — there is no retry path.
The same bug exists in the docs example at docs/sessions.md:779-784.
| if (!this._pgClient) { | |
| this._pgClient = new Client({ | |
| connectionString: this.env.HYPERDRIVE.connectionString | |
| }); | |
| await this._pgClient.connect(); | |
| } | |
| return wrapPgClient(this._pgClient); | |
| private async getPgConnection(): Promise<PostgresConnection> { | |
| if (!this._pgClient) { | |
| const client = new Client({ | |
| connectionString: this.env.HYPERDRIVE.connectionString | |
| }); | |
| await client.connect(); | |
| this._pgClient = client; | |
| } | |
| return wrapPgClient(this._pgClient); | |
| } |
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
Adds Postgres-backed session providers for storing conversation history, context blocks, and searchable knowledge in an external database via Hyperdrive. This enables cross-DO queries, analytics, and shared state without relying on DO SQLite.
Supersedes #1196.
What's new
Providers (
packages/agents/src/experimental/memory/session/providers/)PostgresSessionProvider— tree-structured messages, compaction overlays, message FTS via tsvectorPostgresContextProvider— writable context block storage (memory, cached prompt)PostgresSearchProvider— searchable knowledge base with tsvector + GIN indexFramework improvements (
packages/agents/src/experimental/memory/session/)Session.create()acceptsSessionProviderfor external storage (in addition toSqlProviderfor DO SQLite)set_contextAPI: removedkeyparam, auto-generates keys fromtitleor content slugfreezeSystemPrompt()returns cached,refreshSystemPrompt()force-reloads from providersclearMessages()callsrefreshSystemPrompt()to invalidate the cached promptappendToBlock()adds newline separator between entries[readonly],[searchable],[loadable],[not searchable]get()returns entry count only (no key listing)Example (
experimental/session-planetscale/)pgdriverwrapPgClienthelper converts?placeholders to$1, $2, ...for pg compatibilityTests (
packages/agents/src/tests/experimental/memory/session/postgres-providers.test.ts)convertToModelMessagescompatibility, prompt lifecycle (freeze/refresh/invalidation/concurrent)Docs (
docs/sessions.md)Migration SQL
Customers run this once — providers never create tables:
Test plan