fix(chrome-ai): three HIGH-priority bugs in PR #514 — chat cache, prototype pollution, snapshot reset by sroussey · Pull Request #528 · workglow-dev/libs

sroussey · 2026-05-22T08:32:37Z

Fixes 3 HIGH-priority findings in #514.

Summary

HIGH-1: WebBrowser_Chat session cache never reuses. Stored historyFingerprint was computed over the prior-prior state on turn 1, so turn 2's lookup always missed and rebuilt the session from scratch. Switched to a messageCount-based watermark — cache hits when cached.messageCount === lastUserIdx. After each turn, messageCount = messages.length + 1.
HIGH-2: Tool-call arguments allow __proto__ keys past validation. Many tool input schemas don't set additionalProperties: false, so {__proto__: {polluted: true}, ok: true} passes validation. Added a sanitizeToolArgs helper that recursively drops __proto__ / constructor / prototype keys before validation.
HIGH-3: snapshotStreamToTextDeltas corrupts text on non-prefix snapshots. Non-prefix branch was concatenating instead of resetting. Reset-and-emit now correctly handles self-correction snapshots.

Test plan

WebBrowserProvider.test.ts — new tests for cache reuse, sanitization (incl. recursive), and snapshot reset
All existing tests in the chrome-ai suite still pass

The previous fingerprint-based cache key recomputed the fingerprint from the *prior* history on every turn, so turn 2's cache lookup always missed and rebuilt the session from scratch. Switch to a messageCount high-water mark: cache hits when cached.messageCount === lastUserIdx (i.e., the session has already heard everything before the trailing user message). After a successful turn the session has heard messages.length + 1 messages (history + new assistant reply), which we record for the next call.

…lution (HIGH) Many tool input schemas don't set `additionalProperties: false`, so a hallucinated `{__proto__: {polluted: true}, ok: true}` payload would pass validation and propagate through to consumers. Add a `sanitizeToolArgs` helper that recursively rebuilds the value with a plain Object.prototype, dropping `__proto__`, `constructor`, and `prototype` keys at every depth. Sanitize BEFORE validation so the validator sees the cleaned object.

`snapshotStreamToTextDeltas` was concatenating instead of resetting when a snapshot was not a prefix-extension of the previously accumulated text. For self-correction snapshots (Chrome replacing, not extending, prior text) this corrupted consumer state with duplicated content like `"hello worldhello sailor"`. Reset the accumulator to the new snapshot and emit it as the delta so consumers treat the non-prefix boundary as a replace, matching the documented streaming-convention exception. Also add `snapshotStreamToTextDeltas` to `_testOnly` so the helper is testable from the test package, and add coverage for: - HIGH-1: chat cache reuse and rebuild-on-divergence - HIGH-2: __proto__/constructor/prototype scrubbing (top-level + recursive) - HIGH-3: prefix-extend, non-prefix-reset, identical-snapshot semantics Also fix a stale comment in the existing tool-calling lifecycle test that claimed cache reuse — tool-calling intentionally rebuilds per turn.

pkg-pr-new · 2026-05-22T08:34:25Z

Open in StackBlitz

@workglow/cli

npm i https://pkg.pr.new/@workglow/cli@528

@workglow/ai

npm i https://pkg.pr.new/@workglow/ai@528

@workglow/browser-control

npm i https://pkg.pr.new/@workglow/browser-control@528

@workglow/indexeddb

npm i https://pkg.pr.new/@workglow/indexeddb@528

@workglow/javascript

npm i https://pkg.pr.new/@workglow/javascript@528

@workglow/job-queue

npm i https://pkg.pr.new/@workglow/job-queue@528

@workglow/knowledge-base

npm i https://pkg.pr.new/@workglow/knowledge-base@528

@workglow/mcp

npm i https://pkg.pr.new/@workglow/mcp@528

@workglow/storage

npm i https://pkg.pr.new/@workglow/storage@528

@workglow/task-graph

npm i https://pkg.pr.new/@workglow/task-graph@528

@workglow/tasks

npm i https://pkg.pr.new/@workglow/tasks@528

@workglow/util

npm i https://pkg.pr.new/@workglow/util@528

workglow

npm i https://pkg.pr.new/workglow@528

@workglow/anthropic

npm i https://pkg.pr.new/@workglow/anthropic@528

@workglow/bun-webview

npm i https://pkg.pr.new/@workglow/bun-webview@528

@workglow/chrome-ai

npm i https://pkg.pr.new/@workglow/chrome-ai@528

@workglow/electron

npm i https://pkg.pr.new/@workglow/electron@528

@workglow/google-gemini

npm i https://pkg.pr.new/@workglow/google-gemini@528

@workglow/huggingface-inference

npm i https://pkg.pr.new/@workglow/huggingface-inference@528

@workglow/huggingface-transformers

npm i https://pkg.pr.new/@workglow/huggingface-transformers@528

@workglow/node-llama-cpp

npm i https://pkg.pr.new/@workglow/node-llama-cpp@528

@workglow/ollama

npm i https://pkg.pr.new/@workglow/ollama@528

@workglow/openai

npm i https://pkg.pr.new/@workglow/openai@528

@workglow/playwright

npm i https://pkg.pr.new/@workglow/playwright@528

@workglow/postgres

npm i https://pkg.pr.new/@workglow/postgres@528

@workglow/sqlite

npm i https://pkg.pr.new/@workglow/sqlite@528

@workglow/supabase

npm i https://pkg.pr.new/@workglow/supabase@528

@workglow/tf-mediapipe

npm i https://pkg.pr.new/@workglow/tf-mediapipe@528

commit: fb1bfd0

Copilot

Pull request overview

Fixes three high-priority issues in the Chrome AI provider: (1) chat session cache reuse, (2) prototype-pollution hardening for tool-call arguments, and (3) corrected handling of non-prefix streaming snapshots.

Changes:

Reworked WebBrowser_Chat cache reuse to use a messageCount watermark instead of a history fingerprint.
Added recursive sanitization of tool-call args to drop __proto__ / constructor / prototype keys before JSON-schema validation.
Adjusted snapshotStreamToTextDeltas to reset internal state on non-prefix snapshots and added targeted tests.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
providers/chrome-ai/src/ai/index.ts	Exposes `snapshotStreamToTextDeltas` under `_testOnly` for unit tests.
providers/chrome-ai/src/ai/common/WebBrowser_ToolCalling.ts	Adds `sanitizeToolArgs` and runs it before schema validation/capture.
providers/chrome-ai/src/ai/common/WebBrowser_Sessions.ts	Documents/adjusts cached session state to emphasize `messageCount` reuse semantics.
providers/chrome-ai/src/ai/common/WebBrowser_ChromeHelpers.ts	Updates snapshot→delta conversion logic and documentation for non-prefix snapshots.
providers/chrome-ai/src/ai/common/WebBrowser_Chat.ts	Fixes chat cache reuse check and updates cache writes to store `messageCount`.
packages/test/src/test/ai-provider/WebBrowserProvider.test.ts	Adds tests for chat cache reuse, tool-arg sanitization, and snapshot reset behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        // delta. Consumers reconstructing full text by concatenation should
+        // treat any subsequent non-prefix delta as a reset boundary; use
+        // `snapshotStreamToSnapshots` if you need replace-mode semantics.
+        accumulatedText = value;
        yield { type: "text-delta", port, textDelta: value };


+      // Now simulate a retroactive history mutation: a NEW user message at
+      // index 0 (so lastUserIdx becomes 2, not 1). Cache's messageCount=2
+      // != expectedPriorCount=2? Actually equal here, so try shrinking.
+      // Shrink: a single user message again. lastUserIdx=0, but cache
+      // says messageCount=2 → mismatch → rebuild.


github-actions · 2026-05-22T15:32:30Z

Coverage Report

Status	Category	Percentage	Covered / Total
🔵	Lines	62.25%	22893 / 36773
🔵	Statements	62.13%	23683 / 38117
🔵	Functions	63.19%	4309 / 6819
🔵	Branches	50.77%	11078 / 21817

File Coverage

No changed files found.

Generated in workflow #2382 for commit fb1bfd0 by the Vitest Coverage Report Action

…apability probe Integrates the chrome-ai branch (7 commits — PR #514/#520/#528) with main's parallel chrome-ai work (model.download, model.dispose, ApiBinding): - Chat-session cache keyed by AiChatTask sessionId, with messageCount high-water mark for reuse (replaces fingerprint-based invalidation) - StructuredGeneration + ToolCalling run-fns gated by an async capability probe; pre-probe state advertises a conservative subset (no json-mode, no tool-use) so the provider never claims a capability it can't fulfil - ChatHistory helpers + WebBrowser_TextGeneration_Unified dispatcher (text.generation shared by AiChatTask + TextGenerationTask) - ChromeHelpers ships both assertAvailability and ensureAvailable; both session APIs (chrome-chat cache + idle-evict store) coexist - Drops main's WebBrowser_Chat.test.ts (chrome-ai's WebBrowserProvider.test already covers chat behavior under the new cache semantics)

…viders Addresses review of #514/#520/#528 rebase: CRITICAL fix — `model.dispose` now reaches chat-cached sessions. The post-rebase chrome-ai branch had two parallel session maps (`chromeSessions` for chat reuse, `sessions` for idle-evict + ModelDispose lookup) but only the chat map was populated by runtime code, making `model.dispose` a functional no-op in production. Unified into a single Map<sessionId, WebBrowserSessionEntry> with both chat-cache fields (messageCount, fingerprints) and lifecycle fields (modelKey, lastUsedAt, idleTimer). `ChromeChatSessionState` now requires `modelKey`. `disposeWebBrowserSessionsForModel(modelKey)` iterates the unified store, so model.dispose destroys chat-cached sessions. Chat sessions become subject to idle eviction (free bonus). IMPORTANT — sanitizeToolArgs applied across the codebase per intent of the prior refactor: - OpenAIShapedChat (parseOpenAIToolCallMessage + accumulateOpenAIStream) → covers OpenAI + HFI - ToolCallParsers (adaptParserResult + parseToolCallsFromText) → covers llama.cpp Hermes/Liquid/Qwen35/Llama paths + HFT - Anthropic_ToolCalling (input_json_delta + content_block_stop) - Gemini_ToolCalling (functionCall.args) - Ollama_ToolCalling (parsed function.arguments) - LlamaCpp_ToolCalling (extractNativeFunctionCalls) - Cactus_ToolCalling[.browser] (JSON-parse parseToolCalls paths) Every model-supplied tool-arg payload now passes through sanitizeToolArgs before reaching downstream consumers, closing the prototype-pollution vector across the provider matrix. Also: - Added packages/test/src/test/ai/ToolCallingUtils.test.ts (14 unit tests for sanitizeToolArgs, compileToolValidators, validateToolCallArgs, plus a sanitize→validate→name-check integration test). - Added WebBrowser_Sessions.test regression for the unified-store behavior (disposeWebBrowserSessionsForModel sees chat-cached entries). - Documented WebBrowser_Chat's rebuild-on-next-turn recovery model (vs the in-fn retry that main's now-deleted test exercised).

* feat(chrome-ai): chat history, tool calling, structured generation, capability probe Integrates the chrome-ai branch (7 commits — PR #514/#520/#528) with main's parallel chrome-ai work (model.download, model.dispose, ApiBinding): - Chat-session cache keyed by AiChatTask sessionId, with messageCount high-water mark for reuse (replaces fingerprint-based invalidation) - StructuredGeneration + ToolCalling run-fns gated by an async capability probe; pre-probe state advertises a conservative subset (no json-mode, no tool-use) so the provider never claims a capability it can't fulfil - ChatHistory helpers + WebBrowser_TextGeneration_Unified dispatcher (text.generation shared by AiChatTask + TextGenerationTask) - ChromeHelpers ships both assertAvailability and ensureAvailable; both session APIs (chrome-chat cache + idle-evict store) coexist - Drops main's WebBrowser_Chat.test.ts (chrome-ai's WebBrowserProvider.test already covers chat behavior under the new cache semantics) * refactor(ai,chrome-ai,openai,hfi): shared tool sanitation; emit-pattern streams Tool calling utilities (packages/ai/src/task/ToolCallingUtils.ts): - sanitizeToolArgs: recursive __proto__/constructor/prototype scrubbing for model-supplied tool args (prototype-pollution defence) - compileToolValidators + validateToolCallArgs: per-tool inputSchema validation with graceful fallback for tools whose schema fails to compile Stream helpers converted from generators to emit-callback so run-fns no longer need a for-await/yield pump: - snapshotStreamToTextDeltas / snapshotStreamToSnapshots (chrome-ai) - accumulateOpenAIStream (@workglow/ai provider-utils, used by OpenAI + HFI) Run-fns updated to call helpers with emit directly and emit their own final 'finish' event. chrome-ai's WebBrowser_ToolCalling drops its private sanitization + validation copy and reuses the shared utils. * fix(chrome-ai): wire model.dispose; apply sanitizeToolArgs across providers Addresses review of #514/#520/#528 rebase: CRITICAL fix — `model.dispose` now reaches chat-cached sessions. The post-rebase chrome-ai branch had two parallel session maps (`chromeSessions` for chat reuse, `sessions` for idle-evict + ModelDispose lookup) but only the chat map was populated by runtime code, making `model.dispose` a functional no-op in production. Unified into a single Map<sessionId, WebBrowserSessionEntry> with both chat-cache fields (messageCount, fingerprints) and lifecycle fields (modelKey, lastUsedAt, idleTimer). `ChromeChatSessionState` now requires `modelKey`. `disposeWebBrowserSessionsForModel(modelKey)` iterates the unified store, so model.dispose destroys chat-cached sessions. Chat sessions become subject to idle eviction (free bonus). IMPORTANT — sanitizeToolArgs applied across the codebase per intent of the prior refactor: - OpenAIShapedChat (parseOpenAIToolCallMessage + accumulateOpenAIStream) → covers OpenAI + HFI - ToolCallParsers (adaptParserResult + parseToolCallsFromText) → covers llama.cpp Hermes/Liquid/Qwen35/Llama paths + HFT - Anthropic_ToolCalling (input_json_delta + content_block_stop) - Gemini_ToolCalling (functionCall.args) - Ollama_ToolCalling (parsed function.arguments) - LlamaCpp_ToolCalling (extractNativeFunctionCalls) - Cactus_ToolCalling[.browser] (JSON-parse parseToolCalls paths) Every model-supplied tool-arg payload now passes through sanitizeToolArgs before reaching downstream consumers, closing the prototype-pollution vector across the provider matrix. Also: - Added packages/test/src/test/ai/ToolCallingUtils.test.ts (14 unit tests for sanitizeToolArgs, compileToolValidators, validateToolCallArgs, plus a sanitize→validate→name-check integration test). - Added WebBrowser_Sessions.test regression for the unified-store behavior (disposeWebBrowserSessionsForModel sees chat-cached entries). - Documented WebBrowser_Chat's rebuild-on-next-turn recovery model (vs the in-fn retry that main's now-deleted test exercised). * feat(chrome-ai): retry once on InvalidStateError when a cached session is destroyed Chrome can destroy a `LanguageModel` session out from under us (tab backgrounding, GPU process restart, memory pressure). When a cached session's `promptStreaming` throws DOMException("...destroyed...", "InvalidStateError") we now rebuild the session from full history via `initialPrompts` and retry the prompt once. Retry is gated on three conditions, all required: - We were using a CACHED session (a fresh-session failure means the model is broken; retrying won't help). - No text-delta has reached the consumer yet (we can't unsend deltas). - The error name is `InvalidStateError` (matches Chrome's InvalidStateError DOMException; tolerant of message-text changes). Tests: - "retries once with a fresh session when a cached session is destroyed" seeds the cache on turn 1, has the cached session's promptStreaming throw on turn 2's reuse, asserts rebuild + retry + cache replacement. - "does not retry when a fresh (non-cached) session fails" guards the first gate.

sroussey added 3 commits May 22, 2026 01:27

sroussey requested a review from Copilot May 22, 2026 14:32

Copilot started reviewing on behalf of sroussey May 22, 2026 14:33 View session

Copilot AI reviewed May 22, 2026

View reviewed changes

docs(chrome-ai): align test comment with actual shrink-rebuild behavior

fb1bfd0

sroussey merged commit 31d1abc into chrome-ai May 22, 2026
12 checks passed

sroussey deleted the claude/sweet-edison-NSph2-chrome-ai branch May 22, 2026 15:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(chrome-ai): three HIGH-priority bugs in PR #514 — chat cache, prototype pollution, snapshot reset#528

fix(chrome-ai): three HIGH-priority bugs in PR #514 — chat cache, prototype pollution, snapshot reset#528
sroussey merged 4 commits into
chrome-aifrom
claude/sweet-edison-NSph2-chrome-ai

sroussey commented May 22, 2026

Uh oh!

pkg-pr-new Bot commented May 22, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sroussey commented May 22, 2026

Summary

Test plan

Uh oh!

pkg-pr-new Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

github-actions Bot commented May 22, 2026

Coverage Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pkg-pr-new Bot commented May 22, 2026 •

edited

Loading