fix(cactus,chrome-ai): security and correctness fixes from review#531
Merged
Conversation
Previously the outer try/catch around the cache-read path swallowed every
filesystem error (EACCES, EIO, EISDIR, EMFILE, ...) and silently fell through
to the network refetch. That masked real misconfigurations as innocuous
cache misses, hiding the underlying cause from operators and forcing
unnecessary HuggingFace fetches.
The catch block now distinguishes:
- CactusIntegrityError re-throws (unreachable; handled inside),
- ENOENT falls through (the legitimate cache-miss case),
- any other code wraps the original error with a descriptive message
and `cause` so callers can see what actually went wrong.
Regression test in Cactus_Runtime.node.test.ts covers both branches by
driving the production code through real fs state — ENOENT via an empty
temp dir, and a non-ENOENT (EISDIR) by making the asset path a directory.
`fetchAssetBytes` (the public entry) already called `assertSafeModelId` once before delegating, but the helper `fetchAssetBytesBrowser` did not — so a future refactor that hits the helper from a different code path could slip a hostile model_id past the allowlist. The Node variant `fetchAssetBytesNode` already re-validates at the call site; this brings the browser helper to parity. The helper now takes `model_id` as an explicit parameter and asserts it as the first statement, matching the Node helper's pattern. The single call site in `fetchAssetBytes` threads `model_id` through.
The unified session store evicts idle entries after 30 minutes (WEB_BROWSER_SESSION_IDLE_MS). Before this fix, a single long-running chat turn whose stream exceeded the idle window would have its cached session destroyed mid-stream by the idle timer, even though the model was actively producing output. The next turn would then look up a now-missing session. The fix wires `touchWebBrowserSession(sessionId)` into the chat run-fn's `trackingEmit` helper alongside the existing `deltaEmitted` bookkeeping, so every `text-delta` event resets the idle clock. Idle eviction still fires after 30 minutes of true silence (no further deltas), so the mechanism is preserved — only active streams defer it. Regression test in WebBrowser_Chat.idleTouch.test.ts: pre-populates the cache, drives a controllable ReadableStream through two chunks separated by 25 simulated minutes (50 min total elapsed), and asserts the session is still cached. A second test confirms post-stream silence still evicts at the 30-min mark.
Three changes that close the loop on the SHA-256 integrity machinery that landed in #530: 1. New `providers/cactus/scripts/hash-catalog.ts` (Bun) fetches every asset URL referenced by `CACTUS_CATALOG`, computes `sha256Hex` and byte length, prints a JSON report, and rewrites the catalog file in place when `--write` is passed. Wired as `bun run hash-catalog` in the cactus package. Re-running after a `revision` bump regenerates everything; the matcher is conservative and only replaces placeholder blocks so partially-populated catalogs are safe. 2. `Cactus_ModelCatalog.ts` gains a production guard: when `NODE_ENV === "production"` or `CACTUS_REQUIRE_REAL_HASHES === "1"`, module load throws if any asset still has the placeholder sha256 or a non-positive size. Dev/test stays permissive (no env var) so contributors can iterate against `TODO_FILL_AT_RELEASE`. Exposes `CATALOG_HAS_PLACEHOLDERS` so release tooling can opt-in to the same check. 3. Catalog values populated with real hashes + sizes obtained by running the script against the live HuggingFace asset URLs: - needle.safetensors (22,259,039 bytes) - vocab.txt (122,132 bytes) - config.json (320 bytes) With real sizes now in place, the network-fallthrough assertion in `Cactus_Runtime.node.test.ts` is tightened to assert via the downstream integrity rejection (which can only fire if fetch ran), keeping the ENOENT-branch coverage intact. Tests in `Cactus_ModelCatalog.test.ts` verify both env-var-gated states by spawning a child Bun process with and without CACTUS_REQUIRE_REAL_HASHES set.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR applies a set of targeted security/correctness fixes across the Cactus and Chrome-AI providers, plus adds release/tooling safeguards around Cactus’ asset hash catalog to ensure integrity checks can’t be silently bypassed in production.
Changes:
- Cactus: tighten filesystem error handling to only fall back to network fetch on
ENOENT, and add defense-in-depthmodel_idvalidation in the browser cache helper. - Chrome-AI: prevent active streaming sessions from being evicted mid-stream by touching the unified session store on each
text-delta. - Cactus: add a hash-catalog generation script, populate real hashes/sizes in the catalog, and add a production/env-gated placeholder/size guard with tests.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| providers/chrome-ai/src/ai/common/WebBrowser_Chat.ts | Touches session on each streamed delta to defer idle eviction during long generations. |
| providers/cactus/src/ai/common/Cactus_Runtime.ts | Restricts cache-read fallthrough to ENOENT; rethrows other fs errors with cause. |
| providers/cactus/src/ai/common/Cactus_Runtime.browser.ts | Re-validates model_id inside the browser cache helper; threads model_id through. |
| providers/cactus/src/ai/common/Cactus_ModelCatalog.ts | Adds production/env guard against placeholder hashes or non-positive sizes; exports CATALOG_HAS_PLACEHOLDERS; updates catalog to real hashes/sizes. |
| providers/cactus/scripts/hash-catalog.ts | New Bun script to fetch assets, compute sha256/size, and optionally rewrite the catalog. |
| providers/cactus/package.json | Adds hash-catalog script entry. |
| packages/test/src/test/helpers/cactus-placeholder-guard-runner.ts | Child-process helper to validate placeholder guard behavior under env toggles. |
| packages/test/src/test/ai-provider/WebBrowser_Chat.idleTouch.test.ts | Regression tests ensuring streaming defers idle eviction while post-stream eviction still occurs. |
| packages/test/src/test/ai-provider-cactus/Cactus_Runtime.node.test.ts | Regression tests for fetchAssetBytesNode cache-read error handling. |
| packages/test/src/test/ai-provider-cactus/Cactus_ModelCatalog.test.ts | Tests for catalog placeholder detection/export and env-gated placeholder rejection behavior (via runner). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+73
to
+92
| function rewriteCatalogFile(catalogPath: string, reports: readonly AssetReport[]): number { | ||
| const source = readFileSync(catalogPath, "utf8"); | ||
| let next = source; | ||
| let replaced = 0; | ||
| for (const r of reports) { | ||
| // Match block: | ||
| // filename: "<r.filename>", | ||
| // sha256: CACTUS_HASH_PLACEHOLDER, | ||
| // size: 0, | ||
| // (whitespace-tolerant; size may be any number, sha256 may be the constant or quoted hex placeholder) | ||
| const blockRe = new RegExp( | ||
| `(filename:\\s*"${escapeRegex(r.filename)}",\\s*\\n\\s*)` + | ||
| `sha256:\\s*(?:CACTUS_HASH_PLACEHOLDER|"${escapeRegex(CACTUS_HASH_PLACEHOLDER)}"),` + | ||
| `(\\s*\\n\\s*)size:\\s*\\d+,`, | ||
| "m" | ||
| ); | ||
| const updated = next.replace( | ||
| blockRe, | ||
| `$1sha256: "${r.sha256}",$2size: ${r.size},` | ||
| ); |
Comment on lines
+62
to
+73
| it("CACTUS_REQUIRE_REAL_HASHES=1 throws when a placeholder is present", () => { | ||
| // Spawn a child process that: | ||
| // - injects a stubbed Cactus_Integrity that exposes the placeholder | ||
| // under the constant the catalog imports, then | ||
| // - injects a temporary catalog source whose first asset's sha256 is | ||
| // the placeholder and size=0, then | ||
| // - imports the module under CACTUS_REQUIRE_REAL_HASHES=1. | ||
| // We assert the child exits non-zero. To avoid file mutation in the | ||
| // working tree, we use a TS evaluator script that constructs the | ||
| // placeholder situation inline by re-importing the constants and | ||
| // running the guard predicate directly. | ||
| const here = dirname(fileURLToPath(import.meta.url)); |
Comment on lines
+18
to
+23
| * The test temporarily mutates the catalog source through Bun's `import()` | ||
| * resolver. We avoid `vi.mock` because mocking the module from outside while | ||
| * also exercising its module-load side effects is brittle across runners; | ||
| * instead we drive the assertions by directly invoking the same validation | ||
| * predicate (`CATALOG_HAS_PLACEHOLDERS`) and by spawning a child process | ||
| * with the env var set so the import-time throw is observable. |
…ouch test The run-fn only reads input.messages (and optionally input.temperature) at runtime, but AiChatProviderInput requires model+prompt at the schema layer, which the production dispatcher provides but is irrelevant for this unit test. Match the dispatcher convention (any-typed input) via an explicit cast so the test compiles under bun run build:types. Unblocks CI build/CodeQL/publish-preview/vitest jobs that were skipping after the build job failed on this TS2345.
@workglow/cli
@workglow/ai
@workglow/browser-control
@workglow/indexeddb
@workglow/javascript
@workglow/job-queue
@workglow/knowledge-base
@workglow/mcp
@workglow/storage
@workglow/task-graph
@workglow/tasks
@workglow/util
workglow
@workglow/anthropic
@workglow/bun-webview
@workglow/cactus
@workglow/chrome-ai
@workglow/electron
@workglow/google-gemini
@workglow/huggingface-inference
@workglow/huggingface-transformers
@workglow/node-llama-cpp
@workglow/ollama
@workglow/openai
@workglow/playwright
@workglow/postgres
@workglow/sqlite
@workglow/supabase
@workglow/tf-mediapipe
commit: |
Coverage Report
File CoverageNo changed files found. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Four small fixes surfaced by a security / DX / correctness review of recent commits to the Cactus and Chrome-AI providers (in particular
#530SHA-256 integrity +#514Chrome AI). One commit per fix, no squashing.fix(cactus): catch only ENOENT in fetchAssetBytesNode— the outertry/catcharound the cache-read path used to swallow every filesystem error and silently refetch over the network, masking real problems likeEACCES/EIO/EISDIR. The catch now letsENOENTfall through (legitimate cache miss) and re-wraps every other code with acause-carrying error so operators see what actually went wrong.fix(cactus): validate model_id inside fetchAssetBytesBrowser— defense-in-depth parity withfetchAssetBytesNode. The browser helper now re-assertsassertSafeModelIdat the top, matching the Node variant. A future refactor that hits the helper from a new entry point can no longer slip a hostilemodel_idpast the allowlist.fix(chrome-ai): touch chat session on each delta to defer idle eviction— long-running streams (>30 min) used to have their cachedLanguageModelsession destroyed mid-flight by the unified store's idle timer. The chat run-fn now callstouchWebBrowserSession(sessionId)on everytext-delta, so active output defers eviction; post-stream silence still evicts at the normal 30-min mark.fix(cactus): add hash-catalog script + production placeholder guard— a Bun-runnableproviders/cactus/scripts/hash-catalog.tsfetches every catalog asset URL, computes sha256 + byte length, prints JSON, and (with--write) rewrites the catalog in place. Module-load throws in production /CACTUS_REQUIRE_REAL_HASHES=1when any asset still uses the placeholder or has non-positive size; exportsCATALOG_HAS_PLACEHOLDERSso tooling can opt-in. The script was run against the live HuggingFace URLs and the catalog now ships real hashes + sizes for the threeneedle-26massets.Findings source
These four items came from a focused review of the most-recent merged commits in
providers/cactusandproviders/chrome-ai. They are all small, locally-scoped, and have regression-test coverage (except commit 2 which mirrors an already-tested pattern from the Node variant).Test plan
npx vitest run packages/test/src/test/ai-provider-cactus/— 11 files, 31 tests (2 skipped integration)npx vitest run packages/test/src/test/ai-provider/— 14 files, 163 testsbun test packages/test/src/test/ai-provider-cactus/Cactus_Runtime.node.test.ts packages/test/src/test/ai-provider-cactus/Cactus_ModelCatalog.test.ts packages/test/src/test/ai-provider/WebBrowser_Chat.idleTouch.test.ts— passes under Bun runnerbun run build:js— all 61 turbo tasks greenbun providers/cactus/scripts/hash-catalog.ts— verified against live HuggingFace asset URLs; real hashes/sizes committedNew / modified files at a glance:
providers/cactus/src/ai/common/Cactus_Runtime.ts— ENOENT-only catchproviders/cactus/src/ai/common/Cactus_Runtime.browser.ts— model_id re-validation in helperproviders/cactus/src/ai/common/Cactus_ModelCatalog.ts— production guard,CATALOG_HAS_PLACEHOLDERS, real hashesproviders/cactus/scripts/hash-catalog.ts— new scriptproviders/cactus/package.json—hash-catalognpm scriptproviders/chrome-ai/src/ai/common/WebBrowser_Chat.ts— touch on text-deltapackages/test/src/test/ai-provider-cactus/Cactus_Runtime.node.test.ts— newpackages/test/src/test/ai-provider-cactus/Cactus_ModelCatalog.test.ts— newpackages/test/src/test/helpers/cactus-placeholder-guard-runner.ts— new (child-process driver for the guard test)packages/test/src/test/ai-provider/WebBrowser_Chat.idleTouch.test.ts— newOpen follow-up
None blocking — the hash-catalog script was successfully run against HuggingFace and the catalog now contains real hashes for
needle-26m. Re-runningbun run --filter @workglow/cactus hash-catalog -- --writeis the documented path after any futurerevisionbump.https://claude.ai/code/session_014HFcmuoPdbo9YWs29okjyF
Generated by Claude Code