fix: prevent repeated failed rebundle attempts and TOCTOU on cache fallback#878
fix: prevent repeated failed rebundle attempts and TOCTOU on cache fallback#878
Conversation
…llback When a rebundle fails (e.g. pulled extensions with bare specifiers and no deno.json import map), the fallback to the cached bundle left the cache mtime unchanged. This meant every subsequent cold start would re-detect the bundle as stale, invoke the Deno bundler, wait for it to fail, then fall back again — adding seconds of latency per extension on every load. Touch the cache file's mtime after a successful fallback so the next mtime check sees it as fresh and short-circuits immediately. Also fix a pre-existing TOCTOU race: bundleExists was set via Deno.stat() early in the method, but the fallback Deno.readTextFile() runs later in a catch block. If the bundle is deleted between the stat and the read (e.g. concurrent GC), readTextFile would throw unhandled. Wrap the fallback read in a try/catch so deletion between stat and read falls through to the original bundleError instead of crashing. Applied consistently to all 4 extension loaders: models, vaults, drivers, and datastores. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Adversarial Review
Critical / High
None.
Medium
None.
Low
-
Code duplication across 4 loaders — The identical 11-line catch block is now copy-pasted in
user_model_loader.ts:474-492,user_vault_loader.ts:257-270,user_driver_loader.ts:250-270, anduser_datastore_loader.ts:252-270. This is pre-existing duplication (the method was already copy-pasted), so it's not introduced by this PR, but worth noting that a future refactor to extractbundleWithCacheinto a shared utility would reduce the surface area for divergence bugs. -
No new test coverage for the TOCTOU path or the utime touch — The existing tests don't exercise the scenario where
readTextFilefails afterstatsucceeded (TOCTOU), or verify thatutimeis called after a fallback. These are defensive paths that are hard to unit-test without mocking Deno APIs, so this is informational rather than blocking.
Verdict
PASS — The changes are correct and well-reasoned. The read-before-utime ordering ensures the cached content is captured even if the file disappears before the mtime touch. The inner try/catch around Deno.utime correctly treats touch failure as non-fatal. The outer try/catch around readTextFile correctly falls through to throw bundleError when the cache is gone (TOCTOU). No security, data integrity, or logic concerns.
There was a problem hiding this comment.
Code Review
Blocking Issues
None.
Suggestions
-
Duplicated logic across 4 loaders: The cache fallback + mtime touch + TOCTOU guard is now a ~15-line block duplicated identically in all 4 files. Consider extracting a shared helper (e.g.
fallbackToCachedBundle(bundlePath, relativePath, logger)) to reduce duplication. Not blocking since the current approach is consistent and correct. -
Missing
user_driver_loader_test.ts: The driver loader is the only one of the 4 without a test file. Pre-existing, but worth noting as a gap. -
No unit tests for the new utime/TOCTOU paths: The mtime touch and TOCTOU catch are defensive and hard to test without filesystem stubs, so this is understandable — but if the project ever adds filesystem abstraction, these would be good candidates.
Overall: well-scoped fix with clear comments, correct error handling, and no security concerns. The approach of touching mtime to short-circuit subsequent loads is sound, and the TOCTOU fix properly handles the race condition by falling through to throw the original error.
Summary
Deno.stat()and the fallbackDeno.readTextFile()Problem
PR #876 changed extension loaders to attempt rebundling before falling back to cached bundles, fixing a bug where stale caches were served even when source files had changed. However, CI review identified a performance regression:
For pulled extensions with bare specifiers (e.g.
from "zod"resolved via a project'sdeno.json), repos without the import map can't rebundle. The old code detected bare specifiers upfront and returned the cache immediately. The new code attemptsbundleExtension(), which invokes the Deno bundler, waits for it to fail, then falls back — adding seconds of latency per extension on every load.This is because the cache file's mtime was never updated after a fallback, so the next mtime check would see the cache as stale again and trigger the same failed rebundle cycle.
User Impact
Users with many pulled extensions using bare specifiers (e.g.
@swamp/hetzner-cloudwith 11 models) would see significantly slower cold starts — the Deno bundler would be invoked and fail 11 times on everyswampcommand, adding potentially 10+ seconds of startup latency.Fix
Mtime touch: After falling back to the cached bundle on rebundle failure,
Deno.utime()touches the cache file's mtime. The next load sees the cache as "fresh" and returns it immediately at the mtime check — no bundler invocation.TOCTOU fix: The
bundleExistsflag was set viaDeno.stat()early in the method, but the fallbackDeno.readTextFile()runs in a later catch block. If the bundle is deleted between the stat and the read (e.g. concurrent garbage collection),readTextFilewould throw unhandled. The fallback read is now wrapped in a try/catch — if the file is gone, it falls through to throw the originalbundleError.Test plan
deno checkpasses on all 4 loadersdeno lintpasses on all 4 loadersdeno fmtpasses@swamp/hetzner-cloud— 11 models in a repo withoutdeno.json)deno.jsonis present (local dev scenario)🤖 Generated with Claude Code