perf(filesearch): move @-picker crawl and fzf index to worker_threads by callmeYe · Pull Request #3455 · QwenLM/qwen-code

callmeYe · 2026-04-20T02:25:34Z

TLDR

Moves the @-picker's recursive file crawl and fzf index construction off the main thread into a worker_threads worker, so pressing @ no longer freezes the Ink render loop. On large workspaces (home directory, 100k-file monorepos) the CLI stays fully responsive where it previously stalled for 1–9 seconds.

Also wires up a bundled ripgrep fast path for the crawl (3–4× faster than fdir past ~50k files) and pre-warms the index at CLI boot so the first @ press usually finds a ready snapshot.

Screenshots / Video Demo

N/A — no new UI surface; the visible change is "nothing freezes anymore" on large trees, which is hard to capture in a still. Happy to record a before/after clip against ~ if reviewers want one.

Dive Deeper

Root cause of the original freeze was two synchronous pieces of work on the main thread:

fdir recursive crawl in packages/core/src/utils/filesearch/crawler.ts.
new AsyncFzf(allFiles, …) in packages/core/src/utils/filesearch/fileSearch.ts — despite the name the constructor is synchronous and dominates for large file counts.

This branch introduces:

FileIndexCore — pure class holding the crawl, fzf index, and result cache. Worker-safe and unit-testable without a real worker.
fileIndexWorker.ts — thin parentPort pump that wraps FileIndexCore.
FileIndexService — main-thread singleton keyed by (projectRoot + ignore-fingerprint + options) that owns the worker, multiplexes reqId-based search calls, fans out streaming partial updates, and disposes stale-key instances when .gitignore edits or cwd changes produce a new key.
RecursiveFileSearch is now a thin proxy over the service, so the public FileSearch interface (and vscode-ide-companion's use of it) is unchanged.
ripgrepCrawler.ts — rg --files backend using the bundled ripgrep binary. Empirically on macOS:

Tree size fdir rg via Node Winner

~2.7k files (qwen-code) ~25 ms ~140 ms fdir

~48k (node_modules) ~640 ms ~1.8 s fdir

~100k (home dir cap) ~9 s ~2.5 s rg 3–4×

At small sizes Node's spawn+IPC overhead beats rg's native parallel walker; past ~50k files rg's Rust walker wins decisively. Since both are well under the 200 ms loading threshold on small repos, rg is the default; QWEN_FILESEARCH_USE_RG=0 forces fdir.
AppContainer.tsx pre-warms FileIndexService.for(...) right after config.initialize() resolves, so the worker crawl starts in parallel with Config/MCP boot instead of waiting for the first @ keypress.
useAtCompletion.ts no longer flips isLoading=true on the INITIALIZE dispatch; the 200 ms slow-load timer now also covers INITIALIZING, so the common pre-warmed path opens silently and only slow cold-starts surface a spinner.
Composer.tsx no longer renders ConfigInitDisplay; the "Initializing…" banner at top of the prompt is gone.

The branch also lands five rounds of self-audit fixes on top of the P1 baseline:

crawlError now disposes the service (no zombie worker on transient failures).
dispose() rejects whenReady() waiters.
REFRESH uses a monotonic token so partial-driven re-searches fire even mid-SEARCHING.
worker.on('error') is wired and converted to a synthetic exit.
Worker dispose uses parentPort.close() (drains queued sends) instead of process.exit(0).
optionsKey() includes the .gitignore/.qwenignore fingerprint so edits invalidate the cached singleton; FileIndexService.for() also evicts stale-key instances sharing the same projectRoot.
FileSearch.dispose?() added to the interface; vscode-ide-companion's clearFileSearchCache now disposes instead of only dropping the Map entry.
Worker module-init is wrapped in try/catch and search payloads are type-validated; picomatch compile errors degrade to empty results instead of throwing to the UI.
scripts/prepare-package.js whitelists fileIndexWorker.js so the published npm tarball actually ships it.

Reviewer Test Plan

npm ci && npm run build && npm run bundle
Run on a small repo: npm start then press @ — should open instantly, no loading flash.
Run on a large tree: QWEN_WORKING_DIR=~ npm start, press @ within the first second — prompt should stay responsive (cursor keeps blinking, you can still type). Compare to main: freeze.
Force fdir: QWEN_FILESEARCH_USE_RG=0 QWEN_WORKING_DIR=~ npm start — slower crawl but still non-blocking.
.gitignore invalidation: open an @-picker, edit .gitignore to add a dir, press @ again — newly-ignored files disappear without a CLI restart.
Unit + hook tests: npm run test --workspace=packages/core and npm run test --workspace=packages/cli.

Testing Matrix

	🍏	🪟	🐧
npm run	✅	❓	❓
npx	❓	❓	❓
Docker	❓	❓	❓
Podman	❓	-	-
Seatbelt	❓	-	-

Personally validated on macOS 15.1.1 (arm64) with Node 22.17. Worker spawn latency on Windows is known to be higher (30–80 ms) — pre-warm absorbs it, but independent validation there would be appreciated. Sandbox profiles may need to permit worker thread creation; I haven't repro'd a sandbox issue locally but flagged it as a risk.

Before：

qwencode@before.mp4

after：

qwencodeafter.mp4

Linked issues / bugs

Fixes #3454

Before P1 the recursive @-picker did the fdir crawl and synchronous `new AsyncFzf(...)` construction on the main thread. On 100k-file workspaces that blocked the Ink render loop for 500ms–2s, so pressing `@` felt unresponsive. This change introduces `FileIndexCore` (pure class), `FileIndexWorker` (worker_threads entry), and `FileIndexService` (main-thread singleton that multiplexes search requests and exposes onPartial for future streaming). `RecursiveFileSearch` becomes a thin proxy, preserving the public `FileSearch` contract (and vscode-ide-companion's usage) untouched. The esbuild bundle now emits a self-contained `dist/fileIndexWorker.js` alongside `dist/cli.js`.

…EFRESH trigger Changes driven by an open-ended correctness audit: - crawlError now tears the service down (dispose worker, delete from INSTANCES, reject pending searches + whenReady waiters), so a transient crawl failure no longer leaks a worker per attempt. - dispose() now rejects whenReady() waiters with AbortError; previously callers awaiting whenReady() through a dispose could hang forever. Also reorders unsubscribe before posting 'dispose' so the in-process transport's synchronous exit cleanup can't beat the AbortError. - worker.on('error') is wired, converting uncaught worker errors into a normal exit-cleanup path instead of re-throwing on the main process and crashing the CLI. - useAtCompletion gains a monotonic refreshToken in its reducer state and effect deps, so REFRESH re-triggers the worker effect even when the status was already SEARCHING (previously a partial chunk arriving mid-search could not cause a re-run). - Worker 'dispose' now closes parentPort instead of process.exit(0) so in-flight searchResult/searchError messages drain before shutdown. - FileIndexCore.startCrawl preserves the allFiles reference on the cache-hit fallback path to keep concurrent search() iterations stable. - dispose() races transport.terminate() against a 2s timeout so a faulted worker cannot hang shutdown. - handleExit is now idempotent: if dispose() already set disposed=true, a subsequent worker exit won't double-reject pending/readyWaiters. Added regression tests for whenReady-on-dispose rejection and for post-dispose FileIndexService.for() creating a fresh instance.

…TANCES hygiene - `optionsKey()` now hashes the .gitignore / .qwenignore contents via loadIgnoreRules().getFingerprint(). Editing those files produces a new key so the next `FileIndexService.for()` spawns a fresh worker instead of serving a stale cached snapshot with outdated ignore rules. - `FileIndexService.for()` only memoises an instance that survived construction. If the Worker spawn errored synchronously and handleExit ran before INSTANCES ever had the key, a permanently-disposed instance would otherwise be cached for the lifetime of the process. - `FileSearch.dispose?()` is now a (optional) part of the public interface; the recursive proxy delegates to `FileIndexService.dispose()`. - `FileMessageHandler.clearFileSearchCache` calls `dispose()` when a workspace file create/delete fires so the worker's fzf index is rebuilt from disk instead of continuing to serve stale results. Added regression tests: editing .gitignore mid-session invalidates the singleton; disposed services don't pollute INSTANCES.

…ttern errors - Wrap FileIndexCore construction at worker module-init in try/catch. A bad projectRoot (e.g. containing NUL) used to crash the worker before its message handler attached, leaving the main thread waiting forever. We now surface the failure as a normal crawlError / searchError so the service sees it on the next message. - Validate reqId / pattern types on 'search' messages so a malformed IPC shape fails just the one request instead of the whole worker. - filter() now catches picomatch compile errors and returns [] — typing an interim `foo[` (common mid-keystroke state) no longer surfaces as a TypeError to the UI; it's simply "no matches" until the pattern is well-formed. - Worker error log trims to name+message instead of the full Error object, so a CLI transcript/wrapper doesn't capture absolute paths from the user's machine by default. Added regression test: malformed glob pattern in core.search returns [].

…y eviction - scripts/prepare-package.js now lists `fileIndexWorker.js` in the published tarball's `files` whitelist. Without this, `npm i -g` would produce an install missing the worker, and the first `@`-picker use would throw ERR_MODULE_NOT_FOUND at runtime. Release blocker. - FileIndexService.for() now disposes any stale instance under a different key but the same projectRoot before creating a new one. When `.gitignore` is edited mid-session, the fingerprint change would otherwise leave the old worker pinned in INSTANCES forever — the auditor flagged this as a real leak even though the commit message for round 2 implied it was fixed. - Dropped `__setIndexTransportFactory` from the public barrel; it's a test-only hook that shouldn't be part of the external API. Tests (and nothing else) can still import it from the module directly. - `search()` on a disposed service now throws AbortError instead of a plain Error, matching how in-flight searches are rejected inside dispose(). Added regression tests: stale-key service is disposed and its `search()` throws AbortError after .gitignore-driven eviction.

…nc throw Round 4 switched `FileIndexService.search()` on a disposed instance from `Error` to `AbortError` for notional consistency with how in-flight searches are rejected inside `dispose()`. The audit correctly called out that this silently breaks useAtCompletion: the hook's catch block swallows `AbortError` as "user pressed ESC" and never dispatches the ERROR state, so a crawlError-driven cascade (where dispose() is called and the next search hits the sync guard) would leave the UI stuck in SEARCHING forever. In-flight rejections inside dispose() remain AbortError — those are genuine cancellations. The post-dispose sync guard is caller misuse and must surface as a plain Error to drive the ERROR branch.

…cators Addresses post-audit UX feedback: - AppContainer kicks FileIndexService.for(...) as soon as Config.initialize resolves. The worker crawl starts in the background before the user types `@`, so the first picker open usually finds a ready (or nearly-ready) singleton and returns results without any visible wait. - useAtCompletion no longer flips isLoading on the INITIALIZE dispatch. A 200ms timer now arms during INITIALIZING too, so the "Loading suggestions..." placeholder only appears if the crawl/search is genuinely slow — the common pre-warmed path opens silently. - Remove the global ConfigInitDisplay render in Composer. The top-of-screen "Initializing..." banner is no longer shown while Config/MCP finish setting up; the prompt renders without any boot-time placeholder. Updated the two useAtCompletion tests that asserted the old flash-loading behaviour to assert the steady-state (isLoading=false) instead.

Initially planned as P2 to replace fdir with ripgrep for the @-picker crawl. Empirical benchmarking on two representative trees refuted the premise: qwen-code (~2700 files) fdir ~25ms vs rg (spawn+IPC) ~140ms node_modules (~48k files) fdir ~640ms vs rg (spawn+IPC) ~1800ms Even though `rg --files` is ~70ms in a shell, the child_process spawn and stdout-pipe overhead from Node dominates for every tree size we tested. Claude Code's speed almost certainly comes from being a native binary with an in-process walker, not from shelling out to ripgrep. The ripgrepCrawler implementation is kept for future re-evaluation on very large trees or different platforms, and is reachable via the `QWEN_FILESEARCH_USE_RG=1` env var. fdir remains the default. Also: useAtCompletion tests now sort before asserting on suggestion order. The exact ranking is an fzf tiebreak artifact that depends on crawler emission order — not a behavioural contract worth coupling tests to.

Earlier commit (21764b5) gated ripgrep off by default based on a benchmark that only covered small-to-medium trees (<48k files), where Node's spawn+IPC overhead does beat fdir. User-reported feedback prompted a re-test on a home-directory-scale target: qwen-code repo (~2700 files) fdir ~25ms rg ~140ms fdir wins project/node_modules (~48k) fdir ~640ms rg ~1800ms fdir wins ~/ home dir (100k-file cap) fdir ~9s rg ~2.5s rg 3-4× wins The slow case is the painful one — a user typing @ at $HOME shouldn't wait 9 seconds while Node's single-threaded walker catches up. On small repos both backends are well below the 200ms loading threshold, so the perceptual cost of rg's spawn overhead is zero. Flip the default; keep `QWEN_FILESEARCH_USE_RG=0` as the escape hatch to force fdir.

yiliang114 · 2026-04-20T03:07:18Z

+   * errored or exited). Used by the `FileSearch` proxy to preserve its
+   * original "initialize awaits full readiness" contract.
+   */
+  whenReady(): Promise<void> {


One blocker here: if the worker exits after FileIndexService.for() returns but before the caller invokes whenReady(), handleExit() only rejects the waiters that already exist and leaves the service effectively looking like it is still crawling. A later whenReady() call then adds a new waiter that never settles. I was able to reproduce this with a fake transport where the exit fires before whenReady() is called, and the promise timed out instead of rejecting. Could we mark the service as errored here (or have whenReady() reject once the instance has already exited/disposed)?

Great catch, thanks! You're right — handleExit was setting disposed = true but leaving _state at 'crawling', so a whenReady() call arriving after the exit parked in readyWaiters forever.

Fixed in b15c289: handleExit now transitions _state to 'error' first, which makes the existing check at the top of whenReady() reject synchronously for any subsequent caller. Added a regression test (fileIndexService.test.ts → "rejects whenReady() called after the transport has exited") that uses a fake transport firing exit before whenReady() subscribes — the exact scenario you reproduced.

yiliang114 · 2026-04-20T03:07:18Z

+      // Fire-and-forget: errors surface via the normal search path the next
+      // time the hook is used.
+      try {
+        FileIndexService.for({


One thing I noticed here: this prewarm still runs when enableRecursiveFileSearch is false, because we call FileIndexService.for(...) unconditionally and start the worker/crawl anyway. The actual search path respects the flag later, so this specifically turns into a startup regression for users who opted out of recursive file search. Could we guard the prewarm on the same config flag?

Good point, fixed. In b15c289 the prewarm is now gated on config.getEnableRecursiveFileSearch() !== false, so users who opted out no longer pay for a startup crawl or a worker they'll never use. The useAtCompletion search path already respected the flag at runtime — this just aligns startup behaviour with it.

yiliang114

The worker-thread refactor looks good overall, but I found two blocking lifecycle/config edge cases before I'd be comfortable merging. Details are in the inline comments: whenReady() can hang after an early worker exit, and the CLI still prewarms recursive indexing even when recursive file search is disabled.

Verification I ran locally: npm install in the review worktree (which completed build + bundle via prepare), packages/core targeted file-search tests, and packages/cli useAtCompletion tests all passed.

wenshao

I reviewed this PR and found four high-confidence issues that should be addressed before merge:

packages/cli/src/config/config.ts: --bare now drops explicitly provided --core-tools, which contradicts bare mode's contract to honor explicit CLI inputs and can silently remove requested tools in scripted flows.
packages/core/src/tools/swarm.ts: swarm workers still allow recursive orchestration tools like agent, swarm, and exit_plan_mode, so the do not spawn sub-agents restriction is not actually enforced.
packages/vscode-ide-companion/src/webview/handlers/FileMessageHandler.ts: cache invalidation can be bypassed if an in-flight file search initialization completes after clearFileSearchCache() has removed the promise from the map, allowing a stale index to be re-cached.
packages/webui/src/index.ts: changed code still fails typecheck because many relative ESM imports are missing explicit .js extensions under the repo's node16/nodenext TypeScript settings.

Lint passed, but typecheck did not pass for changed code.

— gpt-5.4 via Qwen Code /review

Windows CI was failing the full filesearch suite (useAtCompletion, crawler, fileSearch, fileIndexCore, fileIndexService) because ripgrep on Windows emits `.\file.txt`; the stdout pump stripped `./` before `toPosixPath`, so the leading `./` survived and the ancestor-directory walk in `buildRipgrepFileFilter` asked the ignore lib to test `"./"` — which throws RangeError and poisoned the data handler. Normalize to posix first, then strip; guard the filter for `.`/`./`/leading-`./` inputs as defence in depth. Also addresses review feedback: - fileIndexService: an early worker exit left `_state` at `'crawling'`, so a `whenReady()` call arriving after the exit parked in readyWaiters and never settled. `handleExit` now transitions to `'error'` so the state check rejects synchronously. - AppContainer prewarm: guarded on `enableRecursiveFileSearch` so opted-out users don't pay for a full crawl + worker at startup. - FileMessageHandler: added a per-rootPath generation token so an in-flight `initialize()` disposes the stale index instead of re-caching it when `clearFileSearchCache` fires mid-crawl. Regression tests added for all four. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

callmeYe · 2026-04-20T06:32:21Z

Thanks so much for the detailed review, @wenshao — really appreciate you taking the time to look this over carefully, your feedback is always spot-on and makes the codebase better. 🙏

Responses to each point:

1. `FileMessageHandler.ts` — cache-invalidation race ✅

Excellent catch, you nailed a real bug. The in-flight initialize() would indeed land in the Map after clearFileSearchCache had removed it, leaving a stale index masquerading as fresh. Fixed in b15c289 with a per-rootPath generation token:

fileSearchInitTokens: Map<string, symbol> — a fresh Symbol is minted before each init.
clearFileSearchCache deletes the token, which invalidates any in-flight init.
After await search.initialize(), the IIFE compares fileSearchInitTokens.get(rootPath) !== token and disposes the stale search instead of caching it.

Regression test added in FileMessageHandler.test.ts ("disposes stale index when clearFileSearchCache fires during in-flight initialize()") that reproduces the exact interleaving: deferred initialize → watcher callback fires → init resolves → assert dispose() was called and search() was never invoked.

2–4. `config.ts` (`--bare`), `swarm.ts`, `webui/src/index.ts`

I want to gently flag that these three files aren't actually modified by this PR — I confirmed via `git diff upstream/main...HEAD`, which shows the branch's 16-file changeset is scoped to `esbuild.config.js`, the filesearch module, `AppContainer`/`Composer`/`useAtCompletion`, `FileMessageHandler.ts`, `scripts/prepare-package.js`, and `packages/core/src/index.ts`. None of `config.ts`, `swarm.ts`, or `webui/src/index.ts` appear in the diff.

My read is that the `/review` agent may have expanded its scope beyond the PR changeset and surfaced findings from the broader tree. All three sound like plausible issues in their own right — happy to dig into any of them in a separate PR if you'd like, but I'd prefer to keep this PR scoped to the worker_threads change and the CR follow-ups so it doesn't grow further.

Also in `b15c289`

While I was in there I also addressed @yiliang114's two blocker-level findings (see the inline threads for details) and fixed the Windows CI failure — on Windows, ripgrep emits .\file.txt, and my stdout pump was stripping ./ before normalising, causing \"./\" to reach ignore.ignores() which threw RangeError and silently poisoned every filesearch test. Normalised to posix first + added defensive guards in buildRipgrepFileFilter.

Thanks again for the careful review! Let me know if anything else should be addressed before merge.

wenshao

[Critical] packages/vscode-ide-companion/src/webview/handlers/FileMessageHandler.ts contains new typecheck failures because message payloads typed as Record<string, unknown> are accessed with dot syntax (data?.query, data.path, data.content, etc.) instead of bracket syntax required by the repo's TS settings. Please switch those reads to bracket access such as data?.['query'], data['path'], and data['content'].

[Critical] packages/vscode-ide-companion/src/webview/handlers/FileMessageHandler.ts also introduces an unused buildCaseInsensitiveGlob helper, which is a compile error under the current TS config. Please remove it or wire it into the active code path.

— gpt-5.4 via Qwen Code /review

…ata payloads - Remove unused `buildCaseInsensitiveGlob` helper and its companion `globSpecialChars` Set. Both were carried over from an earlier case-insensitive-search attempt and have no callers anywhere in the tree. - Switch `Record<string, unknown>` reads in the message handler (`data?.query`, `data.path`, `data.content`, etc.) to bracket syntax so the file stays clean under stricter `noPropertyAccessFromIndexSignature` variants of the TS config. Pure cleanup; behaviour unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

callmeYe · 2026-04-20T07:14:27Z

Thanks again for the follow-up, @wenshao — your eye for TS hygiene is really sharp, I appreciate the second pass. 🙏 Both items addressed in 6ffe02b:

1. Unused `buildCaseInsensitiveGlob` helper ✅ removed

You're absolutely right that this was dead weight. Worth noting for context: the helper and its companion globSpecialChars Set are actually inherited from upstream/main (pre-dated this PR — present at merge-base 7cded6e0d), so it's long-standing leftover from an earlier case-insensitive glob attempt. grep -rn buildCaseInsensitiveGlob confirms zero callers. Deleted the helper and the Set both in 6ffe02b.

2. Dot-access on `Record<string, unknown>` ✅ converted to bracket syntax

Migrated all eight sites in FileMessageHandler.ts to bracket access:

data?.['query'], data?.['requestId'], data?.['path'] (message dispatch)
data['path'], data['oldText'], data['newText'] (handleOpenDiff)
data['content'], data['fileName'] (handleCreateAndOpenTempFile)

One nuance worth sharing: the vscode-ide-companion package's local tsconfig doesn't currently enable noPropertyAccessFromIndexSignature (it's only on in the root tsconfig), which is why npm run check-types was passing locally for me. I cross-checked by running tsc --noEmit --noPropertyAccessFromIndexSignature --noUnusedLocals against the package and FileMessageHandler.ts comes back clean after the edit. So even if the package's tsconfig tightens in the future, this file is forward-compatible.

CI run is already in flight for 6ffe02b — let me know if anything else needs attention before merge.

…ng tmp After the ripgrep path fix landed, the only remaining Windows CI failures were `EBUSY: resource busy or locked, rmdir` on test temp directories. Root cause: `ripgrep --files` is spawned with `cwd: crawlDirectory`, and on Windows the OS holds a handle on that working directory until the child fully exits. If `afterEach` calls `cleanupTmpDir` before the FileIndexService (and its rg subprocess) is disposed, rmdir races the handle release and fails. Three-pronged fix: - `cleanupTmpDir` now passes `maxRetries: 5, retryDelay: 100` to `fs.rm`. Node retries EBUSY/EPERM internally on Windows; half a second is plenty for tens-of-ms handle-release races. This is a blanket safety net for every caller. - `fileIndexService.test.ts` afterEach: `__resetForTests()` runs before `cleanupTmpDir` so the transport is torn down first. - `useAtCompletion.test.ts` afterEach: added the same `FileIndexService.__resetForTests()` call so the hook's in-process worker is disposed before the dir is removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

yiliang114

(replaced — see updated review below)

yiliang114

Solid perf win — moving the crawl + fzf index off the main thread is the right call. Architecture looks reasonable and the five self-audit rounds show good discipline.

Two structural concerns before the specifics:

PR scope — this bundles three independent changes (worker architecture, ripgrep backend, UX/loading polish). Would be easier to review and safer to revert as 2-3 separate PRs.
2000+ lines — bulk is justified given the new subsystem, but some of the complexity below could be avoided.

5 high-risk + 5 medium-risk items flagged inline.

yiliang114 · 2026-04-21T07:06:25Z

+      for (const p of cached) this.allFiles.push(p);
+    }
+    this.crawlDone = true;
+  }


Double crawl on cache hit

When crawl() returns cached results it never fires onProgress, so allFiles and chunks are both empty here. This fallback then calls crawl() a second time — also a cache hit, so it works, but every cached boot does two cache lookups for no reason.

Fix: either have crawl() fire onProgress on the cache-hit path, or capture the return value of the first crawl() call directly instead of relying solely on the onProgress side-channel.

Fixed in ea82f9d. crawl() doesn't fire onProgress on cache hits, so the old code relied on a second call (which also hit the cache but was waste). Now startCrawl captures the first call's return value and reconciles only when streamed === false:

let streamed = false; const full = await crawl({ ..., onProgress: (chunk) => { streamed = true; /* push */ } }); if (!streamed) { for (const p of full) this.allFiles.push(p); }

One crawl call, one cache lookup on the hit path. The rationale for preferring this over having crawl() fire onProgress on cache hit: the cache exists precisely because we already know the full list — synthesising streaming chunks out of it would be a latency/memory overhead for no benefit. Keeping the return-value path cleaner.

yiliang114 · 2026-04-21T07:06:25Z

+      // time the hook is used.
+      if (config.getEnableRecursiveFileSearch() !== false) {
+        try {
+          FileIndexService.for({


Prewarm options duplicated with useAtCompletion

These options must be byte-for-byte identical to fileSearchOptions in useAtCompletion.ts:202 to hit the same FileIndexService singleton (keyed by sha256). The "must match" comment is correct but there's no compile-time or runtime guard — if one side drifts, you silently spawn two workers and the prewarm is wasted.

Suggestion: extract a shared buildFileSearchOptions(config) helper, use it in both places.

Fixed in ea82f9d. Extracted buildFileSearchOptions(config, cwd) into useAtCompletion.ts (exported) and rewired both call sites:

AppContainer.tsx prewarm: FileIndexService.for(buildFileSearchOptions(config, config.getTargetDir()))

useAtCompletion.ts hook: const fileSearchOptions = useMemo(() => buildFileSearchOptions(config, cwd), [config, cwd])

Both sites now go through the same helper, so a new field in FileSearchOptions is a compile-time change in one place. The "must match" drift risk is gone.

yiliang114 · 2026-04-21T07:06:25Z

+      if (refreshTimer) clearTimeout(refreshTimer);
+      unsubscribe();
+    };
+    // eslint-disable-next-line react-hooks/exhaustive-deps


eslint-disable hiding stale-closure risk

fileSearchOptions is a new object every render, so it can't go in deps without infinite re-runs — hence the disable. Same issue at line 333.

The problem: anyone adding a field to fileSearchOptions or changing the derivation logic won't get an eslint warning, and the effect silently goes stale.

Suggestion: wrap fileSearchOptions in useMemo([cwd, config]) so the reference is stable, then add it to deps and drop both eslint-disables.

Fixed in ea82f9d. Took the useMemo([config, cwd]) route you suggested:

const fileSearchOptions = useMemo( () => buildFileSearchOptions(config, cwd), [config, cwd], );

Both useEffects now list fileSearchOptions in their deps and the two eslint-disable react-hooks/exhaustive-deps directives are gone. A field added to FileSearchOptions (via the shared buildFileSearchOptions helper) will now trigger a clean re-render if config/cwd change, without the stale-closure hazard.

yiliang114 · 2026-04-21T07:06:25Z

+ * excluded. Cheap in practice — a filesystem tree has orders of magnitude
+ * fewer directories than files.
+ */
+async function enumerateEmptyDirs(


enumerateEmptyDirs may negate the rg speed gain on large trees

This does a full recursive fs.readdir over every directory to find empties. On ~/ (the headline use case) that's potentially tens of thousands of readdir calls. It also doesn't inherit rg's .gitignore pruning — if node_modules slips through fileFilter, all its subdirectories get visited.

Questions:

Do we have a benchmark for this function on ~/? The PR's before/after comparison doesn't isolate it.

Does the @-picker actually need empty directories in results? If not, this entire function can go.

If it stays, it should at least use the directory-level filter for pruning, not just fileFilter.

Good questions — let me take them in order:

(3) "Use the directory-level filter for pruning, not just fileFilter" — actually the current code does. enumerateEmptyDirs calls fileFilter(dirPath) where dirPath = cwdRelative + '/', and buildRipgrepFileFilter routes trailing-slash entries to dirIgnore:

// ripgrepCrawler.ts:372 if (p.endsWith('/')) { return dirIgnore(p); }

So .gitignore / .qwenignore / user ignoreDirs entries that matched as directories prune whole subtrees — e.g. node_modules/ never gets recursed into. The filter is the same one rg applies; it just runs in JS here because rg didn't emit the dir itself.

(2) "Does the @-picker actually need empty directories in results?" — yes, via two callers:

@-picker sorting in fileSearch.ts:78-79 distinguishes dirs from files (dirs ranked first for the same score tier), so dropping dir entries changes the suggestion order.

The existing crawler.test.ts asserts dir entries in 8 places (e.g. 'src/', 'build/public/', 'level1/level2/'). Removing them would be a contract break.

(1) "Benchmark for this function on ~/" — the before/after numbers I cited (fdir 9s → rg 2.5s) include enumerateEmptyDirs in the rg total. I'll do an isolated measurement on ~50k-dir homes when I next profile, but it's bounded by (#directories) not (#files), and each visit is one readdir with withFileTypes: true — no per-entry stat. On my own home dir (~40k dirs) it contributed ~250ms of the 2.5s total.

Separately in this commit, I did add a real speed optimisation addressing the same underlying concern (see #5 below): collectRipgrepExcludeDirs now forwards .gitignore/.qwenignore/user ignoreDirs directory patterns to rg as --glob '!dir' args. That prunes subtrees at the rg walker itself instead of streaming every path under them for the Node post-filter (and enumerateEmptyDirs) to discard. Matters a lot for build/, dist/, .cache/ etc.

Happy to revisit the whole empty-dir synthesis in a separate perf PR if profiling reveals it's actually the bottleneck — but I'd prefer to keep this one scoped to CR follow-ups so it doesn't grow further.

yiliang114 · 2026-04-21T07:06:25Z

+ * `.qwenignore` directory rules). This is a superset; the post-filter
+ * below enforces the exact semantics.
+ */
+function collectRipgrepExcludeDirs(_options: CrawlOptions): string[] {


collectRipgrepExcludeDirs is a no-op — rg doesn't know about .qwenignore dirs

All .qwenignore directory exclusions are handled by the Node-side post-filter. That means rg still walks and emits every path under e.g. build/ or dist/, only for Node to discard them line by line.

At minimum we should forward ignoreDirs as --glob '!dir' args. That's low-hanging fruit and directly benefits the large-tree scenario this PR targets. The "left as a hook for a later perf pass" comment feels like a TODO that should be resolved before merge.

Fixed in ea82f9d. collectRipgrepExcludeDirs is no longer a no-op:

function collectRipgrepExcludeDirs(options: CrawlOptions): string[] { // pull trailing-slash dir patterns (build/, dist/, .git/, user ignoreDirs // — loadIgnoreRules normalises them into 'name/' patterns) out of the // ignore fingerprint and emit them as plain names for rg's --glob !name. // Patterns with glob metachars (*, ?, [, !) or slashes are skipped so we // don't confuse rg — the post-filter still catches those. }

The result is passed into ripgrepCrawl via extraExcludeDirs, which was already wiring each entry into --glob '!<name>'. So on a typical project tree rg now prunes build/, dist/, .cache/, user ignoreDirs etc. at the walker instead of streaming every path under them for the Node filter to reject.

Kept the post-filter (buildRipgrepFileFilter) as the source of truth so semantics are unchanged — the --glob hints are a pure speed optimisation. The "left as a hook for a later perf pass" comment is gone.

yiliang114 · 2026-04-21T07:06:26Z

+import { AbortError } from './fileSearch.js';
+import { loadIgnoreRules } from './ignore.js';
+
+type WorkerRequest =


WorkerRequest / WorkerResponse duplicated across two files

Same types defined here and in fileIndexWorker.ts. If someone adds a message type to one side but not the other, the protocol silently diverges — TS won't catch it since they compile independently.

Suggestion: extract to a shared fileIndexProtocol.ts.

Fixed in ea82f9d. Extracted to packages/core/src/utils/filesearch/fileIndexProtocol.ts:

export type WorkerRequest = /* ... */; export type WorkerResponse = /* ... */;

Both fileIndexService.ts (main thread) and fileIndexWorker.ts (worker thread) now import type from this shared module. A new message variant added only on one side is now a compile error on the other — TS flags the unhandled discriminant in the switch.

yiliang114 · 2026-04-21T07:06:26Z

+ * failure (binary missing, unexpected exit). We retry fdir for subsequent
+ * crawls without paying the spawn-and-fail cost every time.
+ */
+let ripgrepDisabled = false;


ripgrepDisabled is permanent for the process lifetime

One spawn failure (transient resource exhaustion, sandbox race, etc.) permanently degrades to fdir. Fine for CLI, risky for long-lived hosts like the VSCode extension.

Suggestion: add a cooldown (e.g. retry after 5 minutes), or reset the flag when FileIndexService creates a new instance.

Fixed in ea82f9d. Replaced the permanent flag with a timestamp + 5-minute cooldown:

const RIPGREP_DISABLED_COOLDOWN_MS = 5 * 60 * 1000; let ripgrepDisabledAt = 0; function isRipgrepDisabled(): boolean { if (ripgrepDisabledAt === 0) return false; if (Date.now() - ripgrepDisabledAt >= RIPGREP_DISABLED_COOLDOWN_MS) { ripgrepDisabledAt = 0; return false; } return true; }

On failure we set ripgrepDisabledAt = Date.now(), which disables rg for the next 5 min only — after that the next crawl retries. Long-lived hosts (VSCode extension) recover automatically from transient sandbox/resource-exhaustion races instead of staying degraded for the rest of the session.

yiliang114 · 2026-04-21T07:06:26Z

+  snapshotSize: number;
+}
+
+const INSTANCES = new Map<string, FileIndexService>();


INSTANCES map has no capacity bound

Stale-key eviction handles same-projectRoot churn, but switching across different directories (multi-root workspace, frequent cd) accumulates workers with no LRU or cap. Each worker is ~10-30 MB.

Suggestion: add a simple cap (e.g. 5), dispose the oldest when exceeded.

Fixed in ea82f9d. MAX_INSTANCES = 8 cap with LRU eviction:

static for(options: FileSearchOptions): FileIndexService { const key = optionsKey(options); const existing = INSTANCES.get(key); if (existing && !existing.disposed) { // Touch: re-insert to mark as most-recently-used (Map preserves // insertion order). INSTANCES.delete(key); INSTANCES.set(key, existing); return existing; } // ... create new instance, then: while (INSTANCES.size > MAX_INSTANCES) { const oldestKey = INSTANCES.keys().next().value; // ... void victim.dispose(); } }

Using Map's insertion-order property: .for() hits re-insert the entry so it becomes newest, and overflow victims come from keys().next() which is the oldest untouched instance. dispose() tears down the worker properly so the ~10-30 MB per entry gets reclaimed.

Added a regression test (fileIndexService.test.ts → "evicts the oldest instance when the LRU cap is exceeded") that spawns 9 distinct-projectRoot instances and asserts the first one is disposed while the 9th is still usable.

Picked 8 conservatively: nobody realistically has more than a handful of active project roots at once, and it's well above the typical multi-root workspace / cd-churn steady state.

yiliang114 · 2026-04-21T07:06:26Z

+    // Race terminate against a short timeout so a faulted worker can't hang
+    // dispose() indefinitely. `terminate()` normally resolves in well under
+    // 100ms; 2s is generous enough that healthy workers always win.
+    await Promise.race([


dispose() timeout doesn't guarantee termination

The Promise.race returns after 2s, but if terminate() hasn't completed, the worker is still alive. The dangling promise resolves into nothing — no crash, but a potential leak.

Suggestion: log a warning on timeout, and consider calling worker.terminate() again as a force-kill fallback.

Fixed in ea82f9d. Now tracks the timeout, logs a warning, and re-issues terminate():

let timedOut = false; await Promise.race([ this.transport.terminate(), new Promise<void>((resolve) => setTimeout(() => { timedOut = true; resolve(); }, 2000), ), ]); if (timedOut) { console.warn( '[FileIndexService] worker terminate() timed out after 2s; retrying force-kill', ); void this.transport.terminate().catch(() => {}); }

worker_threads' terminate() is idempotent, so a second call just queues another tear-down attempt — best-effort without re-blocking the caller. The warning is the "you now have a zombie" diagnostic you asked for; the retry is the force-kill fallback. Keeping the overall dispose() non-blocking on a hung worker (we intentionally don't await the retry) so the caller can proceed even if the worker is genuinely wedged.

yiliang114 · 2026-04-21T07:06:26Z

+  };
+}
+
+let transportFactory: (options: FileSearchOptions) => IndexTransport = process


process.env['VITEST'] for transport selection is fragile

After esbuild bundling this is always undefined — dead code. Other test runners (Jest, Mocha) would get the real Worker transport, potentially causing flaky tests. The existing __setIndexTransportFactory hook shows the author already felt this wasn't clean.

Suggestion: use explicit DI (e.g. an optional transport param on FileIndexService.for()) instead of env sniffing.

Fixed in ea82f9d. Replaced env sniffing with an explicit DI function:

// fileIndexService.ts let transportFactory: (options: FileSearchOptions) => IndexTransport = createWorkerTransport; export function installInProcessIndexTransport(): () => void { return __setIndexTransportFactory(createInProcessTransport); }

(Renamed from useInProcessIndexTransport so eslint's react-hooks/rules-of-hooks doesn't flag it at module-level call sites.)

Wired from each package's vitest setup:

// packages/core/test-setup.ts and packages/cli/test-setup.ts import { installInProcessIndexTransport } from '@qwen-code/qwen-code-core'; installInProcessIndexTransport();

Survives esbuild bundling (no process.env lookup that gets dead-code-eliminated), works under Jest/Mocha/etc. the same way, and external embedders can opt into the in-process backend for hardened sandboxes without touching env vars. The existing __setIndexTransportFactory hook is still there for tests that want a custom fake.

- fileIndexCore: drop the redundant second crawl() on cache hits; use the first call's return value when onProgress never fires. - fileIndexProtocol: new module holding WorkerRequest/WorkerResponse so fileIndexService and fileIndexWorker can't silently diverge. - fileIndexService: LRU cap (8) on the INSTANCES singleton Map; hits re-insert to refresh recency. Dispose timeout warns + retries terminate() as a force-kill fallback. - fileIndexService: drop `process.env['VITEST']` transport sniffing in favour of an explicit `installInProcessIndexTransport()` DI hook wired from each package's test-setup. Survives bundling. - crawler: `ripgrepDisabled` is now a timestamp with a 5-minute cooldown (was permanent-for-process-lifetime). A single spawn failure no longer downgrades long-lived hosts like the VSCode extension for the whole session. - crawler: collectRipgrepExcludeDirs now forwards plain directory patterns (.git/, build/, user ignoreDirs) to rg as `--glob '!dir'` args so rg prunes subtrees at the walker instead of streaming every path under them for the Node post-filter to reject. - useAtCompletion: fileSearchOptions is useMemo'd on [config, cwd] and both effects have it in their deps — the eslint-disables are gone. Extracted `buildFileSearchOptions(config, cwd)` helper that AppContainer uses too, so the prewarm and search paths can't drift from the same FileIndexService singleton key. Tests: - fileIndexService.test: new "evicts the oldest instance when the LRU cap is exceeded" regression test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Conflict in packages/cli/src/ui/components/Composer.tsx: upstream added `HistoryItemToolGroup` type import to support per-subagent token attribution, while this branch removed the `ConfigInitDisplay` banner as part of the pre-warm UX change. Resolved by keeping the upstream type import and dropping `ConfigInitDisplay` (import + JSX usage), which is the intended behaviour of this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

wenshao

packages/cli/src/ui/hooks/useAtCompletion.test.ts:230

[Critical] realFileSearch.search(...args) now triggers TS2556 because args is inferred as a generic rest array, but search() has a fixed (pattern, options?) signature. This makes the PR fail typecheck on a file changed here.

Suggested fix: replace the spread call with an explicit call such as realFileSearch.search(args[0], args[1]), or type the mock implementation parameters as the exact tuple for search.

— gpt-5.4 via Qwen Code /review

callmeYe · 2026-04-21T15:04:41Z

Thanks for the extra pass, @wenshao! 🙏 I think this one is a false positive from the /review agent though — let me lay out the evidence:

1. The flagged code is not touched by this PR

git blame packages/cli/src/ui/hooks/useAtCompletion.test.ts -L 225,232:

2c6794feed (Arya Gummadi 2025-08-25 17:27:36 -0700 227)         search: vi
2c6794feed (Arya Gummadi 2025-08-25 17:27:36 -0700 228)           .fn()
2c6794feed (Arya Gummadi 2025-08-25 17:27:36 -0700 229)           .mockImplementation(async (...args) =>
2c6794feed (Arya Gummadi 2025-08-25 17:27:36 -0700 230)             realFileSearch.search(...args),
2c6794feed (Arya Gummadi 2025-08-25 17:27:36 -0700 231)           ),

These lines were written by @AryaGummadi on 2025-08-25, ~7 months before this PR was opened. git log HEAD..upstream/main -- packages/cli/src/ui/hooks/useAtCompletion.test.ts returns nothing from this branch either — the file is not in the branch's changeset.

2. Local typecheck is clean

cd packages/cli && npx tsc --noEmit → 0 errors. Root tsconfig has strict: true + noImplicitAny, strictFunctionTypes, strictBindCallApply, noPropertyAccessFromIndexSignature, all the usual suspects turned on, so the local run matches CI.

3. Why TS2556 doesn't fire here

FileSearch.search is (pattern: string, options?: SearchOptions) => Promise<string[]>. The spread works because mockImplementation is typed as (...args: Parameters<T>) => ReturnType<T>; TS resolves args to the exact tuple [string, SearchOptions?], not a generic rest array. So realFileSearch.search(...args) matches the fixed signature.

A TS2556 would only fire if args were typed as any[] / unknown[] (no tuple preserved). That could happen if the surrounding context broke Parameters<T> inference — but here the vi.fn() is contextually typed via the FileSearch interface's search property, so the tuple is preserved.

4. Tests and typecheck both green

$ npm test --workspace=packages/cli -- --run src/ui/hooks/useAtCompletion.test.ts
Test Files  1 passed (1)
Tests  13 passed (13)

Happy to send you the full tsc --noEmit output and the test run log if helpful. If you're seeing a different result in CI, could you share the CI tsc output? It'd help pinpoint whether there's a genuine config drift we need to investigate vs. the agent hallucinating a signature mismatch.

wenshao

No issues found. LGTM! ✅ — gpt-5.4 via Qwen Code /review

Re-reviewing: 2 bare-mode tests in config.test.ts fail locally because test-setup.ts now eagerly imports from @qwen-code/qwen-code-core, which loads workspaceContext.ts with real fs before vi.mock('fs') can apply. CI Test jobs have also been stuck ~10h on "Run tests and generate reports" across all OS/Node matrix entries, plausibly same root cause (eager worker/timer handles keeping vitest from exiting). Please defer the core import until the test that actually needs the worker transport.

wenshao · 2026-04-21T22:21:36Z

Took another pass at this locally, dismissed the previous approve. Two related things need addressing:

1. `--bare` mode tests broken by this PR

Two bare-mode tests in packages/cli/src/config/config.test.ts fail locally:

loadCliConfig with includeDirectories > should ignore implicit startup context inputs in bare mode
loadCliConfig > should force minimal startup behavior in bare mode

Failure:

expected [] to deeply equal [ '/home/user/project', '/cli/path1' ]

Root cause (reproducible): the new lines in packages/cli/test-setup.ts:26-27

import { installInProcessIndexTransport } from '@qwen-code/qwen-code-core';
installInProcessIndexTransport();

eagerly evaluate the entire core module tree at setup-file time. That loads packages/core/src/utils/workspaceContext.ts with the real node:fs bound to its fs reference. config.test.ts later registers vi.mock('fs', ...), but by then workspaceContext is already holding the un-mocked fs — so WorkspaceContext.addDirectory calls resolveAndValidateDir, fs.existsSync('/home/user/project') returns false against the real filesystem, addDirectory silently catches Directory does not exist, and getDirectories() ends up empty.

Repro:

cd packages/cli && npx vitest run src/config/config.test.ts -t "should ignore implicit startup"
# FAIL

Commenting out those two lines in test-setup.ts — or even swapping them for import { FatalError } from '@qwen-code/qwen-code-core' — the test passes. So it isn't the install call itself, it's any top-level import from core in test-setup.ts pulling workspaceContext.ts in before vi.mock('fs') has a chance to run.

Also worth noting: git diff main -- packages/cli/src/config/config.ts packages/cli/src/config/config.test.ts is empty, so this isn't a bare-mode logic regression — it's a pure test-setup side effect introduced by this PR.

2. CI Test matrix stuck for ~10h

On https://github.com/QwenLM/qwen-code/actions/runs/24722612700, all 9 Test (<os>-<node>) jobs have been stuck on "Run tests and generate reports" since 2026-04-21T20:23:00Z. Lint + CodeQL are green.

Likely same root cause as #1: installInProcessIndexTransport() runs in the global setup phase, swaps the transport factory for the in-process variant, and the open handles it creates further down the line (e.g. the 2 s setTimeout dispose-guard at fileIndexService.ts:469, or inflight AbortControllers) may not be released when vitest tries to exit — the process hangs waiting for open handles. Locally the default pool behaviour might not repro, but the matrix (Windows especially) is more sensitive.

Suggested fixes

Don't call installInProcessIndexTransport from test-setup.ts. Scope it to tests that actually use FileIndexService via beforeAll / afterAll:

// e.g. in useAtCompletion.test.ts
import { beforeAll, afterAll } from 'vitest';
let restore: (() => void) | null = null;
beforeAll(async () => {
  const { installInProcessIndexTransport } = await import('@qwen-code/qwen-code-core');
  restore = installInProcessIndexTransport();
});
afterAll(() => {
  restore?.();
});

That way config.test.ts (and any other file that doesn't need the transport) never pulls core / workspaceContext.ts into the module graph before its vi.mock('fs') applies.

If a global install is genuinely required, at minimum switch to a dynamic import:
```
await import('@qwen-code/qwen-code-core').then(m => m.installInProcessIndexTransport());
```
(Though this doesn't fully solve the ordering issue — option 1 is cleaner.)
Investigate the CI hang independently: run locally with vitest --detectOpenHandles to catch any lingering setTimeout / Worker / AbortController handles. The 2 s race in FileIndexService.dispose should probably be exercised in a teardown hook during tests.
Would be worth adding a smoke test ensuring that test-setup imports from core don't poison module-level fs mocks — prevents this kind of regression in the future.

Happy to take another look once these land and CI goes green.

@wenshao

…imers Addressing @wenshao's two findings on QwenLM#3455: 1. `--bare` mode tests failing in `packages/cli/src/config/config.test.ts` Root cause (correctly diagnosed): the top-level `installInProcessIndexTransport()` call in `test-setup.ts` eagerly evaluated `@qwen-code/qwen-code-core`'s module tree, pulling `workspaceContext.ts` in with a real `node:fs` binding. Later `vi.mock('fs', …)` declarations in individual test files no longer took effect, so `fs.existsSync()` fell through to the real FS and silently dropped mock directories. Fixed by removing the install from both `test-setup.ts` files and opt-ing in per-test-file (`beforeAll`/`afterAll`) in the three suites that actually exercise the worker-backed filesearch: - `packages/core/src/utils/filesearch/fileIndexService.test.ts` - `packages/core/src/utils/filesearch/fileSearch.test.ts` (also drops live singletons between tests so Windows rmdir doesn't race an open ripgrep child) - `packages/cli/src/ui/hooks/useAtCompletion.test.ts` The two `test-setup.ts` files now carry an explanatory comment so this doesn't regress. 2. CI matrix stuck for ~10h on "Run tests and generate reports" Likely cause: pending `setTimeout` handles keeping the event loop alive past test completion. - `FileIndexService.dispose()` raced `terminate()` against a 2 s `setTimeout` but never cleared the timer on the healthy-exit path — so after every disposal the process held a timer for up to 2 s. Now the terminate() arm clears the timer, and the timer is also `.unref()`'d as a belt-and-suspenders. - `crawlCache` TTL timers (default 30 s) are now `.unref()`'d too. They'd keep vitest workers waiting on exit even after all assertions passed. `clear()` still drops both synchronously between tests, so correctness is unchanged. Verified: - `npx vitest run src/config/config.test.ts -t "should ignore implicit startup"` now passes (was failing before). - Full test suites — core 5925 / 5927, cli 4275 / 4282 (skipped tests pre-existing), vscode-ide-companion 204 / 205 — all green. - Typecheck clean on core + cli. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

callmeYe · 2026-04-22T02:12:26Z

Thanks so much for the careful follow-up, @wenshao 🙏 — both findings were spot on. Fixed in 44fcc11.

1. `--bare` mode tests ✅

Your diagnosis was exactly right: the top-level installInProcessIndexTransport() call in test-setup.ts pulled @qwen-code/qwen-code-core's module tree into every test file's graph before any per-file vi.mock('fs', …) had a chance to run, so workspaceContext.ts was already closed over the real node:fs.

I took your Option 1 — scope the install to per-test-file beforeAll/afterAll hooks. The three suites that actually exercise the worker-backed filesearch got:

let restoreTransport: (() => void) | null = null;
beforeAll(() => {
  restoreTransport = installInProcessIndexTransport();
});
afterAll(() => {
  restoreTransport?.();
  restoreTransport = null;
});

applied in:

packages/core/src/utils/filesearch/fileIndexService.test.ts
packages/core/src/utils/filesearch/fileSearch.test.ts (also added singleton resets between tests so Windows rmdir doesn't race an open rg child)
packages/cli/src/ui/hooks/useAtCompletion.test.ts

Both test-setup.ts files now carry an explanatory comment so this doesn't regress. Verified locally:

$ cd packages/cli && npx vitest run src/config/config.test.ts -t "should ignore implicit startup"
Tests  1 passed | 177 skipped (178)
$ cd packages/cli && npx vitest run src/config/config.test.ts -t "should force minimal startup"
Tests  1 passed | 177 skipped (178)

Both previously-failing bare-mode tests now green.

2. CI matrix hang ✅

Ran the filesearch suite under the same kind of scrutiny and found two lingering setTimeout handles keeping the event loop alive past test completion:

A. FileIndexService.dispose() 2 s race timer (fileIndexService.ts:469)

The Promise.race would resolve from transport.terminate() on the healthy path, but the 2 s fallback timer was never cleared — so every dispose() held the event loop open for up to 2 s after the test assertions passed. Fixed:

await Promise.race([
  this.transport.terminate().then(() => {
    if (timer) clearTimeout(timer);   // healthy-exit clears the timer
  }),
  new Promise<void>((resolve) => {
    timer = setTimeout(() => { timedOut = true; resolve(); }, 2000);
    timer.unref?.();                  // belt-and-suspenders
  }),
]);

B. crawlCache TTL timer (30 s default, crawlCache.ts:54)

The cache-eviction timer had no .unref(), so whenever a test triggered a cacheable crawl the process held a handle for up to 30 s. Same fix — timerId.unref?.() right after setTimeout. clear() still drops synchronously between tests so correctness is unchanged.

Locally, filesearch-suite wall time went from ~7 s + a handful of seconds of hanging to 5.18 s clean exit. Matches your --detectOpenHandles intuition — thank you for that pointer.

Test matrix

core:                  5925 / 5927  (2 pre-existing skips)
cli:                   4275 / 4282  (7 pre-existing skips)
vscode-ide-companion:   204 /  205  (1 pre-existing skip)

All green. Typecheck clean on both packages. CI should unstick once 44fcc11 lands — let me know if the matrix still shows anything odd.

Re: suggestion 4 (smoke test guarding test-setup fs-mock poisoning)

Good idea in principle, but I'd want to noodle on the right mechanism — asserting "no core imports" in test-setup.ts from within the test harness is a bit of a chicken-and-egg. Could be a simple lint rule against importing from @qwen-code/qwen-code-core (or relative paths into core) in any test-setup.ts. Happy to send that as a separate PR if it'd be useful — wanted to keep this one focused on unblocking the current CI.

Thanks again for the detailed writeup — the repro steps and diagnosis saved me significant time.

callmeYe added 9 commits April 20, 2026 10:29

callmeYe force-pushed the feat/filesearch-worker-p1 branch from 11823ba to 3fdc49c Compare April 20, 2026 02:34

yiliang114 reviewed Apr 20, 2026

View reviewed changes

yiliang114 requested changes Apr 20, 2026

View reviewed changes

wenshao requested changes Apr 20, 2026

View reviewed changes

callmeYe requested review from wenshao and yiliang114 April 20, 2026 07:50

yiliang114 reviewed Apr 21, 2026

View reviewed changes

callmeYe requested a review from yiliang114 April 21, 2026 08:22

wenshao requested changes Apr 21, 2026

View reviewed changes

callmeYe requested a review from wenshao April 21, 2026 15:06

wenshao previously approved these changes Apr 21, 2026

View reviewed changes

github-actions Bot mentioned this pull request Apr 22, 2026

📊 AI CLI 工具社区动态日报 2026-04-22 gsscsd/big_model_radar#225

Open

yiliang114 added the TBD To Be Discussed label Apr 22, 2026

callmeYe requested a review from wenshao April 22, 2026 07:16

Tree size	fdir	rg via Node	Winner
~2.7k files (qwen-code)	~25 ms	~140 ms	fdir
~48k (node_modules)	~640 ms	~1.8 s	fdir
~100k (home dir cap)	~9 s	~2.5 s	rg 3–4×

Conversation

callmeYe commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TLDR

Screenshots / Video Demo

Dive Deeper

Reviewer Test Plan

Testing Matrix

Linked issues / bugs

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yiliang114 left a comment

Choose a reason for hiding this comment

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

Uh oh!

callmeYe commented Apr 20, 2026

1. FileMessageHandler.ts — cache-invalidation race ✅

2–4. config.ts (--bare), swarm.ts, webui/src/index.ts

Also in b15c289

Uh oh!

wenshao left a comment

Choose a reason for hiding this comment

Uh oh!

callmeYe commented Apr 20, 2026

1. Unused buildCaseInsensitiveGlob helper ✅ removed

2. Dot-access on Record<string, unknown> ✅ converted to bracket syntax

Uh oh!

yiliang114 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yiliang114 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

callmeYe commented Apr 20, 2026 •

edited

Loading

1. `FileMessageHandler.ts` — cache-invalidation race ✅

2–4. `config.ts` (`--bare`), `swarm.ts`, `webui/src/index.ts`

Also in `b15c289`

1. Unused `buildCaseInsensitiveGlob` helper ✅ removed

2. Dot-access on `Record<string, unknown>` ✅ converted to bracket syntax

yiliang114 left a comment •

edited

Loading

wenshao commented Apr 21, 2026 •

edited

Loading

1. `--bare` mode tests broken by this PR

1. `--bare` mode tests ✅