feat(webui): add daemon web renderers by chiga0 · Pull Request #1 · chiga0/qwen-code

chiga0 · 2026-05-20T03:11:09Z

Summary

What changed: removed the standalone packages/daemon-web app and moved the reusable daemon web renderers into @qwen-code/webui.
Why it changed: keep the daemon web work as a clean reusable client/rendering layer instead of introducing another host application in this repository.
Reviewer focus: package boundary, whether host-app concerns stay out of the shared layer, daemon-ui-core consumption, xterm ownership, and isolation from native TUI/channel/IDE flows.

Current Split

The PR now keeps three layers separate:

@qwen-code/sdk / daemon UI core: daemon HTTP/SSE communication, typed daemon events, UI-event normalization, transcript state, selectors, permission/action contracts.
@qwen-code/webui: reusable React renderers for daemon transcript state, currently DaemonWebChat and DaemonWebTerminal, plus fixture helpers for host apps and tests.
Host app: owns session list/sidebar, page layout, base URL/token/workspace inputs, model switcher UI, routing, deployment, and product styling.

A third-party host can embed the shared pieces like this:

import {
  DaemonSessionProvider,
  DaemonWebChat,
  DaemonWebTerminal,
} from '@qwen-code/webui';

<DaemonSessionProvider baseUrl="/daemon" workspaceCwd={workspace}>
  <DaemonWebChat />
  <DaemonWebTerminal />
</DaemonSessionProvider>;

Validation

Commands run:

npm run typecheck --workspace=@qwen-code/webui
npm run lint --workspace=@qwen-code/webui
cd packages/webui && npx vitest run src/daemon/transcriptAdapter.test.ts src/daemon/daemonWebRenderers.test.ts
npm run build --workspace=@qwen-code/webui
cd packages/cli && npx vitest run src/serve/httpAcpBridge.test.ts

Expected result:
- Webui daemon renderers compile and build as reusable package exports.
- Chat renderer maps transcript blocks into the existing shared chat viewer.
- Terminal renderer consumes the same transcript blocks and provides a semantic xterm surface.
- Fixture helpers cover user, thought, assistant, AskUserQuestion, permission, shell, and status events without requiring a daemon session.
- npm run dev -- serve can spawn the dev ACP child through tsx instead of failing on TypeScript-source imports.
Observed result:
- Webui typecheck, lint, focused tests, and build passed.
- CLI bridge focused test passed 174 tests.

Scope / Risk

This does not mount a production /web route and does not add a separate daemon web app package.
This does not change native qwen TUI, --acp, channel, or IDE default behavior.
@xterm/xterm is intentionally owned by the React terminal renderer layer, not by sdk or daemon UI core. A future packaging pass can add a dedicated @qwen-code/webui/daemon subpath or separate renderer package if bundle/install-size pressure requires it.
The bridge runtime change only affects dev mode when a TypeScript CLI entry is spawned from npm run dev -- serve; built deployments continue using raw Node.

Testing Matrix

	🍏	🪟	🐧
npm run	✅	⚠️	⚠️
npx	N/A	N/A	N/A
Docker	N/A	N/A	N/A
Podman	N/A	N/A	N/A
Seatbelt	N/A	N/A	N/A

Testing matrix notes:

Verified on macOS with package-level build/type/lint/unit checks.
Windows/Linux were not available in this local environment.

Linked Issues / Bugs

chiga0 · 2026-05-20T03:11:37Z

Generated by GPT-5.5 model

E2E validation report for the daemon web client POC:

Package checks passed:
- npm run typecheck --workspace=@qwen-code/daemon-web
- npm run lint --workspace=@qwen-code/daemon-web
- npm run test --workspace=@qwen-code/daemon-web — 4 tests passed
- npm run build --workspace=@qwen-code/daemon-web
- cd packages/cli && npx vitest run src/serve/httpAcpBridge.test.ts — 174 tests passed
Browser fixture validation passed with Chrome-channel Playwright:
- http://127.0.0.1:5180/?fixture=1 rendered user, thought, assistant, AskUserQuestion, permission, shell, and status blocks.
- Permission tray rendered and page-level scroll remained zero.
- No browser console errors or warnings.
Browser live daemon validation passed with Chrome-channel Playwright:
- npm run dev -- serve --port 4172 --hostname 127.0.0.1
- QWEN_DAEMON_PROXY_TARGET=http://127.0.0.1:4172 npm run dev --workspace=@qwen-code/daemon-web -- --port 5180
- Page connected through /daemon, composer became enabled, and page-level scroll remained zero.
- No browser console errors or warnings.

Notes:

This PR is stacked on feat/daemon-ui-core / feat(daemon): add shared UI transcript layer QwenLM/qwen-code#4328.
It intentionally does not change native TUI, channel, or IDE default behavior.

chiga0 · 2026-05-20T03:55:30Z

Generated by GPT-5.5 model

Updated the PR to match the cleaner package split we discussed:

Removed the standalone packages/daemon-web workspace package.
Moved reusable daemon browser rendering into @qwen-code/webui under packages/webui/src/daemon.
Kept host-app responsibilities out of the shared layer: session/sidebar routing, base URL/token/workspace controls, model switcher UI, page shell, and deployment are still expected to live in the embedding app.
Exported reusable pieces from webui: DaemonWebChat, DaemonWebTerminal, transcript-to-chat mapping helpers, and deterministic daemon web fixtures.
Preserved the dev-only ACP bridge fix so npm run dev -- serve can spawn TypeScript-source CLI children through tsx.

Validation after the split:

npm run typecheck --workspace=@qwen-code/webui
npm run lint --workspace=@qwen-code/webui
cd packages/webui && npx vitest run src/daemon/transcriptAdapter.test.ts src/daemon/daemonWebRenderers.test.ts
npm run build --workspace=@qwen-code/webui
cd packages/cli && npx vitest run src/serve/httpAcpBridge.test.ts

All passed locally. The PR is now renderer/package-boundary focused rather than a standalone web app POC.

… mechanical lift + BridgeFileSystem seam) (QwenLM#4319) * refactor(acp-bridge): lift defaultSpawnChannelFactory to acp-bridge/spawnChannel (QwenLM#4175 F1 step 1) First mechanical lift of QwenLM#4175 F1 (acp-bridge package self-sufficiency). Moves the production spawn factory + its `killChild` helper + `SCRUBBED_CHILD_ENV_KEYS` denylist + `KILL_HARD_DEADLINE_MS` constant from `cli/src/serve/httpAcpBridge.ts` (~283 lines) to `@qwen-code/acp-bridge/spawnChannel`. This unblocks `channels/base/AcpBridge.ts` and `vscode-ide-companion`'s acpConnection from each reimplementing the child lifecycle — they can now consume the same primitive. Backward compatible: `cli/src/serve/httpAcpBridge.ts` imports the lifted factory and re-exports it, so existing references in `cli/src/serve/index.ts:90` and the factory's own internal usage (`opts.channelFactory ?? defaultSpawnChannelFactory`) keep resolving. Bridge tests that mock `defaultSpawnChannelFactory` via `BridgeOptions.channelFactory` are unaffected. Side cleanups: drops `spawn` / `ChildProcess` / `Readable` / `Writable` / `ndJsonStream` / `MissingCliEntryError` imports from httpAcpBridge.ts (all only used by the lifted spawn factory). - 44/44 acp-bridge tests pass - 174/174 cli httpAcpBridge tests pass - typecheck clean across acp-bridge + cli 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * refactor(acp-bridge): lift BridgeClient + permission types to acp-bridge/bridgeClient (QwenLM#4175 F1 step 2) Second mechanical lift of QwenLM#4175 F1 (acp-bridge package self-sufficiency). Moves `BridgeClient` class (~700 LOC) + `PendingPermission` interface + `PermissionResolutionRecord` interface + `MAX_RESOLVED_PERMISSION_RECORDS` constant + early-event capacity constants + `describeStatKind` and `sliceLineRange` helpers from `cli/src/serve/httpAcpBridge.ts` to `@qwen-code/acp-bridge/bridgeClient`. Design choice for SessionEntry boundary: introduce a minimal `BridgeClientSessionEntry` interface in bridgeClient.ts with only the four fields BridgeClient actually reads from the factory's richer `SessionEntry` (`sessionId`, `events`, `pendingPermissionIds`, `activePromptOriginatorClientId`). The factory's `SessionEntry` structurally satisfies it — TypeScript's structural typing enforces the match at the `resolveEntry` callback signature, so no explicit conversion is required and the bridge package stays free of daemon-host session-bookkeeping types. Cross-package writeStderrLine handling: inline the 3-line helper in bridgeClient.ts (mirrors the spawnChannel.ts pattern from F1 step 1) so acp-bridge has no reverse dependency on `cli/src/utils/stdioHelpers`. httpAcpBridge.ts shrinks from 4406 LOC to 3647 LOC (-759 lines). Removed ACP SDK imports that only BridgeClient consumed: `Client`, `RequestPermissionRequest`, `WriteTextFileRequest`, `WriteTextFileResponse`, `ReadTextFileRequest`, `ReadTextFileResponse`, `SessionNotification`. Kept the ones the factory still uses (`CancelNotification`, `PromptRequest`, `RequestPermissionResponse`, `SetSessionModelRequest`, `SetSessionModelResponse`). Backward compatible: httpAcpBridge.ts re-exports `BridgeClient`, `BridgeClientSessionEntry`, `PendingPermission`, `PermissionResolutionRecord`, and `MAX_RESOLVED_PERMISSION_RECORDS` so the `ChannelInfo.client: BridgeClient` field declaration below + any embedder reaching into these types keep resolving. - 44/44 acp-bridge tests pass - 174/174 cli httpAcpBridge tests pass - 229/229 cli server tests pass - typecheck clean across acp-bridge + cli 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * refactor(acp-bridge): lift createHttpAcpBridge factory to acp-bridge/bridge (QwenLM#4175 F1 step 3) Third + final mechanical lift of QwenLM#4175 F1 (acp-bridge package self-sufficiency). Moves the `createHttpAcpBridge` factory closure (~3000 LOC) + `ChannelInfo` + `SessionEntry` interfaces + factory-only helpers (`canonicalizeExistingAncestor`, `verifyParentWithinWorkspace`, `withTimeout`, `isServeDebugLoggingEnabled`, `writeServeDebugLine`, `hasControlCharacter`) + factory constants (`DEFAULT_INIT_TIMEOUT_MS`, `MCP_RESTART_TIMEOUT_MS`, `DEFAULT_MAX_SESSIONS`, `MAX_EVENT_RING_SIZE`, `DEFAULT_PERMISSION_TIMEOUT_MS`, `DEFAULT_MAX_PENDING_PER_SESSION`, `MAX_DISPLAY_NAME_LENGTH`) from `cli/src/serve/httpAcpBridge.ts` to `@qwen-code/acp-bridge/bridge`. `cli/src/serve/httpAcpBridge.ts` shrinks from 3647 LOC to 97 LOC — a pure re-export shim that preserves every existing relative import path (`./httpAcpBridge.js`) so `server.ts`, `runQwenServe.ts`, `workspaceAgents.ts`, `workspaceMemory.ts`, `index.ts`, plus the bridge test suite, keep resolving without any call-site changes. The new `bridge.ts` reuses what was already in acp-bridge (errors, types, options, status helpers, channel types, event bus, workspace paths) via local relative imports — no reverse dependency on `cli`. `writeStderrLine` is inlined at the top of `bridge.ts` (same pattern as `spawnChannel.ts` + `bridgeClient.ts` from F1 steps 1-2) so the package self-contained promise holds. Cumulative F1 impact across the 3 mechanical lift steps: - httpAcpBridge.ts: 4682 LOC → 97 LOC (-4585 lines; the original file was 98% bridge core, 2% backward-compat re-exports) - 3 new files in acp-bridge: spawnChannel.ts (~270 LOC), bridgeClient.ts (~745 LOC), bridge.ts (~3515 LOC) - All daemon-host concerns (env snapshot, daemon preflight cells) remain in `cli/src/serve/daemonStatusProvider.ts` and reach the bridge through the `BridgeOptions.statusProvider` seam frozen by PR 22b/2. - 735/735 cli serve tests pass across 17 files - 174/174 cli httpAcpBridge tests pass - 44/44 acp-bridge tests pass - typecheck clean across acp-bridge + cli `packages/cli/src/serve/httpAcpBridge.test.ts` (~6600 LOC) is intentionally NOT moved in this commit — it currently imports `createHttpAcpBridge` / `defaultSpawnChannelFactory` / `BridgeClient` via the cli shim and keeps passing without changes. Moving it to `acp-bridge/src/bridge.test.ts` is a follow-up worth tracking separately so the production-code lift can land + be reviewed cleanly. The `BridgeFileSystem` injection seam (originally bundled into F1 as the 22b' scope) is also deferred to a follow-up so the mechanical lift stays mechanical — design + implementation of the fs injection is its own discussion. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * feat(acp-bridge): add BridgeFileSystem injection seam (QwenLM#4175 F1 step 5, 22b' scope) Adds the `BridgeFileSystem` injection seam originally scoped as QwenLM#4175 22b'. When a `BridgeFileSystem` is wired through `BridgeOptions.fileSystem`, `BridgeClient.readTextFile` and `BridgeClient.writeTextFile` delegate to it instead of running their inline `fs.realpath` / `fs.writeFile` / `fs.readFile` proxy. This unblocks production `qwen serve` plumbing PR 18's `WorkspaceFileSystem` (TOCTOU guards, symlink-substitution checks, trust gate, `.gitignore`, audit hooks) into the ACP fs methods — closing the `ws.ts:613` follow-up thread that has been tracked since PR 18 landed. The serve-side adapter that wraps `WorkspaceFileSystem` + the `runQwenServe` wiring are intentionally split into the immediate-follow-up so this PR stays focused on the seam design. Backward compatible: `fileSystem` is optional on `BridgeOptions`. Tests, Mode A in-process consumers, channels (`packages/channels/base/ AcpBridge.ts`), and the VSCode IDE companion all keep working unchanged — they omit the field and `BridgeClient` falls through to the inline proxy that has been the Stage 1 default since QwenLM#3889. API: - `BridgeFileSystem.readText(params: ReadTextFileRequest): Promise<ReadTextFileResponse>` - `BridgeFileSystem.writeText(params: WriteTextFileRequest): Promise<WriteTextFileResponse>` The interface mirrors ACP SDK request/response types directly so the adapter does the minimum amount of translation (`{ path, content }` ↔ `WorkspaceFileSystem`'s `ResolvedPath` brand types + options bag). - 735/735 cli serve tests pass (inline fallback path preserved) - 44/44 acp-bridge tests pass - typecheck + eslint clean 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * docs(acp-bridge): catch README + stale source comments up to F1 lift Self-review fold-in: post-F1 the package README still said "PR 22a" and listed `BridgeClient` / `createHttpAcpBridge` / `defaultSpawnChannelFactory` under "What's not here yet" — both contradicted by this PR. Updated: - README lift-history table now shows PR 22a / 22b/1 / 22b/2 as merged and F1 (this PR) as the slice that closes the bridge core + adds `BridgeFileSystem`. F3 PR 24 row aligned to the feature-cohesive plan. - "What's here today" now documents `spawnChannel`, `bridgeClient`, `bridge`, `bridgeFileSystem` modules. - "What's not here yet" section removed (its 2 bullets are both resolved by F1). - Subpath import list updated to enumerate all 14 subpaths. - Backward-compat section updated to call out the 97-line shim and the 6 consuming files that still import via `./httpAcpBridge.js`. Source-comment line-number drift: - `channel.ts:12` no longer claims `defaultSpawnChannelFactory` is "still in cli/src/serve/httpAcpBridge.ts" — points to the lifted location. - `permission.ts:33` + `permission.ts:45` no longer reference `httpAcpBridge.ts:1096-1106` / `httpAcpBridge.ts:1003` (file is now 97 lines after F1). Updated to point at the structurally- equivalent locations inside the lifted `bridgeClient.ts`. - `permission.ts:7` no longer says first-responder still lives in `cli/src/serve/httpAcpBridge.ts` — points at the bridgeClient.ts location. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * docs(acp-bridge): adopt 3 Copilot review comments on F1 doc accuracy Folds in 3 of 4 Copilot inline comments from QwenLM#4319 review: 1. `bridgeClient.ts` writeTextFile preserveMode comment said "fall through to umask defaults" for new files, but the code passes `mode: preserveMode?.mode ?? 0o600` to `fs.writeFile`. Updated the "BkwQW" comment + the inner catch-block comment to clarify that new files actually get the `0o600` default applied at writeFile time (NOT umask defaults — the explicit `mode` arg bypasses umask for atomicity per the `Blehd` comment block). 2. `bridgeFileSystem.ts` JSDoc referenced `cli/src/serve/bridgeFileSystemAdapter.ts` as if the file exists, but it's deferred to the immediate F1 follow-up PR. Reworded as "the immediate follow-up PR will land a serve-side adapter" so reviewers don't grep for a non-existent file. 3. `bridgeOptions.ts` `fileSystem` field JSDoc had the same wording issue ("Production `qwen serve` wires this to..."). Same fix — now says "The immediate F1 follow-up will land a serve-side adapter" so the deferred state is obvious. Declined from this review round: - Copilot inline #1 (`spawnChannel.ts:155` stderr forwarder drops empty lines): pre-existing behavior since QwenLM#3889. F1 lifted verbatim — not a regression introduced here. Out of scope for a lift PR. - github-actions bot summary: most items are pre-existing notes (TOCTOU residual race, SCRUBBED_CHILD_ENV_KEYS allowlist concern, sliceLineRange benchmark threshold) on code the F1 lift moved verbatim. One ("httpAcpBridge.ts still has ~3700 LOC") is a false positive — the file is 97 LOC after F1. Others are cosmetic refactors (extract FIXME to tracking issue, ARCHITECTURE_DECISIONS doc system, deprecation timeline) that aren't worth churning the lift PR over. - 44/44 acp-bridge tests pass - typecheck clean 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * docs(acp-bridge): tighten BridgeFileSystem contract + re-export type from shim Self-review + code-reviewer agent fold-in, two changes: 1. `cli/src/serve/httpAcpBridge.ts` shim now re-exports `BridgeFileSystem` from `@qwen-code/acp-bridge/bridgeFileSystem` so the immediate F1 follow-up adapter (in `cli/src/serve/`) can import it via the established `./httpAcpBridge.js` path like every other daemon-side bridge import does. Without this the adapter would need to deep-import from acp-bridge while every other serve file goes through the shim — inconsistent. 2. `BridgeFileSystem.readText` + `writeText` JSDoc now spells out the two defensive gates the inline proxy carried (non-regular- file rejection + 100 MiB buffered-size cap for reads; write-then-rename atomicity + dangling-symlink walk-through + mode preservation + `0o600` new-file default for writes). When a `BridgeFileSystem` is injected, the inline path is FULLY bypassed — without the contract spelled out, a future adapter author could silently drop the `/dev/zero` / 500 MB log RSS defenses the inline path established. Note on F1 CI: this PR targets `daemon_mode_b_main` but the `.github/workflows/ci.yml` `pull_request` trigger is scoped to `branches: main / release/**`, so the main CI workflow (Lint / Test on Linux/macOS/Windows / CodeQL) does NOT run on this PR. This is a by-design side effect of the new feature-cohesive branching strategy — `daemon_mode_b_main → main` periodic merges will trigger the full CI matrix, providing safety net coverage before any F-series work lands on `main`. Locally verified: - 174/174 cli httpAcpBridge tests pass - 44/44 acp-bridge tests pass - 735/735 cli serve tests pass - typecheck clean across acp-bridge + cli 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * test(acp-bridge): cover BridgeFileSystem injection seam + extract shared writeStderrLine (QwenLM#4319 wenshao review) Folds in wenshao review on QwenLM#4319: 1. **[Critical]** zero test coverage for the F1 step 5 `BridgeFileSystem` delegation branches in `BridgeClient.writeTextFile` / `BridgeClient.readTextFile` and the factory's `opts.fileSystem` → constructor positional-arg forwarding. New `packages/acp-bridge/src/bridgeClient.test.ts` adds 6 tests covering: - writeTextFile delegates to injected fileSystem.writeText (inline proxy fully bypassed; `fakeFs.writeText` called with the original params; `readText` mock not invoked) - writeTextFile invalid-path call succeeds purely via the mock when fileSystem is injected (proof that the inline `fs.realpath` path doesn't run) - readTextFile delegates to injected fileSystem.readText - readTextFile propagates injection errors to the caller - inline-fallback regression guard: write actually hits disk via the inline proxy when fileSystem is omitted (real tmp file round-trip) - same for read Why these matter: the 7-arg `BridgeClient` constructor places `fileSystem` at the tail as optional. A reordering — or dropping the arg from `bridge.ts` factory's `new BridgeClient(..., opts.fileSystem)` call — would silently bypass the adapter in production and the inline `fs.writeFile` raw-path would run with no audit / trust / TOCTOU coverage. The delegation tests would catch that because the mock fileSystem would never be invoked. 2. **[Suggestion]** `writeStderrLine` was defined identically in `bridge.ts:117` and `bridgeClient.ts:30` (22 call sites across the two files). Both consumers live in the SAME `@qwen-code/acp-bridge` package, so the original "no reverse-dep on cli" justification doesn't apply within the package. Extracted to `packages/acp-bridge/src/internal/stderrLine.ts` — a single source of truth that future behavior changes (timestamp prefix, log level, structured field) can edit once. `internal/` subpath is intentionally not in `package.json`'s `exports`, keeping the helper package-private. `spawnChannel.ts` deliberately does NOT consume it (its stderr writes use `process.stderr.write(prefix + line + '\n')` directly because each line carries its own `[serve pid=… cwd=…]` line prefix). - 6/6 new BridgeFileSystem-seam tests pass - 50/50 acp-bridge total (44 existing + 6 new) - 174/174 cli httpAcpBridge tests pass (no regression from refactor) - typecheck + eslint clean 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * test(acp-bridge): cover defaultSpawnChannelFactory env scrubbing + fix bridge.ts comment refs (QwenLM#4319 wenshao round 2) Folds in wenshao review on QwenLM#4319 round 2 — 1 Critical + 2 Suggestions: 1. **[Critical] spawnChannel.ts has 0 unit tests, security-critical paths untested.** Now that `defaultSpawnChannelFactory` is a public export of `@qwen-code/acp-bridge`, channels + IDE consumers can't rely on cli-package integration tests for env-scrubbing guarantees. Refactored the inline env-scrubbing logic into a pure exported helper `scrubChildEnv(source, scrubbed, overrides)`. Behavior is byte-identical to the pre-extraction inline implementation; the factory body now reads: const childEnv = scrubChildEnv( process.env, SCRUBBED_CHILD_ENV_KEYS, childEnvOverrides); Added `packages/acp-bridge/src/spawnChannel.test.ts` with 12 tests covering: - shallow-clone (no aliasing into live process.env) - QWEN_SERVER_TOKEN stripping - non-scrubbed vars pass through - override-add a new key - override-replace an existing key - override with undefined deletes the key (PR 14 fix QwenLM#4247 wenshao R5) - override CANNOT re-introduce a scrubbed key (defense in depth) - override CANNOT undo the scrub by setting undefined for a scrubbed key - override-apply-after-scrub ordering invariant - empty overrides equals no overrides - multi-key scrub for forward-compat (the WARNING comment on SCRUBBED_CHILD_ENV_KEYS anticipates a future sandboxed-agent mode expanding the denylist; this verifies the loop already handles that) The killChild SIGTERM→SIGKILL escalation + STDERR_LINE_CAP_CHARS truncation are NOT covered yet — they require either real child processes or extensive node:child_process mocking; both are orthogonal to the env-scrubbing security guarantees wenshao explicitly called out, and can land as a follow-up if anyone wants the full surface tested. 2. **[Suggestion] bridge.ts comments referenced a "consolidated re- export block earlier in this file" that doesn't exist in acp-bridge (only in the cli shim).** Fixed both occurrences (~line 292, ~line 310) to point at the actual local import + the package barrel re-export. 3. **[Suggestion] bridge.ts canonicalizeWorkspace re-export comment referenced `./fs/paths.ts`.** Updated to mention the full lift chain: extracted to `cli/src/serve/fs/paths.ts` in PR 18, then lifted here to `./workspacePaths.ts` in PR 22b/1. - 12/12 new spawn env-scrub tests pass - 62/62 acp-bridge total (50 existing + 12 new spawn) - 174/174 cli httpAcpBridge tests still pass (the factory's inline env-scrubbing refactor preserves byte-identical behavior) - typecheck + eslint clean 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * docs(acp-bridge): fix 14-arg→7-arg typo in test docstring + simplify canonicalizeWorkspace re-export doc (QwenLM#4319 wenshao round 3) Folds in 2 of 3 wenshao Suggestions from QwenLM#4319 round 3: 1. `bridgeClient.test.ts:20` JSDoc said "the 14-arg constructor's positional slot" — typo I introduced when writing the test in `fbc92bccf`. The same docstring correctly says "the constructor takes 7 positional args" at line 25. Updated to "7-arg". 2. `bridge.ts:3461` `canonicalizeWorkspace` re-export JSDoc no longer references the historical `cli/src/serve/fs/paths.ts` location. Reads cleaner as a present-tense pointer to `./workspacePaths.ts` (where the implementation actually lives now post-PR 22b/1). Git history covers the lift chain; the docstring should describe current state. DECLINED + tracked separately: - **[Critical]** `closeSession` + `killSession` use module-scoped `channelInfo` instead of `channelInfoForEntry(entry)` — channel- overlap edge case can kill the wrong channel. Wenshao explicitly notes "pre-existing bug preserved by the lift" — F1's mechanical- lift scope shouldn't carry behavior fixes, and the fix needs a channel-overlap regression test to land safely. Tracked as QwenLM#4325. - 62/62 acp-bridge tests pass (no regression from doc tweaks) - typecheck + eslint clean 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * docs(acp-bridge): polish from second-pass self-review (cross-platform test + package metadata + dead tombstones) Five small adoptions from a second-pass code-reviewer agent review on F1 (no new external comments — pre-emptive cleanup before reviewer returns): 1. **`bridge.ts:290-313`** — deleted two standalone "InvalidPermission OptionError / WorkspaceInit* / McpServer* lifted to bridgeErrors" tombstone comments. Pre-22b they were load-bearing (explained why the class wasn't `class`-defined inline at that file location). Post-F1 the symbols are imported at the top of the file and the comments sit between unrelated code (`writeServeDebugLine` / `MAX_DISPLAY_NAME_LENGTH` / `DEFAULT_INIT_TIMEOUT_MS`) with no anchor. Dead doc — removed. 2. **`README.md`** — `spawnChannel` entry now lists `scrubChildEnv` alongside `defaultSpawnChannelFactory` + `killChild` + `SCRUBBED_CHILD_ENV_KEYS`. Channels / VSCode IDE consume the package barrel so the helper should be visible in the inventory. 3. **`package.json:description`** — refreshed from the PR 22a wording ("EventBus, AcpChannel, in-memory channel, PermissionMediator interface") to include F1 additions (`createHttpAcpBridge` / `BridgeClient` / `defaultSpawnChannelFactory` / `BridgeFileSystem`). Visible on `npm view`-style tooling + IDE hover so worth keeping current. 4. **`bridgeClient.test.ts:92-115`** — swapped `/proc/no-such-file` for `/this/dir/never/exists/file.txt` and reworded the comment. `/proc/` is Linux-only; on macOS / Windows the inline proxy's dangling-symlink fallback would write through to a path under root rather than failing. Test passed regardless (mock assertion, not real disk) but the comment overstated portability. 5. **`spawnChannel.test.ts:36`** — added a comment block explaining why the test deliberately hand-rolls the SCRUBBED set instead of importing the production `SCRUBBED_CHILD_ENV_KEYS`. The decoupling is intentional (pure-function parameterized test + forward-guard for future denylist expansion) but a naive reader would think it's an oversight. - 62/62 acp-bridge tests pass - 174/174 cli httpAcpBridge.test.ts pass - typecheck + eslint + pre-commit hooks clean 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(acp-bridge): bridge.ts security fold-in from QwenLM#4297 review (3 issues) Folds 3 unresolved review comments from the post-merge thread on QwenLM#4297 (wenshao via qwen-latest agent) into F1 (QwenLM#4319). All 3 touch `acp-bridge/src/bridge.ts` — the same file F1 already moves the lifted factory into — so consolidating here saves opening a separate follow-up PR and keeps the security narrative in one reviewable commit. The 2 cross-package fixes (`core/src/memory/const.ts` test gap + `cli/src/serve/runQwenServe.ts` malformed-context fallback) will land as their own small PRs after F1 merges. #### Fix 1 (wenshao Critical, QwenLM#4297 thread): `fs.unlink(target)` arbitrary-file-deletion primitive in `verifyParentWithinWorkspace` 'create'-cleanup After `fs.open(target, 'wx')` creates the empty file at the real parent, an attacker with local workspace write access can swap the parent directory for a symlink (`docs/` → `/etc`). The cleanup's `fs.unlink(target)` re-resolves the TEXTUAL path through the attacker's freshly-planted parent symlink, deleting whatever file exists at the external location. Fix: drop the `fs.unlink(target)` line. The 0-byte file at the pre-race location is harmless (0 bytes, inside the workspace we'd already verified) — leaving it over deleting an arbitrary external file is the right safety trade. Comment block explains the reasoning so future maintainers don't re-introduce the unlink. #### Fix 2 (wenshao Critical): `O_TRUNC` arbitrary-file-truncation primitive in workspace-init 'overwrite' branch `O_TRUNC` causes the kernel to truncate the file to zero bytes AT `open(2)` SYSCALL TIME — strictly before `verifyParentWithinWorkspace` runs. A parent-symlink TOCTOU race between `canonicalizeExistingAncestor` and this `open()` zeros the file at the attacker-redirected location (arbitrary-file-truncation primitive against any file the daemon UID can open). The pre-fix code's own comment on `verifyParentWithinWorkspace` acknowledged this as "Acceptable residual posture for the Stage-1 trust model"; wenshao pushed back that arbitrary-file-zeroing exceeds the Stage-1 trust budget. Fix: drop `O_TRUNC` from the open flags. Truncation moves to AFTER `verifyParentWithinWorkspace` succeeds, via `fh.truncate(0)` on the fd we already hold. fd-based truncate does NOT re-resolve the path — an attacker swapping the parent symlink after we open can't redirect the truncation. #### Fix 3 (wenshao Suggestion): `canonicalizeExistingAncestor` missing `ELOOP` catch Circular symlinks in the parent path (`a -> b`, `b -> a`) cause `fs.realpath` to fail with `ELOOP`. Without catching it, the error propagates as an unstructured HTTP 500 instead of the typed `WorkspaceInitSymlinkError` (HTTP 400) the route handler expects from the workspace-init race-detection family. Fix: add `'ELOOP'` to the caught error codes alongside `'ENOENT'` and `'ENOTDIR'`. Walking up the parent chain when ELOOP hits at a sub-component preserves the existing "walk to the deepest extant ancestor" contract — the deepest realpath-able ancestor still dictates the canonical prefix. #### Why no new tests in this commit - Fix 1 is a single-line removal: any regression that re-adds the unlink would be caught by reviewing the diff; existing 174-test `httpAcpBridge.test.ts` integration suite confirms the create-path still works (file is created + closed correctly; only the attacker-cleanup branch changes). - Fix 2 is a structural move (truncate from open-time to post-verify); the existing overwrite-init integration tests confirm the end-to-end behavior is unchanged (file ends up empty after init). Adding a TOCTOU race regression test requires controlled filesystem-race simulation that exceeds reasonable test infra scope for this PR. - Fix 3 is a one-word addition to an error code list; the `canonicalizeExistingAncestor` helper is module-private and the integration test for circular-symlink → typed 400 would require exporting it OR setting up a real circular-symlink workspace. Both routes widen scope beyond the security fix itself; the high-level behavior is verifiable by the existing route-error- mapping test pattern + diff review. A follow-up PR can add the integration tests once the security fix itself has shipped; the immediate priority is closing the arbitrary-file-deletion + arbitrary-file-truncation primitives. - 62/62 acp-bridge tests pass - 174/174 cli httpAcpBridge.test.ts pass - typecheck + eslint clean #### Refs - Original review on QwenLM#4297 (wenshao via qwen-latest agent), post- merge, currently unresolvable on QwenLM#4297 itself because that PR is already MERGED. - Other 2 QwenLM#4297 review threads (`const.ts` test coverage, `runQwenServe.ts` malformed-context observability) target files outside F1's scope and will land as separate follow-up PRs. 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix: post-merge Codex P2 fold-in — MCP restart disabled-tools normalization + SDK timeout headroom (QwenLM#4319) Folds in 2 P2 findings from a Codex review run on `git diff main...HEAD` of F1 PR QwenLM#4319. Both are pre-existing in code merged into `daemon_mode_b_main` before F1 was created (QwenLM#4282 PR 17), but they're tiny tactical fixes (~25 LOC + 1 LOC) on the same integration branch the same reviewer (wenshao) already engages with, so folding into F1 saves an extra follow-up PR cycle. #### Fix 1: normalize disabled tool names during MCP restart refresh `packages/cli/src/acp-integration/acpAgent.ts:1563-1566` The bootstrap path in `cli/src/config/config.ts:1426-1434` applies a 4-step normalization to `tools.disabled`: 1. typeof string filter 2. .trim() 3. drop empty after trim 4. dedupe via Set The MCP-restart refresh path only did step 1, then stored the raw strings. `ToolRegistry` checks disabled tools with EXACT `Set.has(tool.name)`, so a tool disabled at boot as `' Foo '` (or `'Foo\n'`) is no longer matched after `restartMcpServer` and gets silently re-registered. This contradicts the documented "toggle + restart" workflow that QwenLM#4282 PR 17 advertised. Fix: mirror the bootstrap normalization verbatim before `setDisabledTools`. Adds 6 lines + a 7-line comment pointing at the bootstrap reference for future maintainers. #### Fix 2: add headroom to MCP restart SDK timeout `packages/sdk-typescript/src/daemon/DaemonClient.ts:102` The SDK's `MCP_RESTART_DEFAULT_TIMEOUT_MS` was EXACTLY 300_000ms, the same ceiling the daemon's own `MCP_RESTART_TIMEOUT_MS` uses for the upper bound on a single MCP rediscovery. For restarts that finish (or fail with a typed `McpServerRestartFailedError` JSON envelope) near 300s, the client `AbortSignal` could fire BEFORE the daemon had finished serializing + transmitting the response, yielding a client `TimeoutError` even though the daemon was still within its own budget. Fix: bump to 330_000ms (10% / 30s headroom over the daemon ceiling). Comment updated to call out the race + the rationale for the specific headroom value. Callers needing tighter caps still pass their own `timeoutMs` to `restartMcpServer`. #### Why folded into F1 vs separate follow-up PRs These are post-merge findings on `QwenLM#4282 PR 17` code, not F1-introduced regressions. Normally we'd track as separate follow-up issues (mirror of the QwenLM#4325 / `channelInfo` decline). But: - Both fixes are TINY (~25 LOC + ~2 LOC including comment); the bridge security fold-in commit `7bd66c6e8` set the precedent of folding in small same-branch issues when the cost-benefit favors closing them immediately. - Same reviewer (wenshao via qwen-latest agent) — won't be confused by the scope expansion; in fact the original PR 17 commenter is also the one who'd review the follow-up issue's fix. - Both fixes target `daemon_mode_b_main`-only paths (MCP restart route added by PR 17 lives on the integration branch). - Saves opening 2 trivial follow-up issues that would just sit until someone picks them up. #### Verification - sdk-typescript: 424/424 tests pass (no test hardcoded the old 300_000 default — only the constant declaration itself referenced it) - cli acp-integration: 282/282 tests pass (no test exercised the exact whitespace-bearing disabled-tools scenario, so no test changes were strictly required; a regression test would belong in a separate test-coverage PR alongside the const.ts test gap from the QwenLM#4297 unresolved-comment thread) - typecheck clean across cli + sdk-typescript 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * docs(acp-bridge): wenshao review round 4 — 3 Suggestion fold-ins (QwenLM#4319) 1. **bridge.ts:2270 stale line refs in `publishWorkspaceEvent` JSDoc** — comment said `permission_resolved at line 1717` (actual: line 682) and `broadcastWorkspaceEvent closure at ~line 2127` (actual: line 1281). Line numbers drifted across the lift commits. Replaced both with function-name refs (`in resolvePending`, `declared above in this factory body`) that survive future edits. 2. **`ws.ts:613` opaque references in bridgeFileSystem.ts:20 + bridgeOptions.ts:267** — no `ws.ts` file exists in the repo; the ref came from an internal review thread on PR 18 that future readers can't locate. Replaced with a self-contained description ("post-PR-18 follow-up thread about BridgeClient's inline fs proxy bypassing WorkspaceFileSystem (originally raised in QwenLM#4250 review)") plus a cross-reference to the FIXME(stage-1.5, chiga0 finding 4) already lifted into this package. 3. **bridge.ts:3503 duplicate `canonicalizeWorkspace` re-export** — `index.ts:11` already does `export * from './workspacePaths.js'` which exposes `canonicalizeWorkspace` through the package barrel. The bridge.ts re-export was a leftover from the lift that just duplicated the symbol at the barrel level (`bridge.ts` then re- exports it again via `index.ts`'s `export * from './bridge.js'`). Removed; `canonicalizeWorkspace` stays available via the package barrel + the `@qwen-code/acp-bridge/workspacePaths` subpath, which is what the cli shim already imports from. - 62/62 acp-bridge tests pass - 174/174 cli httpAcpBridge tests pass - typecheck + eslint clean 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * fix(acp-bridge): wenshao round 5 — killChild deadline log + stale line-ref cleanup (QwenLM#4319) Folds in 1 of 3 wenshao Suggestions on F1 PR QwenLM#4319 round 5; 2 declined with tracking issues opened (QwenLM#4329, QwenLM#4330). **Adopted:** `spawnChannel.ts:323` — `killChild` hard deadline now emits a stderr warning before abandoning a stuck child. Pre-fix the `setTimeout(KILL_HARD_DEADLINE_MS)` silently resolved the promise, letting `bridge.shutdown()` claim graceful shutdown while a `qwen --acp` zombie still held FDs / memory / locks. Under systemd/k8s supervision this lets the daemon respawn race the orphan for the same workspace. New warning is a single line on the daemon's stderr (`qwen serve: killChild hard deadline (10000ms) reached; child pid=... still alive (uninterruptible sleep?) — abandoning. Operator should check for zombie qwen --acp processes...`) so monitoring/log aggregators catch the zombie signal. **Partial adopt:** `acpAgent.ts:1564` — replaced the hard-coded `cli/src/config/config.ts:1426-1434` line-number cross- reference (will drift when config.ts is edited) with a content-anchor pointer ("search for `disabledTools` array population around the `tools.disabled` settings read"). Same class of stale-line-ref cleanup F1 already did across `bridge.ts` / `permission.ts` / `bridgeClient.test.ts`. **Declined** for F1 scope, both with tracking issues: - `acpAgent.ts:1564` — extract a shared `normalizeDisabledToolList()` helper for the boot path + restart path so future enhancements (case-folding, Unicode normalization, plugin-name aliasing) only edit one site. Tracked as QwenLM#4329. - `DaemonClient.ts:112` — enforce SDK/server MCP-restart timeout coupling so a future bump on either side doesn't silently re-introduce the race that `b78de2719` fixed. Tracked as QwenLM#4330 (shared constant vs cross-package integration test vs startup assertion — three options enumerated). Both extractions have real merit but are structural refactors that sit outside F1's "mechanical lift + targeted security/doc fixes" scope. Folding either would add new shared-utility / shared-package plumbing the lift PR explicitly avoids. - 62/62 acp-bridge tests pass - 174/174 cli httpAcpBridge tests pass - typecheck + eslint clean 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code) * refactor(cli): extract normalizeDisabledToolList helper — fold-in for wenshao QwenLM#4319 round 5 (closes QwenLM#4329) Folds in wenshao Suggestion from QwenLM#4319 round 5 (originally declined as out-of-scope, opened as QwenLM#4329 for follow-up tracking). User pushed back that the helper is small enough + same package as the duplicate sites, so doing it inline rather than as a separate follow-up PR closes the review thread completely. ## Change New file `packages/cli/src/config/normalizeDisabledTools.ts`: ```typescript export function normalizeDisabledToolList(raw: unknown): string[] ``` 4-step normalization (`typeof string` filter + `.trim()` + drop empty + dedupe preserving first-occurrence order). Non-array `raw` short- circuits to `[]` so callers can pass arbitrary settings-shaped input without `Array.isArray` boilerplate. Replaces two byte-identical inline implementations: - `packages/cli/src/config/config.ts:1426-1434` (bootstrap path) — was 9 lines of inline trim+dedupe loop. - `packages/cli/src/acp-integration/acpAgent.ts:1571-1591` (MCP restart refresh path) — was 10 lines + an `Array.isArray` gate + 20 lines of explanatory comment about why it had to mirror the bootstrap path. Both call sites now just call `normalizeDisabledToolList(raw)`. ## Why it matters `ToolRegistry.has(tool.name)` is an exact-string match. A hand-edited `tools.disabled: [' Foo ', '', 'Foo']` settings entry must produce `Set(['Foo'])` at boot AND after every `restartMcpServer` — otherwise the boot-disabled tool gets silently re-registered after the next MCP restart (the bug Codex P2 originally caught in `b78de2719`). Sharing the helper makes future enhancements (Unicode normalization, plugin- name aliasing, case-folding decisions) edit exactly one site. ## Tests New `packages/cli/src/config/normalizeDisabledTools.test.ts` (16 tests) covering: - non-array short-circuit (undefined, null, object, number, string, bool) - typeof-string filter (drops mid-array non-strings without aborting) - trim + empty-skip (whitespace-only entries dropped) - dedupe (exact match, whitespace variants collapse to first occurrence, case NOT folded) - boot/restart parity scenarios (the BkwQW class the helper was written to prevent) - order preservation across trim + dedupe ## Refs - Closes QwenLM#4329 - F1 PR QwenLM#4319, originally tracked the helper extraction as deferred (commit `5f6b55e80` round 5 reply); now folded in here. - Original duplicate introduction was `b78de2719` (Codex P2 fold-in for MCP restart normalization). 🤖 Generated with [Qwen Code](https://github.com/QwenLM/qwen-code)

## Critical #1 — 401/403 reconnect storm + transcript wipe `DaemonSessionProvider`'s reconnect loop kept retrying `createOrAttach` on 401/403 even with `autoReconnect: true`. Each cycle: - hit the daemon with the same bad token → 401 again - cleared the session handle - the next successful attempt (if token magically recovered) would receive a different sessionId, triggering the `store.reset()` branch at line 143 and wiping the user's transcript - no terminal "auth failed" state surfaced to the user Fix: split `TERMINAL_SESSION_HTTP_STATUSES` into `AUTH_FAILURE_HTTP_STATUSES` (401, 403) and the rest (404, 410). On auth failure, return from the reconnect loop unconditionally regardless of the `autoReconnect` flag — these are credential failures, not transient. The user must update credentials; daemon spam must stop. `extractHttpStatus` helper factored out of `isTerminalSessionHttpError` to share between the two predicates. ## Critical QwenLM#2 — rawInput / rawOutput leaking secrets to UI `normalizer.normalizeToolUpdate` forwarded `rawInput` / `rawOutput` verbatim onto `DaemonUiToolUpdateEvent` → `DaemonToolTranscriptBlock`. The `details` projection was redacted via `stringifyRedactedJson` / `redactSensitiveFields`, but the underlying `rawInput` / `rawOutput` fields were unredacted. Any UI component that read those fields directly (ShellToolCall, WriteToolCall, JSON debug panels) leaked the raw values to the DOM. Example: `{ command: 'curl', apiKey: 'sk-prod-...' }` had `apiKey` redacted in `details` but exposed verbatim on `rawInput`. Fix: apply `redactSensitiveFields` to both `rawInput` and `rawOutput` ONCE at the normalizer boundary, then reuse the redacted shape for the `details` projection. Downstream is uniformly safe; no double traversal. ## Tests (49/49 pass) - SDK `daemonUi.test.ts` (36 tests, +1) — new test `redacts sensitive fields in tool.update rawInput and rawOutput at normalizer boundary` verifies full-event string scan finds zero secret values + structural keys preserved with values `'[redacted]'`. - WebUI `DaemonSessionProvider.test.tsx` (13 tests, +2) — new tests `breaks out of the reconnect loop on 401 / 403 auth failures even when autoReconnect is true` and `still reconnects on 404 / 410 session-not-found errors when autoReconnect is true` lock in the asymmetry: auth failure → 1 attempt only; session-not-found → retries until success. ## Out of scope (declined / deferred — see PR review reply) - CRIT QwenLM#3 `withActionTimeout` test coverage gap → behavior correct, test-only follow-up (avoids PR bloat) - Suggestions QwenLM#4-7 → 4 nice-to-haves, deferred to keep PR focused on production-correctness fixes Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…#4340) * fix(review): harden SKILL.md against weak-model rule skipping Weak models often skip parts of the long /review prompt and fall back to familiar defaults — `gh pr checkout` instead of the worktree flow, or running the autofix prompt even when the user passed `--comment` (which means "only post inline comments, don't mutate code"). Three reinforcements, all in SKILL.md (no CLI changes): - Promote the two most commonly violated rules to the top of the "Critical rules" list: worktree is mandatory for PR reviews, and `--comment` skips Step 8 entirely. - Add an inline blockquote at the top of the Step 1 PR branch that names the specific forbidden commands (`gh pr checkout`, `git checkout`, `git switch`, `git pull`, `git reset --hard`). - Add an explicit skip block at the top of Step 8 listing the three conditions that bypass autofix — `--comment`, cross-repo lightweight mode, or no fixable findings — so a weak model doesn't have to infer them from scattered earlier text. * fix(review): address /review comments on rule scope + Step 8 dedup Follow-up to the initial harden pass, addressing the inline review comments on PR QwenLM#4340. Rule #1 (worktree mandatory): - Scope it to **same-repo PR reviews** so cross-repo PRs running in lightweight mode (no matching local remote, no worktree) don't read as a contradiction. - Replace "Your very first action" with "After argument parsing and remote detection, the first command that touches code state" — the literal "very first" was wrong since `--comment` parsing and URL/remote disambiguation legitimately run before `fetch-pr`. - Align the forbidden-command list with the Step 1 blockquote (add `git pull` and `git reset --hard`) so a weak model that only reads the Critical rules section sees the same five commands as a model that reaches the blockquote at the point of use. - Add an explicit "cross-repo PRs use lightweight mode" parenthetical so the same model knows where to look for the alternative path. Step 8 skip block: - Drop the redundant third bullet ("no Critical or Suggestion findings with concrete, applicable fixes") — it was both logically equivalent to the "Otherwise" clause below and used a different qualifier ("concrete, applicable" vs "clear, unambiguous"), risking a weak model treating them as two distinct thresholds. - "ANY of the following" → "EITHER" since only two bullets remain. - Fold the no-findings case into the Otherwise clause as a no-op note.

* docs(serve): F2 MCP transport pool design (v2.1) Design document for F2 shared MCP transport pool — workspace-scoped pool that replaces today's per-session McpClient spawning so N sessions in one workspace share one process per unique server config. v2.1 folds in 12 review corrections on top of v2: - single-PR delivery per #4175 branching strategy (commit-by-commit review) - sessionToEntries reverse index for O(refs) releaseSession - ?entryIndex= selective restart route - spawn-failure slot leak fix - in-flight tool call during reconnect semantics (MCPCallInterruptedError) - /mcp disable triggers SessionMcpView re-apply - entryIndex exposure instead of raw fingerprint (avoid token-rotation side-channel) - reconnect backoff spec (stdio 5s x3, HTTP exponential 1/2/4/8/16s x5) - canonicalOAuth normalization - legacyInProcessAcquire renamed to createUnpooledConnection - drainAll(opts?) signature with timeoutMs - locked SDK reducer field names (no public API rename) - extension uninstall orphan entries deferred to MAX_IDLE_MS natural reap Refs: #3803, #4175 F2 Generated with Qwen Code * docs(serve): fix V21-10 changelog row wording Replace-all regression from prior commit: both sides of the rename arrow ended up as createUnpooledConnection. Restore the meaning (old name was descriptive, not a literal symbol). Generated with Qwen Code * refactor(core): split McpClient.discover into pure tool/prompt list (#4175 F2 commit 1) Foundation for the F2 shared MCP transport pool. Splits the existing side-effecting discovery API into a pure version that returns a {tools, prompts} snapshot, so the upcoming pool (#4175 F2 commit 2) can let a single shared McpClient produce one snapshot and have N per-session SessionMcpView instances each register a filtered copy into their own ToolRegistry / PromptRegistry. Changes: - Extract listMcpPrompts(serverName, mcpClient) — pure version of discoverPrompts that returns DiscoveredMCPPrompt[] (with serverName and bound invoke) WITHOUT touching any PromptRegistry. - Refactor discoverPrompts(name, client, registry) to wrap listMcpPrompts + register; preserves historical Promise<Prompt[]> return type (strips serverName / invoke from returned plain Prompt objects so existing callsites are unaffected). - Add McpClient.discoverAndReturn(cliConfig) — pure method returning {tools, prompts}. Same error semantics as discover(): flips status to DISCONNECTED on any failure and re-throws; "No prompts or tools found on the server." sentinel preserved so wrapping managers / pools can distinguish "server up but empty" from "server down". - Refactor McpClient.discover(cliConfig) to delegate: calls discoverAndReturn then explicitly registers BOTH tools and prompts into the per-instance registries. Pre-F2 prompts were registered as a side effect inside discoverPrompts; post-F2-1 registration happens in discover() after the pure call returns. Observable side effects identical (both registries populated by end of call); the order flip (tools first, then prompts vs. prompts first as side effect, then tools) has no observable race because discover() is awaited as a unit by connectAndDiscover and the two registries are independent maps. - Remove dead private methods McpClient.discoverTools and McpClient.discoverPrompts that delegated to the exported functions. Tests: - 7 new tests covering discoverAndReturn (snapshot purity, no registration, no-prompts-or-tools rejection with DISCONNECTED status flip, unconnected-state guard) and listMcpPrompts (enriched return type with invoke, no-prompts-capability fallback, protocol error swallow). - 1 new backward-compat test asserting discoverPrompts wrapper still registers prompts AND strips enrichment fields from return value. - 1 forward-defense assertion: the no-prompts-or-tools throw path verifies registries were strictly untouched, catching future regressions in commits 2-6 that might register a partial batch before the guard fires. Backward compatibility: - McpClient.discover() signature and side-effect contract unchanged for all standalone qwen callers + existing tests (44/44 pass). - discoverPrompts() exported signature unchanged. - No new public exports from packages other than listMcpPrompts + McpClient.discoverAndReturn (additive). - All 36 pre-existing tests in mcp-client.test.ts pass; all 71 tests in mcp-client-manager.test.ts pass. - packages/core typecheck clean; lint clean on touched files. Refs: #3803, #4175 F2; design doc docs/design/f2-mcp-transport-pool.md §7 Generated with Qwen Code * feat(core): McpTransportPool + SessionMcpView (#4175 F2 commit 2) Core implementation of the F2 shared MCP transport pool. Workspace- scoped pool that lets N ACP sessions share one MCP client per unique (serverName, fingerprint) tuple instead of each session spawning its own MCP child process. New files: - mcp-pool-events.ts: PoolEvent discriminated union, PoolEntryState enum, MCPCallInterruptedError class (§13.4), type guards. - mcp-pool-key.ts: fingerprint() with sorted canonical form for stable hashing across env-key permutations; canonicalOAuth() collapses {enabled:false}/undefined/null/{} to null (V21-9); mcpTransportOf() classification; isPoolable() opt-in gate; POOLED_TRANSPORTS_DEFAULT = {stdio, websocket} (V21 C8); connectionIdOf / parseConnectionId. - session-mcp-view.ts: per-session, per-server projection of the pool's snapshot into a session's own ToolRegistry + PromptRegistry. passesSessionFilter() preserves pre-F2 include/exclude semantics. applyTools clones each tool via withTrust() so per-session trust never cross-contaminates the shared snapshot (V21 C7). teardown() drops all this view's registrations. - mcp-pool-entry.ts: PoolEntry class with refcount, drain state machine (spawning -> active <-> draining -> closed | failed), generation counter for stale-handler guard (§7.3), snapshot replay on attach (§7.2 / V21 C4), restart() with in-flight coalescing (§13.2), forceShutdown() with idempotency, MAX_IDLE_MS hard cap that survives drain/attach flap. defaultPoolEntryOptions() returns transport-keyed defaults (stdio: 5s fixed x3, http: 1/2/4/8/16s exponential x5 per §6.6). - mcp-transport-pool.ts: top-level McpTransportPool class. - acquire(name, cfg, sid, toolReg, promptReg): pool lookup, spawnInFlight dedup for concurrent acquires, slot reservation released on spawn failure (V21-4), sessionToEntries reverse index for O(refs) releaseSession (V21-2). - release(id, sid) / releaseSession(sid). - restartByName(name, {entryIndex?}): V21-3 selective restart via opaque entryIndex; returns RestartResult[]. - getSnapshot(): includes entryCount + entrySummary (with opaque entryIndex, NOT raw fingerprint per V21-7) for the pool-aware status route in commit 5. - aggregateStatusByName(): "any-CONNECTED wins" across multi-entry name collisions (§8.1). - drainAll({force?, timeoutMs?}): wall-clock bounded graceful shutdown for QwenAgent.close (§17 + V21-11). - createUnpooledConnection(): SDK MCP + HTTP-no-opt-in path constructs a per-session McpClient and uses the legacy discover() (which writes to session registries directly). - poisonedToolRegistry/PromptRegistry: stub passed to pool's own McpClient instances; throws on any registration to catch regressions where a pool path accidentally fell back to side-effecting discover() instead of discoverAndReturn(). Changes: - mcp-tool.ts: added DiscoveredMCPTool.withTrust(trust) clone method (analogue of asFullyQualifiedTool but only updates trust; returns this when trust unchanged to skip allocation in the common case). Tests (40 new): - mcp-pool-key.test.ts (18 tests): fingerprint stability across env permutations, divergence on auth byte changes, exclusion of per-session filters from key, canonicalOAuth collapse, transport classification, isPoolable gate, connectionId round-trip with :: in server names. - session-mcp-view.test.ts (11 tests): filter semantics, trust copy invariant (snapshot tool NOT mutated), allocation pin when trust unchanged, include/exclude precedence, prompt fan-out, updateConfig + re-apply, idempotent teardown. - mcp-transport-pool.test.ts (11 tests): 3-session sharing with 1 spawn, credential isolation via env divergence, drain timer cancellation by re-attach, drain timer expiry, spawnInFlight dedup of 5 concurrent acquires, reverse-index releaseSession, restartByName + entryIndex selectivity, subprocessCount in snapshot, drainAll teardown. No integration with daemon yet (acpAgent / Config / ToolRegistry wiring lands in commit 4). Pool currently constructible in isolation; existing standalone qwen + per-session McpClient path untouched and all 71 mcp-client-manager + 44 mcp-client tests pass unchanged. Refs: #3803, #4175 F2; design doc docs/design/f2-mcp-transport-pool.md §4 architecture, §5 fingerprint, §6 lifecycle, §7 SessionMcpView Generated with Qwen Code * feat(core): cross-platform pid sweep + commit-2 review fixes (#4175 F2 commit 3) Two adjacent concerns in one commit: 1. Cross-platform descendant pid sweep (new file pid-descendants.ts) 2. Two P1 bug fixes folded back from commit-2 self-review == Pid descendant enumeration == `listDescendantPids(rootPid)` walks the process tree below the MCP child's root pid and returns all descendant pids in BFS order. `sigtermPids(pids)` sends SIGTERM tolerantly (ESRCH swallowed). Both are platform-aware: - Linux/macOS: `pgrep -P <pid>` recursion (pgrep exit code 1 means no children, NOT an error — special-cased) - Windows: PowerShell `Get-CimInstance Win32_Process` filtered by `ParentProcessId` (CIM replaces deprecated wmic on Win10 21H1+) Bounded by `QUERY_TIMEOUT_MS=2000`, `MAX_DESCENDANTS=256`, `MAX_DEPTH=8` so a runaway process tree can't stall daemon shutdown. Graceful degradation: tool missing or timeout returns `[]` and logs warn; OS will eventually reap the orphans (Linux init / Windows job objects). `PoolEntry.forceShutdown` now calls `getTransportPid()` → `listDescendantPids` → `sigtermPids` BEFORE `client.disconnect()`. Closes the leaked-wrapper-process gap that pre-F2 per-session McpClient teardown also had — wrappers like `npx`, `uvx`, `pnpm dlx` spawn the actual server as a grandchild; killing only the wrapper leaves the real server hanging. New `McpClient.getTransportPid()` public getter that introspects `StdioClientTransport.pid` (returns undefined for non-stdio transports + already-exited children). Optional-chained call site in PoolEntry tolerates older mock McpClient stubs in tests. == P1 fixes folded back from commit-2 review == P1 #1: PooledConnection.release() was a documented no-op that leaked refs until releaseSession bulk-cleanup. Wired `PooledConnectionImpl.releaseCallback` to the pool-supplied `pool.release(id, sessionId)`. Pool's `acquire` (both fast-path existing-entry and post-spawn paths) passes the callback through `PoolEntry.attach`'s new `opts.release` parameter. P1 #2: createUnpooledConnection double-teardown. Path: client.discover() registers tools/prompts into session registries → entry.markActive([], []) → entry.attach(sid, view) which synchronously called view.applyTools([]) → removeMcpToolsByServer(serverName) wiping the registrations discover() just made. Fix: PoolEntry.attach now accepts `opts.skipReplay?: boolean`. createUnpooledConnection passes `skipReplay: true` AND a release callback that calls forceShutdown directly (per-session lifetime, no pool refcount). Existing pool paths pass `release` but NOT `skipReplay`, preserving snapshot replay for the late-attach race. Tests (6 new on pid-descendants.test.ts): - input validation (non-positive, NaN, no-children) - sigtermPids empty input + ESRCH tolerance - integration: spawn shell that spawns node grandchild, verify listDescendantPids finds at least one descendant (POSIX-only, CI-skip gated) Verification: - 161/161 MCP-related tests pass (44 mcp-client + 71 mcp-client-manager + 18 mcp-pool-key + 11 session-mcp-view + 11 mcp-transport-pool + 6 pid-descendants) - packages/core typecheck clean - lint clean on touched files Not included (deferred to later commits): - Health monitor / auto-reconnect inside PoolEntry. Existing per-server reconnect logic lives in McpClientManager (consecutiveFailures + isReconnecting + reconnectDelayMs); pool doesn't yet have its own monitor. PoolEntry.restart() works for manual restart; future commit will plumb `client.onerror` → pool's reconnect path with §6.6 backoff strategy. Refs: #3803, #4175 F2; design doc §6.4 pid sweep, §6.5/§6.6 spawn failure + reconnect backoff, §7.2 snapshot replay Generated with Qwen Code * feat(serve): wire McpTransportPool into QwenAgent daemon mode (#4175 F2 commit 4) Daemon-mode integration of the F2 shared MCP transport pool. Sessions running in the same workspace now share one MCP transport per unique server config, instead of each session spawning its own child process. Touches: - packages/core/src/config/config.ts: setMcpTransportPool / getMcpTransportPool. Pool reference stored on Config so ToolRegistry's nested McpClientManager construction can pick it up at config.initialize() time. Forward-declared via inline `import('...').McpTransportPool` to avoid a circular import between config.ts and tools/. - packages/core/src/tools/tool-registry.ts: forwards config.getMcpTransportPool() into the McpClientManager ctor. When undefined, manager keeps its pre-F2 behavior (71/71 existing manager tests pass unchanged). - packages/core/src/tools/mcp-client-manager.ts: new optional `pool?` ctor param + new `discoverAllMcpToolsViaPool` branch in discoverAllMcpTools. Gated on pool presence so standalone qwen is unaffected. Pool path: * Iterates servers with disable check * Calls pool.acquire(name, cfg, sessionId, toolReg, promptReg) * Tracks returned PooledConnection in `pooledConnections` map * On disconnectServer: pooled.release() + map delete * On stop(): releaseAllPooledConnections + existing flow SDK MCP servers stay on the legacy path inside the pool itself (createUnpooledConnection); manager doesn't need a parallel SDK code path. - packages/cli/src/acp-integration/acpAgent.ts: QwenAgent.mcpPool field, eager construction in ctor (V21-13 Q6 resolved). Reads options from env vars set by runQwenServe: * QWEN_SERVE_NO_MCP_POOL=1 → kill switch (mcpPool stays undefined; sessions fall back to per-session spawn) * QWEN_SERVE_MCP_POOL_TRANSPORTS=stdio,websocket,http,sse → operator opt-in for HTTP/SSE pooling (V21 C8); default keeps stdio + websocket only * QWEN_SERVE_MCP_POOL_DRAIN_MS=N → drain grace override (default 30s; bounded [1s, 10min]) newSessionConfig calls config.setMcpTransportPool(this.mcpPool) BEFORE config.initialize() so the ToolRegistry that initialize constructs picks up the pool reference. New `shutdownMcpPool(timeoutMs)` method called from the SIGTERM/SIGINT handler in runAcpAgent before runExitCleanup so the pool's descendant pid sweep (commit 3) catches npx/uvx wrapper grandchildren. - packages/core/src/index.ts: barrel exports for the pool primitives (McpTransportPool, POOLED_TRANSPORTS_DEFAULT, types, helpers). - packages/core/src/tools/mcp-pool-key.ts: dedupe — removed local McpTransportKind / mcpTransportOf definitions and re-export from mcp-client-manager.ts (avoids name collision in the index.ts barrel). Tests: - mcp-client-manager.test.ts: 2 new tests * "routes discovery through the pool when one is injected" — asserts pool.acquire called with (name, cfg, sessionId, toolReg, promptReg); inverse invariant that McpClient is NOT constructed by the manager when pool present (catches a regression where the pool branch silently bypasses). * "falls back to per-session McpClient spawn when no pool injected" — explicit backward-compat assertion. - All 73/73 mcp-client-manager tests pass (71 existing + 2 new) - All 161/161 MCP-related tests pass (44 + 73 + 18 + 11 + 11 + 6 — incremented manager count) - packages/core typecheck clean - packages/cli typecheck: pool-related imports resolve; pre-existing serve/status.ts + @google/genai issues unrelated to F2 unchanged Backward compatibility: - Standalone qwen (non-daemon): QwenAgent not constructed; pool not constructed; behavior identical to pre-F2 - QWEN_SERVE_NO_MCP_POOL=1: kill switch falls back to per-session spawn even in daemon mode - ACP child invoked with no pool env vars: defaults activate (pool on, stdio+websocket transports, 30s drain) - Existing McpClientManager construction sites (ToolRegistry, test fixtures with the older 1-6 arg signatures) unchanged because new pool param is optional and trailing - McpTransportKind / mcpTransportOf still exported from the same module path consumers used pre-F2 Not included (deferred to commits 5-6): - Pool-aware GET /workspace/mcp snapshot (commit 5) — buildWorkspaceMcpStatus still reads from bootstrap session's manager; pool snapshot integration via QwenAgent extMethod is next commit - Pool-aware POST /workspace/mcp/:server/restart route with ?entryIndex= (commit 5) - Budget guardrails graduation to workspace scope (commit 6) — pool currently has no `--mcp-client-budget` integration, so per-session budget enforcement still applies in pool mode (each session's manager state machine is independent). PR 14b push events still fire per session. Refs: #3803, #4175 F2; design doc §2 current state, §10 per-session injection, §17 shutdown ordering Generated with Qwen Code * fix(serve): repair acpAgent imports clobbered by pre-commit auto-format (#4175 F2 commit 4 follow-up) The pre-commit eslint --fix in the previous commit (3dcdddf19) merged the value imports into the type-only import block, which yielded `import type { ... type McpTransportKind, ... }` — TypeScript rejects nested `type` modifier inside `import type`. Restore the original two-block layout: value imports for runtime symbols (McpTransportPool, POOLED_TRANSPORTS_DEFAULT, etc.) and a separate `import type { ... }` for types only (McpTransportKind, ApprovalMode, Config, ConversationRecord, DeviceAuthorizationData). Pre-existing unrelated issues (ServeMcpTransport / @google/genai in cli/) are not addressed here. Generated with Qwen Code * fix(core): SDK MCP servers must stay on legacy path in pool mode (#4175 F2 commit 4 follow-up 2) Self-review found a regression: pool mode would route SDK MCP servers through pool.acquire which delegates to createUnpooledConnection. createUnpooledConnection constructs an McpClient with the pool's `sendSdkMcpMessage` callback — but the pool was constructed in QwenAgent ctor with no callback, so SDK MCP server tool calls would fail in daemon mode. Fix: discoverAllMcpToolsViaPool checks isSdkMcpServerConfig per server and routes SDK servers to the legacy discoverMcpToolsForServer path which preserves the per-session sendSdkMcpMessage wiring from McpClientManager's ctor. Non-SDK servers continue through pool.acquire. Bypass is per-server, not per-manager, so a workspace mixing SDK and non-SDK servers gets both pool-shared transports for the non-SDK ones AND working SDK MCP for the rest. Generated with Qwen Code * fix(core): wenshao review fold-ins — 7 critical races + lifecycle gaps + 4 suggestions (#4175 F2 PR #4336) Folds in @wenshao's first review pass on PR #4336. 7 critical bugs in pool lifecycle / race handling, 4 smaller suggestion fixes. Each issue keyed by its label in the PR comment thread for back-reference. == Critical fixes == C1 (acpAgent.ts:269) — Normal IDE close path missing pool drain. `await connection.closed` returned without calling `shutdownMcpPool`, leaking shared MCP entries (subprocess + wrappers) until OS reaped them — a real regression vs pre-F2 where each session's manager torn down its own clients on disconnect. Mirror SIGTERM handler's pool drain on the normal-close branch too. C2 (mcp-pool-entry.ts:291 area) — `attach()` ref ordering broke max-idle hard cap. Pre-fix, `attach` added the ref before calling `cancelDrainTimer`, so the `refs.size > 0` check inside cancelDrainTimer was always true and the maxIdle timer + firstIdleAt got reset on every attach — completely defeating its purpose (per design §6.3: "started at first idle and NEVER reset"). Fix: cancelDrainTimer now only cancels the drain grace timer; maxIdle survives the entire entry lifetime, cleared only by forceShutdown. C3 (mcp-pool-entry.ts:401) — `doRestart()` zombie state on reconnect failure. Pre-fix, a thrown `client.connect()` / `client.discoverAndReturn()` propagated up but left the entry with `localStatus = CONNECTED`, `state = 'active'`, stale snapshot — pool snapshot lies, subsequent acquires reuse the broken entry. Fix: try/catch wraps connect + discover; on failure transitions to terminal `'failed'` state, sets DISCONNECTED status, emits `failed` event, detaches subscribers via SessionMcpView.teardown, calls onClosed so pool drops the entry from its map. C4 (mcp-pool-entry.ts:361) — `forceShutdown`/`attach` race creates zombie connections. Pre-fix, `state = 'closed'` was assigned AFTER two async yields (`await listDescendantPids`, `await client.disconnect()`). During those yields, a concurrent `acquire` calling `attach` only rejected `'closed'`/`'failed'` states — got a handle to an entry mid-teardown. Fix: flip state to `'closed'` synchronously at the top of forceShutdown, before any await. Concurrent attach now sees 'closed' immediately and rejects. C5 (mcp-transport-pool.ts:399) — `drainAll` race with in-flight spawns. Pre-fix, after Promise.race resolved, `entries.clear()` + `spawnInFlight.clear()` ran synchronously. But in-flight spawn promises continued executing and called `entries.set(id, entry)` AFTER the clear — orphan entries leaking subprocesses past pool shutdown. Fix: introduce `draining` mutex flag (acquire rejects when set), and `await Promise.allSettled` on in-flight spawns BEFORE taking the entry snapshot. Spawn completion before clear is now ordered correctly. C6 (mcp-pool-entry.ts:155) — PoolEntry ignored transport- level errors. Pre-fix, McpClient.onerror writes DISCONNECTED to the global `serverStatuses` map on transport drop, but PoolEntry's `localStatus` stayed CONNECTED — pool's `aggregateStatusByName` then read the stale localStatus and "any-CONNECTED-wins" overwrote the correct DISCONNECTED back into the global map. Fix: PoolEntry registers a module-level status change listener filtered by serverName, mirrors the GLOBAL value into localStatus on every change. `suppressNextStatusEcho` flag guards against listener loops when the entry's own updateGlobalStatus writes to the global map. Listener detached on forceShutdown / failed-state transition. Sub-fix in spawnEntry: order is now `entries.set(id, entry)` BEFORE `entry.markActive(...)`. Pre-fix, markActive ran updateGlobalStatus before entries.set, so aggregateStatusByName couldn't find the just-spawned entry, returned DISCONNECTED, wrote that to the global map, the new status listener echoed it back as `localStatus = DISCONNECTED` — defeating the CONNECTED state markActive had just set. Reorder + idempotent `entries.delete(id)` in catch covers the race. C7 (mcp-client-manager.ts:966) — `discoverAllMcpToolsIncremental` bypassed pool. The pool gate in `discoverAllMcpTools` correctly routed the bulk path through `discoverAllMcpToolsViaPool`, but `discoverAllMcpToolsIncremental` (called from `Config.startMcpDiscoveryInBackground` during boot's default progressive mode) had no such guard — silently reverting to per-session McpClient spawning during the exact path most daemon sessions take. Fix: same `if (this.pool) return discoverAllMcpToolsViaPool(cliConfig)` gate at the top of discoverAllMcpToolsIncremental. == Suggestions == S1 (session-mcp-view.ts:38) — Docstring claimed both includeTools and excludeTools support `<name>(<args>)` parens form, but only includeTools strips parens. excludeTools uses direct equality (matches pre-F2 `mcp-client.ts:isEnabled` history). Doc fixed to reflect actual behavior. S2 (pid-descendants.ts:166) — `sigtermPids` docstring claimed it used `taskkill /F` on Windows, but the implementation always calls `process.kill(pid, 'SIGTERM')` regardless of platform. On Windows, Node polyfills SIGTERM to TerminateProcess (similar effect, no shell-out needed). Doc fixed; implementation unchanged. S3 (session-mcp-view.ts:110) — Debug log contained literal "N" instead of `${count}` interpolation. Operators enabling debug logging saw a meaningless placeholder. Track actual `registered` count and interpolate. S4 (mcp-transport-pool.ts:545) — `createUnpooledConnection` passed `() => MCPServerStatus.CONNECTED` as the status aggregator callback. After forceShutdown, this would write CONNECTED to the global serverStatuses map even though the transport was dead. Fix: aggregator now delegates to `client.getStatus()` so the global map reflects the actual McpClient state. == Verification == - 163/163 MCP-related tests pass (44 + 71 + 18 + 11 + 11 + 6 + 2) - packages/core typecheck clean - All fixes folded into the commit-where-the-bug-lived (commit 2 / commit 3 / commit 4) via fix-up commit on top — preserves bisectability of the buggy state for future forensics Refs: PR #4336 review by @wenshao (commit 4 round 1) Generated with Qwen Code * feat(serve): pool-aware status + restart routes (#4175 F2 commit 5) Wire the F2 transport pool into the daemon's `GET /workspace/mcp` and `POST /workspace/mcp/:server/restart` surfaces, plus advertise two new conditional capability tags. Status route enrichment (`buildWorkspaceMcpStatus`): - pool snapshot taken once outside the per-server loop (avoids N walks) - per-server cells gain `entryCount` + `entrySummary` (V21-7 opaque `entryIndex`, NOT raw fingerprint) when the pool holds at least one matching entry - pool snapshot failure is a stderr-loud non-fatal — the legacy budget-accounting cells still render Restart route routing (`workspaceMcpRestart` ext method): - new `?entryIndex=N` query param (or `*` / omitted) on `/workspace/mcp/:server/restart` — bounded non-negative integer or the literal `*`; bad inputs return `400 invalid_entry_index` - ACP child routes through `pool.restartByName(name, {entryIndex})` when the pool holds entries; falls back to the legacy `discoverToolsForServer` path otherwise (`--no-mcp-pool` daemons, unpooled HTTP/SSE/SDK transports, or names that drained out) - legacy single-entry response shape `{restarted, durationMs}` preserved; multi-entry responses use the new `{entries: RestartResult[]}` shape — clients gated on the `mcp_pool_restart` capability tag are the only senders of `entryIndex` - pool-mode hard restart failure fans out one `mcp_server_restart_refused` event per failed entry with `reason: 'restart_failed'` (additive enum value) plus `details` carrying the underlying error text; soft-skip pre-flight checks (`disabled` / `in_flight` / `budget_would_exceed`) still run BEFORE the pool branch Capability advertisement: - `mcp_workspace_pool` + `mcp_pool_restart` both gated on a new `mcpPoolActive` toggle in `AdvertiseFeatureToggles` - conditional predicate is default-OFF (matches `require_auth` pattern); server.ts call site flips to default-ON via `opts.mcpPoolActive !== false`, so a daemon booted without the kill switch advertises both tags by default - `runQwenServe.ts` infers `mcpPoolActive: false` when the parent process has `QWEN_SERVE_NO_MCP_POOL=1` so the envelope tracks the ACP child's actual feature set SDK type extensions (additive only): - `ServeWorkspaceMcpServerStatus.entryCount` + `entrySummary` - `DaemonMcpServerRestartedData.entryIndex?` - `DaemonMcpServerRestartRefusedData.{reason: 'restart_failed', entryIndex?, details?}` - `MCP_RESTART_REFUSED_REASONS` widened to include `restart_failed` Tests: - `EXPECTED_REGISTERED_FEATURES` gains the two pool tags; conditional- features drift test asserts `mcpPoolActive` predicate behavior - `daemonEvents.test.ts` exercises the new `restart_failed` reason through the reducer 163 F2 tests + 62 acp-bridge tests + 46 daemon events tests pass. * fix(serve): self-review fold-ins for F2 commit 5 — capability test + SDK doc Two findings from the code-reviewer pass on `edeb0a5cf`: R1 (critical): the `/capabilities` v1-envelope test was asserting `features` against `getAdvertisedServeFeatures()` (no toggles → both new pool tags filtered out by the default-OFF predicate), but the actual response uses `mcpPoolActive: opts.mcpPoolActive !== false` (default-ON at the call site). Anchored the assertion against the same toggle the route uses, plus added a separate test that explicitly boots with `mcpPoolActive: false` and verifies both pool tags drop out (mirrors the `QWEN_SERVE_NO_MCP_POOL=1` kill-switch path). R3 (doc clarity): the `restart_failed` reason's jsdoc claimed old SDK reducers "see the new value as `unknown` (TS structural widening) and surface it generically rather than crashing." That described the type system but mis-stated the runtime: `isMcpServerRestartRefusedData` calls `MCP_RESTART_REFUSED_REASONS.has(...)` and returns false for unknown reasons, so `parseDaemonEvent` silently DROPS the event. New text explains the closed-set predicate + how the additive-protocol contract still holds (pre-PR SDKs gate on `mcp_pool_restart` before sending `entryIndex`, so they shouldn't be observing pool-mode multi-entry restarts). * fix(core): wenshao R1-R8 review fold-ins for F2 commit 5 Eight findings from wenshao's review of commit 5; six adopted as real bug fixes / encapsulation wins, two with partial / declined replies. R1 (critical): `maxIdleTimer` force-closed actively-used pool entries. The C2 fix intentionally let the timer survive attach/detach flap, but the fire-action didn't re-check `refs.size`. A session that re-attached inside the 30s drain grace and stayed busy for 4+ minutes would lose the entry permanently when `maxIdleTimer` (started at the earlier detach) fired. Now: if active refs exist at fire time, log + reset `firstIdleAt` so the next idle window gets a fresh hard cap. R2 (critical): incremental discovery released ALL pooled connections then re-acquired everything. Pre-fix every progressive-mode boot pass or `/mcp refresh` produced a brief window with zero MCP tools registered AND bounced every entry's drain timer. Now: diff `pooledConnections` against the desired (name, fingerprint) set and release only stale entries; survivors stay attached, no tool registry churn. SDK MCP servers still re-run via the legacy path (idempotent re-call). R3 (correctness): `doRestart` updated `toolsSnapshot`/`promptsSnapshot` and emitted typed events but no `SessionMcpView` instance subscribed to that event stream — so session ToolRegistry instances kept stale pre-restart registrations. Latent until commit 5 landed the restart HTTP route; now a real correctness bug. Iterate `subscribers` directly after snapshot update so views actually pick up the new tools/prompts. R4 (cosmetic→correctness): `getSnapshot()` counted websocket toward `subprocessCount`, but websocket transports dial a (potentially remote) server and don't spawn a local OS child — inflated the operator-facing capacity-planning metric. Restricted to `stdio` only. R5 (defense-in-depth): the Windows `Get-CimInstance` PowerShell script interpolated `${pid}` directly into the `-Filter` string. The entry-point integer guard makes injection impossible today, but binding the pid to a `$p` variable up front makes the integer-only contract robust against future relaxations of the guard. R6 (encapsulation): `PoolEntry.cfg` was readonly-public, exposing secrets (env API keys, header auth tokens, OAuth fields) to anyone holding an entry reference. Made private; added `transportKind` getter for the only external reader (subprocessCount classification in `getSnapshot`). R7 (partial): removed five PoolEvent type guards, the `Prompt` re-export, and `PoolEntryConnectionStatus` — all premature public API with zero callers in source or tests. Kept `MCPCallInterruptedError` because design §13.4 declares it as the user-facing contract for the V21-5 in-flight call interruption follow-up; removing it would lose the invariant carrier. R8 (cleanup): SIGTERM handler and IDE-initiated close path had identical `if (agentInstance) { try { await shutdownMcpPool(8_000) } catch ... }` blocks. Extracted into `drainPoolBeforeExit(label)` so both paths share the timeout + log labels and future drain-semantic changes happen in one place. R9 / R10 deferred: the McpClientManager 7th-arg sentinel pattern (R9) and per-PID-per-level pgrep cost (R10) work correctly today; both are refactoring/perf optimizations for a later cleanup PR rather than F2 correctness blockers. Tests: - All 163 F2 tests pass; all 73 mcp-client-manager tests pass - No new tests added; the existing R3 fix was caught only because commit 5's restart route activated the latent path. Adding a unit test for the snapshot fan-out would require wiring a mock SessionMcpView; deferred to commit 6's test harness expansion. * feat(serve): graduate MCP budget guardrails to workspace scope (#4175 F2 commit 6) Move slot reservation + 75% hysteresis + refused-batch coalescing from per-session McpClientManager copies onto a single workspace-scoped controller owned by the pool. 4 sessions × budget=2 now caps the workspace at 2, not 8. Core class (`packages/core/src/tools/mcp-workspace-budget.ts`): - New `WorkspaceMcpBudget` mirrors the manager's state machine (`tryReserve` / `release` / `recordRefusal` / hysteresis at `MCP_BUDGET_WARN_FRACTION`/`MCP_BUDGET_REARM_FRACTION` / bulk-pass coalescing) but is constructed once per workspace. - Reservation key is server NAME (matches PR 14 v1 contract; two pool entries with same name but divergent fingerprints share one slot). - `recordRefusal` flushes inline as a length-1 batch when called out-of-bulk-pass; bulk passes accumulate and `endBulkPass` does the coalesced emit (mirrors `McpClientManager.refuseAndLog → emitRefusedBatchIfAny`). Pool integration (`mcp-transport-pool.ts`): - New optional `budget?: WorkspaceMcpBudget` ctor option + `getBudget()` accessor for snapshot builders. - `acquire()` calls `tryReserve` pre-spawn; `'refused'` returns `BudgetExhaustedError` after `recordRefusal`. Spawn-failure path rolls back the slot (V21-4) when no sibling entry holds the name. - Entry close callback releases the slot if no other entry shares the same `serverName` (multi-fingerprint preservation). Manager integration (`mcp-client-manager.ts`): - `discoverAllMcpToolsViaPool` brackets the pass with `beginBulkPass`/`endBulkPass` so per-server BudgetExhaustedError refusals coalesce into ONE `refused_batch` event at end of pass. - `BudgetExhaustedError` from pool is logged at debug (deliberate refusal, not a failure); other errors stay at `error`. Daemon wiring (`acpAgent.ts`): - `QwenAgent` ctor reads `QWEN_SERVE_MCP_CLIENT_BUDGET` / `QWEN_SERVE_MCP_BUDGET_MODE` env vars (same path as per-session manager) and constructs `WorkspaceMcpBudget` when budget > 0, passes it to the pool. - `broadcastBudgetEvent(event)` fans workspace-scoped events to every attached session via per-sid `extNotification`s on the shared connection — replaces N per-session callbacks with one pool callback fanning out N times. - `newSessionConfig` skips the per-session `setMcpBudgetEventCallback` wiring when the workspace budget is active (prevents double-firing). - `buildWorkspaceMcpStatus` reads pool budget when active, marks the cell `scope: 'workspace'`. Per-session fallback unchanged. - `buildBudgetCells` accepts optional `scope` parameter; pre-F2 daemons / `--no-mcp-pool` keep `'session'` for back-compat. SDK additive surface (`sdk-typescript/src/daemon/events.ts`): - `DaemonMcpBudgetWarningData.scope?: 'workspace' | 'session'` - `DaemonMcpChildRefusedBatchData.scope?: 'workspace' | 'session'` - New helper `isWorkspaceScopedBudgetEvent(data)` for SDK consumers branching on scope. Type predicates unchanged (scope is optional). - Reducer counters (`mcpBudgetWarningCount` / `mcpChildRefusedBatchCount`) increment regardless of scope per V21-12 — workspace events fan to all sessions so counters move in lockstep. Tests: - 17 new `WorkspaceMcpBudget` tests covering tryReserve, release, hysteresis state machine, refused-batch coalescing, getters - 3 new pool integration tests covering acquire-refused-on-cap, slot release on entry close, slot rollback on spawn failure - All 163 pre-existing F2 tests pass; 229 total core+SDK tests Total: 1 new core class, ~600 LOC production + ~270 LOC tests. * fix(core): self-review fold-ins for F2 commit 6 — slot release race + iter safety Three findings from the code-reviewer pass on `ef2974b85`; one real race fix + two clarity/defensive improvements. R1 (race, important — 86): close-callback released the budget slot prematurely when a same-name in-flight spawn was still running. The sibling check inspected only `this.entries`, missing entries that hadn't yet completed `markActive`. Sequence: entry A for 'srvA' finishes spawn → registers in `entries`. Entry B (different fingerprint, same name) starts spawning. Entry A drains; close- callback finds no siblings in `entries` (B not yet registered) → releases the slot. B finishes; slot is unreserved while B occupies capacity. A subsequent acquire for a third name slips past the cap. Fix: new `hasNameSibling(name)` helper checks BOTH `this.entries` and `this.spawnInFlight.keys` (form `${name}::${fingerprint}`, so a `startsWith(`${name}::`)` test isolates same-name in-flight spawns). Used by the close-callback AND the spawn-failure rollback. Order of catch/finally chained on the spawn promise is also fixed: `finally` removes from `spawnInFlight` BEFORE the `catch` runs the rollback, so `hasNameSibling` sees the post-cleanup state. Pre-fix the catch ran first while the in-flight entry was still in the Map — masked the rollback's release decision. New test: `preserves slot when entry closes during a same-name in-flight spawn (R1 race fix)` exercises exactly this sequence. R2 (docs): SDK reducer counter docstrings updated to call out the N× workspace fan-out multiplier explicitly. A workspace-scoped `mcp_budget_warning` event fires once at the budget but produces N reducer increments across N attached sessions on the daemon's connection. Pre-fix the docstring didn't mention this and consumers aggregating `mcpBudgetWarningCount` across sessions would double-count silently. Now both `mcpBudgetWarningCount` and `mcpChildRefusedBatchCount` docstrings have a "workspace-scope multiplier" paragraph pointing consumers at `isWorkspaceScopedBudgetEvent` for branching. R3 (defense): `broadcastBudgetEvent` snapshots `this.sessions.keys` into `Array.from(...)` BEFORE the per-id async fan-out so a concurrent `killSession` (which mutates `this.sessions` synchronously inside its handler) can't corrupt the iterator. No known reproducer in the current code paths but cheap defensive hardening — matches the same pattern used by the bridge's `broadcastWorkspaceEvent`. R2 of the original review (V21-12 reducer scope-blindness) is by- design per design §11.4: SDK consumers wanting a deduplicated "workspace events fired" tally use `lastMcpBudgetWarning?.scope` to gate. The docstring fix (above) closes the documentation gap that made this contract invisible. Tests: 151 pool + workspace-budget + manager + SDK events tests pass (3 new pool integration tests including the R1 regression). Lint clean. * fix(core): wenshao W1-W15 review fold-ins for F2 commits 5+6 Twelve real fixes (7 critical + 5 minor) + 3 declined-with-reply. W1 (critical): pool spawn-failure leaked `statusChangeListener` — catch only ran `entries.delete` + `client.disconnect`, never `forceShutdown` (the sole removal path). Each failure leaked one listener permanently. Fix: call `entry.forceShutdown('manual')` before disconnect; wrap in try/catch since the entry never reached `active`. W2 (critical): `statusChangeListener` corrupted sibling entries' `localStatus` for multi-fingerprint name collisions. Module-level `serverStatuses` is shared across all entries with the same `serverName`; entry A's transport error wrote DISCONNECTED, B's listener fired with that status, and the `if (status !== this.localStatus)` guard didn't catch it because B was CONNECTED. Fix: cross-check `this.client.getStatus() !== status` (per-entry truth) before mirroring — sibling writes are now ignored. W3 (critical): `doRestart()` skipped the `listDescendantPids` + `sigtermPids` sweep that `forceShutdown` performs. For stdio MCP servers wrapped by `npx`/`uvx`/`pnpm dlx`, every restart-via-HTTP left the actual server grandchild as an orphan. Fix: mirror the sweep BEFORE `client.disconnect`; per-pid failures tolerated. W4 (critical): `doRestart()` didn't `cancelDrainTimer` or transition `'draining' → 'active'`. An entry in drain grace whose restart arrived would yield to the drain timer mid-disconnect, get force-closed, then `client.connect` would spawn an orphan that the pool no longer tracks. Fix: cancel drain + transition state at the top of `doRestart`. W5 (critical): `McpClientManager.pooledConnections` held dead handles after a pool entry transitioned to `'failed'` (entry removed from `pool.entries`, manager never learned). Subsequent discovery passes saw `pooledConnections.has(name)` and skipped re-acquiring → server's tools permanently lost for the session until full `stop` + rediscovery. Fix: subscribe to entry events on `acquire`; evict on `'failed'` (idempotent via `get(name) === conn` guard). W6 (critical): `discoverAllMcpToolsViaPool` was not re-entrant. Two concurrent passes (full + incremental, or two incrementals) could both see `pooledConnections.has(name) === false` before either called `.set()` → second `.set` overwrote first → conn1 leaked forever. Fix: per-manager `discoveryInFlight` mutex; second caller awaits the same promise. W14 (critical): `createUnpooledConnection`'s catch path had the same `statusChangeListener` leak as W1 (different code path, same root cause — only `forceShutdown` removes the listener). Fix: same mirror in the unpooled catch. W9 (minor): `parsePoolDrainMs` accepted `'30000ms'` / `'30000abc'` silently via `Number.parseInt` truncation. Fix: strict `^\d+$` regex; reject with stderr warning + default fallback. W10 (minor): pool's `acquire` called `indexAttach(sessionId, id)` BEFORE `entry.attach()`. If `attach` threw (e.g., entry transitioned to `closed`/`failed` between the existence check and the call), the reverse index retained a stale mapping. Fix: index AFTER `attach` succeeds (both fast path + in-flight path). W13 (doc): `subprocessCount` JSDoc still claimed `stdio + websocket` after R4 restricted it to stdio in commit 5. Fix: doc updated. W15 (defensive): bridge's pool-mode response handler cast `response as PoolEntries` and iterated `response.entries` without runtime shape validation. A buggy/out-of-sync ACP child returning a malformed shape would crash the route with TypeError. Fix: `Array.isArray` check + per-entry shape guard; malformed entries skipped with stderr warning. W7 (test gaps, partial): added regression test `serializes concurrent discovery passes via mutex` for W6. Other coverage gaps (drain mutex, spawnEntry failure, restart failure, createUnpooledConnection) are deferred — better addressed via a focused test-coverage commit after F2 series merges. Declined (with reply on PR): - W8 (`maxReconnectAttempts`/`reconnectStrategy` unused) — health monitor reconnect is a deferred F2 follow-up per design §6.6; the fields stay as forward-compat placeholders. - W11 (duplicate fast-path/in-flight-path attach blocks) — accepted refactor opportunity; not blocking F2 series merge. - W12 (passesSessionFilter O(M×N)) — micro-perf optimization; measurable only with hundreds of tools / large filter lists. Tests: 231 F2/SDK tests pass (1 new mutex regression test); 62 acp-bridge tests pass. Lint clean. * docs(serve): F2 design v2.2 — record PR #4336 32-fold-in review history The PR cycle on #4336 surfaced 32 review fold-ins across 3 wenshao review batches plus 2 self-review batches. Each fold-in is recorded in v2.2 changelog with site / what was wrong / fold-in commit ref so a future contributor reading the design doc + git log can trace every behavior nudge back to its review trigger. Highlight critical fixes that landed mid-PR: - C1 (IDE-close path missed pool drain — leaked entries until OS reaped) - C3 (doRestart reconnect failure left zombie state) - C5 (drainAll mid-spawn race) - C6 (statusChangeListener missing serverName filter) - WR1 (maxIdleTimer fire-action ignored active refs) - WR2 (release-all-then-acquire-all left zero-tools window) - WR3 (doRestart skipped subscriber fan-out) - 6R1 (slot-release race during same-name in-flight spawn) - W2 (sibling-fingerprint statusChangeListener corruption) - W3 (doRestart skipped descendant pid sweep — orphan grandchildren) - W4 (doRestart drain-timer race orphaned new subprocess) - W5 (manager held dead handles after entry 'failed') - W6 (discoverAllMcpToolsViaPool not re-entrant — leaked conn1) Plus 5 declined-with-reply items (W7/W8/W11/W12/R9/R10) filed as F2 follow-ups for a future cleanup PR. * fix(core): wenshao W21-W25 review fold-ins for F2 commit 6 — critical bugs round 4 Three critical bugs + one parsing divergence + one test gap, four adopted as fixes. Round 4 of cumulative wenshao review on F2 PR #4336; all earlier rounds (C1-C7+S1-S4, R1-R10, W1-W15) already shipped in `ae0b296c4` / `72399f109` / `4a3c5cd90`. W21 (critical): `hasNameSibling` used `id.startsWith(\`${name}::\`)` on `spawnInFlight` keys, which produces false positives when a sibling name BEGINS with `${name}::` — server names CAN contain `::` per `mcp-pool-key.test.ts:258`, and `connectionIdOf` is just string concatenation with zero sanitization. Sequence: configure servers `"ext"` and `"ext::github"`, spawn for `"ext"` fails → rollback finds `"ext::github::<fp>"` in spawnInFlight, returns `true` (false positive) → slot for `"ext"` never released → permanent leak until daemon restart. Fix: use `parseConnectionId` (which uses `lastIndexOf('::')`) to extract the exact serverName and compare via equality. Malformed ids skip via try/catch so a stray bad key doesn't crash the rollback path. W24 (parsing divergence): `createWorkspaceMcpBudget` used `Number.parseInt(rawBudget, 10)` while `McpClientManager.readBudgetFromEnv` uses `Number(rawBudget)` + `Number.isInteger`. Same env var produced 100× enforcement difference for `"1e2"` (pool: 1, manager: 100) and divergent acceptance for `"2.5"` / `"0x10"`. Fix: switch to `Number(...)` + explicit `Number.isInteger` guard so pool and manager honor identical env values. W25 (critical, gpt-5.5): pool-mode `spawnEntry` awaited `client.connect()` + `client.discoverAndReturn()` directly with no timeout. A hung stdio/websocket server's connect/discover left `spawnInFlight` unresolved forever — every same-id acquirer waited indefinitely AND the budget slot was never rolled back because the catch never ran. Fix: new `runWithTimeout` wrapper + new `discoveryTimeoutFor(cfg)` helper mirroring `McpClientManager.discoveryTimeoutFor` (stdio 30s, remote 5s, per-server `discoveryTimeoutMs` override clamped to [100ms, 300s]). On timeout the existing W1 catch runs `entry.forceShutdown('manual')` + `client.disconnect()` (which races to close the transport ahead of any silent tool registration) AND the W6 budget rollback releases the slot. W23 (test gap): added `swallows BudgetExhaustedError from pool.acquire and logs at debug` to mcp-client-manager.test.ts. Wires a fake pool whose `acquire` throws `BudgetExhaustedError` for one server, asserts the discovery completes (Promise.all resolves), only the non-refused server lands in `pooledConnections`, and `beginBulkPass`/`endBulkPass` fire exactly once each. W22 (test gap, deferred): five integration paths in acpAgent.ts remain untested (`createWorkspaceMcpBudget`, `broadcastBudgetEvent`, snapshot builder workspace branch, `skipPerSessionBudgetCallback` guard, `buildBudgetCells` scope param). The cli package's vitest config requires a workspace setup not available in this branch; adding tests for these paths produces files that pass locally but might break in CI. Filed as F2 follow-up rather than blocking merge — same pattern as W7 commit-6 partial-adopt. Tests: 186 F2 + workspace-budget + manager tests pass (1 new W23 regression). Lint clean. * fix(core): wenshao W31-W40 review fold-ins for F2 commits 5+6 — round 5 Two more critical doRestart races + DRY refactor + 3 test gaps. W33 duplicate of already-fixed W21 (no action). W31 (critical): `doRestart` cancelled `drainTimer` (W4 fix) but NOT `maxIdleTimer`. Same orphan-process race as W4, different timer: when the entry was draining (refs=0, both timers running), the maxIdleTimer's fire-action checked `refs.size > 0` and force-shut down the entry mid-restart → `doRestart` resumed and spawned an orphan that the pool no longer tracked. Fix: cancel BOTH timers + reset `firstIdleAt` at top of `doRestart` so a future detach starts a fresh idle window. W32 (critical): `doRestart` failure catch skipped descendant pid sweep. When `client.connect()` partially spawned a stdio wrapper before `discoverAndReturn()` failed, the wrapper's grandchildren (npx / uvx workers, real MCP server) survived as orphans. Every failed restart leaked one+ orphan process. Fix: call `sweepAndDisconnect('restart_failed')` in the failure catch so the NEW transport's grandchildren are SIGTERM'd before the entry transitions to `'failed'`. W34 (improvement): generation guard alone didn't catch concurrent `forceShutdown`. If `forceShutdown` ran during any of `doRestart`'s awaits (e.g., `drainAll` mid-restart on shutdown), the entry was in `'closed'` state but `doRestart` resumed and wrote CONNECTED + emitted `reconnected` on a pool-evicted zombie entry. Fix: state guard `if (this.state === 'closed' || this.state === 'failed')` after the generation guard; drop the snapshot silently. W35 (observability): `doRestart` logged pid-sweep + disconnect failures at `debug` level while `forceShutdown`'s identical operations used `warn` and `error`. In production (debug off) a restart that failed to sweep grandchildren was completely invisible — operators debugging memory climb saw "successful restarts" with no error trail. Fix: unified into the new `sweepAndDisconnect` helper with `warn` for sweep failures, `error` for disconnect failures. W36 (doc): `restartByName` JSDoc said `Promise.allSettled` but the implementation uses `Promise.all` with per-entry try/catch (rejections never escape). Doc updated to match. W37 (DRY): pid sweep + disconnect was duplicated nearly verbatim across three sites — `forceShutdown`, `doRestart` pre-call, and (after W32) the failure catch. Extracted shared `sweepAndDisconnect(reason)` private helper. Future changes to either step now happen in one place. W38 (coverage): no test exercised `discoverAllMcpToolsIncremental` with a pool — the C7 commit 5 fix added the gate but only `discoverAllMcpTools` had pool-routing coverage. Added regression test mirroring the existing pool test but calling `discoverAllMcpToolsIncremental`. W39 (coverage): no test exercised `disconnectServer`'s pool-mode branch (release pooled connection + delete from `pooledConnections`). Added test wiring fake pool, populating via discovery, asserting `release()` called on disconnect. W40 (coverage): existing `restartByName` test only asserted `results[0].restarted === true` — never verified that the R3 fix's post-restart subscriber fan-out actually delivered the new snapshot to attached views. Added assertion: post-restart `removeMcpToolsByServer` call count > pre-restart count (one extra call from the fan-out's `view.applyTools` invocation). W33 was reviewer noticing the same `hasNameSibling` startsWith prefix collision already fixed by W21 in `3fb453220` — replied with the commit reference, no action needed. Tests: 189 F2 + workspace-budget + manager tests pass (3 new W38 / W39 / W40 regressions). Lint clean. * fix(core): wenshao W41-W46 review fold-ins for F2 commits 5+6 — round 6 Six review findings — 4 real critical bugs, 1 false positive (already correct), 1 coverage gap deferred. The bugs are tightly clustered around the doRestart + spawnEntry timeout / state-guard surface. W41 (false positive): reviewer claimed `entryCount` / `entrySummary` not on `ServeWorkspaceMcpServerStatus`. Verified — they ARE declared in `packages/acp-bridge/src/status.ts` (added in commit 5). Both core and cli typecheck pass cleanly. No change. W42 (critical, build break): TS2367 at `mcp-pool-entry.ts:639`. The `if (this.state === 'closed' || this.state === 'failed')` state guard added in W34 fold-in passes runtime correctness but TS's control-flow analysis narrows `this.state` along the non-throwing path of the prior `try { connect; discover } catch` (catch sets state='failed' then throws), eliminating `'closed'`/`'failed'` from the reachable union. Build hard-failed. Fix: read `this.state` into a `currentState` local with explicit `as PoolEntryState` cast to re-widen the type. The runtime guard is required (concurrent forceShutdown CAN mutate state across awaits). W43 (critical, race): `runWithTimeout` in `spawnEntry` had `entries.set(id, entry)` + `entry.markActive(...)` INSIDE the timeout-wrapped IIFE. When timeout fired, the catch block deleted the entry and forceShutdown'd it, but the IIFE kept running. If connect/discover settled later, the IIFE's late `entries.set` re-inserted the deleted entry and `markActive` set `state='active'` + `localStatus=CONNECTED` on a transport already disconnected by forceShutdown → zombie entry. Fix: move `entries.set` + `markActive` OUT of the IIFE into the post-await success path. Mirrors `McpClientManager.runWithDiscoveryTimeout`'s `timedOut` flag pattern. W44 (critical, hang): `doRestart` had no wall-clock timeout matching W25's `spawnEntry` fix. A hung MCP server during a restart blocked `restartInFlight` indefinitely; because `restart()` coalesces concurrent callers onto the same promise, every subsequent restart attempt also hung forever and the HTTP route handler never returned. Fix: wrap connect+discover in `runWithTimeout` using the same `discoveryTimeoutFor` resolution. W45 (critical, leak): generation guard + state guard in `doRestart` returned silently without sweeping the new transport spawn. `client.connect()` had already spawned npx/uvx wrapper + MCP grandchild; the OLD transport was disconnected pre-attempt via `sweepAndDisconnect('restart')`, so the new spawn would leak as net-new orphans on both supersede paths. Fix: both guards now call `await this.sweepAndDisconnect('restart_superseded')` before returning. W46 (coverage, deferred): 5 untested new paths flagged. The existing W38/W39/W40 tests (commit `ee3e60af3`) cover incremental discovery + disconnectServer + restart fan-out. The remaining gaps (maxIdleTimer cancellation in doRestart, state guard, sweepAndDisconnect('restart_failed'), runWithTimeout in spawnEntry, hasNameSibling parseConnectionId) need integration tests with fake timers + hung-mock connect — substantially more test infrastructure than the partial-adopt budget for this round. Filing as F2 follow-up. Refactor: `runWithTimeout` + `discoveryTimeoutFor` extracted from mcp-transport-pool.ts into new `mcp-discovery-timeout.ts` so `PoolEntry.doRestart` (W44) can share the primitives without a cross-module value import (which would create a runtime cycle between mcp-pool-entry → mcp-transport-pool). Tests: 189 F2 tests pass; typecheck clean (`npx tsc --noEmit` returns 0 errors). Lint clean. * fix(core): wenshao W51 + W52 review fold-ins for F2 commit 6 — round 7 Two suggestions, both adopted. W52 (semantic): doRestart's generation guard + state guard returned void with debug-level logging. `restart()` resolved successfully → `restartByName` reported `{restarted: true}` to the HTTP API caller even when the restart was effectively aborted. Operators saw "restart succeeded" while sessions silently lost the server. Fix: both guards now `throw new Error(...)` AFTER calling `sweepAndDisconnect('restart_superseded')` (W45 cleanup still happens). `restartByName`'s try/catch translates the throw into `{restarted: false, reason: <message>}` on the HTTP response — the caller now sees an accurate per-entry result. W51 (coverage): added `mcp-discovery-timeout.test.ts` with 14 tests covering both shared primitives. Pre-fix the new `mcp-discovery-timeout.ts` module had ZERO unit tests despite both `spawnEntry` (W25) AND `doRestart` (W44) depending on it for correctness (timeout bounds, clamping, timer cleanup). Tests pin: `discoveryTimeoutFor` stdio default (30s) / remote defaults (httpUrl / url / tcp → 5s) / per-server override clamping to [100ms, 300s] / NaN+Infinity fall through; `runWithTimeout` task resolve-before-timer / timer-before-task / task rejection / clearTimeout on both settlement paths. Tests: 203 F2 tests pass (14 new in mcp-discovery-timeout.test.ts). Typecheck clean. Lint clean. * fix(core): wenshao W61-W76 review fold-ins for F2 commits 5+6 — round 8 Sixteen review findings — 11 adopted as fixes (6 critical bugs + 5 suggestions/improvements), 5 declined-with-reply. W62 (critical, hang): `createUnpooledConnection` had no timeout matching W25/W44. SDK MCP / non-pooled HTTP servers could block `acquire` indefinitely. Fix: wrap connect+discover in `runWithTimeout` using `discoveryTimeoutFor(cfg)`. W63 (critical, race + leak): `drainAll` had three bugs in one block: (1) returned a live `errors` array reference that background `shutdownPromises` could keep mutating; (2) never cleared the timeout timer when `Promise.all` won the race; (3) `forced` count went retroactively negative when late settles pushed into `drained` after the snapshot. Fix: capture lengths synchronously after the race, return `[...errors]` copy, and explicitly `clearTimeout` on both race outcomes. Clamp `forced` to non-negative. W65 (critical, bypass): workspace budget enforcement was bypassed for unpooled HTTP/SSE/SDK-MCP connections — `--mcp-client-budget=2` let 3 HTTP MCP servers connect without refusal. Fix: move the `tryReserve` check BEFORE the `isPoolable` early-return so it applies to both pooled-spawn and unpooled paths. Unpooled entries' close-callback now releases the slot via the same `hasNameSibling`-guarded pattern pooled entries use. W66 (correctness): `applyPrompts` registered ALL prompts unconditionally, ignoring the per-session `excludeTools` / `includeTools` filter that `applyTools` honored. A session restricting tools still received every prompt + the prompt's bound `invoke` closure reaching the same shared `Client` state/credentials as more-trusted siblings. Fix: new `passesSessionPromptFilter` helper applied to each prompt by name. Reuses `excludeTools`/`includeTools` config keys. W68 (defense-in-depth): `restartByName` lacked the `draining` mutex check `acquire()` has. A concurrent restart during `drainAll()` could spawn a fresh subprocess via `client.connect()` that wasn't in drainAll's entry snapshot. Fix: `if (this.draining) return [];` early-out. W69 (correctness): `forceShutdown` set `localStatus = DISCONNECTED` AFTER `await this.sweepAndDisconnect`. During the async yield, `getSnapshot()` still saw `localStatus === CONNECTED` for an entry mid-teardown. Fix: set `localStatus` synchronously alongside `state` at the top of the method (sibling of the C4 fix). W70 (defensive): `emit()` delegated to `EventEmitter.emit` directly, so a synchronous throw from one session's listener would crash the emit call and skip remaining listeners — in `forceShutdown` this meant one buggy listener prevented subprocess cleanup, budget slot release, and entry eviction for ALL sessions sharing the entry. Fix: iterate listeners with per-listener try/catch + debug log on failure. W67 (premature API): `MCPCallInterruptedError` + `onEntryEvent` were exported with zero callers. Removed `onEntryEvent` (was public, no F4 consumer shipping in this PR); `MCPCallInterruptedError` stays per design §13.4 contract for the V21-5 in-flight call interruption follow-up. Re-introduce `onEntryEvent` alongside its first F4 consumer. W72 (correctness, gpt-5.5): pool-mode discovery only updated `McpClientManager.discoveryState` (manager-local), leaving the module-global `mcpDiscoveryState` at `NOT_STARTED`. `GET /workspace/mcp` + MCP preflight cell read the global → reported `not_started` while pool discovery was running or already complete. Fix: new exported `setMCPDiscoveryState(...)` from mcp-client.ts; pool path writes the global at IN_PROGRESS / COMPLETED transitions. W73 (critical, gpt-5.5): `drainAll`'s `Promise.allSettled([...spawnInFlight])` wait was unbounded — a spawn with a large `discoveryTimeoutMs` override could block daemon shutdown for the full discovery timeout BEFORE the 8-10s drain budget began. Fix: race the in-flight wait against the same `timeoutMs` deadline; if it doesn't settle, proceed with whatever entries are visible. W75 (memory leak, gpt-5.5): the `'failed'` event listener wired in `discoverAllMcpToolsViaPool` was anonymous arrow → only removed on `conn.release()`. The `'failed'` branch deleted from `pooledConnections` but never released/unsubscribed; listener stayed attached, pinning manager/connection refs in its closure. Fix: named listener that calls `conn.off('event', ...)` on 'failed' before deleting from the map. Declined with reply (filed as F4 / scope follow-ups): - W61 / W71 (releaseSession wiring on per-session close): the ACP channel has no per-session close notification, so sessions are append-only in `acpAgent.this.sessions` for the daemon's lifetime. Adding session-end hooks needs F4-level lifecycle work; pool entries currently drain en-masse via `drainAll` on daemon shutdown. Filing as F4 follow-up. - W64 (cross-session DoS via restart): per-session ownership checks would change the workspace permission model — currently all authenticated workspace clients are equal (PR 17 contract); adding ownership for restart specifically would be inconsistent with the rest of the workspace mutation surface. Defer to a workspace-policy PR. - W74 (`discoveryTimeoutFor` duplication with manager): refactor to share single source-of-truth touches `McpClientManager` internals; risk of regression in legacy mode. The duplication is acknowledged in the file's own header comment ("Mirrors `McpClientManager.discoveryTimeoutFor` exactly"). Defer. - W76 (entryIndex route tests): cli package's vitest setup requires workspace-linked deps not available locally; same partial-adopt pattern as W22. Tests: 203 F2/SDK tests pass (no new tests this round — fixes only). Typecheck clean. Lint clean. * fix(core): address MCP pool review feedback Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(core): gpt-5.5 W77 — cancel in-flight unpooled acquire on session release W77 (gpt-5.5 via Qwen Code /review): `createUnpooledConnection` stored the `unpooled-*` entry in `this.entries` before awaiting `client.connect()` / `client.discover()`, but only called `indexAttach(sessionId, id)` after `entry.attach()` succeeded. If `closeStoredSession()` invoked `releaseSession(sessionId)` during the connect/discover window, `sessionToEntries[sessionId]` was empty — so the in-flight unpooled transport kept spawning and `attach()` later registered tools/prompts into a session that had already been closed. The race is latent today (per-session releaseSession wiring is W61/W71, deferred to F4) but would become live the moment that hook lands. Fix: - `mcp-pool-entry.ts`: add public `isTerminated()` probe and guard `markActive()` against terminal state. Pre-fix, a concurrent `forceShutdown` flipping state→'closed' would be undone by markActive's unconditional `state='active'` assignment, resurrecting a torn-down entry. - `mcp-transport-pool.ts` `createUnpooledConnection`: * call `indexAttach(sessionId, id)` synchronously right after `entries.set(id, entry)`, BEFORE the connect/discover await. * post-await: extend the discard guard with `entry.isTerminated()` to detect a concurrent `releaseSession`→`forceShutdown` that landed during the await, and call `view.teardown()` to roll back the side-effects of the legacy u…

* feat(daemon): add shared UI transcript layer * fix(daemon): address ui review feedback * test(daemon): cover raw event diagnostics option * fix(daemon): address latest ui review * fix(daemon): cover reconnect and status edge cases * fix(daemon): guard prompt busy cleanup * fix(daemon): handle trimmed tool updates * fix(daemon): cap transcript text blocks * fix(daemon): dedupe trimmed tool diagnostics * fix(daemon): harden webui transcript edge cases * fix(daemon): preserve webui daemon events * fix(daemon): address latest ui review comments * fix(daemon): close latest ui review nits * fix(daemon): harden ui review edges * fix(daemon-ui): address wenshao 2 Critical findings (QwenLM#4328 review) ## Critical #1 — 401/403 reconnect storm + transcript wipe `DaemonSessionProvider`'s reconnect loop kept retrying `createOrAttach` on 401/403 even with `autoReconnect: true`. Each cycle: - hit the daemon with the same bad token → 401 again - cleared the session handle - the next successful attempt (if token magically recovered) would receive a different sessionId, triggering the `store.reset()` branch at line 143 and wiping the user's transcript - no terminal "auth failed" state surfaced to the user Fix: split `TERMINAL_SESSION_HTTP_STATUSES` into `AUTH_FAILURE_HTTP_STATUSES` (401, 403) and the rest (404, 410). On auth failure, return from the reconnect loop unconditionally regardless of the `autoReconnect` flag — these are credential failures, not transient. The user must update credentials; daemon spam must stop. `extractHttpStatus` helper factored out of `isTerminalSessionHttpError` to share between the two predicates. ## Critical QwenLM#2 — rawInput / rawOutput leaking secrets to UI `normalizer.normalizeToolUpdate` forwarded `rawInput` / `rawOutput` verbatim onto `DaemonUiToolUpdateEvent` → `DaemonToolTranscriptBlock`. The `details` projection was redacted via `stringifyRedactedJson` / `redactSensitiveFields`, but the underlying `rawInput` / `rawOutput` fields were unredacted. Any UI component that read those fields directly (ShellToolCall, WriteToolCall, JSON debug panels) leaked the raw values to the DOM. Example: `{ command: 'curl', apiKey: 'sk-prod-...' }` had `apiKey` redacted in `details` but exposed verbatim on `rawInput`. Fix: apply `redactSensitiveFields` to both `rawInput` and `rawOutput` ONCE at the normalizer boundary, then reuse the redacted shape for the `details` projection. Downstream is uniformly safe; no double traversal. ## Tests (49/49 pass) - SDK `daemonUi.test.ts` (36 tests, +1) — new test `redacts sensitive fields in tool.update rawInput and rawOutput at normalizer boundary` verifies full-event string scan finds zero secret values + structural keys preserved with values `'[redacted]'`. - WebUI `DaemonSessionProvider.test.tsx` (13 tests, +2) — new tests `breaks out of the reconnect loop on 401 / 403 auth failures even when autoReconnect is true` and `still reconnects on 404 / 410 session-not-found errors when autoReconnect is true` lock in the asymmetry: auth failure → 1 attempt only; session-not-found → retries until success. ## Out of scope (declined / deferred — see PR review reply) - CRIT QwenLM#3 `withActionTimeout` test coverage gap → behavior correct, test-only follow-up (avoids PR bloat) - Suggestions QwenLM#4-7 → 4 nice-to-haves, deferred to keep PR focused on production-correctness fixes Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(daemon-ui): redact tool details in web transcript * fix(daemon-ui): close review gaps in transcript safety --------- Co-authored-by: 秦奇 <gary.gq@alibaba-inc.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…wenLM#4411) * refactor(core): F2 PR A R9 — McpClientManager options-object ctor R9 (filed as F2 follow-up from QwenLM#4336 review): 7 positional ctor args collapse to (config, toolRegistry, options?: McpClientManagerOptions). The trailing 5 (eventEmitter, sendSdkMcpMessage, healthConfig, budgetConfig, pool) become named fields on `McpClientManagerOptions`. Test factory `mkManager(overrides?)` introduced at the top of `mcp-client-manager.test.ts` so each of the prior 80 inline constructions becomes a single line naming only the field(s) the test overrides; the 4 `undefined` sentinels each test threaded through to reach the trailing `pool` arg are gone. Net: 113 LOC removed (test) + 35 LOC added (src exposes interface + mkManager factory + tool-registry call site update). Behavior unchanged — same field assignments, same downgrade-enforce-without- budget breadcrumb, same budget event wiring. Filed bucket: F2 perf / cleanup PR A (R9 + W11 + W12 + R10/R23 T7), see issue QwenLM#4175 item 7 "F2 post-merge cleanup PRs". This is the first of the 4 fixes in PR A; W11/W12/R10 follow as separate commits. Test sweep: 84/84 mcp-client-manager.test.ts pass; typecheck clean. * refactor(core): F2 PR A W11 — extract attachPooledSession + rollbackReservationOnSpawnFailure W11 (filed as F2 follow-up from QwenLM#4336 review): two private helpers on `McpTransportPool` to eliminate inline duplication in `acquire()`: - `attachPooledSession(entry, id, serverName, cfg, sessionId, toolReg, promptReg)`: builds `SessionMcpView` + `entry.attach` with the standard pool release callback. Used by both the fast-path attach (existing entry) and the post-spawn attach (after `await inFlight`). NOT used by `createUnpooledConnection` — its release callback runs `entry.forceShutdown('manual')` + `indexDetach` directly (no pool refcount accounting since unpooled entries are per-session). - `rollbackReservationOnSpawnFailure(reservationResult, serverName)`: R24 T17 contract — only release the budget slot if THIS acquire actually reserved a new slot (`'reserved'`); `'already_held'` skips because the sibling owns it. Used by both the unpooled catch and the pooled spawn-in-flight catch. Race-window invariants (W10 / W77 / W90 / W111 / W125 / R24 T17) stay at the call sites because they describe the SURROUNDING ordering, not the helpers themselves. Helpers are documented to defer those decisions back to callers. Behavior unchanged. Filed bucket: F2 perf cleanup PR A (R9 done / W11 this commit / W12 + R10 to follow). Test sweep: 28/28 mcp-transport-pool.test.ts pass; typecheck clean. * refactor(core): F2 PR A W12 — SessionMcpView precompute filter Sets W12 (filed as F2 follow-up from QwenLM#4336 review): `applyTools` / `applyPrompts` precompute `excludeSet` + `includeSet` once per pass instead of scanning `cfg.includeTools` / `cfg.excludeTools` arrays inside every per-tool iteration. Pre-fix the per-tool predicate (`passesSessionFilter`) walked both arrays for every snapshot entry → O(M × N) per `applyTools` call. With M tools × N filter entries, typical M=5-20 / N=2-5 case finishes in microseconds either way; the win is data-structure correctness and code clarity, not perceived perf. `passesSessionFilter` / `passesSessionPromptFilter` (the array- based predicates) stay exported and unchanged for unit tests + any caller wanting to test a single name without paying Set construction. The bulk path uses two new private helpers `compileNameFilter` + `compiledFilterAccepts` whose Sets live on the `applyTools` / `applyPrompts` stack frame. Same semantics: `excludeTools` is direct-equality match (no parens strip — pre-F2 behavior preserved); `includeTools` strips the first `(...)` suffix so `toolName(args)` matches `toolName`. Filed bucket: F2 perf cleanup PR A (R9 + W11 done / W12 this commit / R10 to follow). Test sweep: 13/13 session-mcp-view.test.ts pass; typecheck clean. * perf(core): F2 PR A R10 / R23 T7 — pid-descendants ps snapshot + pgrep fallback R10 / R23 T7 (filed as F2 follow-up from QwenLM#4336 review): the Linux / macOS pid-descendant enumeration moves from per-pid `pgrep -P <pid>` BFS (one subprocess fork per node visited) to a single `ps -A -o pid=,ppid=` snapshot followed by an in-memory tree walk over `Map<ppid, pid[]>`. Windows analog: single `Get-CimInstance Win32_Process | ConvertTo-Csv` snapshot of all `(ProcessId, ParentProcessId)` rows replaces per-pid `Get-CimInstance -Filter "ParentProcessId=$p"` BFS. Two motivations: 1. **Fork count**: typical `npx → tool` / `uvx → tool` wrapper trees are 2-3 levels deep with B=1-3 children per node → pre-fix BFS forked ~5-10 subprocesses per pool-shutdown call. Post-fix: exactly 1 fork regardless of tree depth. 2. **Snapshot consistency**: pre-fix BFS walked the table level by level; a child that forked between two adjacent BFS levels could be missed (we'd see the child but query its descendants AFTER the new fork). The snapshot path captures the table at one instant; new descendants forked after the snapshot are tolerated by the existing ESRCH-tolerant SIGTERM loop. Caveats: - `ps -A -o pid=,ppid=` is POSIX standard (macOS / Linux / *BSD), but BusyBox `ps` <v1.28 (2018) doesn't support `-o`. Distroless containers may not have `ps` at all. To preserve behavior on those edge platforms, the legacy per-pid `pgrep` BFS is retained as a fallback (`listDescendantPidsUnixPgrepFallback`). Same retention on Windows for the per-pid filter path. - Snapshot path uses `maxBuffer: 8MB` to cover ~250k-process pathological hosts. Default 1MB would clip at ~30k processes. - `MAX_DESCENDANTS = 256` / `MAX_DEPTH = 8` caps preserved on both snapshot + fallback paths. - Snapshot scans the entire host process table (not just the target subtree). On the typical 200-500 process developer machine this parses in <10ms; the win over BFS is real but not order-of-magnitude — ~2x improvement, not 100x. PR A's motivation framing is "fork hygiene + consistency", not raw perf. Empty-result detection: snapshot path tracks `parsedRows`. If the ps/CIM tool runs successfully but produces 0 parseable rows (BusyBox without `-o` echoing usage, AppLocker truncating CIM output, etc.), we throw — the outer catch falls back to the per-pid path. A genuine "root has no children" case parses many rows and just returns empty from the walk. So the "no-children-found" semantics are preserved across both paths. Test gate update: pre-fix `integration: spawn-and-enumerate` test skipped on `CI === '1'` because pgrep wasn't available on minimal CI runners. Post-fix `ps -A` is universally available on non-distroless Linux/macOS — only the Windows skip remains. 6/6 pid-descendants tests pass including the now-active integration spawn test. Design doc (`docs/design/f2-mcp-transport-pool.md` §6.4 + the F2 follow-up table at lines 82-85) updated to reflect the snapshot + fallback shape, and to mark W11 / W12 / R9 / R10 as ✅ Done in PR A with the per-fix commit refs. This commit completes F2 cleanup PR A. Filed bucket order: R9 (commit 0cb1eaa) → W11 (commit 2d546ef) → W12 (commit a4a855a) → R10 (this commit). Issue QwenLM#4175 item 7 "F2 post- merge cleanup PRs": PR A done; PR B (W93 + W133-a + W134) and PR C (W133-c SDK breaking) to follow as separate clusters. Test sweep: 287/287 F2 + cli pass; ESLint clean; typecheck clean (core + cli). Integration test on macOS local runs the new snapshot path successfully. * refactor(core): F2 PR A R2 — wenshao followup (visited set + dedup predicate) Two Suggestions from wenshao's first PR QwenLM#4411 review pass (07:15Z), both small and worth folding before merge: PR-A-R2 #1 (pid-descendants.ts:309 — walkDescendants visited set): `walkDescendants`'s BFS lacked a `visited` set. If the snapshot captures a PID-reuse cycle — rare but possible on busy hosts with rapid pid churn between `ps -A`'s start and parse, where Linux wraparound can show a freed pid in a different parent's children list creating an A→B / B→A cycle — pre-fix BFS would revisit nodes and fill the MAX_DESCENDANTS=256 quota with duplicate entries, starving legitimate descendants. Pre-PR-A the per-pid `pgrep` BFS had the same theoretical issue but was less exposed (each `pgrep -P pid` call returns only DIRECT children; snapshot captures the whole tree at once, making cycles instantly visible). Fix: 3-LOC `Set<number>` add. `root` seeded into `visited` so a malformed snapshot listing root as a descendant of its own child doesn't re-enqueue root either. PR-A-R2 QwenLM#2 (session-mcp-view.ts:117 — predicate dedup): After W12, the exported `passesSessionFilter` / `passesSessionPromptFilter` still called `passesNameFilter` (the pre-W12 array-based implementation), while `applyTools` / `applyPrompts` used `compiledFilterAccepts(compileNameFilter(...))`. Two parallel implementations of the same predicate — future change to one without the other would silently diverge: - the exported function's tests (passesSessionFilter unit tests) would still pass - the production filter path in applyTools/applyPrompts would behave differently Reviewer also noted `passesSessionPromptFilter` had zero callers in production code or tests after W12 — `applyPrompts` no longer references it. Kept the export rather than deleting it (matches the `passesSessionFilter` shape for symmetry + the F3 audit-path comment block earmarks both as the replay predicates), but routed both through `compiledFilterAccepts(compileNameFilter(...))` so there is a single source of truth. Set construction is per-call for these exports (negligible for unit-test / one-off probes); the bulk paths in `applyTools` / `applyPrompts` still construct ONE filter per pass via the original W12 code path. `passesNameFilter` (the standalone array-based helper) deleted — its only callers were the two exports, which now use the compiled path. Public-API surface unchanged: the two exported functions keep their signatures and semantics. Test sweep: 19/19 pid-descendants + session-mcp-view tests pass; typecheck + ESLint clean. Continues commit chain: f059170 (R9) → 20d2f1b (W11) → 6cf18f6 (W12) → 2a41c6f (R10) → this (R2 followups). * fix(core): F2 PR A R3 T3 — Windows CSV delimiter locale fix `ConvertTo-Csv -NoTypeInformation` honors the system locale's list separator on PowerShell 5.1. On German / French / Dutch / Italian / ... locales the separator is `;` not `,`, so the regex `^"(\d+)","(\d+)"$` in `snapshotProcessTreeWin` never matched → `parsedRows === 0` → snapshot threw → fell back to the per-pid CIM filter path with ~0.5-1s extra PowerShell startup latency per descendant on every pool shutdown. Fix: 1-LOC `-Delimiter ","` on `ConvertTo-Csv`. Forces comma regardless of locale or PowerShell version. PowerShell 7+ defaults to comma already; 5.1 (the Windows-bundled version most users have without explicit upgrade) honored locale. The explicit delimiter makes both consistent. Skipped wenshao's companion Suggestion T4 (test coverage for walkDescendants MAX_DESCENDANTS / MAX_DEPTH caps) as F2 hardening follow-up — the caps are simple 2-line guards exercisable by inspection; ~50 LOC of mock infrastructure isn't commensurate with the regression risk on currently-stable defensive code, and (per the issue QwenLM#4175 follow-up bucket) we keep dedicated test-coverage work out of perf-cleanup PRs. Continues commit chain: f059170 (R9) → 20d2f1b (W11) → 6cf18f6 (W12) → 2a41c6f (R10) → ced5d62 (R2) → this (R3 T3). Test sweep: 6/6 pid-descendants tests pass; typecheck + ESLint clean.

…ication Addresses the 6 inline comments from wenshao's 2026-05-23 13:03 CHANGES_REQUESTED review. ## Real fix — WeakMap memoization actually works now (Suggestion QwenLM#2) The earlier `sortedBlocksCache` / `childrenIndexCache` WeakMaps keyed on `state.blocks` reference, but `cloneTranscriptState` did `blocks: [...state.blocks]` eagerly — every dispatch produced a fresh array, so the caches never hit. The JSDoc claim "memoize across renders that don't touch blocks" was misleading. Fix: lazy copy-on-write. - `cloneTranscriptState` now shares `blocks` + `blockIndexById` by reference (no eager copy). - New `takeBlocksOwnership(state)` performs the array copy at the first mutation; subsequent mutations in the same dispatch are no-ops (tracked via module-level `ownedBlocks: WeakMap<State, blocks>`). - `appendBlock`, `getWritableBlockById`, and `trimTranscriptState` all take ownership before mutating. Result: sidechannel events (approval mode change, session metadata, workspace events, auth device-flow, etc.) preserve `state.blocks` identity across dispatches. The WeakMap caches actually hit now — verified by new test `selectTranscriptBlocksOrderedByEventId returns the same array reference for sidechannel-only events`. ## Lint Criticals (3) — readonly array syntax `ReadonlyArray<T>` → `readonly T[]` per `@typescript-eslint/array-type`: - `KNOWN_DEVICE_FLOW_ERROR_KINDS` satisfies clause - `EMPTY_CHILD_LIST` - `selectSubagentChildBlocks` return type ## Suggestion #1 — shallow copy from selectSubagentChildBlocks Return `[...cached]` so accidental in-place mutation (e.g., caller calling `.sort()` on the result) cannot corrupt the WeakMap-cached children index for other consumers sharing the same `state.blocks` snapshot. ## Suggestion QwenLM#6 — KNOWN_DEVICE_FLOW_ERROR_KINDS sync test Added test `only contains canonical device-flow error kinds` — runtime assertion that guards against the array being silently emptied. The `as const satisfies readonly DaemonAuthDeviceFlowSdkErrorKind[]` at the declaration site already enforces type-level membership; this test adds a stable count check. ## Test coverage (+4 new tests, 152/152 pass) - `selectTranscriptBlocksOrderedByEventId` preserves array identity across sidechannel-only events (memo hit verification) - `selectSubagentChildBlocks` preserves WeakMap entry across sidechannel dispatches - `selectSubagentChildBlocks` returns shallow copy (caller mutation doesn't corrupt cache) - `KNOWN_DEVICE_FLOW_ERROR_KINDS` membership + count assertions ## Side effects - Block property mutations still leak across snapshots (pre-existing — the original eager copy was also a shallow array copy with shared block refs). Not introduced by this change; documented in `getWritableBlockById` comments. - All existing block-mutating tests pass — `takeBlocksOwnership` produces the same observable result as eager copy, just deferred to first mutation. Validation: - SDK tests: 152/152 pass - SDK typecheck: clean - WebUI typecheck: clean Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…ggestion + 1 false-positive) Walks all 22 inline comments from wenshao's 13:00-14:56 burst plus doudouOUC's APPROVED-with-suggestion. 11 real fixes applied; 1 reverted after gate-check; remaining items either already addressed in prior commits (stale) or are test-only coverage gaps now filled. ## Security / Correctness Criticals (real) ### sanitizeUrl strips Basic Auth (R2 #1) `https://user:pw@host/...` previously passed through with userinfo intact, leaking secrets into rendered markdown / HTML / plaintext. `u.username = ''; u.password = '';` before serializing. ### thumbnailUrl protocol validation always-on (R2 QwenLM#2) `javascript:alert(1)` in `![image](url)` survived when sanitizeUrls was false (the default). Added `ensureSafeImageUrl(url)` — protocol whitelist (http/https/data only) that runs unconditionally for image URL renderings. `sanitizeUrls: true` still wins for query-param + Basic Auth stripping. ### permission.resolved orphan after sentinel pruned (R1 QwenLM#2) The prior trim-contract fix guarded `existingId === TRIMMED_*`. After `pruneTrimmedPermissionIndexes` deleted a sentinel (long sessions), `existingId` became `undefined`, bypassed the guard, and created an orphan. Reject `undefined || TRIMMED_*` together. ## Behavior Suggestions (real) ### Selective cancellation propagation (R2 QwenLM#6) `assistant.done.reason` of `stream_ended` / `reconnected` are transport-layer signals — the daemon-side tool is still running and SSE replay will deliver the real terminal status. Marking in-flight tools cancelled caused a visible spinner-to-red flash on reconnect. Scoped propagation to `cancelled` || `error` only. ### awaitingResync diagnostics (R2 QwenLM#3) State-resync latch silently dropped events with no signal. Added `console.warn` describing the dropped event type + last resync trigger so a stuck UI is debuggable. Latch behavior intentionally preserved — recovery is `store.reset()` on session reconnect. ### selectSubagentChildBlocks: freeze instead of copy (R1 QwenLM#8) `[...cached]` per-call defeated React.memo / useMemo identity stability (every call produced a fresh array reference). Now freeze the cached arrays at build time in `getOrBuildChildrenIndex` and return the frozen reference directly — referential stability + mutation defense (strict-mode throws on `.length = 0` etc.). ### detectSubagentDelegation regex too broad (R3 QwenLM#2) `(?:^|_)task$` falsely matched `edit_task` / `list_task` / `create_task` etc. — common tool names unrelated to delegation. Anthropic's Task tool is literally named `Task` (no prefix), so restricted bare-`task` to whole-name only: `^task$`. `delegate` / `subagent` / `spawn_task` keep the `^|_` prefix. ### memoryChanged bytesWritten finite check (R3 QwenLM#3) `typeof === 'number'` accepted NaN / Infinity. Use the existing `numberField` helper which calls `Number.isFinite(v)`. ### Multi-line blockquote prefix (R3 #1) `> *thought:* ${text}` only prefixed the first line; subsequent lines escaped the blockquote. Added `blockquote(raw)` helper that prefixes every line; applied to thought / debug / error renderings. ## Quality (real) ### plainText / HTML maxFieldLength parity (R1 QwenLM#5/6/7, doudouOUC approve note) The tool block in markdown caps via `text()`; plaintext + HTML caps were missing on header fields, preview content, and permission block labels. Threaded `cap()` consistently across all three projections. ### isSensitiveKey dedup (R1 QwenLM#10) Seven exact-match entries (`password` / `apikey` / `idtoken` / `sessiontoken` / `clientsecret` / `xapikey` / `xauthtoken`) were already subsumed by existing `endsWith` rules. Removed. ### Re-export DaemonUiStateResyncRequiredEvent (R2 QwenLM#7) Other session-meta event types are exported from the daemon barrel; this one was missed. Added to both `daemon/ui/index.ts` and `daemon/index.ts`. ## Reverted after gate-check (false-positive) ### classifySelectedPermissionOption CANCELLED branch (R2 QwenLM#4) Reviewer suggested adding `CANCELLED_PERMISSION_TERMS` check before the `completed` default, so `selected:cancel` would map to cancelled. This CONFLICTS WITH: - the design comment at the caller: "A selected option resolves the prompt even when the option id is a domain value like a city name or an option id containing deny/cancel" - the existing test `'cancelled-substring-permission'` with payload `'selected:abort'` expecting status `'completed'` The daemon expresses "user cancelled the prompt" via `cancelled` as the PRIMARY token (handled at the caller layer), not `selected:cancel` — the latter means "user picked an option labeled cancel", which is a successful selection. Reverted; added explanatory comment so the next review round doesn't re-flag it. ## Stale (already fixed) ### R1 #1 (daemonBlockToPlainText opts forwarding) Already fixed in d35cbb7 (2026-05-23 monitor pass for review 4350741340). No further action. ## Test coverage added - HTML web_fetch URL sanitization (sanitizeUrls + Basic Auth) - Image URL protocol validation when sanitizeUrls:false - HTML shell / permission / thought / debug / status block kinds - Trimmed-tool cancellation propagation (no throw + transport-layer no-cancel) - Late permission.resolved after sentinel prune (no orphan) - Frozen children-index identity stability + mutation guard - previewMarkdown preserves rawOutput as object (in webui adapter test file) ## Validation | | | |---|---| | SDK tests | **161/161** (was 153 → +8 new) | | WebUI tests | **9/9** (was 8 → +1 new) | | SDK typecheck | clean | | WebUI typecheck | clean | Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…wenLM#4353) * feat(sdk/daemon-ui): expand event coverage to 28+ daemon event types (PR-A) Closes the "12+ daemon events fall through to debug" gap surfaced in the PR the daemon currently emits (Stage 1 + Wave 3-4), so renderers stop having to peek at `rawEvent.data` for known event categories. Session-meta: - session.metadata.changed (from session_metadata_updated) - session.approval_mode.changed (from approval_mode_changed) - session.available_commands (from available_commands_update; upgraded from a status-text fallback to a typed event carrying the command list) Workspace state (Wave 3-4): - workspace.memory.changed - workspace.agent.changed - workspace.tool.toggled - workspace.initialized - workspace.mcp.budget_warning - workspace.mcp.child_refused - workspace.mcp.server_restarted - workspace.mcp.server_restart_refused Auth device-flow (Wave 4 OAuth, RFC 8628): - auth.device_flow.started - auth.device_flow.throttled - auth.device_flow.authorized - auth.device_flow.failed (carries DaemonAuthDeviceFlowSdkErrorKind) - auth.device_flow.cancelled - `DaemonUiErrorEvent.errorKind?: DaemonErrorKind` — closed-enum error category propagated from daemon's typed-error taxonomy. Renderers can branch on errorKind for "retry auth" vs "check file path" affordances instead of regex-matching `text`. - `DaemonUiToolUpdateEvent.provenance?: DaemonUiToolProvenance` + `.serverId?` — closed enum ('builtin' | 'mcp' | 'subagent' | 'unknown'). Falls back to the `mcp__<server>__<tool>` naming heuristic when the daemon doesn't stamp provenance explicitly. Unblocks UI namespace dispatch without string-matching toolName. Session-meta / workspace / auth events do NOT push transcript blocks. They are intentional sidechannel observations: `lastEventId` advances (monotonic invariant preserved), but the chat-stream transcript stays focused on user/assistant/tool/shell/permission content. Renderers consume them via selectors (introduced in follow-up PRs). All new event types produce short structured lines in `daemonUiEventToTerminalText` for tail-style debug consumers. Web/IDE renderers should consume the typed events directly via subscription. 40/40 tests pass. New tests verify: - All 16 new event types normalize correctly - Malformed payloads fall back to debug without leaking raw data (`secret` field never appears in fallback text) - MCP tool provenance heuristic (`mcp__github__create_issue` → provenance='mcp', serverId='github') - errorKind propagation on session_died / stream_error - Reducer is no-op on new event types; lastEventId still advances This is PR-A of the unified-renderer-layer follow-up series: - PR-A (this commit) — event coverage + closed-enum schema - PR-B — server-side timestamps + ordering refactor - PR-C — multimodal content + tool preview taxonomy - PR-D — render contract (toMarkdown / toHtml / toPlainText) + adapter conformance test framework - PR-E — reducer state machine (subagent / progress / current tool / cancellation propagation) See https://github.com/QwenLM/qwen-code/pull/4328#issuecomment-4494179724 for the full proposal. Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * feat(sdk/daemon-ui): server timestamps + event-id-based ordering (PR-B) Closes the "时间定义不标准" gap surfaced in the PR #4328 review: - Client-side `Date.now()` drifts across clients - No daemon-authoritative timestamp propagated to UI - Out-of-order replay events get fresher `state.now` than originals, breaking `createdAt` ordering - `DaemonUiEventBase.serverTimestamp?: number` — daemon-authoritative wall-clock timestamp extracted from envelope. - `DaemonTranscriptBlockBase.serverTimestamp?: number` + `clientReceivedAt: number`. - `createdAt` preserved as `@deprecated` alias for `clientReceivedAt` (backward compat for code written before this PR). `extractServerTimestamp` looks at three candidate envelope locations: 1. `event.serverTimestamp` (preferred when daemon adds it) 2. `event._meta.serverTimestamp` (Anthropic-style metadata convention) 3. `event.data._meta.serverTimestamp` (sessionUpdate nested location) The SDK is ready to consume serverTimestamp WHEN daemon emits it, without requiring a coordinated SDK release. Undefined when daemon doesn't emit (current state) — graceful degradation to client-clock ordering. `selectTranscriptBlocksOrderedByEventId(state)` — returns blocks sorted by: 1. `eventId` (daemon-monotonic SSE cursor) — primary key 2. `serverTimestamp` (daemon wall clock) — fallback for synthetic frames 3. `clientReceivedAt` (local clock) — last resort Use this when displaying long sessions where event id 5 may arrive AFTER event id 7 (typical in SSE replay-after-reconnect). `formatBlockTimestamp(block, opts)` — formats the most authoritative timestamp on a block using `Intl.DateTimeFormat`. Prefers `serverTimestamp` over `clientReceivedAt` for cross-client consistency. Accepts locale / timeZone / dateStyle / timeStyle. Daemon needs to stamp `_meta.serverTimestamp` on every SSE envelope. This SDK PR is ready to consume it the moment the daemon ships the field; no coordination needed. - serverTimestamp extraction from all three envelope locations - Defaults undefined when envelope has none - `selectTranscriptBlocksOrderedByEventId` sorts mixed-arrival events by eventId (replay scenario) - `formatBlockTimestamp` prefers serverTimestamp; returns localized string PR-B of the unified follow-up to PR #4328 (PR-A + PR-B + PR-C + PR-D + PR-E in one branch). Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * feat(sdk/daemon-ui): reducer state machine — currentTool / approvalMode / cancellation propagation (PR-E) Closes the "reducer state machine 设计缺漏" gap surfaced in the PR #4328 review: - No `currentTool` — UI scans `blocks[]` to find the running tool - No mirrored approval mode — UI walks events to badge "plan"/"yolo" - Cancellation does not propagate — in-flight tool blocks stuck at 'in_progress' forever when the parent prompt is cancelled ## State additions (sidechannel, no transcript blocks) `DaemonTranscriptSidechannelState`: - `currentToolCallId?: string` — toolCallId of the in-flight tool - `approvalMode?: string` — mirrored from session.approval_mode.changed - `toolProgress: Record<string, { ratio?, step? }>` — per-tool progress shape (daemon-side emission of `tool.progress` events pending) ## Reducer behavior ### `tool.update` events `IN_FLIGHT_TOOL_STATUSES` = { pending, confirming, running, in_progress } `TERMINAL_TOOL_STATUSES` = { completed, success, failed, error, canceled, cancelled } - Tool enters in-flight: set `currentToolCallId = event.toolCallId` - Tool enters terminal: clear `currentToolCallId` if it matches - Unknown status (forward-compat): leave pointer untouched This avoids the failure mode where a future daemon-emitted status like `'paused'` would silently mark unknown states as either in-flight or terminal incorrectly. ### `session.approval_mode.changed` Mirror `event.next` onto `state.approvalMode`. Renderers can render a mode badge ("plan" / "default" / "auto-edit" / "yolo") with a single selector call, no event-stream walking. ### `assistant.done` with `reason === 'cancelled'` `propagateCancellationToInFlightTools` walks every tool block whose status is still in-flight and force-sets it to 'cancelled'. The daemon does not guarantee terminal `tool_call_update` for every in-flight tool when the parent prompt is cancelled, so this propagation prevents UI spinners from spinning forever. `currentToolCallId` is also cleared in the same call. Non-cancellation `assistant.done` (e.g., `reason: 'end_turn'`) does NOT propagate — in-flight tools remain in-flight until the daemon emits their terminal update naturally. ## Selectors - `selectCurrentTool(state)` — returns the running tool block, or undefined - `selectApprovalMode(state)` — returns the mirrored approval mode - `selectToolProgress(state, toolCallId)` — per-tool progress query All exported from `@qwen-code/sdk/daemon`. ## Scope deliberately deferred Subagent nesting (`parentBlockId` / `delegationId` / `DaemonSubagentTranscriptBlock`) is NOT in this PR. The shape needs design discussion (how to project nested events; whether to bake delegation tracking into transcript or sidechannel). PR-D / PR-F follow-up. ## Test coverage (51/51 pass) - currentToolCallId set on enter, cleared on terminal - approvalMode mirrors changes - Cancellation marks in-flight tools 'cancelled', leaves completed alone - Unknown status does NOT clear currentToolCallId (forward-compat) - Non-cancellation `assistant.done` does NOT propagate ## Roadmap PR-E of the unified follow-up to PR #4328 (PR-A + PR-B + PR-E in this branch; PR-C / PR-D pending). Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * feat(sdk/daemon-ui): tool preview taxonomy + multimodal content extraction (PR-C) Closes two related gaps surfaced in the PR #4328 review: - `DaemonToolPreview` had only 4 kinds — UI fell back to `key_value` / `generic` for tools that deserved structured display - `getTextContent` silently dropped non-text content (image / audio / resource), so multimodal conversations vanished from the UI `DaemonToolPreview` extends from 4 to 8 variants: - `file_diff` — `{ path, oldText?, newText?, patch? }` — file edit tools (Anthropic-style `oldText/newText`, aider-style `patch`, write-style `newText` alone) - `file_read` — `{ path, range?: [start, end] }` — file read tools, with range extracted from `lineRange` tuple OR `offset/limit` pair - `web_fetch` — `{ url, method? }` — HTTP fetch tools (requires URL with scheme to avoid false positives on relative paths) - `mcp_invocation` — `{ serverId, toolName, argsSummary? }` — MCP server tool calls, identified via `mcp__<server>__<tool>` naming convention (same heuristic as PR-A `DaemonUiToolUpdateEvent.provenance`) Detector order matters — MCP wins first (most specific), then file_diff, file_read, web_fetch, then the existing command / key_value fallbacks. New helper `extractContentPart(value): DaemonUiContentPart | undefined` returns a discriminated union: ```ts type DaemonUiContentPart = | { kind: 'text'; text: string } | { kind: 'image'; mediaType: string; source: { url?, data? } } | { kind: 'audio'; mediaType: string; source: { url?, data? } } | { kind: 'resource'; uri: string; mediaType?, description? }; ``` The existing `getTextContent` is preserved for backward compat. Renderers that need to surface non-text content (web UI thumbnails, IDE attachment chips) now have a typed shape to consume. - Wiring `extractContentPart` into the normalizer / reducer so text blocks accumulate `parts: DaemonUiContentPart[]` alongside `text` (additive shape change requires render contract coordination — PR-D). - 5 additional tool preview kinds (image_generation / code_block / tabular / subagent_delegation / search) — useful but not urgent; current 8 kinds cover the typical agent flows. - file_diff detection from Anthropic / aider / write shapes - file_read with lineRange tuple AND offset+limit pair - web_fetch with method, REJECTS relative paths (no scheme) - mcp_invocation with serverId + toolName extraction - Detector priority: MCP wins over file_diff on conflicting shapes - extractContentPart for text / image (url) / audio (data) / resource - Unknown content type returns undefined (skip rather than synthesize) - Image without source returns undefined (defensive) PR-C of the unified follow-up to PR #4328 (PR-A + PR-B + PR-E + PR-C in this branch; PR-D render contract pending). Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * feat(sdk/daemon-ui): render contract — markdown / HTML / plain text helpers (PR-D) Closes the "render 契约只覆盖 terminal" gap surfaced in the PR #4328 review: > PR ships `daemonUiEventToTerminalText` for terminal. Web/IDE/channel > adapters each roll their own projection. No shared contract → adapter > divergence is inevitable. ## New helpers ```ts daemonBlockToMarkdown(block, opts?): string // GFM-compatible daemonBlockToHtml(block, opts?): string // conservatively escaped HTML daemonBlockToPlainText(block, opts?): string // for copy-paste / logs daemonToolPreviewToMarkdown(preview, opts?): string ``` All three respect the same `kind` discrimination so adapters can switch between them without touching call sites. ## Per-kind projection For each `DaemonTranscriptBlock['kind']`: - `user` / `assistant` / `thought` — plain text with role labels - `tool` — header with toolName + structured preview + status badge - `shell` — fenced code block, stream-discriminated (stdout vs stderr) - `permission` — title + options list + resolved/pending indicator - `status` / `debug` / `error` — semantic class / role (error → role=alert) For each `DaemonToolPreview['kind']`: - `ask_user_question` — question + options as bullet list - `command` — fenced bash with optional cwd comment - `file_diff` — unified diff in fenced code block (oldText/newText OR patch) - `file_read` — `path (lines N-M)` line - `web_fetch` — `METHOD url` line - `mcp_invocation` — `serverId::toolName` with args summary - `key_value` — bullet list - `generic` — emphasized summary ## Security - Default HTML sanitizer escapes `<`, `>`, `&`, `"`, `'` and FIRST strips ANSI/control sequences via `sanitizeTerminalText` (defense against agent-emitted escape codes in HTML output). - Custom sanitizer hook for consumers wanting markdown→HTML pipelines (markdown-it + DOMPurify, etc.). - `sanitizeUrls` option strips token-like query params (`token=`, `key=`, `x-amz-`, etc.) from URLs in `web_fetch` previews. - `maxFieldLength` truncation defaults 8192, prevents pathological rendering on huge content. ## Adapter conformance (out of scope for this commit) The conformance test framework (fixture corpus + `runAdapterConformanceSuite`) mentioned in PR-D scope is deferred to a follow-up. The render helpers here are the precondition — once stable, the conformance framework can use them as the reference projection. ## Test coverage (77/77 pass) - All 9 block kinds render in markdown (verified for user/assistant/tool/ shell/permission/error specifically) - file_diff renders as unified diff with old/new lines - mcp_invocation renders as `server::tool` format - HTML escapes XSS (`<script>` → `<script>`) - HTML strips terminal escape sequences before escaping - Error blocks emit `role="alert"` for screen readers - plain text drops markdown delimiters - maxFieldLength truncates with ellipsis - sanitizeUrls strips token query params - Custom sanitizer hook works ## Roadmap PR-D of the unified follow-up to PR #4328 — completes the 5-PR series (A: event coverage, B: time schema, E: state machine, C: tool preview + content extraction, D: render contract). Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * feat(sdk/daemon-ui): 5 additional tool preview kinds — taxonomy complete (PR-F) Closes the "5 additional preview kinds" item in PR #4353's TODO §A (SDK-only work). ## New preview kinds (8 → 13) - `code_block` — `{ language?, code, origin? }` — REPL / formatter / generator output, fenced as `\`\`\`<language>` in markdown - `search` — `{ query, resultCount?, top? }` — grep / ripgrep / find / glob results with up to 5 top hits - `tabular` — `{ columns, rows, totalRows? }` — structured table output (50-row cap with `totalRows` truncation indicator); supports both `columns: string[] + rows: unknown[][]` explicit shape and legacy `data: Array<Record<>>` shape (auto-infers columns from first row) - `image_generation` — `{ prompt, thumbnailUrl?, model? }` — dall-e / diffusion / imagen / flux / sora style tools - `subagent_delegation` — `{ agentName, task, parentDelegationId? }` — Anthropic-style Task tool and similar sub-agent dispatchers ## Detector priority Order matters — most specific wins. New detectors slot in between `mcp_invocation` and `file_diff`: ``` mcp_invocation > subagent_delegation > search > image_generation > file_diff > file_read > web_fetch > code_block > tabular > command > key_value > generic ``` Rationale: subagent / search / image generation are most discriminable (distinct toolName patterns); file ops next; code_block / tabular last because their shapes (`code:`, `columns:`) can appear in other tools. ## Render projections Both `daemonToolPreviewToMarkdown` and the plain-text rendering paths extended with cases for all 5 new kinds: - code_block: fenced markdown code block with language tag - search: bold header + GFM bullet list of top results - tabular: GFM pipe table with header / separator / body / truncation hint - image_generation: bold header + blockquoted prompt + embedded markdown image (URL sanitization respected via `sanitizeUrls` opt) - subagent_delegation: bold delegate-arrow header + blockquoted task + optional parent delegation reference ## Test coverage (91/91 pass, +14 new) - Each detector with positive case - Detector priority verified: subagent_delegation wins over file_diff when toolName='Task' has both subagent + file-edit fields - Tabular row cap (50) + totalRows stamping for truncated data - Legacy data: Array<Record<>> auto-column inference - Each render projection with structural assertions (markdown table format, image embed, bullet lists) ## Roadmap PR-F of the unified follow-up to PR #4328. Brings the preview taxonomy to 13 kinds covering: file ops (3), web (1), code/data (2), media (1), agent control (2 — ask_user_question + subagent_delegation), MCP (1), search (1), generic fallbacks (2). Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * feat(sdk/daemon-ui): adapter conformance framework + fixture corpus (PR-G) Closes the "Adapter conformance test framework" item in PR #4353's TODO §A. Lets any daemon-ui adapter (TUI / web / IDE / channel / mobile) validate that it projects a fixed corpus of daemon SSE event streams to the same semantic shape — catches projection drift before it reaches users. ## API surface ```ts interface DaemonUiAdapterUnderTest { reduce(events: readonly DaemonUiEvent[]): unknown; renderToText(state: unknown): string; } interface DaemonUiConformanceFixture { name: string; description: string; envelopes: DaemonEvent[]; // raw daemon envelopes expectedContains: string[]; // phrases the rendered text MUST contain expectedAbsent?: string[]; // phrases that MUST NOT appear normalizeOptions?: { ... }; // forward-compat normalize opts } runAdapterConformanceSuite(adapter, opts?): ConformanceSuiteResult DAEMON_UI_CONFORMANCE_FIXTURES: ReadonlyArray<DaemonUiConformanceFixture> ``` ## Design **Format-agnostic assertion**: adapters can render to ANSI / HTML / markdown / JSX — the framework only inspects plain text via `renderToText`. Catches semantic divergence (missing user message, wrong tool status, leaked secret) without forcing identical formatting. **Embedded fixture corpus** (no fs reads — works in browser bundle): - `simple-chat` — user/assistant streaming flow - `tool-call-lifecycle` — running → completed transition - `file-edit-diff` — file_diff preview surfacing - `mcp-invocation` — MCP serverId/toolName extraction via heuristic - `permission-lifecycle` — request + resolved with outcome - `mcp-budget-warning` — Wave 3 event (adapter must observe but rendering is its choice) - `cancellation-propagates` — tool block status flows - `malformed-payload-redaction` — uses `includeRawEvent: true` to verify even a debug-mode adapter doesn't leak `token: secret-do-not-leak` - `auth-device-flow-success` — Wave 4 OAuth events - `available-commands-typed-event` — PR-A upgrade from status text Per-fixture `expectedContains` and `expectedAbsent` describe the content contract independently of format. ## Suite result ```ts { passed: number, failed: ConformanceFailure[], // each carries missing + leaked + excerpt total: number, } ``` **Does not throw** — caller asserts on `result.failed` so adapter test suites can produce per-fixture diagnostics rather than a single opaque exception. ## Filter options `only` / `skip` allow targeted runs during adapter development: ```ts runAdapterConformanceSuite(myAdapter, { only: ['simple-chat'] }); runAdapterConformanceSuite(myAdapter, { skip: ['cancellation-propagates'] }); ``` ## Test coverage (97/97 pass, +6 new) - SDK reference adapter (reducer + markdown render) passes all fixtures - SDK reference adapter (reducer + plainText render) also passes - Buggy adapter (empty string output) fails every fixture with non-empty `expectedContains` - Buggy adapter (raw event dump via JSON.stringify) caught by redaction fixture's `expectedAbsent` - `only` filter narrows to a single fixture - `skip` filter excludes named fixtures from the corpus ## Usage from adapter authors ```ts // In your adapter's test file import { runAdapterConformanceSuite } from '@qwen-code/sdk/daemon'; import { reduceForTui, renderTuiState } from './my-tui-adapter'; it('TUI adapter conforms to daemon UI corpus', () => { const result = runAdapterConformanceSuite({ reduce: reduceForTui, renderToText: renderTuiState, }); expect(result.failed).toEqual([]); }); ``` ## Roadmap PR-G of the unified follow-up to PR #4328. The corpus is intentionally small (10 fixtures) but extensible — adapter authors can submit new fixtures via additions to `DAEMON_UI_CONFORMANCE_FIXTURES` to lock in regression coverage for edge cases their adapter encountered. Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * feat(webui+sdk/daemon-ui): wire transcriptAdapter to SDK render contract (PR-H) Closes the "WebUI transcriptAdapter migration" item in PR #4353's TODO §A. Validates the PR-D render contract end-to-end on the real WebUI consumer. `daemonTranscriptToUnifiedMessages(blocks, options?)` gains a new options parameter: ```ts interface DaemonTranscriptAdapterOptions { useMarkdown?: boolean; // default: false enrichToolDetailsWithPreview?: boolean; // default: false } ``` Defaults preserve legacy behavior — existing callers see no change. For `user` / `assistant` / `thought` blocks, content is projected via SDK's `daemonBlockToMarkdown` instead of raw sanitized text. The WebUI's markdown renderer (markdown-it) then gets: - `**You**\n\n<content>` for user blocks (bold "You" label) - Raw text for assistant blocks (markdown formatting in agent output passes through cleanly) - `> *thought:* <text>` blockquote for thought blocks For `tool` blocks, `rawOutput` is replaced with `daemonToolPreviewToMarkdown(block.preview)`. This lets WebUI surfaces without per-preview-kind React components still display: - `file_diff` as a fenced unified diff - `mcp_invocation` as `server::tool` with args summary - `tabular` as GFM pipe table - `search` as bullet list with match count - `image_generation` as embedded markdown image - `subagent_delegation` as delegate arrow + task quote Renderers with per-kind components should leave this opt-out. `packages/sdk-typescript/src/daemon/index.ts` was missing exports for PR-D / PR-F / PR-G / PR-B / PR-E surface — WebUI's `@qwen-code/sdk/daemon` import path uses the daemon root, not the ui/ sub-index. Added 15+ re-exports so consumers don't need to use the longer `@qwen-code/sdk/daemon/ui/index.js` path. Now exported from `@qwen-code/sdk/daemon` root: - `daemonBlockToMarkdown` / `daemonBlockToHtml` / `daemonBlockToPlainText` - `daemonToolPreviewToMarkdown` - `extractContentPart` + `DaemonUiContentPart` type - `formatBlockTimestamp` + `selectTranscriptBlocksOrderedByEventId` - `selectCurrentTool` / `selectApprovalMode` / `selectToolProgress` - `runAdapterConformanceSuite` + `DAEMON_UI_CONFORMANCE_FIXTURES` - All associated types `webui/src/daemon/transcriptAdapter.test.ts` mock blocks updated to include `clientReceivedAt` (required field added in PR-B). Mechanical change — every `createdAt: N` test fixture gets a matching `clientReceivedAt: N`. - WebUI `npm run typecheck` — clean - SDK `npm run typecheck` — clean - SDK `vitest run test/unit/daemonUi.test.ts` — 97/97 pass - WebUI transcriptAdapter test fixtures typecheck against updated DaemonTranscriptBlockBase schema PR-H of the unified follow-up to PR #4328. Closes the WebUI migration gap in TODO §A. Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * docs(daemon-ui): add developer guide + migration cookbook (PR-I) Closes the final "Documentation" item in PR #4353's TODO §A. Brings the unified daemon UI surface to ~95% SDK-side completion. ## Files added - `docs/developers/daemon-ui/README.md` — full API reference - Three-layer model (normalizer → reducer → render helpers) - Quick start with idiomatic event-loop pattern - Event taxonomy (28+ types categorized: chat-stream / session-meta / workspace / auth device-flow) - Render contract cookbook (markdown / HTML / plainText) - Tool preview taxonomy (13 kinds with use cases) - State selectors (currentTool / approvalMode / toolProgress / ordering) - Cancellation propagation explanation - Time semantics (eventId > serverTimestamp > clientReceivedAt precedence) - Adapter conformance usage - ErrorKind dispatch pattern - Tool provenance dispatch pattern - Forward-compat principles - `docs/developers/daemon-ui/MIGRATION.md` — adapter author migration cookbook - Step-by-step recommended adoption order (9 steps, value-ranked) - Before/after code examples for each step - Backward-compat checklist (everything is additive — no breaking changes) - Cross-references to PR-A through PR-H commits ## Roadmap PR-I of the unified follow-up to PR #4328. Documentation-only — no code changes; no tests affected. Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(daemon-ui): address review feedback * fix(daemon-ui): address review hardening feedback * fix(daemon-ui): handle resync-required events * feat(sdk/daemon-ui): consume daemon-side subagent nesting context (PR-K) Closes the SDK-side gap for §B1 in PR #4353's TODO list. PR-E originally deferred subagent nesting because daemon-side parent-context wasn't yet stamped on tool_call events. After the rebase onto current daemon_mode_b_main, source verification confirms the daemon now emits `tool_call._meta.parentToolCallId` + `tool_call._meta.subagentType` via `SubAgentTracker.getSubagentMeta()` (core), so the SDK side is unblocked. ## Schema additions (additive, forward-compat-safe) `DaemonUiToolUpdateEvent`: - parentToolCallId?: string — toolCallId of the parent Task / delegation - subagentType?: string — sub-agent type label (e.g. 'code-reviewer') `DaemonToolTranscriptBlock`: - parentToolCallId?: string — mirror of event field - subagentType?: string — mirror of event field - parentBlockId?: string — pre-resolved by reducer when parent already in state, so renderers don't re-correlate ## Normalizer wiring `normalizeToolUpdate` checks both top-level and `_meta` for parentToolCallId + subagentType (fallback chain mirrors how provenance/serverId are read). Top-level tool calls without sub-agent context omit the fields cleanly. ## Reducer behavior - New tool block: resolves `parentBlockId` from `toolBlockByCallId` at create time. Out-of-order arrival (child before parent) leaves `parentBlockId` undefined — selectors fall back to `parentToolCallId` lookup. - Existing tool block update: adopts parent context if not yet correlated, never overwrites established correlation (handles the flow where SubAgentTracker activates after the initial tool_call). ## New public selectors - selectSubagentChildBlocks(state, parentToolCallId): returns the array of tool blocks invoked inside a given parent delegation - isSubagentChildBlock(block): type guard for "this tool block came from a sub-agent" Both exported from @qwen-code/sdk/daemon root + ui/index. ## Forward-compat properties - Top-level tool calls (no sub-agent) work identically as before - Trimmed parent blocks: child fallback to undefined parentBlockId - Daemon emits both fields together; SDK reads independently to tolerate partial future stamping ## Test coverage (129/129 pass, +5 new tests) - Extract parentToolCallId + subagentType from `_meta` - Top-level tool calls have undefined parent fields (forward-compat) - Reducer correlates parentBlockId at create time - Reducer adopts parent context on later update (out-of-order arrival) - isSubagentChildBlock discriminator ## Roadmap PR-K of the unified follow-up to PR #4353. Closes §B1 (subagent nesting) in the TODO declaration; daemon-side already shipped on `daemon_mode_b_main` via SubAgentTracker (core). Remaining TODO §B / §D items still depend on further daemon/Core work: - §B2 `tool.progress` event type (daemon emit pending) - §D MessageEmitter multimodal echo + HistoryReplayer inlineData/fileData (core change pending) Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(daemon-ui): PR-K self-review hardening — back-fill / trim / self-ref / docs Multi-round self-review of PR-K (d8375fe46) surfaced two real bugs, a few defensive gaps, and missing docs/fixture coverage. All addressed in one commit. ## Bugs fixed ### Bug 1 — `parentBlockId` never back-filled for out-of-order arrival Original PR-K resolved `parentBlockId` only at child create time, which broke this flow: 1. Child arrives WITH parent stamp → block created with `parentToolCallId` set, `parentBlockId` undefined (parent not in state yet) 2. Parent arrives later → block created, `toolBlockByCallId` indexed 3. Subsequent child updates: existing-block branch only ran the back-fill inside `!existing.parentToolCallId`, which is false (we already adopted the stamp in step 1). `parentBlockId` stayed undefined forever. Fix: separate the two correlations. - existing-block update: independently back-fill `parentBlockId` whenever `parentToolCallId` is set and `parentBlockId` is missing - new-block create: scan existing children whose `parentToolCallId` matches the new block's `toolCallId` and back-fill their `parentBlockId`. Cheap O(n) over current blocks. ### Bug 2 — dangling `parentBlockId` after trim `trimTranscriptState` reset `toolBlockByCallId[id]` to the trimmed sentinel for evicted blocks but did NOT walk surviving children to null their `parentBlockId` references. Renderers walking `blockIndexById.get(parentBlockId)` would get undefined, with no "why" signal. Fix: post-trim, walk remaining tool blocks; if `parentBlockId` references an id not in `keptIds`, null it. `parentToolCallId` stays (survives trimming so selector-keyed queries still work). ## Defensive hardening - **Self-reference guard** (normalizer): drop `parentToolCallId === toolCallId` before it reaches the reducer. Daemon should never emit this, but defending costs nothing. - **Selector docstring**: clarify `selectSubagentChildBlocks` returns **direct** children only; document cycle / depth-cap responsibility for renderers walking up the chain. - **Cosmetic**: remove redundant `as DaemonToolTranscriptBlock` cast in `isSubagentChildBlock` (TypeScript already narrows after `block.kind === 'tool'` on the discriminated union). - **Alphabetical**: move `isSubagentChildBlock` re-export to correct position in both `daemon/index.ts` and `daemon/ui/index.ts`. ## Docs + conformance gaps closed - `README.md` — new "Sub-agent nesting (PR-K)" section with full reducer behavior, out-of-order handling note, recursive walk example, cycle-defense note. - `MIGRATION.md` — new step 8a with before/after for nested rendering. - `conformance.ts` — new `subagent-nesting` fixture covering parent + nested child via `tool_call._meta`. Markdown-safe phrases chosen (markdown escapes `-` so titles cannot be substring-matched as-is). ## Test coverage (+5 tests, 134/134 pass) - Self-reference dropped in normalizer - Back-fill on out-of-order parent arrival (child first, parent after) - Back-fill on later child update when parent now exists - Dangling `parentBlockId` nulled after parent trimmed - New `subagent-nesting` conformance fixture passes SDK reference adapter ## Side-effect verification Verified no regressions: - Cancellation propagation still cancels parent + children together (iterates `toolBlockByCallId`, which includes both) - Render contract unchanged (`daemonBlockToMarkdown` etc. project per block, no nested awareness required) - No serializer to update - `selectTranscriptBlocksOrderedByEventId` unaffected (parent-agnostic) Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(daemon-ui): permission block trim contract — wenshao review Addresses both items from wenshao's review on PR #4353: ## Critical — resolvePermissionBlock missing TRIMMED guard The sibling `upsertPermissionBlock` (transcript.ts:544) correctly returns early when `existingId === TRIMMED_PERMISSION_BLOCK_ID`, but `resolvePermissionBlock` (transcript.ts:581) had no such guard. When `maxBlocks` trimming evicted a pending permission request, a subsequent `permission.resolved` event would: 1. Fail the `getWritableBlockById` lookup (sentinel is not a real block id) 2. Fall through and create a brand-new orphan resolution block This wasted a block slot, accelerated further trimming, and silently broke the trimmed-block contract that the request-side guard establishes. Fix: mirror the request-side guard. Read the index entry up front, return early on the sentinel. ## Suggestion — permissionBlockByRequestId grows unboundedly `trimTranscriptState` writes `TRIMMED_PERMISSION_BLOCK_ID` for evicted permission requests but never deletes those entries. Unlike the tool side (which calls `pruneTrimmedToolIndexes` post-trim), the permission index grew without bound in long sessions. Fix: add `pruneTrimmedPermissionIndexes` analogous to the tool-side helper. Caps the sentinel set at `maxBlocks` entries; older entries are deleted (any later resolution event still drops cleanly via the new Critical guard). ## Tests - Updated existing `keeps orphan permission resolutions visible after request trimming` test to encode the corrected contract (drops silently instead of creating an orphan). Test rename: "drops resolution for trimmed permission requests (wenshao Critical)". - New `Suggestion: pruneTrimmedPermissionIndexes caps the trimmed sentinel set` test verifies the cap. Total: 136/136 tests pass, SDK + WebUI typecheck green. ## Side-effect verification - `upsertPermissionBlock` already had the equivalent guard — no asymmetry remains. - `pruneTrimmedPermissionIndexes` only touches entries holding the sentinel; live permission blocks are unaffected. - Selectors over `state.blocks` (e.g. `selectPendingPermissionBlocks`) iterate the block array, not the index — unaffected by cap. Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(daemon-ui): address wenshao + doudouOUC inline reviews (2026-05-23) Addresses the 13 inline review comments from wenshao (6) and doudouOUC (7, one overlap) on the 2026-05-23 review round. ## Critical / Important ### sanitizeUrls not threaded through HTML preview path (doudouOUC) `daemonBlockToHtml` for tool blocks called `daemonToolPreviewToPlainText` which didn't accept `opts` — when callers set `sanitizeUrls: true`, the markdown path stripped auth tokens but the HTML path leaked them into the DOM. Now: helper accepts opts, threads through `web_fetch.url` and `image_generation.thumbnailUrl`. ### enrichToolDetailsWithPreview overwrote rawOutput (doudouOUC) The webui adapter replaced structured `rawOutput` with a markdown summary string when `enrichDetails: true`. Downstream `ToolCallData` consumers may branch on the shape (object vs string) and break. Plus the actual tool output was silently dropped. Fix: keep `rawOutput` verbatim, surface markdown via a new optional `previewMarkdown` field added to `ToolCallData`. ### transcriptBlockToTerminalText zero test coverage (wenshao) Added 12 tests covering each `switch` branch (user / assistant / thought / tool / shell stdout+stderr / permission unresolved+resolved / status / debug / error) plus the unknown-kind degradation path. Verified `assertNever` returns a graceful error line (does NOT throw) — wenshao's reviewer was slightly wrong on the throw claim but coverage gap was real. ### selectTranscriptBlocksOrderedByEventId no memoization (wenshao) Selector was called from React `useSyncExternalStore` and re-sorted on every dispatch — including sidechannel-only events that don't touch blocks. Added WeakMap cache keyed on `state.blocks` reference; the reducer preserves the same array reference for non-block-mutating events, so the cache hits across renders. ### selectSubagentChildBlocks O(n) per call (wenshao) Naive `state.blocks.filter()` was O(n) per call; rendering a tree with m parents made it O(n*m). Built a memoized reverse index keyed on `state.blocks` reference (WeakMap of parentToolCallId → DaemonToolTranscriptBlock[]). Each lookup now O(1) after first call. ### Test file TS errors at root tsc (wenshao) Fixed multiple TS errors in `daemonUi.test.ts` flagged by root `tsc --noEmit`: - Added `DaemonTranscriptState` + `DaemonUiEvent` imports - `block.content` access via `as Array<Record<string, unknown>>` cast - `delete` on globalThis property via narrower interface cast - `debug?.text` via `DaemonUiEvent & { text: string }` narrowing (Extract on union with `'status' | 'debug'` literal would resolve to never) - 6 occurrences of index-signature access via bracket notation - `raw: null` added to 3 `DaemonUiPermissionOption` literals (required field) - Explicit type annotations on conformance-suite `renderToText` params Note: `webui/src/daemon/transcriptAdapter.test.ts` shows residual "clientReceivedAt does not exist" errors at root tsc, but this is environmental — the resolution trace shows `@qwen-code/sdk/daemon` crossing into a sibling worktree's stale dist via shared workspace node_modules. In a single-worktree CI checkout this resolves cleanly. ## Suggestions (cleanups) ### Hoist asDaemonErrorKind double-eval (doudouOUC) `session_died` + `stream_error` cases each computed `asDaemonErrorKind` twice in the conditional spread (predicate + value). Hoisted to const, no functional change. ### renderToolHeader bypassed opts (doudouOUC) Forwarded `opts` so `maxFieldLength` is honored for tool title / toolName / toolKind. ### isSensitiveKey duplicates (doudouOUC) Removed duplicate `endsWith('accesskey')` / `endsWith('secretkey')` checks and the redundant exact-match `privatekey` (already covered by `endsWith`). ### propagateCancellationToInFlightTools iterated trimmed (wenshao) Filter `TRIMMED_TOOL_BLOCK_ID` sentinels up front. Avoids redundant index dereferences in long sessions with many historical tools. ### toolProgress shallow clone (doudouOUC + wenshao) `cloneTranscriptState` outer `...state` spread shared inner `{ ratio?, step? }` references between snapshots. Once `tool.progress` event handlers start mutating in place, the prior snapshot would leak. Deep-clone the inner records now (cost bounded by in-flight tools, small). ### isDeviceFlowErrorKind closed set (wenshao + doudouOUC) Both reviewers suggested strict validation. We INTENTIONALLY kept lenient pass-through — the public type `DaemonAuthDeviceFlowSdkErrorKind` explicitly includes `(string & {})` as a forward-compat escape hatch (existing test `keeps future auth_device_flow_failed errorKind values observable` enforces this). Now expose `KNOWN_DEVICE_FLOW_ERROR_KINDS` as documentation and explain the design in the JSDoc. ## Validation | | | |---|---| | SDK tests | 148/148 pass (+12 terminal coverage + assorted hardening) | | SDK typecheck | clean | | WebUI typecheck | clean | ## Side-effect verification - WeakMap memos invalidate correctly: reducer creates a fresh `state.blocks` reference only on block-mutating events. Sidechannel events reuse the same reference. - `previewMarkdown` is optional and additive on `ToolCallData`; consumers ignoring it are unaffected. - `sanitizeUrl` is called only when `opts.sanitizeUrls === true` in HTML path; default behavior unchanged. Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(daemon-ui): wenshao glm-5.1 review — lazy COW + lint + memo verification Addresses the 6 inline comments from wenshao's 2026-05-23 13:03 CHANGES_REQUESTED review. ## Real fix — WeakMap memoization actually works now (Suggestion #2) The earlier `sortedBlocksCache` / `childrenIndexCache` WeakMaps keyed on `state.blocks` reference, but `cloneTranscriptState` did `blocks: [...state.blocks]` eagerly — every dispatch produced a fresh array, so the caches never hit. The JSDoc claim "memoize across renders that don't touch blocks" was misleading. Fix: lazy copy-on-write. - `cloneTranscriptState` now shares `blocks` + `blockIndexById` by reference (no eager copy). - New `takeBlocksOwnership(state)` performs the array copy at the first mutation; subsequent mutations in the same dispatch are no-ops (tracked via module-level `ownedBlocks: WeakMap<State, blocks>`). - `appendBlock`, `getWritableBlockById`, and `trimTranscriptState` all take ownership before mutating. Result: sidechannel events (approval mode change, session metadata, workspace events, auth device-flow, etc.) preserve `state.blocks` identity across dispatches. The WeakMap caches actually hit now — verified by new test `selectTranscriptBlocksOrderedByEventId returns the same array reference for sidechannel-only events`. ## Lint Criticals (3) — readonly array syntax `ReadonlyArray<T>` → `readonly T[]` per `@typescript-eslint/array-type`: - `KNOWN_DEVICE_FLOW_ERROR_KINDS` satisfies clause - `EMPTY_CHILD_LIST` - `selectSubagentChildBlocks` return type ## Suggestion #1 — shallow copy from selectSubagentChildBlocks Return `[...cached]` so accidental in-place mutation (e.g., caller calling `.sort()` on the result) cannot corrupt the WeakMap-cached children index for other consumers sharing the same `state.blocks` snapshot. ## Suggestion #6 — KNOWN_DEVICE_FLOW_ERROR_KINDS sync test Added test `only contains canonical device-flow error kinds` — runtime assertion that guards against the array being silently emptied. The `as const satisfies readonly DaemonAuthDeviceFlowSdkErrorKind[]` at the declaration site already enforces type-level membership; this test adds a stable count check. ## Test coverage (+4 new tests, 152/152 pass) - `selectTranscriptBlocksOrderedByEventId` preserves array identity across sidechannel-only events (memo hit verification) - `selectSubagentChildBlocks` preserves WeakMap entry across sidechannel dispatches - `selectSubagentChildBlocks` returns shallow copy (caller mutation doesn't corrupt cache) - `KNOWN_DEVICE_FLOW_ERROR_KINDS` membership + count assertions ## Side effects - Block property mutations still leak across snapshots (pre-existing — the original eager copy was also a shallow array copy with shared block refs). Not introduced by this change; documented in `getWritableBlockById` comments. - All existing block-mutating tests pass — `takeBlocksOwnership` produces the same observable result as eager copy, just deferred to first mutation. Validation: - SDK tests: 152/152 pass - SDK typecheck: clean - WebUI typecheck: clean Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(daemon-ui): forward opts in daemonBlockToPlainText tool case wenshao review 4350741340 (2026-05-23 13:00): the prior doudouOUC review fixed only the HTML path; the plainText tool case still called `daemonToolPreviewToPlainText(block.preview)` without `opts`, so `sanitizeUrls` + `maxFieldLength` were silently ignored when consumers used the plain-text projection (logs, clipboard, terminal mirroring). Symmetric fix to the HTML path (line 509). Added test verifying token stripping reaches `web_fetch.url` via plainText path. Validation: 153/153 SDK tests, SDK + WebUI typecheck clean. Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(daemon-ui): address wenshao 2026-05-23 reviews (3 Critical + 8 Suggestion + 1 false-positive) Walks all 22 inline comments from wenshao's 13:00-14:56 burst plus doudouOUC's APPROVED-with-suggestion. 11 real fixes applied; 1 reverted after gate-check; remaining items either already addressed in prior commits (stale) or are test-only coverage gaps now filled. ## Security / Correctness Criticals (real) ### sanitizeUrl strips Basic Auth (R2 #1) `https://user:pw@host/...` previously passed through with userinfo intact, leaking secrets into rendered markdown / HTML / plaintext. `u.username = ''; u.password = '';` before serializing. ### thumbnailUrl protocol validation always-on (R2 #2) `javascript:alert(1)` in `![image](url)` survived when sanitizeUrls was false (the default). Added `ensureSafeImageUrl(url)` — protocol whitelist (http/https/data only) that runs unconditionally for image URL renderings. `sanitizeUrls: true` still wins for query-param + Basic Auth stripping. ### permission.resolved orphan after sentinel pruned (R1 #2) The prior trim-contract fix guarded `existingId === TRIMMED_*`. After `pruneTrimmedPermissionIndexes` deleted a sentinel (long sessions), `existingId` became `undefined`, bypassed the guard, and created an orphan. Reject `undefined || TRIMMED_*` together. ## Behavior Suggestions (real) ### Selective cancellation propagation (R2 #6) `assistant.done.reason` of `stream_ended` / `reconnected` are transport-layer signals — the daemon-side tool is still running and SSE replay will deliver the real terminal status. Marking in-flight tools cancelled caused a visible spinner-to-red flash on reconnect. Scoped propagation to `cancelled` || `error` only. ### awaitingResync diagnostics (R2 #3) State-resync latch silently dropped events with no signal. Added `console.warn` describing the dropped event type + last resync trigger so a stuck UI is debuggable. Latch behavior intentionally preserved — recovery is `store.reset()` on session reconnect. ### selectSubagentChildBlocks: freeze instead of copy (R1 #8) `[...cached]` per-call defeated React.memo / useMemo identity stability (every call produced a fresh array reference). Now freeze the cached arrays at build time in `getOrBuildChildrenIndex` and return the frozen reference directly — referential stability + mutation defense (strict-mode throws on `.length = 0` etc.). ### detectSubagentDelegation regex too broad (R3 #2) `(?:^|_)task$` falsely matched `edit_task` / `list_task` / `create_task` etc. — common tool names unrelated to delegation. Anthropic's Task tool is literally named `Task` (no prefix), so restricted bare-`task` to whole-name only: `^task$`. `delegate` / `subagent` / `spawn_task` keep the `^|_` prefix. ### memoryChanged bytesWritten finite check (R3 #3) `typeof === 'number'` accepted NaN / Infinity. Use the existing `numberField` helper which calls `Number.isFinite(v)`. ### Multi-line blockquote prefix (R3 #1) `> *thought:* ${text}` only prefixed the first line; subsequent lines escaped the blockquote. Added `blockquote(raw)` helper that prefixes every line; applied to thought / debug / error renderings. ## Quality (real) ### plainText / HTML maxFieldLength parity (R1 #5/6/7, doudouOUC approve note) The tool block in markdown caps via `text()`; plaintext + HTML caps were missing on header fields, preview content, and permission block labels. Threaded `cap()` consistently across all three projections. ### isSensitiveKey dedup (R1 #10) Seven exact-match entries (`password` / `apikey` / `idtoken` / `sessiontoken` / `clientsecret` / `xapikey` / `xauthtoken`) were already subsumed by existing `endsWith` rules. Removed. ### Re-export DaemonUiStateResyncRequiredEvent (R2 #7) Other session-meta event types are exported from the daemon barrel; this one was missed. Added to both `daemon/ui/index.ts` and `daemon/index.ts`. ## Reverted after gate-check (false-positive) ### classifySelectedPermissionOption CANCELLED branch (R2 #4) Reviewer suggested adding `CANCELLED_PERMISSION_TERMS` check before the `completed` default, so `selected:cancel` would map to cancelled. This CONFLICTS WITH: - the design comment at the caller: "A selected option resolves the prompt even when the option id is a domain value like a city name or an option id containing deny/cancel" - the existing test `'cancelled-substring-permission'` with payload `'selected:abort'` expecting status `'completed'` The daemon expresses "user cancelled the prompt" via `cancelled` as the PRIMARY token (handled at the caller layer), not `selected:cancel` — the latter means "user picked an option labeled cancel", which is a successful selection. Reverted; added explanatory comment so the next review round doesn't re-flag it. ## Stale (already fixed) ### R1 #1 (daemonBlockToPlainText opts forwarding) Already fixed in d35cbb75a (2026-05-23 monitor pass for review 4350741340). No further action. ## Test coverage added - HTML web_fetch URL sanitization (sanitizeUrls + Basic Auth) - Image URL protocol validation when sanitizeUrls:false - HTML shell / permission / thought / debug / status block kinds - Trimmed-tool cancellation propagation (no throw + transport-layer no-cancel) - Late permission.resolved after sentinel prune (no orphan) - Frozen children-index identity stability + mutation guard - previewMarkdown preserves rawOutput as object (in webui adapter test file) ## Validation | | | |---|---| | SDK tests | **161/161** (was 153 → +8 new) | | WebUI tests | **9/9** (was 8 → +1 new) | | SDK typecheck | clean | | WebUI typecheck | clean | Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(daemon-ui): tighten ensureSafeImageUrl to data:image/* only Audit follow-up (post-f5c54680f review pass): the previous `ensureSafeImageUrl` whitelist accepted any `data:` URI, which let `data:text/html,<script>alert(1)</script>` pass the protocol check. Modern browsers don't execute `<img src="data:text/html,...">`, but the comment claimed "never legitimate in `<img src>`" which slightly over-claimed the protection. Tighten the data: branch to require an `image/<subtype>` MIME prefix. Verified by a new test that covers: https (allow), data:image/png (allow), data:text/html (reject → '#'), javascript: (reject → '#'). Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(daemon-ui): wenshao + doudouOUC R4 review batch Walks 6 wenshao items (delivered as 8 review submissions — 2 CHANGES_REQUESTED + 6 individual COMMENTED — but 6 distinct concerns) and 3 doudouOUC R4 nits. All 9 real issues addressed; no false-positives this round. ## Real Criticals ### awaitingResync recovery API (wenshao R4) `store.reset()` requires session-id change semantics — wrong shape for "same-session reconnect with SSE replay" recovery. Added explicit `store.clearAwaitingResync()` API. Latch is still set on receipt of `session.state_resync_required` (intentional one-way during replay window); consumers now have a clean path to clear after the replay stream drains. ### normalizeAuthDeviceFlowCancelled test coverage (wenshao R4) Coverage gap surfaced — happy path (valid deviceFlowId) and malformed fallback to debug both untested. Added 2 tests. ## Real Suggestions ### sanitizeUrl: AWS / Azure / GCP credential patterns The previous regex caught `x-amz-` and `x-goog-` headers + generic `signature` / `sig`, but missed: - `AWSAccessKeyId` (S3 presigned) - Azure SAS short codes (`sv` / `se` / `sr` / `sp` / `st` / `spr` / `sip` / `ss` / `srt` / `sig` / `skoid` / etc.) - GCP signed-URL `GoogleAccessId` + `Expires` (paired with credentials in signed URL contexts) Widened regex to include `aws|google|expires` prefixes + added explicit Azure-SAS Set check. ### detectFileDiff: `content` alias disambiguated `{ path, content }` was being classified as `file_diff` regardless of tool semantics — but the same shape is common for file_read assertions or search queries. Since detectFileDiff runs BEFORE detectFileRead in the detector chain, this caused mis-classification. Fix: restrict bare `content` to require either (a) write-intent tool name (write/create/edit/replace/save/update) OR (b) co-occurrence with `oldText`. Explicit `newText` / `new_text` / etc. still pass through unconditionally. Required adding `opts` to the `detectFileDiff` signature (callers already pass opts to siblings). ### detectFileRead: 0-based offset → 1-based range Type doc says `range: [startLine, endLine]` is 1-based inclusive. The offset+limit conversion produced 0-based output ([0, 9] for offset=0/limit=10), which displayed as "lines 0-9" — line 0 doesn't exist in 1-based. Convert at the detector: `[offset+1, offset+limit]`. Updated the matching test (which had encoded the 0-based bug as expected behavior). ### formatMissedRange — guard inverted / single-event ranges The naive `lastDeliveredId+1 .. earliestAvailableId-1` formula produced: - `gap === 0`: "missed 6-5" (inverted) - `gap === 1`: "missed 6-6" (single event shown as range) Added `formatMissedRange()` helper with explicit branches: - `last < first` → "no events lost (resync requested without gap)" - `last === first` → "missed 1 daemon event (id N)" - `last > first` → "missed daemon events X-Y" Applied in both `transcript.ts` (status block message) and `terminal.ts` (ANSI projection) — same formula was duplicated. ## doudouOUC R4 nits ### README errorKind list outdated Replaced `expired / transport / server / internal` with pointer to `KNOWN_DEVICE_FLOW_ERROR_KINDS` exported constant — canonical list auto-stays-in-sync. ### README "10 scenarios" stale Was 10, became 11 with subagent-nesting. Removed the count and let the corpus be derived at runtime via `DAEMON_UI_CONFORMANCE_FIXTURES.length`. ### selectTranscriptBlocks danger post lazy-COW With state.blocks now shared across sidechannel snapshots, a misbehaving consumer doing `(state.blocks as DaemonTranscriptBlock[]).sort()` would poison every snapshot sharing the reference. Freeze the blocks array at the dispatch boundary in `reduceDaemonTranscriptEvents`. Internal reducer mutation goes through `takeBlocksOwnership` which copies before mutating, so the frozen reference is never modified in place. ## Validation | | | |---|---| | SDK tests | **162/162** | | WebUI tests | **9/9** | | SDK typecheck | clean | | WebUI typecheck | clean | Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(daemon-ui): wenshao R5 review batch — Critical OAuth fragment leak + 10 more Walks 13 inline items from wenshao's 16:46-17:28 reviews. 11 fixed, 1 deduped (lint-no-console flagged in both reviews), 1 reverted/push-back (multi-part deny re-flags the same design-intent territory as R2 #4). ## Critical fixes ### sanitizeUrl: OAuth #fragment leak `sanitizeUrl` cleared query params and Basic Auth userinfo, but `u.toString()` preserved `u.hash`. OAuth 2.0 implicit grant puts `access_token=...` directly in the fragment (e.g., `https://app/#access_token=gho_xxx&token_type=bearer`); some Azure SAS variants similarly. Now `u.hash = ''` before serialize. For rendered output (markdown / HTML / plaintext), the fragment is client- state-only and dropping it removes the entire fragment-side leak surface. ### ESLint no-console on awaitingResync diagnostic Project lint forbids bare `console.*`. Added `eslint-disable-next-line no-console -- intentional diagnostic` per wenshao's suggestion. Behavior unchanged. ### normalizeAuthDeviceFlowCancelled test coverage (still missing post-R4) R4 added tests for one of the five device-flow normalizers; the `cancelled` variant was still uncovered. Added happy + malformed-payload tests. ## Behavior fixes ### Plaintext sanitizeTerminalText parity `daemonBlockToPlainText` + `daemonToolPreviewToPlainText` previously returned ANSI/bidi-control text verbatim, while markdown and HTML paths sanitized via `sanitizeTerminalText`. A daemon emitting bidi overrides survived clean to plaintext output — contradicting the "copy-paste / logs" JSDoc intent. Now routes every text field through `clean()` = `cap(sanitizeTerminalText(raw))`. ### blockquote helper applied to image_generation + subagent_delegation R3 added the helper for thought/debug/error but missed two preview markdown sites (`> ${text(preview.prompt)}` for image_generation, `> ${text(preview.task)}` for subagent_delegation). Multi-line prompts / tasks now stay inside the blockquote. ### Default unrecognized-event branch: single debug block Was emitting `status + debug` (2 blocks) per unknown event type. In long sessions where the daemon adds new types an older SDK doesn't recognize, this doubled block-consumption rate and accelerated `maxBlocks` trimming of real content. Now emit a single `debug` block that prefixes the event-type for adapters that want to pattern-match. ### writeIntent regex underscore-boundary aware R4's `content` alias gate-check used `\b` word boundaries, but `\b` doesn't match between `write` and `_` in `write_file` (both `\w`). Fixed to `(?:^|[_-])verb(?:$|[_-])` which catches the canonical `write_file` naming AND still rejects `prewrite_check`. Verb list extended per wenshao's suggestion (`overwrite`/`modify`/`patch`/`generate`). ### useDaemonPendingPermissions over-subscription Hook used `useDaemonTranscriptState()` which fires on every daemon event (text deltas, tool updates, sidechannel). Switched to `useDaemonTranscriptBlocks()` which only invalidates when the blocks array reference changes — block-mutating dispatches only, thanks to lazy COW. Same selector semantics, ~10x fewer renders in chat-heavy sessions. ### Conformance suite: try/catch adapter JSDoc promised "does not throw" but the loop wrapped adapter calls without try/catch. Buggy adapters aborted the whole suite instead of producing a structured `ConformanceFailure`. Now wrap; on throw, capture the error message in `renderedExcerpt: "[adapter threw: ...]"` and continue. ## Type / Quality fixes ### DaemonTranscriptState.blocks typed readonly Runtime contract is frozen (lazy-COW poison defense), but the type was mutable — consumers got runtime `TypeError` for in-place mutation instead of compile errors. Now `readonly DaemonTranscriptBlock[]` so mutation is caught at the type level. ### formatMissedRange exported / deduplicated Helper was duplicated inline between transcript.ts (full phrasing) and terminal.ts (terser phrasing). Exported from transcript.ts and reused in terminal.ts to prevent future drift. ## Push-back (false-positive — see reply) ### classifySelectedPermissionOption multi-part deny (`selected:deny:access_violation`) Re-flags the same `selected:X` design intent rejected in R2 #4. The caller comment explicitly states a selected option resolves the prompt even when the option id contains `deny`/`cancel`. The existing test `cancelled-substring-permission` (payload `selected:abort`, expected `completed`) codifies this. Daemon expresses true user-cancellation via the `cancelled` PRIMARY token, not `selected:cancel`. Not changing; reply directs to the same R2 #4 reasoning. ## Tests added (+10) - normalizeAuthDeviceFlowCancelled happy + malformed - sanitizeUrl OAuth fragment access_token rejected - sanitizeUrl AWS/GCP/Azure SAS credential params stripped - formatMissedRange no-gap / single-event / multi-event - detectFileDiff content alias rejected for read-like tools - detectFileDiff content alias accepted for write-like tools - writeIntent word boundaries (prewrite_check NOT matched) - conformance captures adapter throw - unrecognized event → single debug block - store.clearAwaitingResync clears latch ## Validation | | | |---|---| | SDK tests | **172/172** (was 162, +10) | | WebUI tests | **9/9** | | SDK typecheck | clean | | WebUI typecheck | clean | Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(daemon-ui): wenshao R6 — recovery flow chicken-and-egg + pending pointer Three Criticals from R6 review (4351217188) all pointing at real bugs introduced by R4/R5 work — not false positives. Fixes plus regression tests. ## Critical 1 — same-session reconnect never clears the latch When the daemon emitted `state_resync_required`, the reducer set `awaitingResync = true`. The webui provider dispatched `assistant.done { reason: 'reconnected' }` after re-attaching SSE but never called `store.clearAwaitingResync()`. Result: events flowed in on the fresh stream but every one got dropped by the `applyDaemonTranscriptEvent` passthrough guard. Transcript appeared permanently frozen with no diagnostic clue (the `console.warn` fired on each drop, but the user wouldn't necessarily check DevTools). Fix: in `DaemonSessionProvider.tsx`, after dispatching the synthetic `reconnected` `assistant.done`, check `awaitingResync` and clear it BEFORE the new SSE event loop starts. ## Critical 2 — updateCurrentToolPointer breaks on undefined status In `upsertToolBlock`, a new tool block is created with `status: event.status ?? 'pending'`. But `updateCurrentToolPointer` was called with raw `event.status` — when undefined, the function's own `if (status === undefined) return;` guard short-circuited without ever pointing at the new (visually-pending) block. Result: `selectCurrentTool` returned `undefined` for daemon events that omitted the explicit `status` field, while the block sat at "pending" in the UI — invisible to the current-tool selector. Fix: pass the EFFECTIVE status (`event.status ?? 'pending'`) so the pointer logic mirrors the actual stored status. ## Critical 3 — clearAwaitingResync flow chicken-and-egg The earlier (R4) JSDoc documented the recovery flow as: "re-subscribe with `Last-Event-ID: 0`, then call clearAwaitingResync after replay drains." But while the latch is true, EVERY non-passthrough event is dropped at `applyDaemonTranscriptEvent`. So during the replay drain, zero events made it into state, and clearing the latch afterward did nothing — transcript permanently empty. Correct flow: clear FIRST, then stream events. Updated JSDoc on both `types.ts` interface and `store.ts` impl to document this clearly. Added a regression test (`clearAwaitingResync AFTER dispatching events: events ARE dropped`) that pins the correct flow in code. ## Regression tests (+3) - `undefined status` creates pending block AND sets currentToolCallId - clear-then-dispatch ✓ events flow - dispatch-then-clear ✗ events dropped (correct flow documentation) ## Validation | | | |---|---| | SDK tests | **175/175** (was 172, +3) | | WebUI tests | **9/9** | | SDK typecheck | clean | | WebUI typecheck | clean | ## Note on doudouOUC heads-up #4469 (main → daemon_mode_b_main sync, 45 commits since 2026-05-19) will land soon. doudouOUC's note says rebase should be smooth (no daemon-ui surface conflicts). Will rebase on the cron's next pass after #4469 merges. Generated with AI Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(daemon-ui): wenshao R7 — escapeMarkdownText covers `<` + details URL sanitization Two items from wenshao R7 (one inline Suggestion + one Verification-PASS finding). Both gate-checked as real; fixed. ## escapeMarkdownText: add `<` to escape set Markdown rendered through markdown-it with `html: true` would previously pass through raw `<img onerror>` / `<script>` from reviewer-untrusted metadata fields (tool title / toolKind / status / permission label / preview labels). The HTML render path already escapes via `defaultEscapeHtml`; this brings markdown to the same safety baseline. Note: `escapeMarkdownText` is only applied to metadata fields, NOT to assistant/user/thought body text (those are intentionally markdown content; escaping `<` there would mangle legitimate markdown). ## markdown tool details: sanitize URL credentials when sanitizeUrls:true `daemonBlockToMarkdown`'s `case 'tool':` branch appended `block.details` (serialized `rawInput` JSON) through `text()` which only handled ANSI/bidi. When `rawInput.url` contained credentials (Basic Auth in userinfo / OAuth in `#fragment` / signed-URL query params), the preview path correctly sanitized via `sanitizeUrl`, but the details dump leaked the raw URL. HTML + plaintext branches exclude details entirely, so they didn't leak. The asymmetry meant a consumer rendering markdown + relying on the R5 fragment-leak protection would still leak via details. Fix: added `sanitizeUrlsInText(text)` helper that regex-replaces every `https?://` URL in a string with its `sanitizeUrl(url)` form. Applied to `block.details` i…

feat(daemon): add shared UI transcript layer

5532d6c

chiga0 force-pushed the feat/daemon-ui-core branch from 5532d6c to ba44e08 Compare May 20, 2026 03:34

feat(web): add daemon web client poc

232cc1e

chiga0 force-pushed the feat/daemon-web-client branch from a055109 to 232cc1e Compare May 20, 2026 03:54

chiga0 force-pushed the feat/daemon-ui-core branch from ba44e08 to f338454 Compare May 20, 2026 03:55

chiga0 changed the title ~~feat(web): add daemon web client poc~~ feat(webui): add daemon web renderers May 20, 2026

chiga0 closed this in 9621992 May 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(webui): add daemon web renderers#1

feat(webui): add daemon web renderers#1
chiga0 wants to merge 2 commits into
feat/daemon-ui-corefrom
feat/daemon-web-client

chiga0 commented May 20, 2026 •

edited

Loading

Uh oh!

chiga0 commented May 20, 2026

Uh oh!

chiga0 commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chiga0 commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Current Split

Validation

Scope / Risk

Testing Matrix

Linked Issues / Bugs

Uh oh!

chiga0 commented May 20, 2026

Uh oh!

chiga0 commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

chiga0 commented May 20, 2026 •

edited

Loading