Skip to content

Dev#14

Merged
im4codes merged 131 commits intomasterfrom
dev
May 1, 2026
Merged

Dev#14
im4codes merged 131 commits intomasterfrom
dev

Conversation

@im4codes
Copy link
Copy Markdown
Owner

@im4codes im4codes commented May 1, 2026

No description provided.

IM.codes and others added 30 commits April 23, 2026 22:39
…tom cleanup

User-visible bug: the chat UI shows user messages and tool activity but no
assistant text bubble for Claude Code CLI sub-sessions (and the same latent
failure mode for Codex CLI). The terminal replay shows Claude's answer fine;
only the web timeline is empty between turns.

Reproduced: open a sub-session that was restored from the persisted session
store, let 2+ minutes pass without sending anything, then chat. Assistant
replies never land in the timeline.

Root cause — two watchers give up waiting for the per-session rollout file
before the user's first turn triggers its creation:

1. `jsonl-watcher.startWatchingFile`
   Polled the specific `<ccSessionId>.jsonl` path once per second for 120s,
   then ran a "phantom cleanup" branch that set stopped=true, removed the
   state, and released file claims. But CC only writes that JSONL on the
   first user turn — which for a restored sub-session that sat untouched
   overnight is nowhere near the 120s window. Result:
   `jsonl-watcher: file never appeared, cleaning up phantom watcher`
   logged, file eventually gets created by CC, nothing tails it, no
   assistant.text events emit. Screenshot showed exactly this trace
   against deck_sub_241y2i1h (ccSessionId fdf5286d-...).

2. `codex-watcher.startWatchingById`
   Same structural issue: 60 iterations × 500ms searching recent session
   dirs for a filename matching the target UUID, then a silent `return
   control;` that left the `WatcherState` orphaned in the `watchers` map
   with stopped=false, no activeFile, no pollTimer. Codex writes its
   rollout on first turn → orphaned entry → no drain → no assistant text.

Fix — convert both "give up" paths into "switch to slow probe and keep
the watcher alive until the session is explicitly stopped":

- jsonl-watcher:
  * After 120s fast-poll fails, install a `setInterval(stat(filePath), 10s)`
    instead of cleaning up. When the file appears, call `activateFile`,
    start the standard 2s drain poll, wire the fs.watch, and fall through
    to the normal hot path.
  * Extracted the drain+rotation poll startup into a reusable
    `startDrainPoll()` so the slow-probe success path uses the identical
    sequence as the fast path — no duplicated 2s-poll body.
  * Added `pendingFilePath`, `pendingProbeTimer` to `WatcherState`; the
    timer is cleared in `stopWatching()` so explicit teardown short-
    circuits the probe.

- codex-watcher:
  * Hoisted the per-iteration scan into a local `tryActivate()` helper.
  * Fast loop (60 × 500ms = 30s) calls `tryActivate()` and returns on hit.
  * Fall-through now installs a `setInterval(tryActivate, 10s)` that
    survives indefinitely, with identical teardown semantics.
  * Added `pendingProbeTimer` to `WatcherState`; `stopWatching()` clears it.

Gemini watcher already has the right shape — its `pollTimer` starts
unconditionally and handles the "file not yet there" case inside
`pollTick`, so it never orphans or phantom-cleans. No change needed there.

Tests: all 59 existing watcher tests (jsonl-watcher, jsonl-watcher-refresh,
codex-watcher-retrack) green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drives the Gemini CLI via the Agent Client Protocol (JSON-RPC 2.0 over
stdio) through the canonical `@agentclientprotocol/sdk` ClientSideConnection
— the same library Gemini CLI implements on the agent side. One long-lived
child process multiplexes all sessions; cross-process resume works via
ACP `loadSession` against `~/.gemini/tmp/<project>/chats/<id>.json`.

Verified end-to-end (scripts/smoke-gemini-sdk.mjs) that a session persists
across provider disconnect/reconnect including tool calls and memory recall.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… path

The previous commit registered the provider and added the agent type, but
left it off the user-facing surfaces: SESSION_AGENT_CHOICES, NewSessionDialog
union, subsession icon map, agent badge, all 7 i18n locales, and the
daemon's fresh-session launch branch.

Modelled the launch path on codex-sdk (no preset, no resume id, optional
requested model) — ACP mints its own sessionId on first turn.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…opped response

The in-flight dedup lock in git-status-store didn't distinguish between
"a request is actively in flight" and "a previous request's response never
arrived". Once `inFlightRequestId` got set (e.g. WS reconnect, daemon
restart mid-request, serverLink.send throw), every subsequent call — even
`force=true` from the refresh button — took the "queued and return" branch
and never fired a new WS request. Users saw stale pre-commit file lists
with the refresh button appearing to do nothing.

Fix: `force=true` now always abandons any stuck in-flight request and fires
fresh. Non-force paths (the 30s poll, re-subscribe on mount) also self-heal
once a 15s timeout elapses, so dropped responses don't poison the cache
forever. `settleSharedChangesRequest` also now only clears tracking when
the settled requestId matches the current in-flight one, preventing a
late stale response from nulling out a newer refresh's tracking.

Regression tests cover all four corner cases in web/test/git-status-store.test.ts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…abels

Codex SDK 0.124.0 exposes ModelReasoningEffort = "xhigh" (Extra High),
matching the "Extra high" option shown in its interactive TUI. Add it to
TRANSPORT_EFFORT_LEVELS and CODEX_SDK_EFFORT_LEVELS so the reasoning
level selector shows all four Codex options (minimal / low / medium / high
/ extra high).

Also add formatEffortLevel() to shared/effort-levels.ts and wire it into
all three effort-selector components so "xhigh" renders as "Extra High"
instead of the raw identifier — and all existing levels get proper title-
case labels too (Low, Medium, High, Max, etc.).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The xhigh (Extra High) effort level added in 10da900 caused the
codex-sdk thinking test to match both "High" and "Extra High" buttons
with /high/i. Use /^high$/i (same pattern as the qwen test and nearby
assertion at line 2974).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When the AI compression backend (claude-code-sdk / codex-sdk / qwen) is
unavailable, the compressor falls back to a local summarizer whose
output embeds a user-visible "⚠️ Structured summary unavailable" banner
plus raw event transcripts. Previously, once the retry budget was
exhausted the coordinator committed that fallback text verbatim to the
recent_summary projection "to avoid unbounded growth". Because
buildLocalFallbackSummary prepends the prior summary + "--- Updated ---"
on every call, repeated failures produced nested fallback banners that
compounded in memory and were promoted to durable storage — the Russian-
doll of warning banners users have been observing.

New behavior for fromSdk === false:
- retry budget remaining → keep staged events, mark job
  materialization_failed, write NO projection. Prior real summary
  (if any) is untouched.
- retry budget exhausted → abandon the batch: delete staged events
  (prevents unbounded growth), clear dirty target, mark job completed
  with an "abandoned" error note. Still NO projection written.

Backend downtime now leaves a gap in memory rather than a scar.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Commit 262b0b3 switched the codex-sdk effort-menu matcher from
/high/i → /^high$/i to avoid matching "Extra High" after xhigh was
added. But the menu items render as "○ High" / "○ Extra High" (decoration
prefix + formatted label), so /^high$/i matches nothing and the test
fails in CI with "Unable to find an accessible element with the role
button and name /^high$/i".

Use an exact string match on "○ High" — unambiguous, matches the actual
accessible name, and won't collide with "○ Extra High".

The qwen /high/i matcher elsewhere in the file is intentionally left
alone: qwen's level list has no "Extra High", so substring matching is
safe there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the CPU-bound parse pipeline out of the daemon's main event loop.
Every JSONL line that Claude Code writes previously triggered a synchronous
JSON.parse + regex + block-interpretation + event emission on main, which
competed with WebSocket inbound routing under high agent throughput.

Architecture
- `jsonl-parse-core.ts` — pure parser: no timelineEmitter, no module state.
  Returns an ordered list of EmitInstruction. Shared between main-thread
  fallback and worker.
- `jsonl-parse-worker.ts` + `jsonl-parse-worker-bootstrap.mjs` — Worker
  entry. Bootstrap is plain ESM JS so `new Worker()` can load it without a
  TS loader; it best-effort registers `tsx/esm/api` in dev, then imports
  the real worker module. Production loads the compiled .js directly.
- `jsonl-parse-pool.ts` — main-side client: lazy worker spawn, id-
  correlated requests, timeout + error handling. On worker crash the pool
  marks itself permanently disabled and all future parseLines calls fall
  back to the main-thread context transparently.
- `jsonl-watcher.ts` — drainNewLines batches complete lines into a single
  parseLines request. `emitRecentHistory` uses a scratch ParseContext
  (pure path) so history replay doesn't leak into live state. stopWatching
  forgets session state in both contexts.
- `scripts/copy-worker-bootstraps.mjs` — postbuild step that copies
  `src/**/*.mjs` into `dist/src/` (tsc ignores .mjs, so we ship them
  manually).

Flag semantics
`IM4CODES_JSONL_WORKER` defaults to **on**; it's an operational kill
switch. Set to 0/false/no/off/"" to force main-thread parsing (e.g. for
diagnostics without a redeploy).

Safety guarantees
1. Parity tests in `test/daemon/jsonl-watcher.worker.test.ts` run the SAME
   JSONL fixture through both code paths (real fs.watch + real Worker vs
   main-thread fallback) and assert byte-identical timeline event
   sequences. Any future parse change that diverges the paths fails CI.
2. Pool integration tests in `test/daemon/jsonl-parse-pool.test.ts` spin
   up the REAL Worker (no mocks) and verify: concurrent id correlation,
   cross-request tool_use/tool_result state, large batches, timeout
   behavior, Edit-tool deferred emit, forgetSession cleanup, shutdown-
   reuse, bad-payload resilience.
3. `jsonl-parse-core.test.ts` covers parse correctness in isolation.
4. Automatic fallback on worker crash/timeout means users see no
   functional regression even if the worker path breaks.

Test counts: 12 core + 12 pool + 4 parity = 28 new tests. Full daemon
suite: 2357 pass / 52 skip. Server: 381 pass. Web: 901 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hatch

Rationale: the project is moving hot-path work to SDK-based transport
providers, and Claude JSONL parsing is no longer a priority to offload.
Rather than revert the worker entirely (and throw away the parity tests),
make it opt-in so the code keeps paying its own way:
  - If a specific deployment ever sees main-loop pressure from heavy
    Claude streams, flip `IM4CODES_JSONL_WORKER=1` and restart — no deploy
    needed.
  - The parity tests keep guaranteeing worker/main produce byte-identical
    timeline events, so turning it on later is safe.

Changes:
  - `isJsonlWorkerEnabled()` default → false; opt in with 1/true/yes/on.
  - Remove unused `dispatchEmits` helper that lint flagged.
  - Update tests to match new default (also fixes a timing-sensitive
    failure in `claude-no-text-refresh.test.ts` that hit when the worker
    path was on by default on slower CI runners).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
useTransportModels now tracks ws.connected and re-fetches on reconnect
via DAEMON_MSG.RECONNECTED. Prevents the model picker from staying empty
after a daemon restart or WS reconnect while a session is open.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
GeminiSdkProvider.readModelList() probes ACP session/new to get the
available Gemini models (auto-gemini-3, gemini-2.5-pro, etc.) and caches
them for the lifetime of the connection. Subsequent sessions reuse the
cached list without extra RPC calls.

Wire the discovery through gemini-runtime-config.ts → command-handler
transport.list_models → useTransportModels hook (adds gemini-sdk to
supportsDynamicTransportModels) → NewSessionDialog model picker.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Every provider that exposes a model picker now implements a single
standard method instead of ad-hoc per-type branching in command-handler.

  TransportProvider.listModels?(force?) → ProviderModelList

Each provider delegates to its existing runtime-config:
  codex-sdk       → getCodexRuntimeConfig()
  copilot-sdk     → getCopilotRuntimeConfig()
  cursor-headless → getCursorRuntimeConfig()
  gemini-sdk      → ACP session/new probe (cached)

handleTransportListModels collapses from 60 lines of if-chains to a
single provider.listModels() dispatch. New providers get model-picker
support for free just by implementing the interface.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…tract tests

ClaudeCodeSdkProvider now implements the standard listModels() interface
returning the static Claude roster (opus[1M] / sonnet / haiku). Exposes
the model picker in NewSessionDialog so users can switch models at session
creation, matching codex-sdk / copilot-sdk behaviour.

Also adds test/agent/providers/list-models.test.ts — the first dedicated
test suite for the TransportProvider.listModels() contract, covering all
five providers and error/empty edge cases. Ensures every future provider
in supportsDynamicTransportModels implements the interface.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…lts model picker

When a qwen preset is selected in the supervision settings (both Global
defaults and per-session), the Model dropdown was ignoring the preset and
always showing the full QWEN_MODEL_IDS list — mixing OAuth-only models
with Coding Plan models regardless of which tier the preset uses.

Root cause: the ccPresets local state was stripped to { name, env? },
dropping the availableModels and defaultModel fields that cc.presets.list_response
and cc.presets.discover_models_response carry. getSupervisionModelOptions()
was called with only the backend, never the selected preset.

Fix:
- Preserve availableModels and defaultModel when decoding cc.presets.list_response
- When a preset is selected, use preset.availableModels (if non-empty) as the
  model options; fall back to getSupervisionModelOptions(backend) otherwise
- getPresetPinnedModel() now prefers preset.defaultModel over env.ANTHROPIC_MODEL
  so the discovered model (set by cc.presets.discover_models) wins

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
IM.codes and others added 29 commits April 28, 2026 19:32
A finished assistant.thinking event with no body text and ~0s duration
rendered as a meaningless "~ Thought for 0s" pill. Suppress it so the
timeline isn't cluttered by no-op thoughts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Re-lands the unified desktop window stack after the original attempt
(commit 31f2a56, reverted in 7c4e43b) caused a render/fetch storm.
The class of bug it produced — every pointer-down inside any managed
window cloned the stack object, invalidated `useMemo([..., stack])`
deps in App, remounted ChatView, and refired the timeline.history
fetch effect 30+ rps per open session — is now structurally
prevented by the new state shape.

What's different this time:

  - `web/src/window-stack.ts` — `MutableDesktopWindowStack` class
    that mutates entries in place. `ensureWindow` /
    `bringToFront` / `removeWindow` return `boolean` and are no-ops
    when the request is already in effect (`bringToFront` on the
    frontmost window returns `false` — the load-bearing render-
    stability guarantee).
  - `web/src/app.tsx` — stack instance held in a stable `useRef`;
    re-renders driven by a `useState` version counter that bumps
    only when a stack mutation reports a real change. Memo dep
    lists are primitives only (`stackVersion`, `openSubIdsKey`).
    `useEffect` blocks sync each show-boolean (showRepoPage,
    showCronManager, showDiscussionsPage, showDesktopFileBrowser,
    showDesktopLocalWebPreview, showSharedContextManagement,
    showSharedContextDiagnostics, previewFileRequest) into the
    stack so every open/close path is covered uniformly.
  - `web/src/components/SubSessionWindow.tsx` — accepts new optional
    `desktopFileBrowserZIndex` / `onDesktopFileBrowserOpen` /
    `onDesktopFileBrowserFocus` / `onDesktopFileBrowserClose` props
    and uses the shared parent-child banded ordering for the
    delegated file-browser child window. Mobile branch is unaffected.
  - `web/src/components/FloatingPanel.tsx` — no code change. The
    existing `zIndex` + `onFocus` props are already the right shape;
    the previous attempt's churn there was unnecessary.

Coverage:

  - 32 new unit tests (web/test/window-stack.test.ts) — covers
    idempotent register, child-above-owner banded ordering,
    singleton reopen, removal, getFrontmostMatching, and the
    "100 redundant calls produce 0 changes" invariant at the
    stack level.
  - 5 new render-stability integration tests
    (web/test/window-stack-render-stability.test.tsx) — exercises
    the React integration shape used by app.tsx and asserts:
      * 100x bringToFront on frontmost → 0 ChildHarness remounts
      * 100x bringToFront on frontmost → 0 fetch effect refires
      * memoized z-index for unchanged window stays referentially
        stable under unrelated peer interactions
      * lint-style grep against app.tsx forbids `desktopWindowStack`
        or `stackRef.current` in any hook dep array
  - FloatingPanel and SubSessionWindow tests extended.

Full web suite: 89 files / 1103 passing. Daemon + web typecheck clean.

Identity note: `cronmanager` in the spec was aligned to the deployed
FloatingPanel id `cron` to preserve `rcc_float_cron` saved geometry;
the openspec artifacts under `openspec/changes/unify-floating-window-
stack` (gitignored, local-only) were updated to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Section 7.4 dep-array hygiene test resolved app.tsx via
`path.join(process.cwd(), 'web', 'src', 'app.tsx')`, which works locally
(cwd = repo root) but fails in CI where the Web Tests (Unit) job runs
`cd web && vitest` so cwd = web/ and the path becomes web/web/src/app.tsx.

A previous attempt used `import.meta.url` + `fileURLToPath` but the
web-unit vitest config gives `import.meta.url` a non-`file://` scheme
(ERR_INVALID_URL_SCHEME), so that path is unusable.

Switch to a small candidate list probed by `fs.access` — works under
either cwd and survives whatever transform pipeline vitest applies.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The HTTP backfill path (`fireHttpBackfill`) was already gated by
`isActiveSessionRef.current`, but the parallel WS-side path
(`ws.sendTimelineHistoryRequest`) had no such gate — every mounted
`useTimeline` instance fired its own daemon history request on mount,
on `DAEMON_MSG.RECONNECTED`, on browser-WS `session.event=connected`,
and on the empty-response retry. With N SubSessionCards mounted in the
SubSessionBar, that meant N concurrent `timeline.history_request` calls
to the daemon every reconnect for sessions the user had never opened.

Real-world hit: an OpenCode-typed sub-session (`deck_sub_2f3t5g5g`) the
user never opened was polled 128 times in ~3 minutes, with the daemon
spending 1.5–2.5 s in `recoverOpenCodeSessionRecord` (tmux fork +
OpenCode history scan) and another 1–1.7 s in `exportOpenCodeSession +
buildTimelineEventsFromOpenCodeExport` per request — all on the hot
path, no caching. p90 totalMs blew past the web client's 2.5 s
`fetchTimelineHistoryHttp` AbortController and surfaced as 200/0 B
"timeouts" in DevTools.

Add the same `isActiveSessionRef.current` gate to all four WS-side
call sites: the inner `requestDaemonHistory` helper used by every
mount-path branch, the `DAEMON_MSG.RECONNECTED` block, the browser-WS
reconnect block, and the 1-s empty-response retry. The mount-effect's
dep array already includes `isActiveSession`, so when the user later
activates an inactive card its hook re-runs and the bootstrap path
fetches as normal. Inactive cards still render previews from
memory/IDB cache and continue to receive live WS event pushes.

Expected impact: ~108 timeline.history_request/min → roughly 1/min
(only the active window's hook gates open). The OpenCode synthesize
hot-path remains; that's a separate caching change to follow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an opt-in "remember password" checkbox on the password login form,
defaulting to checked so the credentials persist by default for users on
trusted personal devices. Hydrates username + password from localStorage
on mount; saves on successful login (and on the post-change password
flow); wipes immediately when the user unticks the box.

Storage:
  - rcc_login_remember  — '1' / '0' (absent = default '1')
  - rcc_login_username  — saved username
  - rcc_login_password  — saved password (cleartext; opt-in via checkbox)

UI: small checkbox row between the password input and the Sign In button.

i18n: new key `login.remember_password` added to all seven locale files
(en/es/ja/ko/ru/zh-CN/zh-TW) per the project's mandatory i18n rules.

Tests:
  - 7 new tests in web/test/components/LoginPage.test.tsx covering
    default-checked, prior-off honored, hydration, no hydration when
    off, persist on success, immediate wipe on uncheck, and no-save
    when off.
  - jsdom's `window.location.reload` is stubbed via `vi.stubGlobal`
    rather than `vi.spyOn` (the latter trips Cannot redefine property).

Full web suite: 89 files / 1110 passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… %APPDATA%\npm

Symptom (reported in the wild): Windows daemon upgrade reports "install
succeeded" but the daemon comes back at the OLD version (e.g. stuck on
1917 across multiple upgrade cycles). The upgrade script verifies the
new shim's --version and aborts on mismatch — so npm install really did
install the new version — yet on restart the watchdog launches old code.

Root cause: `writeWatchdogCmd` hardcoded the launch line as
  `call "%APPDATA%\npm\imcodes.cmd" start --foreground`
which only matches when the user's npm global prefix is the default
`%APPDATA%\npm`. For users on nvm-windows, fnm, volta, system-wide
nodejs, or anyone who ran `npm config set prefix <other>`, npm writes
the new shim to a DIFFERENT location while the watchdog keeps launching
a stale `%APPDATA%\npm\imcodes.cmd` left over from a prior default-
prefix install. Result: the upgrade runs perfectly, the watchdog runs
the wrong shim, and the user sees the same old version forever.

Fix:
  - Resolve the real npm prefix from this module's own install path
    (`paths.imcodesScript`), which is exactly where npm just put us.
  - Emit the env-var form (`%APPDATA%\npm\imcodes.cmd`) ONLY when the
    resolved prefix actually matches `%APPDATA%\npm` — preserves the
    non-ASCII-username-safe behavior for default-prefix users.
  - For every other prefix, emit the absolute resolved shim path so
    the watchdog launches the same binary npm just installed.
  - Defensive fallback: if `APPDATA` env var is unset (or empty),
    always use the absolute path.

Also: switch the npm-prefix detection regex/join from the platform-
default `path.dirname` / `join` to `path.win32` explicitly. The watchdog
.cmd is a Windows-only artifact and the input paths use backslashes;
on POSIX dev/CI machines the default `path.dirname` doesn't recognise
backslashes and silently collapses the whole path to ".", which would
have prevented unit tests from exercising the prefix-detection logic.

Tests:
  - Stub APPDATA in beforeEach so the existing default-prefix
    assertions exercise the real isDefaultPrefix branch.
  - New test: custom prefix (`C:\nvm-versions\v20.11.0\…`) emits the
    absolute path and does NOT contain `%APPDATA%\npm\imcodes.cmd`.
  - New test: APPDATA unset falls back to absolute path.

Daemon test suite: 103 passing in test/util/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-up, resume near bottom

Per the user complaint "聊天窗口 滚屏和自动更新 怎么才能滚屏时自动滚低暂停 滚到底
的时候 或者很接近底的时候开启自动更新 现在为了自动更新不得不牺牲滚屏体验"
(canonical sticky-bottom: when reading older history the viewport must NOT yank
to bottom on every new event; when at/near bottom new events MUST follow). Three
distinct paths in `ChatView.tsx` were calling `scrollToBottom()` unconditionally
on every viewItems / lastVisibleTs / initial-mount change, ignoring whether the
user had scrolled away. Plus a 60-s idle-resume `setInterval` overrode user
intent after a minute of inactivity.

Changes (per the round-3 plan in c35f9eeb-dbd discussion):

  1. Split `scrollToBottom()` into mechanism + policy:
     `scrollToBottom(engageFollow = true)` does the motion;
     callers that should NOT engage follow pass `false`.
     The default is `true` so the public callback exposed via
     `onScrollBottomFn` (consumed by SessionPane and SubSessionWindow's
     send paths) keeps its existing "force jump + re-engage" semantics.

  2. Gate the three event-driven scroll paths on
     `(preview || autoScrollRef.current)`:
       * `useLayoutEffect` on [preview, viewItems, loading, loadingOlder]
         — was the load-bearing bug; also bumps a `newSinceUnfollow` count
         when paused so the floating "↓" button can surface unread arrivals.
       * `useEffect` on [lastVisibleTs, preview] (the timestamp fallback) —
         re-checks at rAF fire time so a state flip during the frame window
         is honoured.
       * `useLayoutEffect` on [lastVisibleTs] (the initial-mount synchronous
         scroll) — used to call `scrollToBottom()` with the engage-follow
         default, which the session-change effect's reset of
         `hasInitialScrolledRef` could re-trigger on later lastVisibleTs
         changes, forcibly re-engaging follow against the user. Now uses
         `scrollToBottom(false)` and additionally gates on
         `autoScrollRef.current` so paused state survives.

  3. `handleScroll` programmatic-scroll guard: a recent `scrollToBottom()`
     marks an upcoming synthetic event for swallow with a bounded
     one-shot count and a 200 ms watchdog. Position-aware so iOS layout
     shifts that reset scrollTop to 0 still reach the existing
     transient-top-jump recovery path.

  4. `handleScroll` adaptive + hysteresis thresholds: disengage at
     `max(180, 0.25 * clientHeight)` from the bottom, re-engage at
     `max(60, 0.10 * clientHeight)`. Fixes mobile over-engagement
     (a flat 150 px swallowed ~42 % of a 360 px landscape viewport) and
     boundary flicker during streaming layout.

  5. Remove the 60-s `SCROLL_IDLE_RESUME_MS` `setInterval` — the exact
     "auto-update fights scroll experience" trade-off the user objected to.

  6. Floating "↓" button: append a count of new messages arrived while
     paused, with descriptive aria-label. Click resets the count and
     re-engages follow.

  7. End-key keyboard parity for the "↓" button.

  8. `overflow-anchor: none` on `.chat-view` so the browser's default
     `overflow-anchor: auto` doesn't fight our explicit `scrollTop` writes
     on streaming append.

  9. `prefers-reduced-motion: reduce` is now honoured at the one
     `behavior: 'smooth'` site (`scrollIntoView` for the pinned-last-sent
     jump).

Test surgery: rewrote the broken `forces the main chat view to follow streamed
updates with the same timestamp` test (which encoded the wrong contract) into
`does not move the main chat viewport on same-timestamp streamed updates after
the user scrolls away from bottom`. Added four new tests covering: near-bottom
still follows (no over-correction), newer-timestamp arrivals stay put while
paused, no auto-resume after idle, and the new-message count badge increments.
All 30 ChatView tests pass; full web suite 1114/1122 (8 pre-existing skips).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop both the live "thinking…" indicator and the finalized "Thought for Xs"
summary block — the agent's running state and memory-context card already
give enough signal. Removes the now-unused ThinkingEvent/ActiveThinkingLabel
components, ChatEvent's nextTs prop, and the useNowTicker import.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous tests asserted preference-gated visibility of `.chat-thinking`,
but ChatView now always returns null for assistant.thinking. Collapse
the two cases into a single test that loops through all preference
values and verifies the element is never rendered.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Selecting 3 members in the P2P panel could still launch every active
sub-session because expandAllTargets defaulted missing entries to
"include" — any sub-session created after the last save sneaked in.
Switch to strict allowlist semantics and add daemon-side gates so
empty/missing configs surface a clear error instead of silently
expanding.

- shared: add NO_SAVED_CONFIG, NO_ENABLED_PARTICIPANTS,
  TOO_MANY_PARTICIPANTS error codes and MAX_P2P_PARTICIPANTS=5
- daemon: gate every structured P2P start on saved config + ≥1
  enabled member + cap 5; flip expandAllTargets to inclusion-only
- web: cap participants at toggle and save time in P2pConfigPanel;
  surface the new error codes as toast notifications in app.tsx

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The fix(p2p) commit added Gate 1 / Gate 2 in handleSend so structured
P2P starts via the dropdown / __all__ / p2pMode paths require a saved
config with ≥1 enabled member. Three pre-existing tests asserted the
old "missing config = expand to every active session" behavior.

- "...config filtering removes all otherwise-valid targets" now expects
  NO_ENABLED_PARTICIPANTS (Gate 2 fires before the post-expansion
  filter would have produced NO_CONFIGURED_TARGETS) — semantically
  more accurate: the config exists but selects nobody.
- "p2pAtTargets with __all__" and "p2pMode field expands to all" now
  seed a saved config that allowlists every target, so the expansion
  has members to expand to under the new strict allowlist semantics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The published tarball already neutralizes its own lifecycle scripts so
sharp's `node install/check.js || npm run build` fallback can't hijack
imcodes's tsc. But that doesn't help the second failure mode: npm 11
global install half-extracts sharp's transitive deps (detect-libc,
semver, @img/colour) as empty placeholder dirs, the daemon crashes on
first @huggingface/transformers import, and semantic search permanently
sticky-disables. The daemon-side auto-upgrade dodges this with an
inline bash repair, but human users running `npm install -g imcodes@dev`
have no equivalent protection.

- Add src/util/postinstall-sharp-repair.ts: a published-tarball-only
  postinstall that checks each SHARP_REQUIRED_DEP, wipes empty
  placeholders, and runs `npm install --no-save --ignore-scripts
  sharp@0.34.5` from the imcodes package root. Re-uses the existing
  SHARP_REQUIRED_DEPS so the allowlist stays in lockstep with the bash
  repair. Skips on dev checkouts (.git walk-up) and unexpected
  contexts (no dist/), and is wrapped to always exit 0 — a failed
  postinstall must never break npm install.
- Update scripts/strip-onnxruntime-gpu.mjs prepack: keep neutralizing
  every script EXCEPT postinstall, then force-write
  `postinstall: node dist/src/util/postinstall-sharp-repair.js`. Add a
  fail-fast check that the bundled script actually exists before
  letting `npm pack` complete.
- Daemon auto-upgrade path is unchanged — it still passes
  `--ignore-scripts` and runs the bash repair, which short-circuits
  before the new postinstall fires.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…on Windows

npm normally exposes npm_execpath as the JS entrypoint (npm-cli.js), but
some Windows configurations set it to the .cmd/.bat shim instead. Node's
spawn refuses to launch those without the shell, so the repair would
fail with ENOENT before the nested install ran. Route .cmd/.bat shims
through shell:true to match the unset-execpath fallback's behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Windows unit-tests job runs an explicit cherry-picked file list, so
the new test/util/postinstall-sharp-repair.test.ts (added with the npm-
publish self-heal change) was only running on macOS + Linux. Add it to
the Windows job so the Windows-specific code paths (npm_execpath as
.cmd/.bat shim, shell:true PATH resolution of npm.cmd) actually get
exercised on a real win32 runner before we ship.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ures

The Windows CI run failed with "npm stub log missing" for two tests but
no actionable detail in the failure output (the script's stderr is
captured by spawnSync, not vitest's). Add an opt-in IMCODES_POSTINSTALL_
DEBUG=1 trace that prints pkgRoot, every git-walk step, the resolved
spawn argv, and the child's status/stdout/stderr. Tests set the env var
and dump everything (plus a recursive workdir listing) into the
assertion message on failure so the next CI run shows which guard fired
or where spawn errored.

Production npm installs see no extra output — DEBUG is gated behind the
env var.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n Windows

The Windows CI run revealed the test was actually invoking the real
toolchain npm (~30s of real `npm install sharp@0.34.5` from the
registry) instead of the local fake-npm.mjs stub. Root cause: Windows
env vars are case-insensitive, but Node's process.env preserves the
casing the underlying API hands back. The outer `npm test` lifecycle
sets npm_execpath; when the test does `{ ...process.env,
npm_execpath: stubNpm }`, both keys can end up in the spawn's env
block under different casings. Windows' merge then picks the
inherited (real npm) one and the stub log is never written.

Fix: explicitly strip every case-variant of npm_execpath from the
inherited env before injecting the stub so the child sees exactly
one key. Production behavior is unchanged — real npm only sets one
copy when invoking postinstall, so the script always sees the right
value.

Note: the Linux Node 22 failure in the same run is an unrelated
"database is locked" SQLite flake in test/daemon/timeline-projection
.test.ts that doesn't reproduce on Node 24.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ardening

Implements the daemon memory pipeline overhaul per
`memory-system-1.1-foundations`:

- Removes legacy `src/memory/{claude-mem,mem0,detector,injector,
  context-builder,extractor,interface}.ts` dead code.
- Adds durable `context_event_archive` + `context_projection_sources`
  reverse index, sticky-FIFO `mergeSourceIds(cap=200, sticky=10)`,
  `summary_fingerprint` with partial UNIQUE INDEX, atomic
  `INSERT…ON CONFLICT DO UPDATE` dedup with bounded SQLITE_BUSY retry,
  retention sweeper with NOT-IN guard, resumable archive/fingerprint
  backfill, and `context_meta` migration sentinel.
- Replaces char-heuristic budgets with vendored `@anthropic-ai/tokenizer`;
  composite token-density trigger (auto/idle/schedule/force-fire ceiling)
  with event-count floor.
- Adds tool-specific cheap pre-pass compressors (Bash/Read/Edit/Grep/
  Glob/Write) with recoverable `[event:… retrievable via chat_get_event]`
  placeholders.
- Lazy regex secret redaction at compressor input, persisted summaries,
  and embedding source; pinned-region byte-identical preservation;
  extended pattern set; ≥30 positive + ≥15 false-positive corpus.
- Anti-instruction summarizer preamble + 5 existing + 4 added headings
  preserving `extractDurableSignalsFromSummary`.
- Per-project `.imc/memory.yaml` loader with safe fallback parsing.
- `context_pinned_notes` side table keyed on namespace_key only.
- Tier-3 `materializeMasterSummary` for main-brain sessions only.
- Daemon-side owner-only read tools (`chatGetEvent`,
  `memoryGetSources`, `chatSearchFts`) + server-side `vector_search`
  wrapping `buildScopedWhereClause`; FTS5 with runtime-chosen
  trigram/unicode61 tokenizer recorded in context_meta.
- `compaction.result` timeline event with Phase-1.5 forward-compat
  reserved fields; web UI inline rendering across all 7 locales.
- Structured silent-fail-open diagnostics (`warnOncePerHour` +
  per-process counters).

Compliance hardening (P0–P8 per discussion 38109db9-43e):

- P0: Move `redact-secrets.ts` to `shared/`; daemon path is a
  re-export shim so server/web import without duplicating regexes.
- P1: Domain-validating `loadMemoryConfig` (`clampPositive`,
  `clampNonNegative`, `clampPositiveOrSentinel`); `archiveRetentionDays`
  only accepts `-1` or integer ≥ 1; out-of-domain values warn-once +
  counter and fall back to default. `pruneArchive` adds defense-in-
  depth so direct callers cannot wipe uncited rows by passing 0/-2.
- P2: Proportional summary budget sentinel — defaults
  `autoMaterializationTargetTokens` and `manualCompactTargetTokens`
  to 0; coordinator and `/compact` branch
  `value > 0 ? value : computeTargetTokens(input, mode)`. Default
  installs now exercise the spec's proportional clamps instead of the
  legacy fixed 500/800.
- P3: `server/src/util/embedding.ts:storeProjectionEmbedding`
  re-redacts via `shared/redact-secrets.ts` before
  `generateEmbedding`; query-text embedding remains intentionally
  unredacted.
- P4: `MemoryToolCaller` type split — public surface requires
  `namespace`; owner-wide debug search lives in non-public
  `_internalChatSearchFtsGlobal` accepting `InternalMemoryToolCaller`.
  Runtime guard rejects any caller that smuggles
  `allowGlobalOwnerSearch` through the public path.
- P5: `compileExtraRedactPatterns` gains an `onError` callback wired
  to warn-once + counter; `redactSensitiveText` truncates input to
  1 MiB before applying user patterns (best-effort ReDoS cap).
- P6: `archiveEventsForMaterialization` upserts `token_count` on
  conflict while keeping content/metadata byte-identical to the first
  archive write.
- P7: deferred to follow-up `memory-system-1.2-async-write-path`
  with benchmark gate.
- P8: `scripts/check-scope-filter.sh` regex now catches aliased
  `WHERE p.scope = …` predicates with explicit allowlist for the four
  pre-existing recall-route lines; new
  `scripts/check-no-internal-caller-leak.sh` enforces the P4
  internal/public type boundary; both wired into
  `scripts/run-acceptance-suite.sh`.

Verification:
- `npx tsc --noEmit` (daemon, server, web): clean
- daemon impacted suites 40/40, server 415/415, web 1113/1113,
  e2e memory-pipeline 1/1: all pass
- `bench:memory` 1MB redaction: 24.54ms (gate 100ms)
- `scripts/{check-scope-filter,check-replication-length-assumptions,check-no-internal-caller-leak}.sh`:
  all pass
- `openspec validate memory-system-1.1-foundations`: valid

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ansport SDK

The /compact interception added in 96218b5 assumed transport SDKs do not
accept a manual compact trigger. They do — the literal /compact text is
accepted and the SDK runs its own native compaction (verified by
src/agent/providers/claude-code-sdk.ts:401 which already observes the
SDK's compact_boundary event).

The synthesized replacement (replay history -> compressWithSdk ->
relaunchFreshTransportConversation -> emit compaction.result) was a
regression: relaunch discards SDK tool config, system prompts, resume
identity, and provider prompt-cache state. SDK-native path preserves all.

Memory provenance is unaffected: Workstream A's automatic materialization
archives every memory-eligible event into context_event_archive
independently of the SDK's conversation buffer.

Removed: /compact interception block + supportsTransportCompact helper
in src/daemon/command-handler.ts; shared/compaction-events.ts;
src/context/compression-feedback.ts; COMPACTION_RESULT_EVENT entry in
src/shared/timeline/types.ts; compaction.result case in
web/src/components/ChatView.tsx; chat.compaction_result_title i18n keys
across 7 locales; test/context/compression-feedback.test.ts;
test/daemon/command-handler-compact.test.ts.

Retained: computeTargetTokens('manual') and manualCompactTargetTokens
proportional-sentinel semantics for master-summary writer and any future
programmatic caller (not on user-facing command path in Phase 1).

OpenSpec artifacts under memory-system-1.1-foundations updated to record
the rollback rationale and the authoritative requirement that
user-facing /compact forwards to the transport SDK without daemon-side
interception.

Verification: tsc clean across daemon/server/web; daemon 2509, server
415, web 1113 tests pass; openspec validate passes; bench:memory 21.79ms
(gate 100ms); all three CI guard scripts pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Production regression: globally installed imcodes called setupArchiveFts
unconditionally; on Node 23.11.0 (whose bundled SQLite ships without FTS5)
the CREATE VIRTUAL TABLE … USING fts5(…) call threw inside ensureDb(),
killing daemon startup with "degraded state, no server connection".

User patched the installed dist with a try-catch — but the source-side
setupArchiveFts is still in src/store/context-store.ts:411 and gets
called unconditionally at line 225, so the next upgrade would regress.

Fix:
1. setupArchiveFts probes FTS5 availability up front via a throwaway
   CREATE VIRTUAL TABLE __imc_fts_probe USING fts5(content). On probe
   failure: set context_meta.fts_tokenizer='unavailable', increment
   mem.archive_fts.unavailable, warn-once, and skip BOTH the virtual
   table AND the three INSERT/UPDATE/DELETE triggers. Installing the
   triggers without the virtual table would throw "no such table:
   context_event_archive_fts" on every archiveEventsForMaterialization
   call (trigger bodies resolve lazily at fire time), breaking the
   entire memory pipeline.
2. ensureDb wraps setupArchiveFts in an outer try-catch as
   defense-in-depth so any unforeseen exception (driver mismatch, perm
   issue, disk pressure during virtual-table creation) leaves the
   daemon in a working state rather than crashed.
3. searchArchiveFts fast-paths to the bounded LIKE scan when the
   sentinel is 'unavailable', avoiding mem.archive_fts.match_failure
   counter spam on every query.

Test: test/store/fts-unavailable.test.ts simulates the unavailable
state by writing the sentinel, then asserts (a) archive writes succeed
(no trigger crash), (b) search returns LIKE-fallback hits without
failure counter increment, (c) blank query still returns [].

OpenSpec spec.md and tasks.md updated to record the unavailability
fallback as a normative requirement (P9).

Verification:
- npx tsc --noEmit (daemon, server, web): clean
- 38/38 impacted store/FTS suites pass; new regression suite 3/3 pass
- openspec validate memory-system-1.1-foundations: valid
- bench:memory: 24.26ms (gate 100ms)
- all three CI guard scripts pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The pre-fix gate filtered to runtimeType === 'transport', so process
agents (claude-code / codex / opencode / gemini in tmux) were never
checked. A daemon.upgrade landing while a CLI was mid-generation
killed the child process and discarded the in-flight work — exactly
what the user reported.

Fix: replace the transport-only check with getActiveSessionsBlocking
DaemonUpgrade, which covers BOTH runtimes. Process agents block when
session.state === 'running'; transport agents keep their existing
{thinking, streaming, sending, pendingCount > 0} contract.

Backward-compat: the legacy getActiveTransportSessionsBlockingDaemon
Upgrade is retained as a thin wrapper for any external script still
importing it. New SessionUpgradeBlockReason payload carries name,
runtimeType, sessionState, and (for transport) the
TransportUpgradeBlockReason snapshot for operator debugging.

P2P run gate is also locked tighter in tests: the four non-terminal
P2pRunStatus values (running, dispatched, awaiting_next_hop,
cancelling) all block daemon upgrades.

Tests:
- test/daemon/daemon-upgrade-guard.test.ts now covers
  - process agent state='running' blocks (was: silently allowed)
  - process agent idle/error/stopped does NOT block
  - mixed transport + process gets full reason payload
  - all four non-terminal P2P statuses block
  - legacy wrapper retains transport-only scope
- 11/11 passing; tsc clean across daemon/server/web

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The "restores cursor-headless sessions with persisted provider resume
ids" case spawns a fake child process and walks resume-id continuity
end-to-end. Under default vitest `testTimeout: 5000` plus a busy daemon
project run it routinely takes 4–5s and flakes right at the 5s limit.

CLAUDE.md / openspec/changes/memory-system-1.1-foundations/tasks.md:80
already documents this as a pre-existing timeout that "stabilizes with
a 10s timeout" — but the foundations work never wrote the value into
the test. Pinning `{ timeout: 10_000 }` on the describe block removes
the foot-gun so a default-config full-suite run no longer red-herrings
people into thinking unrelated commits broke this case.

Verified: 3/3 pass under default vitest config; tsc clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…shape

The 63d2b5f upgrade-guard fix changed the daemon.upgrade_blocked WS
payload from the legacy `runtime: TransportUpgradeBlockReason` shape
to `{ runtimeType: 'transport' | 'process', sessionState, transport:
TransportUpgradeBlockReason | null }` so process-agent (tmux CLI)
turns can also be reported.

The matching assertion in test/daemon/command-handler-stop.test.ts
("blocks daemon.upgrade when a transport session still has an active
turn") was overlooked in that commit — it still asserted the old
`runtime` field name, so CI Unit Tests (Node 24) and Unit Tests
(macOS) failed on dev with:

  AssertionError: expected "spy" to be called with arguments: [{ ... }]
  Received:

(received empty because the precise toHaveBeenCalledWith match never
fired).

Local repro: `npx vitest run --project daemon
test/daemon/command-handler-stop.test.ts` — 7/7 pass after this 3-line
update.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…son shape

Same root cause as c3e5f09: the 63d2b5f upgrade-guard fix changed the
WS payload from `runtime: TransportUpgradeBlockReason` to
`{ runtimeType, sessionState, transport: TransportUpgradeBlockReason | null }`,
but the e2e file at test/e2e/daemon-upgrade-gate.test.ts:366 was still
asserting the old `runtime` field name. The case
"blocks daemon.upgrade when a transport session is actually 'thinking'"
broke under both Coverage Report and E2E Tests jobs in CI; daemon Unit
Tests are unaffected because e2e is excluded from the daemon project.

Local: full daemon-upgrade-gate.test.ts now 15/15 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@im4codes im4codes merged commit c17a404 into master May 1, 2026
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant