[pull] main from manaflow-ai:main by pull[bot] · Pull Request #4 · ericmjl/cmux

pull · 2026-03-23T13:10:32Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

…yout

* feat: make cmux.json the settings file * fix: address cmux.json review feedback * fix: preserve legacy settings schema URL --------- Co-authored-by: Lawrence Chen <lawrencecchen@users.noreply.github.com>

Tune Bonsplit tab bar action lane

Fix custom commands docs formatting

…osc-notifications

Add CircleCI CI pipeline

…ssions Route Codex permission approvals through Feed

* Add cmux Help menu resources * Stabilize Help menu UI test launch * Force regular activation during UI tests * Allow Help menu UI test background launch * Align Help menu with docs and add skills docs * Localize skills docs * Disambiguate Japanese configuration help label * Stabilize Help menu UI test launch state * Clean up Help menu UI test windows * Scope Help menu UI test to main menu * Harden piped skills installer source detection --------- Co-authored-by: Lawrence Chen <lawrencecchen@users.noreply.github.com>

* Add agent-readable page variants * Forward auth for page variant rendering * Improve agent page text variants * Decode agent page fallback titles * Harden agent page variant rendering * Fix agent page fetch and link handling * Scope agent page titles to readable content * Tighten agent page text variant edges * Cover skills docs page variants --------- Co-authored-by: Lawrence Chen <lawrencecchen@users.noreply.github.com>

* Share workspace pin action path * Keep sidebar pin snapshot current * Split workspace pin helpers * Address workspace pin review feedback --------- Co-authored-by: Lawrence Chen <lawrencecchen@users.noreply.github.com>

Moves open-tab suggestion matching onto a cached per-manager index and removes sleep-based omnibar focus retries.

…3419) * Add failing regression test for red-X session persistence (#3416) Asserts that closing the last window via the red X must not remove the on-disk session snapshot. Currently fails because the unregister policy returns true when not terminating; the next commit fixes the policy. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Persist session snapshot when closing the last window via the red X (#3416) Cmd+Q hits applicationShouldTerminate / applicationWillTerminate, which write a full session snapshot synchronously. The red traffic-light close button instead reaches AppDelegate.unregisterMainWindow without setting isTerminatingApp, and that path used to (a) save only an in-flight snapshot without scrollback and (b) delete the snapshot once the last window had been removed. Net effect: clicking the red X wiped the saved session, so the next launch came up empty. Capture a full snapshot with scrollback before the closing window's context is removed, and stop deleting the snapshot when no windows remain on disk-unregister. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Trim red-X persist comments to fit Swift file length budget Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Route last-window red-X close through the Cmd+Q termination path When the user closes the last cmux main window via the red traffic-light X, intercept windowShouldClose and route through handleQuitShortcutWarning so the snapshot save, timer/controller teardown, and warn-before-quit dialog stay shared with Cmd+Q. Other windows close normally. Replaces the duplicated save-before-context-removal block; the canonical termination path now captures state while contexts are still alive. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Inline last-window route-to-quit decision to fit Swift file length budget Drops the static shouldRouteMainWindowCloseToQuit helper and its test in favor of an inline guard. Logic is unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Skip last-window quit routing in XCTest to avoid hanging on warn dialog Tests call window.performClose(nil) on cmux windows during cleanup. With the new windowShouldClose interception, closing the last test window routes through handleQuitShortcutWarning, which calls alert.runModal() — blocking the test process indefinitely under XCTest where there is no UI to dismiss the dialog. Bypass the routing under XCTest so closes proceed normally and tests can tear down windows without hanging. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* Prove recorder must capture Cmd-D despite menu bindings The shortcut settings recorder needs to receive Cmd+D even when the same keystroke is currently installed as a menu key equivalent for split actions. This adds the AppKit event-boundary regression before changing routing code so CI can show the failure without the fix. Constraint: Local tests are intentionally not run in this task per operator instruction Confidence: high Scope-risk: narrow Tested: Not run locally Not-tested: CI execution of the new regression * Let shortcut recording preempt menu equivalents Shortcut recording must own the next key event even when that keystroke is currently registered as a SwiftUI or AppKit menu equivalent. The recorder now keeps a weak active-capture target, App-level and Window-level event routing dispatch capture candidates to it before menu fallback, and menu shortcut builders expose no key equivalents while recording is active. Constraint: Cmd+D and Cmd+Shift+D are split menu defaults and may also exist as stale menu equivalents after user remapping Rejected: Only suppress stale default shortcuts | current menu bindings also block recording and conflict feedback Confidence: high Scope-risk: moderate Tested: git diff --check Not-tested: Local XCTest/build run per instruction; CI should run the regression * Refresh Swift file length budget for shortcut routing fix Absorb growth from AppDelegate event routing changes and the AppDelegateShortcutRoutingTests regression covering Cmd+D recording while a matching menu equivalent is installed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

* feat: add command palette cmux.json opener * test: force command palette UI test window mode * test: stabilize command palette UI launch * test: compare command palette config path to user home * test: harden command palette config opener coverage * chore: refresh Swift file length budget * test: resolve login home for config opener assertion --------- Co-authored-by: Lawrence Chen <lawrencecchen@users.noreply.github.com>

* test: cover launch appearance color scheme * fix: resolve launch theme before app appearance exists * fix: address launch theme review feedback * fix: keep appearance tests within file budget * fix: move appearance settings under file budget --------- Co-authored-by: Lawrence Chen <lawrencecchen@users.noreply.github.com>

* Prove remapped Close Tab can still fall through to Ghostty Cmd+W is supposed to stop closing terminal surfaces once Close Tab is remapped, but the responder fallback still lets Ghostty's built-in close_surface binding see the stale default chord. Constraint: Direct xcodebuild is forbidden for this branch; red evidence must come from CI/PR checks Confidence: high Scope-risk: narrow Tested: Manual tagged repro with ./scripts/reload.sh --tag repro-3726-cmdw-main --launch Not-tested: Local unit test execution blocked by no-direct-xcodebuild instruction * Make cmux own W close shortcuts end to end Ghostty's Darwin defaults still bound Cmd+W, Cmd+Option+W, and Cmd+Shift+W after cmux remapped the matching actions. When stale menu shortcut suppression forwarded the old chord into the focused terminal, Ghostty could close the surface outside KeyboardShortcutSettings. Constraint: Cmd+W close actions must be controlled by KeyboardShortcutSettings, not parallel Ghostty defaults Rejected: Consume stale Cmd+W in AppDelegate | would stop remapped-away command chords from reaching the focused terminal Confidence: high Scope-risk: narrow Directive: Keep cmux-owned Ghostty default keybinds unbound whenever the same chord family is user-remappable in KeyboardShortcutSettings Tested: CI red run 25619089637 showed AppDelegateShortcutRoutingTests/testRemappedCloseTabDoesNotLetCmdWReachGhosttyCloseSurfaceFallback failing on Ghostty super+w fallback before this fix Not-tested: Local xcodebuild is forbidden by task instructions * Keep browser popups aligned with remapped close shortcuts Browser popup panels still had a hard-coded default Close Tab key path, so remapping Close Tab could leave stale Cmd-W behavior around popup windows. Route popup close handling through KeyboardShortcutSettings and consume the old default only when it would otherwise leak to stale menu handling. Constraint: Browser popup windows must close themselves, not the opener browser tab, when the active Close Tab shortcut is pressed. Rejected: Keep hard-coded Cmd-W popup handling | it breaks user-remapped Close Tab shortcuts and reintroduces parent tab closure from popup focus. Confidence: high Scope-risk: narrow Directive: Keep popup key handling tied to KeyboardShortcutSettings.closeTab instead of assuming Cmd-W. Tested: git diff --check; ./scripts/reload.sh --tag issue-3726-cmd-w-remappable queued behind existing xcodebuild lock Not-tested: Local unit/UI tests per repository policy; tagged reload result still pending at commit time * Preserve chorded Close Tab remaps in browser popups Browser popup panels handled close key equivalents outside AppDelegate's pending chord state, so chorded Close Tab remaps could only see one key event at a time. Route that popup-specific close path through AppDelegate's chord-aware matcher and keep stale default suppression as the fallback for remapped or unbound defaults. Constraint: Browser popups must close themselves instead of leaking stale Cmd-W menu equivalents to the opener tab. Rejected: Disallow chorded Close Tab shortcuts | the rest of the remappable shortcut system supports chords and the popup path should share that behavior. Confidence: high Scope-risk: narrow Directive: Keep browser popup Close Tab handling on AppDelegate's chord-aware shortcut state; do not reintroduce StoredShortcut.matches(event:) for this path. Tested: git diff --check Not-tested: Local unit/UI tests per repository policy; tagged reload blocked by shared Xcode build lock before this fix

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

This reverts commit d8a6f2a.

…3846) This reverts commit fe7cf04.

…sition (#3694)" (#3849) This reverts commit 00dd7c3.

* Prove managed defaults stop overriding explicit user choices The file-store replay path needs coverage for non-sidebar managed defaults before the generic fix lands. These tests drive the same UserDefaults.didChangeNotification path that makes Settings selections snap back when cmux.json has imported app defaults. Constraint: Issue #3844 requires the regression test to fail before the fix and verification must happen in CI, not local XCTest. Confidence: high Scope-risk: narrow Directive: Keep this coverage behavior-level through KeyboardShortcutSettingsFileStore notification replay, not source-shape assertions. Tested: git diff --check Not-tested: Local XCTest execution per repository policy and task instruction. * Let managed defaults yield after user divergence Managed UserDefaults values imported from cmux.json now behave as defaults, not permanent overrides. The active managed-default snapshot records the last imported file values, and replay only writes a value while UserDefaults still matches that imported marker. If cmux.json changes a key, reload forces the new file value once and updates the marker. The scoped sidebar match-terminal-background marker is folded into the generic path, so strings, bools, nullable strings, arrays, and dictionaries share the same precedence rule. Template generation moved into a small extension to keep the file-length budget intact while managed-default replay remains beside its private state. Constraint: Settings-file changes must apply, but explicit in-app user choices must not be overwritten by UserDefaults.didChangeNotification replay Constraint: Repository policy forbids direct xcodebuild and local test runs for this task Rejected: Per-key marker logic | repeats the sidebar-only patch and leaves the class of bugs open for other managed defaults Rejected: Separate importedManagedDefaults cache | review showed it can desynchronize from activeManagedUserDefaults during reload windows Confidence: high Scope-risk: narrow Directive: Treat activeManagedUserDefaults as the imported-default marker for managed UserDefaults replay; do not add a second marker cache without proving it cannot desynchronize Tested: git diff --check Tested: python3 scripts/swift_file_length_budget.py --budget .github/swift-file-length-budget.tsv (fails only on pre-existing over-budget files; KeyboardShortcutSettingsFileStore.swift is not listed) Not-tested: local Swift tests/builds per repository and user instructions

* Prove app-level config reloads must update live surfaces The appearance-change path currently reaches each terminal surface after the app Ghostty config reload, but the per-surface sequence still only refreshes the host background and redraws. This regression pins the required ordering before the production fix: soft surface config update, host background refresh, then visual refresh. Constraint: Issue 3851 requires a two-commit regression-test-then-fix structure Confidence: high Scope-risk: narrow Directive: Keep app-level Ghostty config reload and per-surface redraw as one ordered sequence Tested: git diff --check Not-tested: Local XCTest/build per user instruction; test is intentionally red until the follow-up fix commit * Keep terminal palettes fresh after appearance reloads The app-level Ghostty config reload now pushes the already-loaded app config into each live terminal surface with a soft surface reload before repainting. That matches the direct GHOSTTY_ACTION_RELOAD_CONFIG ordering and prevents dark-theme foreground palettes from being reused after a dark-to-light appearance switch. Constraint: Use soft reload so surfaces reuse the app config already loaded by the appearance path Rejected: Force redraw only | redraw uses stale per-surface palette state Confidence: high Scope-risk: narrow Directive: Do not separate per-surface config update from the subsequent redraw on app-level config reloads Tested: git diff --check Not-tested: Local XCTest/build per user instruction; CI will run the regression

* Add multi-image drop regressions * Fix multi-image terminal drops * Add CLI stderr crash regression * Fix CLI stderr crash on closed pipes * Add Claude image drop paste regression * Use one paste transaction for local image drops * Remove unused local image type helpers * Copy pasteboard item before fallback rendering * Add Claude multi-image paste regression * Fix Claude multi-image drops --------- Co-authored-by: Lawrence Chen <lawrencecchen@users.noreply.github.com>

* Fix Cloud VM SSH and baked tooling * Record Cloud VM tooling images * Address Cloud VM review feedback * Record Cloud VM tooling images * Fix Cloud VM SSH attach workflow * Address Cloud VM SSH review feedback * Address remaining Cloud VM review feedback * Verify Freestyle recovery auth transport * fix: harden vm ssh reconnect lifecycle * fix: keep vm ssh failure banner visible * fix: clean up ssh signal-derived exits * fix: allow flag-like remote cli values * fix: stop ssh reconnect after pending signal * fix: clean up interrupted ssh reconnect delay --------- Co-authored-by: Lawrence Chen <lawrencecchen@users.noreply.github.com>

…r the sidebar resize handle (#3879) * Expose missing redraw invalidation on window focus regain The sidebar/main resize handle repro depends on a main window key-regain path that currently restores focus state without requiring a content redraw. This regression test builds a registered cmux main window with a split divider under the pointer and asserts that the root content view is invalidated after resign/become-key notifications. Constraint: Do not run local tests or bare xcodebuild for this branch Confidence: high Scope-risk: narrow Directive: Keep this as a behavior-level AppKit lifecycle test; do not replace it with a source-shape assertion Tested: Not run locally by request Not-tested: Local XCTest execution; CI is expected to show this test failing before the fix * Guarantee redraw invalidation when cmux windows regain focus A cursor hover over the sidebar/main divider should not decide whether a key-regain redraw occurs. The main window lifecycle handler now treats content invalidation as an app-level invariant for registered cmux windows before restoring subview focus state. Constraint: Fix must avoid resize-handle-specific timing glue and must not use local bare xcodebuild Rejected: Patch WindowTerminalHostView mouse tracking | redraw on focus regain belongs to the window lifecycle owner, not the current cursor owner Confidence: medium Scope-risk: narrow Directive: Keep key-regain redraw ownership in the main-window lifecycle path; do not move it into divider tracking handlers Tested: git diff --check Not-tested: Local XCTest/build by request; CI will verify * Clarify key-regain refresh ownership Review feedback correctly pointed out that the redraw invariant should be named as one post-focus contract instead of appearing as an orphaned pre-walk next to focus restoration. This keeps the app-window lifecycle as the single owner while calling the keyboard focus coordinator only after redraw invalidation is established. Constraint: Continue avoiding local tests and bare xcodebuild on this branch Rejected: Move redraw ownership into divider tracking | the bug must be impossible regardless of current cursor owner Confidence: medium Scope-risk: narrow Tested: git diff --check Not-tested: Local XCTest/build by request; CI will verify

* Prove update pill must follow each appcast emission The update pill can be backed by an interactive update state and a background-detected update cache at the same time. This regression test models repeated appcast emissions so CI proves that the visible pill version is not allowed to remain pinned to the first update. Constraint: Local tests are intentionally not run for this repo; verification is CI-only. Rejected: XCUITest repro | the model-level ownership bug is lower-risk and faster to validate in the unit target. Confidence: high Scope-risk: narrow Directive: Keep this test focused on visible model behavior rather than source-shape assertions. Tested: git diff --check Not-tested: Swift unit suite, per repo instruction to avoid local tests * Keep update pill tied to the latest appcast emission The visible update state and the background-detected update cache could diverge after the first available update. Route update-available transitions through the view model and refresh any active updateAvailable state whenever Sparkle reports a valid update item, so later polls replace the item the pill renders instead of being hidden behind the first state. Constraint: The existing Sparkle probe cadence stays in UpdateController; the fix only changes update item ownership in the view model. Rejected: Add a second polling loop | would duplicate Sparkle probing and leave the stale state owner split intact. Rejected: Only remove an early return | no polling early return existed in cmux; the stale item lived in duplicated model state. Confidence: high Scope-risk: narrow Directive: Future update UI should route valid appcast items through UpdateViewModel rather than assigning updateAvailable state directly. Tested: git diff --check Not-tested: Swift unit suite, per repo instruction to avoid local tests * Keep update actions bound to the matching Sparkle reply Passive appcast detection now records metadata only, while installable update transitions carry the full SUAppcastItem and reply closure through recordAvailableUpdate. This keeps visible update actions tied to the Sparkle callback that produced the installable state, including when an override pill is already visible. Constraint: Do not run local tests; CI owns test execution for this repo. Rejected: Replace only the appcast item inside existing updateAvailable state | it preserves stale reply closures and can route Install/Dismiss to the wrong Sparkle callback. Confidence: high Scope-risk: narrow Directive: Do not refresh UpdateAvailable display data without refreshing the matching reply closure from the same Sparkle emission. Tested: git diff --check Not-tested: Local XCTest execution per repository policy * Prevent duplicate update dismiss replies during rechecks A user-triggered update check can cancel the current Sparkle update reply while a queued no-update callback is still waiting on the main queue. The view model now provides one cancellation path that replies once, clears the interactive state, and removes any visible override before the next check starts. Constraint: Sparkle reply callbacks must be answered exactly once per update interaction Rejected: Only set state to idle inline in UpdateController | would keep the cancellation semantics split across controller and view model paths Confidence: high Scope-risk: narrow Tested: ./scripts/reload.sh --tag issue-3829-update-pill-keep-polling Not-tested: Local test execution per repository policy; CI will run the regression test

…keeps updating (#3874) * Prove sidebar details unfreeze after color menu action A workspace color assignment happens through the sidebar context menu, and AppKit can end menu tracking before SwiftUI delivers the menu disappearance callback. The regression test captures that lifecycle and expects later git/cwd detail snapshots to update live instead of remaining deferred behind the stale menu flag. Constraint: Local tests are intentionally not run; this branch relies on CI per repository task instructions. Confidence: high Scope-risk: narrow Tested: Not run locally by instruction Not-tested: CI execution pending * End sidebar detail freeze with menu tracking The sidebar row used SwiftUI context-menu disappearance as the only lifetime boundary for deferred workspace detail snapshots. Assigning a workspace color can close AppKit menu tracking without delivering that SwiftUI callback, leaving pwd/git/details behind a stale freeze. The row interaction state now owns an explicit sidebar-detail freeze phase with AppKit tracking as the authoritative pointer-menu lifetime and SwiftUI appearance as a fallback only. AppKit tracking end releases the phase and reports the release through the hover tracker, while cleanup flushes pending snapshots only after the phase is live. Color/reorder actions can keep immediate affordances stable during the menu without making later telemetry updates depend on a fragile SwiftUI callback. Constraint: Do not use timers, DispatchQueue repair, or bare xcodebuild for this task Rejected: Clear the pending snapshot only from each color action | misses reorder and other context-menu actions Rejected: Add an async delayed unfreeze | timing repair would hide the lifecycle bug Rejected: Let SwiftUI disappearance end an active AppKit-tracking freeze | reintroduces ordering-dependent first-end-wins semantics Confidence: high Scope-risk: narrow Tested: git diff --check Not-tested: Local tests/build skipped by instruction; CI rerun pending

* Expose broad IME suppression before narrowing The no-marked-text IME suppression path still depends on the generic inputmethod predicate from #3694, so handled arrow/navigation keys can be swallowed for non-Zhuyin IMEs. These tests lock the intended boundary before the production predicate is narrowed. Constraint: Current main has reverted #3767 and #3836, so this contract targets the remaining #3694 latent broad predicate. Rejected: Restore #3767/#3836 regression tests here | updated scope is only the post-revert no-marked-text predicate risk. Confidence: high Scope-risk: narrow Directive: Keep this test commit behavior-only; the fix belongs in Sources/GhosttyNSView+IMEComposition.swift. Tested: git diff --check Not-tested: Local unit tests not run because this task forbids direct xcodebuild; red/green evidence will come from GitHub CI. * Clarify IME navigation regression test names The probe set covers arrows, paging, home/end, and space, so the test names now describe the full behavior contract instead of only arrow keys. Constraint: Review feedback flagged misleading names while CI is the verification gate. Confidence: high Scope-risk: narrow Tested: git diff --check Not-tested: Local tests not run per repository and user instructions. * Tighten IME suppression cleanup The narrow IME route no longer needs a numeric-key fallback guard because the no-marked branch now only retains explicit Bopomofo candidate commands inside AppKit. Keeping that helper made the regression test read as if it exercised a path that the key allowlist cannot reach. Clearing suppressed key-up bookkeeping on focus regain also closes the stale-state edge noted in review without widening the routing surface.\n\nConstraint: Do not run local tests or xcodebuild for this branch\nRejected: Keep the deferred numpad fallback helper | it is redundant after the command allowlist and leaves misleading coverage\nConfidence: high\nScope-risk: narrow\nTested: git diff --check; git diff --cached --check\nNot-tested: Local XCTest or app build per instruction * Import IME virtual key constants in tests CircleCI unit compilation failed because CJKIMEMarkedSelectionTests references Carbon virtual-key constants without importing HIToolbox in the test target. Import the exact framework that defines those constants instead of replacing them with magic numbers.\n\nConstraint: Do not run local tests or xcodebuild for this branch\nRejected: Replace kVK constants with numeric literals | less readable and easier to drift from AppKit key semantics\nConfidence: high\nScope-risk: narrow\nTested: git diff --check; git diff --cached --check; CircleCI failure log showed missing kVK symbols\nNot-tested: Local XCTest or app build per instruction * Isolate IME test input handlers The IME regression tests share process-global debug hooks, so every test that installs the text-input handler now restores the previous handler and hook state in defer. The hook storage is main-actor scoped to match how AppKit input tests execute. Constraint: Local tests and direct xcodebuild are disabled for this task; CI owns executable verification. Rejected: Leave the debug handler installed after tests | review feedback identified it as cross-test state leakage. Confidence: high Scope-risk: narrow Tested: git diff --check; git diff --cached --check Not-tested: Local XCTest execution per task instruction * Prevent stale IME keyUp suppression A suppressed IME keyDown can be followed by a forwarded keyDown for the same physical key during repeat handling. The forwarded path now clears any stale suppressed keyUp entry before sending the key to Ghostty, so the eventual keyUp is not swallowed. Constraint: IME key forwarding and keyUp pairing must remain owned by GhosttyNSView without timers or cross-layer state. Rejected: Count suppressed keyDown events | repeated keyDowns do not imply repeated keyUp events and can leave stale counts. Confidence: high Scope-risk: narrow Directive: Any forwarded keyDown must own its matching keyUp; do not leave IME keyUp suppression entries active after forwarding the same keyCode. Tested: git diff --check; git diff --cached --check Not-tested: Local XCTest execution per task instruction * Constrain IME key-equivalent rerouting The key-equivalent bypass should be owned by the same input-source boundary as the IME suppression decision. Non-IME dead-key marked text now stays on normal AppKit key-equivalent dispatch, while input-method composition command keys still re-enter keyDown so NSTextInputContext can consume them. Constraint: Do not run local tests or xcodebuild for this PR; verification is CI-gated.\nRejected: Special-case US International only | keeps the broader marked-text routing bug for other non-IME layouts\nConfidence: high\nScope-risk: narrow\nDirective: Keep key-equivalent rerouting and IME suppression predicates aligned on input-source ownership.\nTested: git diff --check; git diff --cached --check\nNot-tested: Local XCTest or xcodebuild per user instruction * Preserve IME keyUp ownership across layout flips A layout-change IME path consumed keyDown before Ghostty saw it, but skipped the shared transient keyUp suppression bookkeeping. This keeps the ownership invariant consistent across all IME-consumed keyDown exits and adds a regression that simulates the input source flip from the text-input handler. Constraint: Do not run local tests or xcodebuild for this branch; CI owns validation. Rejected: Let the unmatched keyUp pass through | it creates release-without-press state in Ghostty. Confidence: high Scope-risk: narrow Directive: Any future keyDown early return before Ghostty forwarding must either forward the key or suppress the paired keyUp. Tested: git diff --check Not-tested: Local Swift tests and local xcodebuild per user instruction

* Update Codex hooks feature flag Use [features].hooks instead of the deprecated codex_hooks alias when installing Codex hooks. Migrate existing codex_hooks entries while preserving other feature settings. * Address Codex hooks review feedback Move Codex config TOML helpers into a dedicated CLI extension, preserve pre-existing hooks flags on uninstall, remove cmux-owned hooks blocks cleanly, and wire the migration regression into the test runner. * Handle dotted Codex hooks feature keys * Restore prior Codex hooks feature setting * Surface Codex config read failures * Harden Codex config TOML markers * Recover orphaned Codex hooks marker * Clarify Codex hooks migration tests --------- Co-authored-by: Lawrence Chen <lawrencecchen@users.noreply.github.com>

…tab (#3243) (#3245) * Lock bare window.open() routing before changing popup classification Issue #3243 is a regression in popup routing, so the first commit flips the existing popup-decision expectation for a scripted `.other` navigation without explicit window features. This makes the spec-correct behavior visible in CI before the runtime classifier is adjusted. Constraint: Local test execution is disabled for this repo by policy Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep this test aligned with the no-feature `window.open(..., "_blank")` path, not feature-bearing OAuth popups Not-tested: Unit tests not run locally * Keep bare window.open() on the tab path unless popup features are explicit Bare `window.open(url, "_blank")` requests should reuse the existing browser-tab routing instead of creating a popup shell with reduced chrome. The popup classifier now requires explicit `WKWindowFeatures` values before returning a live popup web view, while feature-bearing requests still take the popup path needed for OAuth and similar opener-dependent flows. Constraint: WKWindowFeatures only exposes a subset of the HTML window feature tokens Rejected: Treat every `.other` navigation as a popup | regresses spec-correct bare `_blank` behavior Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep popup creation gated by explicit window features unless WebKit exposes a better raw feature signal Not-tested: Unit tests not run locally Not-tested: Debug app build not run locally at commit time

* Prove Claude agents must bypass cmux hook injection The Claude wrapper test harness now exercises the live-socket path for the agents subcommand and asserts that the real Claude binary receives the raw argv without cmux session or settings flags. Constraint: Regression-test commit must land before the wrapper fix so CI can show the bug. Confidence: high Scope-risk: narrow Tested: git diff --check -- tests/test_claude_wrapper_hooks.py Not-tested: Local wrapper test execution skipped per repository testing policy; CI will run tests/test_claude_wrapper_hooks.py. * Stop injecting Claude hooks into command subcommands The wrapper now treats cmux hook injection as an explicit session-entry policy instead of a growing pass-through allowlist. Default launches and known session forms keep --session-id/--settings; command-like invocations, including agents and future subcommands, reach the real Claude binary without cmux flags. Constraint: Do not change hook payloads or Swift socket routing. Rejected: Extend the hardcoded subcommand allowlist | it would fix agents now but recur when Claude adds or renames another flag-incompatible command. Confidence: high Scope-risk: narrow Directive: Keep the wrapper policy inverted: add new injection entrypoints only when they are known Claude sessions, not new pass-through subcommands. Tested: bash -n Resources/bin/claude; python3 -m py_compile tests/test_claude_wrapper_hooks.py; git diff --check -- Resources/bin/claude tests/test_claude_wrapper_hooks.py Not-tested: Local behavior test execution skipped per repository testing policy; CI runs tests/test_claude_wrapper_hooks.py. * Preserve Claude worktree short-flag hook injection The inverted wrapper policy already treated --worktree as a session entrypoint. Cursor Bugbot correctly pointed out that Claude also exposes the -w alias, so the classifier now treats that alias the same way and the wrapper harness covers it. Constraint: Address PR review feedback without widening the wrapper change beyond session-entry classification. Confidence: high Scope-risk: narrow Tested: bash -n Resources/bin/claude; python3 -m py_compile tests/test_claude_wrapper_hooks.py; git diff --check -- Resources/bin/claude tests/test_claude_wrapper_hooks.py Not-tested: Local behavior test execution skipped per repository testing policy; CI runs tests/test_claude_wrapper_hooks.py. * Cover Claude value flags before print entrypoints Greptile found that value-taking options before --print could hide the later interactive entrypoint if the option alias was not recognized. The wrapper now treats -m like --model, and the harness covers both -m <model> --print and --agents <json> --print so these values are skipped before detecting the session entry flag. Constraint: Keep subcommand passthrough as the default while preserving known session-entry forms. Rejected: Remove --agents from value-consuming options | current claude --help reports --agents <json>, so it must consume the following JSON value. Confidence: high Scope-risk: narrow Tested: claude --help confirms --agents <json>; bash -n Resources/bin/claude; python3 -m py_compile tests/test_claude_wrapper_hooks.py; git diff --check -- Resources/bin/claude tests/test_claude_wrapper_hooks.py Not-tested: Local behavior test execution skipped per repository testing policy; CI runs tests/test_claude_wrapper_hooks.py.

* Expose renderer-owned terminal background invariant Claude Code paints explicit ANSI background cells for its input box and statusline. The regression captures the expected backdrop ownership boundary for solid opaque terminals before changing the production path. Constraint: Repository policy forbids local test execution; this commit is intentionally expected to fail before the fix. Confidence: high Scope-risk: narrow Directive: Keep opaque terminal backgrounds renderer-owned unless blur or translucency requires the host compositor. Tested: Not run locally per repository policy; regression is expected to be red before the fix. Not-tested: Local XCTest execution. * Keep opaque terminal backgrounds in the renderer cmux was forcing Ghostty to leave every default terminal background transparent and then filling that area from the host window. That split default cells and explicit ANSI background cells across different compositor paths, which made Claude Code chrome show a visible seam. The policy now keeps solid opaque, unblurred terminal backgrounds inside Ghostty's renderer and reserves host-layer ownership for translucent or blurred terminal backgrounds where the macOS compositor is required. Constraint: background-opacity and background-blur still need host-layer ownership for compositor effects. Rejected: Tune the fallback theme palette | this would not remove the split renderer/host compositing path. Confidence: high Scope-risk: moderate Directive: Do not force macos-background-from-layer for opaque unblurred terminals; explicit ANSI cell backgrounds must share the renderer path with default cells. Tested: git diff --check Not-tested: Local XCTest/build execution per repository policy; CI pending. * Document terminal background ownership threshold Greptile correctly flagged that the opacity threshold needed intent and that the host-layer inverse paths needed coverage. Naming the threshold and adding translucent plus blurred assertions keeps the renderer-owned path narrow and understandable. Constraint: Review feedback was low-priority but directly improved the regression boundary. Confidence: high Scope-risk: narrow Directive: Keep host-layer ownership covered for translucent and blurred backgrounds when adjusting terminal rendering policy. Tested: git diff --check Not-tested: Local XCTest execution per repository policy; CI pending.

…onitors (#3882) * Require AppKit ownership of main window frames The sleep/wake regression needs a behavior-level guard that real cmux main windows enroll in AppKit frame autosave and do not share a live autosave name. Current main-window construction leaves frameAutosaveName empty, so this test is expected to fail before the implementation commit. Constraint: Regression-test commit must contain no production fix so CI can prove the issue first goes red. Confidence: high Scope-risk: narrow Tested: Not run locally per repository testing policy. Not-tested: Local xcodebuild and app launch intentionally skipped for the failing-test-only commit. * Let AppKit own main window frame restoration Main windows now register AppKit frame autosave names as soon as they are created. The first standalone main window uses a stable primary name when available, while additional live windows use UUID-scoped names so AppKit accepts every autosave registration. The old cmux lastWindowGeometry fallback was a competing frame owner: it could remap or persist frames independently of AppKit's screen topology logic. Session snapshots still store window layout with workspace content, but the standalone fallback no longer reads or writes frame data for new-window placement. Constraint: Do not add sleep/wake timers or screen-change repair hooks; use the platform frame autosave mechanism Apple exposes for this window state. Rejected: Add didWake/screensDidChange repositioning | would introduce another lifecycle repair path and preserve split ownership. Confidence: medium Scope-risk: moderate Tested: git diff --check Not-tested: Local tests and local build intentionally skipped per repo policy and user instruction; CI will run after PR push. * Clean up ephemeral main window autosave names Secondary main windows need AppKit frame autosave during their lifetime, but their UUID-scoped autosave keys are not stable restore points. Closing those windows now removes the matching AppKit frame key while preserving the primary slot. This also makes autosave registration explicit instead of hiding the side effect inside a boolean chain, and adds a regression that saves a secondary frame then verifies close removes it. Constraint: Primary frame autosave must survive close because it is the stable main-window restore slot. Rejected: Keep UUID autosave keys indefinitely | leaks per-window keys that are never read again. Confidence: high Scope-risk: narrow Tested: git diff --check Not-tested: Local tests/build intentionally skipped per repo policy and user instruction; CI will verify. * Keep autosave registration test hermetic The production close path now removes UUID-scoped frame autosave names, and this test also removes the secondary name in teardown so the registration contract cannot dirty UserDefaults even if a future close path changes. Constraint: Preserve the primary autosave slot across tests while cleaning per-window UUID slots. Confidence: high Scope-risk: narrow Tested: git diff --check Not-tested: Local tests/build intentionally skipped per repo policy and user instruction; CI will verify. * Fully disarm retired window frame persistence Cursor Bugbot flagged two cleanup gaps after the AppKit autosave migration: ephemeral windows could keep their autosave registration active during close, and the retired v2 cmux geometry key was not part of legacy cleanup. The fix keeps AppKit as the single frame owner by clearing UUID-scoped autosave names before removing their saved frame and deleting the retired v2 key alongside the v1 key. Constraint: Local test execution is intentionally skipped per task instructions; CI owns verification for this PR. Rejected: Keep only NSWindow.removeFrame for UUID windows | AppKit can still write through an active frameAutosaveName before teardown completes. Confidence: high Scope-risk: narrow Tested: git diff --check Not-tested: Local XCTest or app build, per explicit instruction to use CI and defer reload until CI is green. * Keep primary window frame autosave live Review feedback exposed a remaining lifecycle gap in the AppKit autosave handoff: the stable primary autosave name could outlive its window while another cmux window stayed open. The close path now promotes a surviving main window into the primary slot, saves its frame immediately, and retires its old UUID-scoped frame. The retired cmux geometry write plumbing is also removed so the legacy key is only ever cleaned up, not rewritten. Constraint: AppKit frame autosave is the single source of truth for window frame persistence after this migration. Rejected: Preserve the old persistedGeometryData parameter | it made a removed key look writable and kept a path back to competing frame persistence. Confidence: high Scope-risk: narrow Tested: git diff --check Not-tested: Local XCTest or app build, per explicit instruction to use CI and defer reload until CI is green. * Release primary autosave before promotion Greptile found that primary-slot promotion could run while the closing primary NSWindow still held the autosave name during windowWillClose. The close path now clears that registration before promotion, and promotion only removes the survivor's old UUID frame after AppKit accepts the primary name. Constraint: The primary autosave UserDefaults frame must remain preserved while live ownership moves to a surviving cmux window. Rejected: Keep ignoring setFrameAutosaveName's return value | it could remove the survivor's UUID frame after a failed promotion. Confidence: high Scope-risk: narrow Tested: git diff --check Not-tested: Local XCTest or app build, per explicit instruction to use CI and defer reload until CI is green. * Release primary autosave after unregister The primary autosave name should be released only after the closing window is confirmed as a registered main window. This keeps the promotion fix precise: unregister the closing context, clear the primary autosave registration from that closing NSWindow, then promote a survivor into the stable primary slot. Constraint: WindowWillClose still contains the closing NSWindow in NSApp.windows, so promotion must explicitly release the old autosave owner first. Rejected: Clear the primary autosave name before unregistering | it mutates a window even if context unregister unexpectedly fails. Confidence: high Scope-risk: narrow Tested: git diff --check Not-tested: Local XCTest or app build, per explicit instruction to use CI and defer reload until CI is green. * Preserve survivor frame during autosave promotion AppKit reloads an autosaved frame when a window adopts that name, so the primary-slot promotion now seeds the stable name with the survivor frame before registration and restores the survivor frame if AppKit applies anything during the transition. Constraint: setFrameAutosaveName can reload the associated frame as part of registration Rejected: Assign primary autosave name first and save afterward | that can capture the closed primary window frame after moving the survivor Confidence: high Scope-risk: narrow Directive: Keep primary autosave promotion frame-neutral; do not register the survivor under a stale primary name before seeding the saved frame Tested: git diff --check Not-tested: Local XCTest/builds intentionally not run per project instruction; CI will run on PR * Prove promoted autosave frame payload The promotion test now loads the stable primary autosave entry into a probe window so it proves the stored payload matches the survivor frame, not just that the key exists. The autosave-name helper also owns the live-primary check directly instead of accepting a redundant caller-computed flag. Constraint: Review gate required proving AppKit's stored primary autosave frame, not only the live survivor frame Rejected: Keep the key-presence assertion alone | it would still pass with a stale primary autosave payload Confidence: high Scope-risk: narrow Directive: Primary autosave promotion tests must verify both live-frame preservation and the restorable autosave payload Tested: git diff --check Not-tested: Local XCTest/builds intentionally not run per project instruction; CI will run on PR

…rkspace sidebar (#3881) * Expose inactive first-click leaks across sidebar chrome Add failing behavior coverage for the first-click focus policy where chrome and SwiftUI sidebar surfaces bypass PaneFirstClickFocusSettings. The tests mirror the existing pane body assertions and exercise the workspace sidebar through a hosted runtime path instead of checking source text. The sidebar regression discovers the target row through its accessibility identifier, proves the coordinate with an active-window click, then verifies that an inactive first click with focusPaneOnFirstClick disabled is rejected before the workspace selection changes. Constraint: Do not run local tests; CI owns unit and UI verification for this repo. Rejected: Source-shape assertions for hardcoded acceptsFirstMouse | project policy requires runtime behavior tests. Confidence: medium Scope-risk: narrow Tested: git diff --check Not-tested: Local XCTest execution, by repo and user instruction. * Honor inactive first-click policy in sidebar chrome Route minimal-mode controls, PDF chrome, and the SwiftUI workspace sidebar through PaneFirstClickFocusSettings. The sidebar gets a single AppKit hosting gate that captures inactive first clicks when the setting is off and otherwise passes through to normal SwiftUI hit testing. The pass-through overlay is a dedicated subclass instead of mutable configuration, so the default hosting gate cannot be left in a bad pass-through state. The gate converts hit-test points from the superview into local bounds before deciding whether to capture. Constraint: focusPaneOnFirstClick is the existing source of truth for pane first-click behavior. Rejected: Guard each workspace row action | duplicates the policy across SwiftUI action sites and misses future sidebar controls. Confidence: medium Scope-risk: moderate Directive: Keep first-mouse policy at AppKit boundaries; do not add per-row workarounds for inactive-window activation. Tested: git diff --check; source scan for acceptsFirstMouse overrides; file length check for touched budgeted files. Not-tested: Local XCTest/build execution, by repo and user instruction; CI will run on the PR.

* Expose browser Return alert-sound regression The sign-in repro showed WebKit could submit a form on Return while AppKit still treated the same key equivalent as unhandled. The regression coverage models a focused embedded WebKit responder and a reentrant performKeyEquivalent call during forwarded keyDown, so CI can prove the alert-sound path is closed by the follow-up fix. Constraint: Regression test is intentionally isolated from app code changes to preserve the two-commit bug-fix proof. Confidence: high Scope-risk: narrow Directive: Keep Return forwarding tests focused on plain/Shift Return so command shortcuts and IME composition stay outside the forced path. Tested: Not run for this failing-test-only commit; follow-up fix commit is verified with cmux-unit build-for-testing. Not-tested: Runtime Google OAuth audio behavior at this commit. * Prevent browser Return from sounding after submit WebKit can re-enter performKeyEquivalent while a forwarded Return keyDown is already submitting a form. Treat that reentrant Return as handled, and recognize embedded WKWebView responder chains, so AppKit does not play the alert sound after the form action succeeds. Settings sign-in now opens the normal browser callback flow by default, keeping ASWebAuthenticationSession available only behind CMUX_AUTH_USE_ASWEB_AUTH_SESSION for debugging. Callback and sign-out paths cancel pending auth attempts so loading state cannot linger. Constraint: ASWebAuthenticationSession owns part of its event path outside cmux, so suppressing only cmux window responders did not eliminate the audible alert. Rejected: Keep ASWebAuthenticationSession as the default | the user still reproduced the beep after the first responder forwarding patch. Confidence: medium Scope-risk: moderate Directive: Do not return false from browser Return reentry while cmuxBrowserReturnForwardingDepth is active unless you can prove AppKit will not beep on unhandled Return. Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit Tested: ./scripts/reload.sh --tag issue-3840-signin-beep-on-enter --launch Not-tested: Full live Google OAuth audio path after this exact commit; user confirmed the previous launched build before this reentry patch still beeped. * Keep browser sign-in timeout actor-neutral The sign-in beep fix moved login through the default browser by default, and Swift flagged the shared timeout default as main actor-isolated. The constant is immutable and Sendable, so marking it nonisolated keeps the public call site warning-free without changing behavior. Constraint: Keep the follow-up scoped to the new auth timeout path before publishing the PR. Confidence: high Scope-risk: narrow Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit Not-tested: Runtime OAuth callback against production Stack Auth * Avoid duplicate browser sign-in deadline Greptile flagged that the external-browser sign-in timeout raced the existing socket wait deadline. The UI path now owns only a cancellable main-run-loop timer for clearing the settings spinner, while beginSignInAndAwait relies on the single existing await deadline and clears loading after that result. Constraint: Keep the sign-in button from staying disabled if a browser callback never returns. Rejected: Leave the Task.sleep timeout in beginSignIn | duplicated beginSignInAndAwait deadline and was flagged by review. Confidence: high Scope-risk: narrow Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit Not-tested: Full external OAuth round trip against production Stack Auth * Align embedded browser arrow routing Greptile pointed out that embedded WKWebView responders received the Return/Enter forwarding path but not the matching plain-arrow forwarding path. Embedded auth web views now use the same browser ownership predicate for arrows, with a regression covering the keyDown route. Constraint: Keep browser key forwarding behavior consistent across cmux-owned and embedded WebKit responders. Rejected: Add only a comment | the behavior gap was small and directly testable. Confidence: high Scope-risk: narrow Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit Not-tested: Manual arrow navigation inside a live ASWebAuthenticationSession page * Remove new auth timeout primitive Review feedback flagged the external-browser loading timeout as a new timing primitive in shipped Swift. The UI sign-in path now treats opening the browser as the completed handoff and clears loading immediately, while the socket await path keeps loading only until its existing waitForSignInSettled deadline resolves. Constraint: Swift blocking-runtime review rule forbids new Task.sleep, Timer, asyncAfter, or polling loops in shipped app code. Rejected: Timer-backed UI timeout | the rule treats timers as blocking/timing primitives too. Confidence: high Scope-risk: narrow Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit Not-tested: Full external OAuth callback against production Stack Auth * Respect embedded browser review feedback Code review found that embedded browser forwarding only widened the browser gate but still sent keyDown to the original responder. The forwarding path now targets the resolved cmux or embedded WKWebView, keeps the re-entry consumption explicitly Return/Enter-only, and clears loading when sign-out cancels an in-flight auth flow. Constraint: PR cannot merge while CodeRabbit changes-requested feedback remains valid. Rejected: Treat the re-entry warning as a pure false positive | making the Return/Enter guard explicit is cheap and documents the invariant. Confidence: high Scope-risk: narrow Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit Not-tested: Full external OAuth callback against production Stack Auth * Keep browser callback token cache coherent External-browser sign-in seeds the durable token store through the callback path, but the fast access-token cache was only updated by the direct credential flow. Sync the cache on callback and clear it on sign-out so callers see the same session state as the token store. Constraint: Browser callback is now the default sign-in completion path. Rejected: Fall back to reading the token store in getAccessToken | that changes the fast cached-read contract and broadens the auth call surface. Confidence: high Scope-risk: narrow Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit * Avoid browser sign-in loading noise Default browser sign-in now hands off without publishing a transient loading pulse, while the awaited sign-in helper can still keep the loading state active until the callback path settles. Document the shared callback invariant for system web-auth sessions so the session reference is cleared without a redundant cancel. Constraint: External browser handoff should feel idle after opening the browser, but beginSignInAndAwait still needs a visible pending state. Rejected: Always setting loading before choosing the sign-in transport | this creates a needless true-to-false publication on the default browser path. Confidence: high Scope-risk: narrow Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit * Avoid caching unrefreshed browser tokens The callback path seeds persisted tokens before refreshing session state, but the fast access-token cache should only represent an authenticated in-memory session. Move the cache write after refreshSession succeeds and cover the failed-refresh path so getAccessToken cannot return a token while auth state remains unauthenticated. Constraint: Callback token persistence is still needed before refreshSession reads through the Stack client. Rejected: Clearing all persisted callback tokens on refresh failure | that changes recovery semantics outside the fast cache issue. Confidence: high Scope-risk: narrow Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit * Prevent stale browser callback token reads Browser callback handling now clears the fast access-token cache before its first async boundary, so failed refreshes cannot expose a previous session token to concurrent token lookups. The auth sign-in tests also pin the external-browser branch through dependency injection instead of ambient process environment. Constraint: CodeRabbit flagged a callback race before tokenStore seeding and brittle environment-dependent auth tests. Rejected: Keep relying on CMUX_AUTH_USE_ASWEB_AUTH_SESSION in tests | ambient process state can flip the branch under test. Confidence: high Scope-risk: narrow Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit * Resolve embedded field editor web views before forwarding keys Detached AppKit field editors can become first responder for embedded WebKit forms. Resolve their owner view before the normal responder/superview walk so Return and Enter forwarding targets the owning WKWebView instead of stopping at the detached NSTextView. Constraint: CodeRabbit identified the current embedded WebKit field-editor owner gap on PR #3843. Rejected: Delegate-based owner discovery | NSTextView.delegate is unsafe-unretained and existing routing support already centralizes owner resolution. Confidence: high Scope-risk: narrow Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit * Narrow browser return reentry suppression to one event The browser Return forwarding path still needs to suppress WebKit's same-event performKeyEquivalent reentry to avoid the sign-in submit alert sound. A depth-only guard was too broad, so the guard now records the exact forwarded event and lets different nested Return/Enter events fall through to normal AppKit handling. Constraint: PR review requested preserving legitimate nested Return/Enter feedback while keeping the sign-in submit path quiet. Rejected: Browser-accepted-submit callback | WKWebView.keyDown does not synchronously report whether page JavaScript accepted or submitted the key. Confidence: high Scope-risk: narrow Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit * Resolve detached field editor responder owners Detached AppKit field-editor owner views can point at an embedded WKWebView through nextResponder rather than superview. Follow that responder chain for owner resolution, and keep forwarded Return event identity independent of windowNumber so synthetic or detached events still match the same forwarded key. Constraint: CodeRabbit current-head review requested responder-chain owner lookup and window-number-free forwarded event identity. Rejected: Broaden all NSView embedded web lookup immediately | the review blocker is specifically field-editor owner lookup, and keeping the change scoped reduces responder-chain side effects. Confidence: high Scope-risk: narrow Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit * Reduce duplicate embedded browser resolver work The embedded browser resolver now owns detached field-editor owner lookup, so the performKeyEquivalent call site can use the single shared resolver directly. This addresses the remaining current review cleanup without changing key forwarding behavior. Constraint: Keep the sign-in Return/Enter forwarding behavior unchanged while clearing current PR feedback. Rejected: Keep the redundant explicit field-editor probe | it repeated the same resolver branch and left a current review thread open. Confidence: high Scope-risk: narrow Directive: Keep detached field-editor ownership resolution centralized in cmuxOwningEmbeddedWebView(for:) and cmuxOwningEmbeddedWebViewFromFieldEditorOwner(_:). Tested: git diff --check Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit Not-tested: Local unit execution, per repository policy. * Keep browser key tests focused The latest review asked to separate auth sign-in coverage from browser key forwarding coverage. Moving the auth suite into its own test file keeps each test file under the size guidance without changing test behavior. Constraint: Xcode project requires explicit test source membership Rejected: Rename the existing browser test file | larger project-file churn without improving the requested split Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep auth callback coverage in AuthManagerBrowserSignInTests and key-routing coverage in BrowserArrowKeyForwardingTests Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit Tested: ./scripts/reload.sh --tag issue-3840-signin-beep-on-enter --launch Not-tested: Runtime XCTest execution, per repository policy * Keep auth callbacks scoped to initiating debug builds Parallel tagged Debug apps share the cmux-dev callback scheme, so the default sign-in path must keep the callback owned by the initiating ASWebAuthenticationSession. External-browser sign-in remains available only through an explicit environment opt-out or test injection. Constraint: Tagged Debug builds share the same registered callback URL scheme Constraint: Apple documents ASWebAuthenticationSession callbacks as scoped to the calling app even when schemes are shared Rejected: Per-instance callback schemes | registered URL schemes are app-bundle metadata and would be higher-churn for this PR Confidence: high Scope-risk: narrow Reversibility: clean Directive: Do not make external-browser sign-in the default while tagged Debug builds share cmux-dev callbacks Tested: git diff --check Tested: ./scripts/test-unit.sh build-for-testing -derivedDataPath /tmp/cmux-issue-3840-signin-beep-on-enter-unit Tested: ./scripts/reload.sh --tag issue-3840-signin-beep-on-enter --launch Not-tested: Runtime XCTest execution, per repository policy * Route web-auth errors through unified logging Greptile's post-CI summary flagged the new ASWebAuthenticationSession failure logs as raw NSLog calls. AuthManager now uses a file-scoped unified Logger for those production web-auth outcomes while keeping the existing DEBUG-only authLog helper unchanged for auth debug traces. Constraint: Swift logging guidance requires Logger/os.log instead of NSLog for new production logging. Rejected: Reuse authLog for these failures | authLog is DEBUG-only and intentionally writes PII-prone traces to a temporary file. Confidence: high Scope-risk: narrow Reversibility: clean Tested: git diff --check Tested: rg -n 'NSLog\(' Sources/Auth/AuthManager.swift returned no matches Not-tested: Local tests, per user instruction and repository policy.

…nside cmux (#3714) * Add failing regression tests for Vertex/Bedrock auth env preservation Per AGENTS.md regression-test commit policy, this is the RED commit: it introduces three new test cases that fail under the current wrapper behavior, demonstrating that: 1. CLAUDE_CODE_USE_VERTEX=1 + Vertex model ids are silently scrubbed when launching claude inside cmux (#3641 root cause). 2. CLAUDE_CODE_USE_BEDROCK=1 + Bedrock model ids are silently scrubbed too (#3638 root cause). 3. Falsy CLAUDE_CODE_USE_VERTEX=0 must still be cleared (negative case; already passes — guards future fix from over-preserving). Defense-in-depth assertions on ANTHROPIC_VERTEX_PROJECT_ID, CLOUD_ML_REGION, ANTHROPIC_VERTEX_BASE_URL, ANTHROPIC_BEDROCK_BASE_URL, AWS_REGION, and AWS_PROFILE pin the current behavior that those config vars are not in CLAUDE_AUTH_SELECTION_ENV_KEYS, so a future regression that adds them to the unset list would be caught. The fix follows in the next commit. Refs: #3641, #3638 * Auto-preserve Vertex/Bedrock auth env in claude wrapper Closes #3641 (Vertex), closes #3638 (Bedrock). When IN_CMUX=1, the wrapper unsets CLAUDE_CODE_USE_VERTEX, CLAUDE_CODE_USE_BEDROCK, ANTHROPIC_MODEL, ANTHROPIC_SMALL_FAST_MODEL, and ANTHROPIC_API_KEY by default so cmux can manage which Claude account is in use. That works for users on the default Anthropic backend, but users who configure Vertex AI or Bedrock in their shell rc had no account-management surface in cmux to opt back in — they just got 'not logged in' with no warning. The undocumented opt-in (CMUX_PRESERVE_CLAUDE_AUTH_SELECTION_ENV=1 + CMUX_PRESERVE_CLAUDE_AUTH_SELECTION_ENV_KEYS=...) exists but neither bug reporter could discover it. Fix: should_preserve_claude_auth_selection_key() now also returns true (in addition to the existing opt-in path) when: * Key == CLAUDE_CODE_USE_VERTEX and its value is truthy (1/true/yes) * Key == CLAUDE_CODE_USE_BEDROCK and its value is truthy * Key == ANTHROPIC_MODEL or ANTHROPIC_SMALL_FAST_MODEL AND CLAUDE_CODE_USE_VERTEX or CLAUDE_CODE_USE_BEDROCK is truthy (Vertex/Bedrock require backend-specific full model ids that the default Anthropic backend cannot use, so the model selector is meaningless without the backend selector) Falsy values (CLAUDE_CODE_USE_VERTEX=0 or empty) do NOT trigger preservation — the third regression test pins this so a future fix cannot over-preserve. Vertex/Bedrock supporting vars (ANTHROPIC_VERTEX_PROJECT_ID, CLOUD_ML_REGION, ANTHROPIC_VERTEX_BASE_URL, ANTHROPIC_BEDROCK_BASE_URL, AWS_*) are not in CLAUDE_AUTH_SELECTION_ENV_KEYS so they were already preserved; the regression tests assert that explicitly. Test infra changes: * Updated existing test_live_socket_preserves_third_party_claude_auth_for_fresh_launch to drop CLAUDE_CODE_USE_BEDROCK=1 / CLAUDE_CODE_USE_VERTEX=1 from the inherited env. That test conflated 'no Vertex/Bedrock signal' with 'asserting they get cleared'; the dedicated auto-preserve tests now cover the truthy case correctly. Backwards compat: * CMUX_PRESERVE_CLAUDE_AUTH_SELECTION_ENV opt-in path is unchanged. * Users on the default Anthropic backend continue to have selection vars scrubbed (no behavior change). Refs: #3641, #3638 * Address PR #3714 review feedback * Make the wrapper-auth tests hermetic: pop CLAUDE_CODE_USE_VERTEX, CLAUDE_CODE_USE_BEDROCK, ANTHROPIC_MODEL, ANTHROPIC_SMALL_FAST_MODEL (and friends) from the inherited os.environ before applying each test's inherited_env. Without the pop, a developer or CI machine that legitimately exports CLAUDE_CODE_USE_VERTEX=1 in their shell would see the new auto-preserve fire ambiently and fail the negative tests even though the wrapper is correct (Copilot review). * Add ANTHROPIC_API_KEY assertion to both Vertex and Bedrock auto- preserve tests: the API key must be cleared even when a Vertex/Bedrock signal is active (those backends do not consume it). Pins the scrub-API-key invariant against future regressions (CodeRabbit review). * Rename test_live_socket_does_not_auto_preserve_when_vertex_value_is_falsy to test_live_socket_does_not_auto_preserve_when_all_backends_are_falsy to reflect that the test exercises both falsy CLAUDE_CODE_USE_VERTEX=0 and empty CLAUDE_CODE_USE_BEDROCK="" (CodeRabbit nit). * Add test_live_socket_explicit_key_list_is_additive_to_vertex_auto_preserve pinning the precedence between CMUX_PRESERVE_CLAUDE_AUTH_SELECTION_ENV_KEYS and the new auto-preserve: the explicit key list is additive, not exclusionary. A user opting in with KEYS=ANTHROPIC_API_KEY plus CLAUDE_CODE_USE_VERTEX=1 still gets Vertex auto-preserved (Greptile + CodeRabbit raised this as a behavior-change concern; the test documents the intended semantics). * Add a fall-through comment in should_preserve_claude_auth_selection_key() explaining that the explicit-list miss intentionally falls through to auto-preserve, so a future maintainer doesn't 'fix' the perceived early-return regression (Greptile + CodeRabbit suggestion). Refs: #3641, #3638 PR: #3714 * Fix Ruff RUF059: silence unused real_argv in additive-list test The test only asserts on auth_env, so destructure the third tuple element as _ — same pattern test_live_socket_does_not_auto_preserve_when_all_backends_are_falsy already uses. Caught by CodeRabbit review. * Harden auth-env tests: AWS_* wildcard pop + truthy-variant coverage Two follow-ups from CodeRabbit on PR #3714: * AWS_* wildcard hermeticity: developer/CI shells often export AWS_ACCESS_KEY_ID, AWS_SESSION_TOKEN, AWS_DEFAULT_REGION etc. that the previous explicit pop list (AWS_PROFILE/AWS_REGION only) didn't cover. Add a list-comp pop of every key starting with 'AWS_' so no ambient AWS_* var can leak into the wrapper subprocess and skew Bedrock-flow assertions. * Truthy-variant coverage: the auto-preserve case statement accepts 1|true|TRUE|yes|YES (matching the existing CMUX_PRESERVE_CLAUDE_AUTH_SELECTION_ENV parser); the focused vertex/bedrock tests only exercised '1'. Add test_live_socket_auto_preserve_accepts_all_documented_truthy_variants iterating over all 5 variants for both backends so a future simplification of the case statement cannot silently drop yes/YES. Verified locally with hermetic check: AWS_ACCESS_KEY_ID=fake ... CLAUDE_CODE_USE_VERTEX=true \ python3 Tests/test_claude_wrapper_hooks.py -> PASS (would have false-failed before this commit on the CLAUDE_CODE_USE_VERTEX=true line). Refs: #3641, #3638 PR: #3714 * Cover ANTHROPIC_SMALL_FAST_MODEL in falsy-backends negative test The wrapper treats ANTHROPIC_MODEL and ANTHROPIC_SMALL_FAST_MODEL identically in should_preserve_claude_auth_selection_key (both gated on the same vertex_on/bedrock_on flags), but test_live_socket_does_not_auto_preserve_when_all_backends_are_falsy only seeded ANTHROPIC_MODEL. Same defense-in-depth gap pattern CodeRabbit caught earlier with ANTHROPIC_API_KEY in the positive tests — pin the small-fast-model invariant on the negative path too. Refs: #3641, #3638 PR: #3714 * Remove duplicated falsy-backends assertions Commit 8a3bc1d accidentally appended a second copy of the same four expect() blocks (CLAUDE_CODE_USE_VERTEX/BEDROCK, ANTHROPIC_MODEL, ANTHROPIC_SMALL_FAST_MODEL == __UNSET__) right after the intended assertions. Functionally a no-op (each invariant is the same; the test already passes), but redundant noise that future reviewers would flag. Drops the duplicated 21-line block; the surviving assertions are the exact ones the negative test was designed to pin. Refs: #3641 PR: #3714

* Expose missing right-sidebar CLI coverage The CLI had no tests proving keyboard-only right-sidebar actions were reachable through the socket surface, so add regression coverage before wiring the implementation. Constraint: Do not run local tests for this repo; CI owns verification. Confidence: medium Scope-risk: narrow Tested: Not run locally per repository policy. Not-tested: Expected to fail before the right_sidebar command family exists. * Restore right-sidebar parity for CLI automation Keyboard shortcuts could already drive every right-sidebar action, but automation had no matching socket command family. Add a single right_sidebar IPC path, one app-level dispatcher over the existing sidebar handlers, and one CLI branch that forwards the full quiet command surface with optional targeting. Constraint: Preserve the hand-rolled CLI and V1 socket style already used by sidebar commands. Constraint: Do not run local tests or direct xcodebuild in this repo; CI and tagged reload own verification. Rejected: Add ArgumentParser for this command family | inconsistent with the existing CLI dispatcher. Confidence: medium Scope-risk: moderate Directive: Keep future right-sidebar CLI actions routed through RightSidebarRemoteCommand instead of adding one-off socket branches. Tested: python3 -m json.tool Resources/Localizable.xcstrings Tested: git diff --check Not-tested: Local unit/UI tests and local xcodebuild, per repository and task policy. * Align right-sidebar CLI edge paths with review findings Remote review caught two edge-path divergences: invalid subcommands could trigger target resolution first, and show used a different hidden-to-visible path than toggle. Validate the subcommand before socket lookups and route show through the existing toggle handler when it actually reveals the sidebar. Constraint: Keep the CLI quiet and preserve the existing V1 forwarding style. Constraint: Do not run local tests or direct xcodebuild in this repo; CI owns executable verification. Rejected: Add a separate show-specific app handler | it would duplicate the toggle path the feature is meant to mirror. Confidence: medium Scope-risk: narrow Directive: Keep target resolution after subcommand validation so invalid commands report command errors before socket lookup errors. Tested: python3 -m json.tool Resources/Localizable.xcstrings Tested: git diff --check Not-tested: Local unit/UI tests and local xcodebuild, per repository and task policy. * Keep right-sidebar show on the shared reveal path The previous follow-up still left a stale-target fallback that made show visible by mutating FileExplorerState directly. Fail stale targeted show requests instead, and keep every hidden-to-visible show path on the existing toggle handler. Constraint: Preserve keyboard-shortcut parity for the visible-sidebar transition. Rejected: Keep a stale-context direct mutation fallback | it preserves the split behavior review flagged. Confidence: medium Scope-risk: narrow Directive: Do not add direct show-only visibility mutation unless there is a documented target-without-window use case. Tested: git diff --check Not-tested: Local unit/UI tests and local xcodebuild, per repository and task policy. * Strengthen confidence in right-sidebar CLI behavior The right-sidebar CLI was a small command surface, but it crosses CLI parsing, socket forwarding, AppDelegate targeting, and persisted sidebar state. Add behavior-level coverage for handle resolution, forwarding failure, parser validation, target scoping, and the no-socket help contract while keeping the IPC parser in its own file. Constraint: Local test execution is routed through CI/VM by repo policy; tagged reload build is queued behind an existing shared Xcode build lock. Rejected: Source-text or project-grep tests | repo test policy requires observable runtime behavior instead. Confidence: medium Scope-risk: narrow Directive: Keep right-sidebar CLI tests at the socket/behavior boundary so future parser rewrites preserve command contracts. Tested: git diff --check; plutil -lint GhosttyTabs.xcodeproj/project.pbxproj; existing PR CI was green before this commit Not-tested: Local unit tests and local tagged build completion; ./scripts/reload.sh --tag issue-3808-right-sidebar-cli remains queued behind another Xcode build * Clarify right-sidebar IPC isolation The right-sidebar socket parser now has its protocol types in a dedicated file, and CI review asked that these pure values follow the repo's explicit concurrency boundary convention. Marking the request, target, state, result, and mode values as nonisolated/Sendable keeps the IPC surface safe to pass across the socket-to-main-actor boundary without relying on implicit defaults. Constraint: Local tests run in CI for this repo; keep local verification to static checks and tagged builds. Rejected: Leave Sendable implicit | this would diverge from the existing SocketLineProcessingResult pattern. Confidence: medium Scope-risk: narrow Directive: Keep right-sidebar IPC values pure and explicitly sendable if the command family grows. Tested: git diff --check Not-tested: Local unit tests, per repository testing policy * Reuse shared CLI handle normalization for right sidebar The right-sidebar CLI needs UUIDs for the remote command, but it does not need its own copy of window/workspace index resolution. Compose the existing handle normalizers with one small UUID extraction helper so right-sidebar targeting follows the same handle rules as the rest of the CLI while still forwarding concrete UUIDs over the V1 socket command. Constraint: right_sidebar remote parser only accepts UUID target IDs, while public CLI flags accept UUIDs, refs, and indexes. Rejected: Keep bespoke right-sidebar index helpers | duplicates the established CLI normalization path and drew review feedback. Confidence: medium Scope-risk: narrow Directive: Add future target syntaxes to the shared normalizers first, then compose them here. Tested: git diff --check Not-tested: Local unit tests, per repository testing policy * Tighten right-sidebar target and validation boundaries Review found two edge cases in the right-sidebar CLI/socket path: invalid set modes could still proceed to target resolution, and explicit remote targets could fall back to active sidebar state when their registered context lacked one. Validate set modes before resolution, require explicit targets to own sidebar state, and keep usage strings aligned with the public --workspace flag. Constraint: right_sidebar IPC still accepts --tab as an internal alias, but user-facing usage should prefer --workspace. Rejected: Move the parser to a core module in this PR | the thread is already outdated and the type still depends on app-target RightSidebarMode; doing that cleanly would require a broader module-boundary change. Confidence: medium Scope-risk: narrow Directive: Keep validation before socket/target work for every right-sidebar CLI branch. Tested: git diff --check; jq empty Resources/Localizable.xcstrings Not-tested: Local unit tests, per repository testing policy * Keep targeted right-sidebar toggle on shared window path A targeted remote toggle should not bypass the same window-selection path used by show/focus when the target sidebar is hidden. Return target-not-found when a non-active target has no concrete window instead of mutating FileExplorerState directly, and lock the no-window workspace case with a regression assertion. Constraint: Right-sidebar visibility commands must preserve the socket focus policy and avoid silently acting on a different window. Rejected: Keep direct toggle mutation for no-window targets | it recreates the split-path behavior already rejected for show. Confidence: medium Scope-risk: narrow Directive: Visibility commands that can reveal UI should go through the shared window-aware path for explicit targets. Tested: git diff --check; jq empty Resources/Localizable.xcstrings Not-tested: Local unit tests, per repository testing policy * Make right-sidebar socket focus policy command-specific The V1 socket policy was still treating every right_sidebar command as focus-capable, even though mode, hide, invalid requests, and set --no-focus should preserve focus. Parse the right_sidebar request before installing the socket policy so only show, toggle, focus, and focused mode switches can mutate in-app focus, and add debug-level coverage for the command matrix. Constraint: V1 socket policy normally only sees the command key, so right_sidebar passes its raw args into the policy params for granular classification. Rejected: Keep right_sidebar in the broad focus-intent set | it violates the socket focus policy for state reads and no-focus writes. Confidence: medium Scope-risk: narrow Directive: New right_sidebar subcommands must update the command-specific focus policy test. Tested: git diff --check; jq empty Resources/Localizable.xcstrings Not-tested: Local unit tests, per repository testing policy

* Prove remote daemon upload must use remote HOME The bootstrap regression needs an executable seam because the failing behavior lives in Process arguments, not a pure string builder. The new test stubs SSH/SCP and captures the daemon upload destination after the remote shell reports a HOME that differs from the SFTP starting directory. Constraint: Local test execution is intentionally skipped by repo policy and user instruction. Rejected: Assert source text for the scp target | repo test policy requires observable runtime behavior. Confidence: high Scope-risk: narrow Tested: git diff --check Not-tested: XCTest not run locally by policy * Anchor remote daemon bootstrap paths to remote HOME HPC and shared-cluster sshd setups can start SFTP somewhere other than the login shell HOME. The bootstrap now captures remote HOME during the existing SSH probe, builds one daemon install location from it, and reuses the absolute daemon path for upload, install, hello, relay metadata, and proxy transport. Constraint: scp resolves relative destinations against the SFTP subsystem cwd, not necessarily the shell HOME. Rejected: Patch only the scp destination | leaving shell/proxy paths relative would preserve two competing path models. Confidence: high Scope-risk: moderate Directive: Keep daemon bootstrap paths derived from RemoteDaemonInstallLocation; do not reintroduce call-site-specific HOME concatenation. Tested: git diff --check Not-tested: XCTest/build not run locally by policy and user instruction * Document remote bootstrap process test seam Greptile flagged the new nonisolated test override as under-documented shared mutable state. The seam is now DEBUG-only and documents that XCTest installs it before controller startup, clears it during teardown, and owns synchronization for captured test state. Constraint: Do not run local tests or builds for this task. Rejected: Add a broader process-runner abstraction | this hook is only a regression-test seam and should stay narrow. Confidence: high Scope-risk: narrow Tested: git diff --check Not-tested: XCTest/build not run locally by policy and user instruction * Clarify remote install path error

* Expose silent durable cron downgrade in the Claude wrapper The wrapper currently injects Claude hooks without an explicit CronCreate durability boundary, so a caller can request durable:true and still receive a session-only job. This regression test requires the generated settings to include a synchronous CronCreate guard before the fix lands. Constraint: The first commit must contain only the failing regression test for issue #3395. Confidence: high Scope-risk: narrow Tested: Reproduced the bug locally with the repo wrapper, fake cmux socket, temp HOME, and Claude Code 2.1.139; CronCreate returned durable=false and CronList showed [session-only]. Not-tested: Did not run the regression suite locally because project policy keeps tests on CI. * Reject unsupported durable Claude cron requests cmux injects Claude Code hooks but does not own Claude's durable cron scheduler. Letting CronCreate durable:true continue through that hook stack produces a session-only job, so the wrapper now installs a synchronous CronCreate PreToolUse guard and the CLI returns Claude's documented deny decision shape when durable persistence is requested. Constraint: Issue #3395 allows either end-to-end durable persistence or an explicit rejection; persistence is upstream scheduler behavior outside cmux's current ownership boundary. Rejected: Honor durable by writing scheduled_tasks.json from cmux | cmux lacks the upstream scheduler's job schema and restart execution path, so writing the file would create another split source of truth. Rejected: Detect the downgrade after CronCreate | callers would still observe an executed session-only job before cmux could warn. Confidence: high Scope-risk: narrow Directive: Keep Claude pre-tool telemetry async; blocking decisions belong in dedicated narrow synchronous guards. Tested: bash -n Resources/bin/claude; python3 -m py_compile tests/test_claude_wrapper_hooks.py; git diff --check. Not-tested: Did not run app or XCTest locally; CI owns regression execution and dev app launch is intentionally deferred until CI is green. * Make the Claude cron guard read raw hook input Cursor correctly caught that the first guard read cmux's compacted telemetry payload, which strips durable from tool_input. The guard now reads the raw hook JSON for the decision, while the compacted payload remains the feed/status surface. Denied durable CronCreate calls also emit PreToolUse feed telemetry so the denial is visible. Constraint: Do not widen feed telemetry to raw hook payloads; raw input is only for the local decision boundary. Rejected: Add durable to the compacted telemetry allowlist | that would leak scheduler prompt text into the feed path just to make a guard decision. Confidence: high Scope-risk: narrow Tested: bash -n Resources/bin/claude; python3 -m py_compile tests/test_claude_wrapper_hooks.py tests/test_claude_hook_stop_last_assistant.py; git diff --check. Not-tested: Did not run app or XCTest locally; CI owns test execution and launch remains blocked until green.

* Prove passive reveal must not activate cmux This regression test captures issue #3347 at the activation-owner boundary: a passive reveal that does not transfer key focus must not make cmux the Launch Services frontmost application. Constraint: Test-only commit before any fix code is required for the issue workflow Confidence: high Scope-risk: narrow Tested: Not run locally; repository policy routes tests through CI/VM Not-tested: Local XCTest execution * Keep passive focus changes from activating cmux cmux could become the Launch Services frontmost app without owning keyboard focus because passive window/model focus paths were allowed to activate the process. MainWindowVisibilityController now treats application activation as valid only when the same request transfers key focus, and TabManager.focusTab no longer performs AppKit activation as a model-selection side effect. The Task Manager row action keeps its intentional bring-to-front behavior by going through AppDelegate's explicit main-window focus path before selecting the workspace. Constraint: Fix commit must stay separate from the failing regression-test commit for issue #3347 Rejected: Add a deactivate timer after key loss | timing repair would still allow activation without key focus and would be fragile during AppKit key-window transitions Confidence: medium Scope-risk: moderate Directive: Keep AppKit activation in AppDelegate/MainWindowVisibilityController entrypoints; do not reintroduce NSApp.activate from TabManager model selection Tested: git diff --check Not-tested: Local XCTest/build per repository policy; CI will run after PR * Cover miniaturized passive reveal * Fix passive reveal of miniaturized windows * Guard find selection debug logging

* test: cover lossy Chinese pasteboard text * fix: recover faithful CJK paste text * fix: preserve paste fast path for questions

* Expose renderer row rebuild crash repro The Ghostty submodule now points at the failing regression commit for issue #3369. It preserves the unsafe shaped-cell catch-up behavior and adds a targeted test that panics when preedit covers the only shaped glyph and the row tail is empty. Constraint: Two-commit bugfix policy requires this failing test commit before the renderer fix Confidence: high Scope-risk: narrow Tested: ghostty submodule targeted Zig test fails with index out of bounds at generic.zig:52 Not-tested: Final cmux CI until the fix commit advances the submodule pointer * Prevent unsafe renderer row preedit catch-up The Ghostty pin advances from the failing regression commit to the bounded shaped-glyph cursor fix. The renderer row rebuild path now treats shaped glyph cells as a shorter bounded stream than terminal cells, which prevents preedit-covered rows with empty tails from reading past the shaped-cell slice. Constraint: Parent commit follows the failing-test submodule pointer commit for issue #3369 Rejected: Patch the Swift terminal surface bridge | the crash is in Ghostty GenericRenderer row rebuild and should be impossible independent of cmux lifecycle timing Confidence: high Scope-risk: narrow Directive: Do not assume terminal cell count equals shaped glyph count in renderer row iteration Tested: cd ghostty && zig build test -Demit-macos-app=false -Dtest-filter='renderer rebuild row preedit catch-up tolerates empty tail after covered glyph' Not-tested: cmux CI and GhosttyKit archive publication pending * Document GhosttyKit release pin

* Prove stale pane control sockets break SSH startup The pane startup wrapper currently launches ssh even when the cmux-owned ControlPath is a stale Unix socket. This regression uses the generated startup command with a fake ssh binary so the first real SSH child fails if the stale path remains present. Constraint: Tests must run in CI only; local test execution is intentionally skipped.\nConfidence: high\nScope-risk: narrow\nDirective: Keep this as the test-only commit before the control-socket cleanup implementation.\nTested: Not run locally per instruction.\nNot-tested: CI has not run this new failing regression yet. * Restore pane SSH multiplexing after stale control sockets Pane startup commands are reused for new splits, so the startup wrapper now preflights the effective OpenSSH ControlPath before launching the child ssh. It resolves token-expanded paths with ssh -G, checks the master with ssh -O check, and only removes cmux-owned control paths when that check fails. Constraint: Do not run local tests or xcodebuild; CI owns test execution for this repo.\nRejected: App-start sweep | misses stale sockets created after workspace configuration and before later split launches.\nRejected: Make daemon transport the master | broader ownership change than needed for this first regression cut.\nConfidence: high\nScope-risk: moderate\nDirective: Keep preflight in the generated pane startup path so saved split commands get the same stale-socket handling.\nTested: git diff --check.\nNot-tested: Local unit/UI tests and local app build were intentionally not run before CI. * Fix stale control socket test path * Address SSH preflight review nits

* Enable Feed by default * Add terminal focus redraw regression test * Fix feed reply focus handoff * Fix feed reply editor focus steal * Fix feed reply terminal focus handoff * Make feed editor focus one-shot * Fix feed editor focus intent handoff * fix: harden feed focus and activity ordering * fix: preserve feed reply drafts across list recycling * fix: reset feed reply focus requests * fix: separate feed reply focus state --------- Co-authored-by: Lawrence Chen <lawrencecchen@users.noreply.github.com>

pull Bot locked and limited conversation to collaborators Mar 23, 2026

pull Bot added ⤵️ pull merge-conflict Resolve conflicts manually labels Mar 23, 2026

austinywang force-pushed the main branch from 60ba2e6 to b5fb304 Compare March 23, 2026 18:31

austinywang force-pushed the main branch from b06a66d to 049c653 Compare April 10, 2026 22:09

lawrencecchen and others added 25 commits May 1, 2026 15:07

Add Codex Feed hook regression test

05584c6

Harden custom commands rich text fallback

0d6678d

Route Codex permission approvals through Feed

1a985b3

Address tab bar debug review feedback

40e6b25

Merge remote-tracking branch 'origin/main' into task-tabbar-button-la…

b89a9b4

…yout

Make cmux.json the canonical settings file (#3409)

4d0b1db

* feat: make cmux.json the settings file * fix: address cmux.json review feedback * fix: preserve legacy settings schema URL --------- Co-authored-by: Lawrence Chen <lawrencecchen@users.noreply.github.com>

Merge pull request #3406 from manaflow-ai/task-tabbar-button-layout

92bd8f6

Tune Bonsplit tab bar action lane

Merge pull request #3404 from manaflow-ai/feat-docs-custom-commands

16cd449

Fix custom commands docs formatting

Merge remote-tracking branch 'origin/main' into task-migrate-to-circleci

acdd2b6

Harden CircleCI toolchain setup

ce38778

Merge remote-tracking branch 'origin/main' into task-suppress-claude-…

24661c4

…osc-notifications

Fix CircleCI review findings

c8bbd6e

Clarify Codex Feed plan limitation

455b178

fix: address Claude hook review feedback

dcb167f

Dry up Codex Feed permission path

d5f8211

Merge pull request #3312 from manaflow-ai/task-migrate-to-circleci

5fd6451

Add CircleCI CI pipeline

Merge pull request #3420 from manaflow-ai/task-codex-feed-plans-permi…

6bb3873

…ssions Route Codex permission approvals through Feed

Share workspace pin action path (#3425)

86f912b

* Share workspace pin action path * Keep sidebar pin snapshot current * Split workspace pin helpers * Address workspace pin review feedback --------- Co-authored-by: Lawrence Chen <lawrencecchen@users.noreply.github.com>

Fix browser omnibar typing lag with many workspaces

49c1c4d

Moves open-tab suggestion matching onto a cached per-manager index and removes sleep-based omnibar focus retries.

austinywang and others added 30 commits May 11, 2026 00:38

fix(claude): preserve dev channel resume flag (#3752)

b7ce782

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Revert "Fix McBopomofo Bopomofo candidate opening (#3836)" (#3845)

ae5715e

This reverts commit d8a6f2a.

Revert "Fix Chinese IME Enter swallowed in terminal (#3762) (#3767)" (#…

a183d44

…3846) This reverts commit fe7cf04.

Revert "Fix arrow key navigation in IME candidate window during compo…

d9e27e0

…sition (#3694)" (#3849) This reverts commit 00dd7c3.

Bump version to 0.64.4 (#3853)

e78478b

Fix garbled Chinese paste text (#3929)

3e2543b

* test: cover lossy Chinese pasteboard text * fix: recover faithful CJK paste text * fix: preserve paste fast path for questions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] main from manaflow-ai:main#4

[pull] main from manaflow-ai:main#4
pull[bot] wants to merge 1094 commits into
ericmjl:mainfrom
manaflow-ai:main

pull Bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

pull Bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pull Bot commented Mar 23, 2026 •

edited

Loading