Skip to content

Ai stability and ux improvements#2849

Merged
abose merged 19 commits intomainfrom
ai
Apr 26, 2026
Merged

Ai stability and ux improvements#2849
abose merged 19 commits intomainfrom
ai

Conversation

@abose
Copy link
Copy Markdown
Member

@abose abose commented Apr 26, 2026

No description provided.

abose added 8 commits April 26, 2026 18:12
- New FILE_NOT_FOUND title/message strings for graceful open errors.
- Diff hunks rendered with line-number column, dim context rows, and
  a dashed separator between multi-occurrence hunks (replace_all).
- Agent batches consecutive 'Error in hook callback' stderr lines into a
  single console.error block and emits an aiHookError peer event so the
  full SDK trace is easy to inspect.
- New AI_CHAT_TOOL_APPLIED_DIRECTLY string for indicators where the SDK
  ran the tool natively (hook bypassed) — Phoenix can't capture a diff
  for these.
- Inline edit actions move to the tool-header row (margin-left: auto)
  so 'Show diff' shares a line with the 'Edit <file>' label.
- New AI_CHAT_DIFF_* strings for the per-edit options menu
  (Expand all / Collapse all / Always show) and tooltip.
- Style for the small ellipsis trigger; an invisible ::after pseudo
  pads the hit area ~10x14px around the visible icon (biased right
  toward empty header space) so it's easier to click without changing
  layout or stealing clicks from the adjacent Show diff button.
- Agent maps each SDK tool_use id to its counter at content_block_start
  and emits a new aiToolResult event when the matching tool_result
  arrives, carrying isError + a short preview.
- New strings for 'Edit rejected — file not modified' (with reason
  line) so unmatched Edit/Write indicators can be classified.
- .ai-tool-direct-note now matches the .ai-tool-detail-line indent
  (28px) and typography so 'Applied directly' / 'Edit rejected' sit
  under the icon column instead of the card edge. Rejected variant
  switches the note to red (#e88) without italics; the secondary
  reason line is dimmer.
Adds AI_CHAT_TOOL_READ_FILE_RANGE / FROM strings so Read indicators
say 'Read foo.html lines 120-200' or 'Read foo.html from line 120'
instead of dropping the offset/limit. Lets the user tell apart
consecutive reads of the same file at a glance.
Restructure Phoenix's AI hooks so the Claude Code SDK runs Read, Edit,
and Write natively against disk and our PreToolUse + PostToolUse hooks
just sync the editor buffer around it. Avoids the SDK's read/mtime
tracker tripping on Phoenix-applied edits, which was causing redundant
Read+retry cycles after every Edit in the AI panel.

PreToolUse:
- Read: flush dirty buffer to disk so the SDK reads the live content,
  return {} so SDK runs native Read and updates its read tracker.
- Edit: flush dirty buffer, capture pre-edit content for snapshot
  tracking, pre-check that old_string still exists in the file (deny
  with informative reason if the user changed the text since the last
  Read), otherwise return {}.
- Write: flush dirty buffer, capture pre-write content, return {}.

PostToolUse (new for Edit and Write):
- Skip diff-card painting when input.tool_response indicates the SDK's
  native tool itself errored (new _isToolResponseError helper).
- Edit: try applying old_string -> new_string directly to the open
  buffer via doc.replaceRange (preserves CodeMirror marks outside the
  edit region), fall back to refreshDocumentFromDisk only if the
  buffer no longer contains old_string.
- Write: refreshDocumentFromDisk reads new content and pushes it
  through doc.refreshText.
- Trigger aiToolEdit so existing diff-card and snapshot UI keep working.

Editor._resetText (used by all refreshText callers, not just AI):
- Initial document load still uses setValue + clearHistory.
- External-content reload now does an O(n) common prefix/suffix scan
  and replaceRange's only the differing middle, instead of setValue +
  clearHistory. Disk-driven reloads (file watchers, git checkout, AI
  edits) become undoable AND CodeMirror marks at unchanged head/tail
  survive. markClean still sets the new saved generation so the dirty
  flag tracks divergence from disk correctly.

Also tracks SDK tool_use_id -> our toolCounter so PostToolUse can fire
aiToolEdit for the right indicator regardless of phantom partial
re-emits.
With PreToolUse hooks now returning {} (allow), the old practice of
appending CLARIFICATION_HINT to permissionDecisionReason no longer
reaches Claude when a user types a follow-up mid-stream — Edit,
Write, and Read all fall straight through to native execution.

Add a catch-all PostToolUse hook (no matcher) that returns the hint
as additionalContext whenever _queuedClarification is set. Fires
after every tool — Bash, Grep, Glob, WebFetch, Task, the Phoenix MCP
tools, etc. — so any in-flight checkpoint prompts Claude to call
getUserClarification before continuing. Becomes a no-op once the
queue is drained.

Edit/Write specific PostToolUse hooks now return {} for all paths;
clarification is centralised in the catch-all so we can't double-fire.
abose added 11 commits April 26, 2026 18:14
When Claude attempts an Edit or Write on a user file while the panel
is in Plan Mode, surface an inline confirmation card asking the user
to either Allow & Switch to Edit Mode, or Stay in Plan Mode.
Mirrors the Bash confirmation flow (matcher 'Bash' hook) — agent
blocks on a Promise until the browser sends the user's decision back
via the new answerPlanModeWriteConfirm peer.

Allow:
  - Browser flips _permissionMode to acceptEdits and updates the
    permission bar so the panel reflects the new state.
  - Agent caches the approval in _planExitApprovedThisTurn so a
    multi-edit turn doesn't pop a dialog before every edit.
  - Hook returns permissionDecision: 'allow' to override the SDK's
    default plan-mode block for the in-flight call.
Stay:
  - Hook returns deny with reason instructing Claude to use
    ExitPlanMode to propose changes for approval first.
  - Permission bar stays on Plan Mode.

Plan-file writes (.claude/plans/) keep their existing silent-save
path — no prompt, the Proposed Plan card is the user-meaningful
surface for those.

The card uses a distinct blue accent (rgb(79,163,227)) versus the
Bash confirm card's orange so users can tell at a glance which type
of confirmation they're seeing. Allow/Deny button colours stay
green/red for action consistency.
The :hover:not(:disabled) rule had higher CSS specificity than
.selected, so hovering a selected AskUserQuestion option washed away
its green tint. Add :not(.selected) to the hover rule so the
selected state wins.
Editor._resetText now uses cm.replaceRange instead of
cm.setValue + cm.clearHistory so external content reloads (Revert,
FileSyncManager, AI hook flow) leave the undo stack intact and the
user can ctrl-z back to their pre-reset content.

Update the two affected assertions in Document-integ-test.js:
- 'should clear dirty flag AND undo when text reset' renamed to
  'should clear dirty flag but preserve undo history when text reset';
  expects undo size 2 (original Foo edit + refreshText replaceRange)
  instead of 0.
- The clean-text-reset case now expects undo size 1 (the single
  replaceRange the new code performs) instead of 0.

The third pre-existing same-text and same-text-different-line-endings
tests already expect history NOT to be cleared (because _resetText
short-circuits when content matches), so they still pass unchanged.
Adds .ai-msg-tool.ai-todo-stale modifier that lowers a TodoWrite
block's opacity to 0.65 and overrides .ai-todo-icon colour to a
neutral muted tone with reduced alpha. Paired with the browser-side
sweep that swaps the stale spinner glyph for fa-arrow-right and
strips the in_progress class, so older TodoWrite cards read as
historical snapshots rather than implying parallel activity.
The Bash PreToolUse hook closed over the permissionMode argument set
once at _runQuery start. Cycling the panel's permission bar from Edit
Mode to Full Auto mid-stream had no effect on bash calls already
queued in the same turn — they kept hitting the confirm prompt even
though the user had already opted out of confirmation.

Add a module-level _runtimePermissionMode that hooks read at decision
time and a setPermissionMode peer the browser calls on every cycle.
_runQuery still seeds it from the params at start; setPermissionMode
overrides it for the rest of the turn. The Bash hook now reads from
this mutable instead of the closure parameter.

Other hooks (Edit/Write plan-mode confirm, plan-file branches) keep
using the closure parameter intentionally — flipping into/out of plan
mode mid-stream can't undo what the SDK already permitted at query
start, so the runtime override would only confuse those paths.
User reports of dense, inconsistent panel UX prompted a full audit of
Extn-AIChatPanel.less. The panel was using five distinct font sizes
(11/12/13/14/15 px) plus relative units, with several hard-coded pixel
values bypassing the variable system. Tool indicator labels were 12 px
in muted grey (barely above readability threshold); their detail
lines (file paths) were 11 px; in-card buttons ranged from 11 px
(bash, restore, diff toggle) to 13 px (plan approve/revise) for no
semantic reason.

Changes:

- Introduce three named LESS aliases at the top of the file:
    @ai-text-body      = @sidebar-content-font-size  (13px)
    @ai-text-secondary = @sidebar-small-font-size    (12px)
    @ai-text-meta      = @sidebar-xs-font-size       (11px)
  plus @ai-line-prose (1.55) and @ai-line-compact (1.4).

- Per-surface tier remap (where the surface read like content rather
  than UI chrome):
    .ai-tool-label              12 -> 13  (body)
    .ai-tool-detail-line        11 -> 12  (secondary)
    .ai-tool-preview            11 -> 12  (secondary)
    .ai-edit-summary-header     12 -> 13  (body)
    .ai-todo-content            12 -> 13  (body)
    .ai-todo-icon               11 -> 12  (secondary)
    .ai-bash-confirm-body pre   13 -> 12  (secondary, code block)
    .ai-plan-write-confirm-body 12 -> 13  (body)

- Standardise every in-card button at @ai-text-secondary (12 px) +
  padding 4px 12px + line-height @ai-line-compact: bash allow/deny,
  plan approve/revise, plan-write allow/stay, restore, restore-point,
  diff toggle, diff more.

- Vertical rhythm: .ai-msg-tool gets a 4 px breather between
  consecutive tool indicators; .ai-tool-header padding 4/8 -> 6/10;
  .ai-tool-detail padding 0 8 4 28 -> 2 10 6 28; .ai-todo-item padding
  2/0 -> 4/0; bash/plan-write confirm body padding 8/12 -> 10/12.

- Line height standardisation: prose surfaces use @ai-line-prose
  (1.55), compact UI uses @ai-line-compact (1.4).

- Replace hard-coded literals with the aliases:
  .ai-user-file-chip 11 -> meta, .ai-chat-quota-bar 12 -> secondary,
  .ai-image-remove / .ai-screenshot-toolbar 12 -> secondary, ai-info
  / ai-quota-dismiss / file-remove pixels -> aliases, settings dialog
  labels and inputs -> aliases.

- Keep .ai-permission-bar kbd at literal 10 px with a comment
  explaining it's a visual chip, not body text.

Net: panel reads as a single coherent product instead of a collage
of independently styled cards. Panel-wide diff is ~80 lines in a
~1700-line file.
Lifts body text to 14px/1.6, widens message and card padding, raises
list-item gaps with explicit line-height, and applies the same prose
rhythm to the proposed-plan body so plan cards no longer feel denser
than the assistant prose around them. In-card buttons stay at the
12px/4x12 standard.
@abose abose merged commit 9532ce2 into main Apr 26, 2026
11 of 17 checks passed
@abose abose deleted the ai branch April 26, 2026 15:20
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant