Skip to content

🤖 feat: sticky per-workspace auto-expand for thinking & tool blocks#3452

Merged
ammario merged 5 commits into
mainfrom
expand-state-2cdh
Jun 4, 2026
Merged

🤖 feat: sticky per-workspace auto-expand for thinking & tool blocks#3452
ammario merged 5 commits into
mainfrom
expand-state-2cdh

Conversation

@ammar-agent
Copy link
Copy Markdown
Collaborator

@ammar-agent ammar-agent commented Jun 4, 2026

Summary

Makes transcript expand/collapse for thinking (reasoning) and tool blocks legible and customizable via a single per-workspace, per-type auto-expand preference. Expanding or collapsing any block of a type sets that workspace's preference for the type; newly arriving blocks inherit it. Crucially, a preference change never mutates the expand state of an already-mounted block, so there are no layout flashes.

Background

Expand stickiness felt random: thinking blocks expanded while streaming then auto-collapsed on finish, bash auto-expanded the latest streaming command and auto-collapsed it when superseded, and each tool had its own bespoke default. Several of these behaviors mutated present blocks (a visible height tear) and fought any notion of a remembered user choice.

Implementation

New hook useStickyExpand(kind, fallbackExpanded, opts?) (src/browser/features/Messages/useStickyExpand.ts):

  • Seeds initial state once at mount from a persisted record auto-expand:{workspaceId}{ thinking?, tools? }, read via readPersistedState (no subscription). A later preference change therefore cannot retroactively expand/collapse a mounted block — only blocks mounting afterwards inherit it.
  • Writes the preference only on a user toggle, so future blocks of that kind inherit the choice.
  • Resolves workspaceId from MessageListContext (already wraps the whole transcript), so no prop-drilling; outside that context it degrades to local-only state.

Wiring:

  • useToolExpansion is now a thin wrapper over useStickyExpand("tools", …), so all ~46 existing tool call sites become sticky with no per-site edits (each tool's prior default becomes the fallback).
  • ReasoningMessage: quiet default (collapsed, even while streaming), removed the auto-collapse-on-finish useLayoutEffect, allowed expanding mid-stream, and made a collapsed streaming block hide its body while an expanded one stays height-uncontrolled (no clipping).
  • FileEditToolCall / TaskToolCall: replaced bespoke ?? default local state with the hook.
  • BashToolCall: folded into the tools preference; deleted useBashAutoExpand, the useLatestStreamingBashId store hook, and the aggregator's getLatestStreamingBashToolCallId + cache field. Live output still renders when expanded.
  • AskUserQuestionToolCall: forceExpanded while executing so a live, blocking prompt is never hidden; follows the preference once answered.
  • storage.ts: added getAutoExpandPrefsKey + AutoExpandPrefs, registered in PERSISTENT_WORKSPACE_KEY_FUNCTIONS (survives fork/rename).
  • Stable hook callbacks: useStickyExpand returns setExpanded/toggleExpanded with stable identities (useCallback), matching the old useToolExpansion's stable useState setter. Consumers pass setExpanded into effect dependency arrays (e.g. WorkflowRunToolCall's useLayoutEffect/useEffect), so an unstable identity would re-fire those effects every commit — see Pains.
  • CI: added timeout-minutes: 15 to the Test / Unit job (Test / Integration was already capped) so a hung test fails fast with logs instead of stalling the runner for GitHub's ~6h default.

Resolved UX decisions

  • Thinking defaults to quiet (collapsed) until the user expands one.
  • Bash is folded into the unified tools preference (no more special live auto-expand/collapse).

Validation

  • New useStickyExpand.test.tsx locks the present-block invariant: an external preference change does not mutate a mounted block, but a freshly mounted block inherits it; plus per-workspace isolation, write-on-toggle, forceExpanded, and null-context fallback.
  • Updated ReasoningMessage.test.tsx (quiet default, no auto-collapse on completion, preference inheritance) and FileEditToolCall.ui.test.tsx (no mutation when a failed result arrives later).
  • Updated tests/e2e/scenarios/toolFlows.spec.ts for the quiet default: a completed reasoning block now renders collapsed (summary + ellipsis), and the test verifies expanding reveals the full body. Verified locally under Electron/xvfb.

Risks

  • Behavioral change to a high-traffic surface (every reasoning + tool row). Logic is centralized in one hook, lowering per-component risk.
  • File-edit failures no longer auto-collapse — they stay as-mounted with the error visible (the old auto-collapse was itself a present-block mutation).
  • Reasoning is now collapsed by default after a turn (previously a reasoning block displaced by answer text stayed expanded). This is the intended quiet default; the summary line stays visible and one expand sticks for the workspace.
  • Toggling a live AskUserQuestion writes the tools preference; it only fires on a deliberate click, so impact is minimal.

Pains

The first revision hung Test / Unit for 40+ min, and the hang only reproduced on the PR merge commit: main had since added WorkflowRunToolCall.test.tsx, whose component lists setExpanded in a useLayoutEffect dependency array. useStickyExpand returned a fresh setExpanded each render, and because bun test does not run the React Compiler (which memoizes it in the app), the synchronous layout effect re-fired every commit and spun act() forever — past bun's 5s per-test timeout, so the whole job stalled (which also had no timeout-minutes). Reproduced locally by checking out refs/pull/3452/merge and running the file in isolation, then fixed by stabilizing the callbacks. (The suite also has pre-existing, order-dependent mock.module cross-file leaks unrelated to this PR — main itself fails them in a sorted run — left untouched.)


Generated with mux • Model: anthropic:claude-opus-4-8 • Thinking: max • Cost: $26.40

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Breezy!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 626c05230e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/features/Tools/TaskToolCall.tsx Outdated
Comment thread src/browser/features/Tools/TaskToolCall.tsx Outdated
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

Addressed both Codex P2 findings in 1d02bb0a7:

  • Restore task error auto-expansion (TaskToolCall.tsx) and Restore failed task_await expansion: these signals (!!errorResult, failedCount > 0) are now passed to useStickyExpand via forceExpanded instead of the seed-once fallback. forceExpanded is now a live, latched "must open" signal: it opens the row on the rising edge (so an error/failure arriving after mount still expands it), but a signal going false again never collapses a present block, and a user toggle still wins (the row stays collapsible). Base state continues to follow the sticky tools preference.

Added hook tests locking the behavior: "opens on a late forceExpanded trigger and never auto-collapses it (latched)" and "a user collapse still wins over an active forceExpanded signal". Full static-check + the toolFlows e2e pass locally.

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review
Please take another look — addressed the two P2 findings on task/task_await auto-expansion.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. What shall we delve into next?

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Fixed the Test / Unit hang: useStickyExpand now returns stable setExpanded/toggleExpanded (via useCallback), restoring the old useToolExpansion stable-setter contract that WorkflowRunToolCall relies on (it lists setExpanded in a useLayoutEffect dep). The hang only reproduced on the PR merge commit (newer main added that test) because bun test doesn't run the React Compiler. Also added timeout-minutes: 15 to the unit job so a future hang fails fast.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0a16d31346

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/features/Tools/Shared/toolUtils.tsx
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Addressed the P2 "Preserve forced expansion for live advisor output" (commit 08db34e25):

  • AdvisorToolCall now uses useToolExpansion(false, { forceExpanded: isExecutingWithoutResult }), so a live advisor is force-opened while executing regardless of a stored tools: false preference, keeping its streamed advice/reasoning visible. The existing key={… "executing"/"settled"} remount means a settled row mounts fresh and then follows the sticky preference / collapsed default. forceExpanded is latched (monotonic), so the user can still collapse it.
  • Applied the same carve-out to TaskApplyGitPatchToolCall (force-open on an error result) so a failed patch's error isn't hidden by the preference — matching the existing AskUserQuestion / TaskToolCall must-open handling.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. You're on a roll.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Re-requesting review on the latest HEAD (08db34e25), which adds the forceExpanded carve-outs for AdvisorToolCall (live executing) and TaskApplyGitPatchToolCall (error result) addressing the prior P2. All CI is green and the Test / Unit hang is resolved.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Can't wait for the next one!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Replace bespoke per-block expand defaults and present-block-mutating auto
behaviors with a single per-workspace, per-type auto-expand preference seeded
once at mount (no layout flashes). Thinking defaults quiet; bash folds into the
unified tools preference. Deletes useBashAutoExpand and useLatestStreamingBashId.
The reasoning block now starts collapsed (quiet default) instead of staying
expanded after being displaced by answer text, so the e2e asserts the summary
+ ellipsis collapsed state and verifies expanding reveals the full body.
useStickyExpand now treats forceExpanded as a live, latched 'must open' signal
layered above the seed-once preference/fallback, so a task error or failed
task_await sub-task that arrives after mount still opens the row (instead of
staying hidden behind the header). User toggle still wins; never auto-collapses.
…nd unit job

useStickyExpand returned fresh setExpanded/toggleExpanded closures every
render. Consumers pass setExpanded into effect deps — e.g. main's newer
WorkflowRunToolCall lists it in a useLayoutEffect and a useEffect. The old
useToolExpansion returned React's stable useState setter, so those effects
only re-ran on real changes. bun test does NOT run the React Compiler, so the
unstable identity re-fired the synchronous layout effect every commit and spun
act() forever. This only manifested on the PR *merge* commit (this branch +
newer main, which added WorkflowRunToolCall.test.tsx), hanging Test/Unit for
40+ min since the job had no timeout.

- Stabilize setExpanded/toggleExpanded with useCallback (restore stable-setter
  contract; correctness, not perf memoization — tests lack the compiler).
- Add timeout-minutes: 15 to the test-unit job (Test/Integration already caps);
  bounds any future hang with logs instead of stalling the runner ~6h.
Codex review: with a stored `tools: false` preference, the sticky tools
default overrode the initialExpanded fallback that AdvisorToolCall used
(useToolExpansion(isExecutingWithoutResult)) to keep streamed advice visible
while executing — so a live advisor could mount collapsed and hide its output.
Route these must-open states through forceExpanded (latched; the user can
still collapse), matching the AskUserQuestion / TaskToolCall carve-outs.

- AdvisorToolCall: forceExpanded while executing (settled rows still remount
  collapsed via the existing key swap).
- TaskApplyGitPatchToolCall: forceExpanded when there is an error result.
@ammario ammario merged commit c354c5e into main Jun 4, 2026
24 checks passed
@ammario ammario deleted the expand-state-2cdh branch June 4, 2026 20:06
@mux-bot mux-bot Bot mentioned this pull request Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants