Skip to content

🤖 perf: automate workspace-open perf + React render profiling#2397

Merged
ammario merged 3 commits intomainfrom
perf-testing-d6tn
Feb 13, 2026
Merged

🤖 perf: automate workspace-open perf + React render profiling#2397
ammario merged 3 commits intomainfrom
perf-testing-d6tn

Conversation

@ammar-agent
Copy link
Collaborator

Summary

Automate workspace-open performance profiling in e2e so runs produce repeatable Chrome/Electron and React artifacts without manual DevTools interaction.

Background

We needed repeatable perf coverage for large-history workspace load flows, plus machine-readable artifacts that can run in CI/nightly and be compared over time.

Implementation

  • Added reusable CDP profiling helper (tests/e2e/utils/perfProfile.ts) that captures:
    • Chrome trace
    • CPU profile
    • metrics + heap usage
    • perf-summary.json
  • Added deterministic history fixture profiles (tests/e2e/utils/historyFixture.ts):
    • small, medium, large, tool-heavy, reasoning-heavy
  • Added perf scenario (tests/e2e/scenarios/perf.workspaceOpen.spec.ts) that:
    • seeds a selected profile
    • opens workspace
    • writes artifacts under artifacts/perf/**
    • verifies React profiler data exists for interesting render paths
  • Added React profiling collector (src/browser/utils/perf/reactProfileCollector.ts) and preload/browser flag plumbing (MUX_PROFILE_REACT) so e2e can collect React render data.
  • Instrumented ChatPane with perf markers for key render paths:
    • chat-pane
    • chat-pane.header
    • chat-pane.transcript
    • chat-pane.input
  • Added Make target:
    • make test-e2e-perf
  • Added manual/scheduled workflow:
    • .github/workflows/perf-profiles.yml

Validation

  • make fmt
  • make typecheck
  • make lint
  • make static-check
  • MUX_E2E_PERF_PROFILES=small xvfb-run -a make test-e2e-perf
    • verified pass and confirmed React profile includes all render-path IDs above.

Risks

Low-to-moderate. Changes are mostly test/perf infrastructure. Runtime impact is gated behind window.api.enableReactPerfProfile / MUX_PROFILE_REACT, so normal user flows should remain unaffected.


Generated with mux • Model: openai:gpt-5.3-codex • Thinking: xhigh • Cost: $0.00

@ammar-agent
Copy link
Collaborator Author

Perf follow-up ideas from the latest automated workspace-open profiles (prioritized for "instant open" UX):

  1. Coalesce caught-up updates into one render pass

    • In WorkspaceStore caught-up handling, we currently do multiple post-replay bumps/derived updates in sequence.
    • Batch UI invalidation and defer non-critical work (usageStore.bump, consumer recalc) to idle.
  2. Stabilize per-row props in ChatPane render loop

    • taskReportLinking/group/nav-derived props can cause broad MessageRenderer churn.
    • Only pass these props to rows that actually need them; keep row props referentially stable.
  3. Precompute bash output group metadata once per message array

    • Replace per-row backward/forward scans (computeBashOutputGroupInfo) with one linear preprocessing pass.
  4. Defer non-critical header work

    • WorkspaceHeader side effects (skills diagnostics + misc listeners/UI state) contend with first paint.
    • Move optional work behind idle/post-first-paint gates.
  5. Replay history as batches, not per-message events

    • agentSession.emitHistoricalEvents currently emits replay events one-by-one.
    • Add a batched replay payload to cut IPC/event pressure during workspace open.
  6. Paginate history for open path (latest-N first)

    • Load recent messages immediately; hydrate older epochs on demand while scrolling.
    • This is likely the biggest lever for consistent sub-100ms perceived open time at scale.
  7. Virtualize transcript + lazy markdown hydration

    • Render only visible rows and defer heavy markdown/tool rendering offscreen.
    • Important once histories/tool outputs grow.

I can implement items 1–3 first in a focused follow-up to get measurable improvement with low risk.

Add automated CDP + React render profiling for workspace-open e2e perf scenarios.

- add deterministic history profile fixtures (small/medium/large/tool-heavy/reasoning-heavy)
- add reusable Chrome trace/CPU/metrics/heap artifact capture helpers
- add perf workspace-open scenario with artifact output + assertions
- add React render-path profiling for chat-pane/header/transcript/input
- add Makefile target and scheduled/manual workflow for perf collection

---

_Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking: `xhigh` • Cost: `$0.00`_

<!-- mux-attribution: model=openai:gpt-5.3-codex thinking=xhigh costs=0.00 -->
@ammario ammario merged commit cc86325 into main Feb 13, 2026
23 checks passed
@ammario ammario deleted the perf-testing-d6tn branch February 13, 2026 16:16
ammario pushed a commit that referenced this pull request Feb 14, 2026
## Summary

Follow-up perf pass for workspace-open rendering that reduces
critical-path work in the transcript UI and message utilities. This
improves startup paint times for loaded workspaces, especially on long
tool-heavy histories.

## Background

PR #2397 added automated perf profiling for workspace-open flows. Those
artifacts showed recurring renderer-side hot spots during initial
transcript paint (message-row prop churn, per-row grouping scans, and
unnecessary live subscriptions after completion).

## Implementation

- **ChatPane render-path reductions**
- Precompute `bash_output` grouping once per message snapshot
(`computeBashOutputGroupInfos`) instead of per-row scans.
  - Pass `taskReportLinking` only to `task` / `task_await` rows.
- Precompute stable `userMessageNavigation` objects by `historyId` so
non-message state bumps stop invalidating row props.
- **Workspace caught-up sequencing**
- Defer the caught-up `usageStore.bump()` to idle (`requestIdleCallback`
with timeout fallback) so initial transcript paint is prioritized.
- **Reasoning row rendering**
  - Use a plain truncated summary line in collapsed headers.
- Avoid rendering full collapsed markdown body unless
expanded/streaming.
- **Bash row subscriptions**
- Gate live-output / latest-streaming / foreground-id subscriptions to
rows that still need live state (executing or
completed-without-final-output).
- **Transcript truncation policy**
  - Reduce `MAX_DISPLAYED_MESSAGES` from 128 → 64.
- Preserve user prompts + structural markers by default; allow older
assistant/tool/reasoning rows to stay behind `history-hidden` until
“Load all”.
- **Desktop dist/e2e stability fix**
- Replace `@/...` alias imports in `src/desktop/main.ts` with relative
imports so dist runs do not emit unresolved alias `require()` calls.

## Validation

- `bun test src/browser/utils/messages/messageUtils.test.ts
src/browser/utils/messages/StreamingMessageAggregator.test.ts
src/browser/stores/WorkspaceStore.test.ts`
- `make static-check`
- Perf scenario (dist build path):
- `MUX_E2E_RUN_PERF=1 MUX_PROFILE_REACT=1 MUX_E2E_LOAD_DIST=1 xvfb-run
-a bun x playwright test tests/e2e/scenarios/perf.workspaceOpen.spec.ts
--project=electron --workers=1`

### Perf wall-time (isolated profile runs)

| Profile | Before (ms) | After (ms) | Delta |
|---|---:|---:|---:|
| small | 211 | 202 | -4.3% |
| medium | 212 | 186 | -12.3% |
| large | 244 | 200 | -18.0% |
| tool-heavy | 252 | 185 | -26.6% |
| reasoning-heavy | 174 | 171 | -1.7% |

## Risks

- **Transcript visibility tradeoff (intentional):** older assistant rows
can now be omitted behind `history-hidden` by default; users can restore
with “Load all”.
- **Reasoning header formatting tradeoff (intentional):** collapsed
summary is plain text (no markdown formatting) to avoid markdown render
cost in headers.
- **Bash subscription gating:** completed rows now skip live
subscriptions unless needed; behavior still preserves executing and
completed-without-output cases.

---

_Generated with `mux` • Model: `openai:gpt-5.3-codex` • Thinking:
`xhigh` • Cost: `$4.95`_

<!-- mux-attribution: model=openai:gpt-5.3-codex thinking=xhigh
costs=4.95 -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants