[FOLLOW-UP — parked] Perf instrumentation, harness, cross-commit report by SuuBro · Pull Request #579 · SuuBro/bobbit

SuuBro · 2026-05-13T22:20:06Z

Parked as a follow-up. The user-facing perf win + the dropped-keystroke fix were extracted into a much smaller, targeted PR: #584 (`feat/defer-offscreen-render`). This branch contains the supporting infrastructure that enabled the discovery and would let future engineers reproduce / extend the analysis:

Client perf-trace primitive (`src/app/perf-trace.ts`)
Server `[timing]` log extension
Sidebar/render/api/ws instrumentation hooks
Manual harness `tests/manual-integration/perf-sidebar-nav.spec.ts` with realistic-corpus-tuned fixture
Cross-commit comparison report (`scripts/perf-{bench,report,progression}.mjs` + `docs/perf/sidebar-nav-report.html`)
Phase 3 budget E2E test
Real-session JSONL corpus profile
`docs/perf/HOW-TO-REPEAT.md` workflow doc
`docs/perf/sidebar-nav-baseline.md` with all the postmortems for the experiments that didn't pay off (Opt-B / C / D / F / G / H)

Not for merge as-is. Either:

Land a curated subset (e.g. just the perf-trace primitive + server timing) when there's appetite for ongoing perf work, or
Close this PR if the team prefers to keep master lean and revive the harness from scratch when the next perf goal arrives.

🤖 Generated with Bobbit

- src/app/perf-trace.ts: tiny client-side span/mark primitive with ring buffer, cost-when-disabled invariant (no-op singleton handle), localStorage / ?perf=1 opt-in, window.__bobbitPerf surface. Pinned by tests/perf-trace.spec.ts (12 tests, including 100k startSpan heap-growth check). - src/app/perf-flags.ts: feature-flag helper for Phase 2 experiments. - Instrumentation hooks (Phase 1 owner — instrumentation only): - main.ts: 'app.boot' mark as first statement. - api.ts: gatewayFetch wrapper dispatches api.session.fetch / api.goal.fetch / api.goal.gates.fetch / api.goal.agents.fetch / paint.tool-content.lazy by URL pattern. Cheap when perf disabled. - sidebar-nav.ts: nav.click + nav.session.ready/nav.goal.ready opened on openForNavItem(); pending span stashed on state. - routing.ts: closes nav.click on setHashRoute completion. - render.ts: paint.first span wrapping doRenderApp; ends pending nav span on next rAF once the destination view's sentinel is in the DOM (pi-chat-panel for session, wf-checklist-row/.gate-detail-panel/.tab-empty for goal). Sets data-perf-ready on #app for harness wait. - message-reducer.ts: reducer.rehydrate wraps the snapshot case. - state.ts: pendingNavSpan field. - src/server/server.ts: extended _timingEnabled block — always-on logging with BOBBIT_TIMING_LOG_MIN_MS threshold env var; response wrapper tallies bytes; per-request io counter bumped at entry to the five hot endpoints (GET /api/sessions/:id, /api/goals/:id, /api/goals/:id/gates, /api/goals/:id/team/agents, /api/sessions/:id/tool-content/:mi/:bi). Log format: '[timing] METHOD path Xms bytes=B io=N'. - tests/manual-integration/perf-sidebar-nav.spec.ts: Playwright harness — boots gateway, seeds 10 sessions + 1 goal via REST, drives cold/warm/goal passes, dumps client perf entries + server [timing] tail to .perf-out/ JSON + HTML report. Hard-fails (process.exit(1)) when any of the five canonical gate spans has zero samples. NOT in CI. Status: - npm run check + test:unit + test:e2e all green when last run. - Perf-trace unit suite (12 tests) passes. - Harness boots and produces api.* / reducer.rehydrate / paint.first samples but nav.click / nav.session.ready / nav.goal.ready don't fire yet because the sidebar row selectors don't match the seeded sessions (sessions land under an 'ungrouped' header that may need expansion, or the seeded REST sessions render in a sidebar shape the harness doesn't click into). Follow-up coder needs to: get the sidebar row click path working, then produce the docs/perf/sidebar-nav-baseline.md with real numbers, then build the cross-commit comparison report (docs/perf/history/ + scripts/ perf-report.mjs + docs/perf/sidebar-nav-report.html) per the scope addition. Harness exit-1-on-missing-spans invariant is intentional and protects against silently-broken instrumentation. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

…port Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

All five canonical gate spans produce non-zero samples on a single harness run. nav.session.ready p50 ~89ms is the dominant hotspot on click→ready; see docs/perf/sidebar-nav-baseline.md for the full ranking + reproduction recipe. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

…les) Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

…mit report

Static design reference for the cross-commit perf report at docs/perf/mockups/sidebar-nav-report.html. - Synthetic 8-commit data telling a mixed story (improvements, regressions, flat spans, a newly-appearing span). - Headlines strip surfaces top movers by |Δp50|. - Summary table grouped by nav / api / render with green / red tinted Δ pills and inline p50 sparklines. - Per-span trend cards with inline-SVG line charts (p50 solid, p95 dashed), auto-scaled Y axis, commit SHAs on X. Uses Bobbit CSS tokens only (--chart-1/4, --positive, --negative, surface tokens) with :root fallbacks for the preview-bridge HMR race per defaults/docs/html-rendering.md. No hardcoded colours, no prefers-color-scheme, no external libs. Also relax .gitignore's '*-report.html' rule (which silently covered docs/perf reports) by re-including docs/perf/**/*-report.html so the committed report stays under version control. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Bring the generated docs/perf/sidebar-nav-report.html in line with the static mockup at docs/perf/mockups/sidebar-nav-report.html: - Header line: 'Generated YYYY-MM-DD HH:MM from N commits' + range + branch. - Headlines strip: top 6 spans by |Δp50 ms|, classified good/bad/flat, with green/red border accent and tinted delta caption. - Summary table: grouped by category (nav / api / render) with sub-headers, rows sorted by |Δp50 ms| within each group; Δ cells rendered as tinted pills; per-span p50 sparkline column (skips gaps for missing samples). - Per-span charts: inline-SVG line charts with auto-scaled 'nice' Y range, 4 gridlines + tabular Y labels, p50 solid fill + line, p95 dashed line, hover <title> tooltips on every dot, evenly-spaced SHA ticks on X. - Runs table: sortable visual, latest row highlighted with '← latest' tag. - Empty / single-run states render a clean explanatory card instead of a misleading 'no data' table. - Classifier treats <1ms AND <5% movement as 'flat' so reducer.rehydrate doesn't flash red for sub-ms jitter. All theming via Bobbit CSS tokens with :root fallbacks for the preview-bridge HMR race (see defaults/docs/html-rendering.md). No hardcoded colours, no prefers-color-scheme, no external libs. Regenerated docs/perf/sidebar-nav-report.html against the single existing history entry (commit 999bdc2) is included so the in-repo report matches the new generator. Re-running the manual harness will refresh it. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Scaffolds tests/e2e/ui/perf-sidebar-nav.spec.ts. Drives one cold session load (for api.session.fetch + reducer.rehydrate), one warm sidebar-row click (for nav.session.ready), and one warm goal-dashboard click (for nav.goal.ready + api.goal.fetch). Reads window.__bobbitPerf.entries() and asserts each of the five canonical gate spans has at least one sample below a generous regression-net budget derived from docs/perf/sidebar-nav-baseline.md (commit 999bdc2). Budgets are inflated ~10-100x p95 so transient CI slowness never trips the assert; a real regression still trips. Each budget cites its source baseline number inline. Skips cleanly (test.skip) if window.__bobbitPerf is gated off so the test doesn't silently no-op. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Behind the new `lazyToolContent` perf flag: - Server: `GET /api/sessions/:id?stripToolContent=1` opts in to include a `messages` array with tool-call content blocks above a configurable threshold (default 4KB) replaced by the existing `{ _truncated, _originalLength, preview }` shape. The renderer + fetchToolContent flow already lazy-load via the existing `/tool-content/:mi/:bi` endpoint. Default response unchanged. - Strip helper: src/server/agent/strip-tool-content.ts. Pure data-shape function, referential-equality fast path when no strip is needed. - Client: gatewayFetch rewrites GET /api/sessions/:id to add `?stripToolContent=1` when the flag is on. Idempotent. - Pinning test: tests/session-strip-tool-content.test.ts (12 cases covering both tool_use and toolCall shapes, custom thresholds, referential equality, parseStripThreshold). - docs/perf/sidebar-nav-baseline.md: Phase 2B A/B section with run instructions and decision rule. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Closes the design-doc §2.2 gap flagged by implementation gate verification: the canonical `nav.session.cold` and `nav.goal.cold` spans were specified but never wired up. `mark("app.boot")` existed; no consumer did. main.ts now captures BOOT_T0 immediately after the boot mark and installs a MutationObserver on `#app` for `data-perf-ready` transitions. The first transition to "session" or "goal" records the corresponding cold-load perf span (with sessionId/goalId pulled from location.hash) and disconnects — it is a one-shot, only meaningful on hard refresh. Cheap when disabled: returns a noop disposer without installing the observer when `perfIsEnabled()` is false. Tests in `tests/perf-trace-cold-spans.spec.ts` extract the function from a transpile of main.ts (no bundling — main.ts is wired into the UI graph that parallel coders own) and exercise it in a real browser against the real perf-trace module. Covers: session/goal sentinels, one-shot behaviour, non-sentinel ignored, pre-set attribute synchronous emission, disabled path noop, null target, and hash-based detail capture. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Replace the empty-session warm/cold passes with a deterministic, seeded archived-session fixture so the harness measures reducer.rehydrate, api.session.fetch, and nav.session.ready against realistic transcript sizes — not the artificial messages:0 baseline. Mechanism (no src/ changes): - After project registration, stop the gateway, write N archived rows to <projectStateDir>/sessions.json pointing at synthetic JSONL files, restart. ProjectContext.SessionStore reads them on boot. - WS archived-attach (getArchivedMessages) parses the JSONLs and emits real messages frames, driving reducer.rehydrate with non-trivial work. - Warm pass drives nav via window.__bobbitOpenForNavItem (the keyboard path) so nav.click + nav.session.ready fire identically for archived and live rows. Direct row clicks on archived sessions bypass openForNavItem (see render-helpers.ts:501). Fixture mix per session: ~50% user/assistant text, 5–10 tool_use + tool_result pairs, plus one >=50 KB tool-result blob (deterministic ASCII so JSONL sizes are stable across runs). BOBBIT_PERF_FIXTURE_SIZE = small | medium | large selects 10 / 50 / 200 messages per session. Default medium. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Append Realistic-fixture baseline section to sidebar-nav-baseline.md. Key findings from the seeded archived fixture (medium = 50 msgs/session, large = 200 msgs/session): - reducer.rehydrate is decisively NOT a hotspot. 0.2ms p50 at medium, 0.4ms p50 at large, max 3.6ms across all runs. The Phase 1 candidate 'LRU-cache reducer state by session id' can be deprioritised. - paint.first is the new transcript-scaling hotspot: p95 = 27.5ms medium → 103ms large; max 73ms → 177ms. Synchronous markdown / syntax-highlight render of the whole transcript on click dominates at scale. - nav.session.ready p50 stays under the 100ms snappy threshold at medium (34.1ms) but p95 clears it at large (208ms), and the driver is paint.first scaling — that is the real perceived-snappiness lever. - Doc explicitly flags the live-vs-archived caveat: archived attach is lighter than live (no rpcClient, no event-buffer subscribe), so absolute numbers improve vs Phase 1's empty-live baseline. Once Phase 2B lazy- tool-content lands the harness should add a live-fixture pass. Cross-commit JSON files: docs/perf/history/c25e40be730b.json (medium, canonical) docs/perf/history/c25e40be730b-large.json (stress) Harness side-tweak: history filename now suffixes non-medium fixture sizes so multiple runs at the same SHA don't overwrite each other. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Co-authored-by: bobbit-ai <bobbit@bobbit.ai> # Conflicts: # docs/perf/sidebar-nav-baseline.md

Read-only profiling of ~1.94K agent-CLI session transcripts under ~/.bobbit/agent/sessions. Produces docs/perf/real-session-profile.md covering corpus stats, message-type and role distribution, per-tool result-byte distribution, top-10 large-blob shapes, and concrete recommendations + anti-recommendations for buildRealisticJsonl() in tests/manual-integration/perf-sidebar-nav.spec.ts. Adds scripts/perf-profile-real-sessions.mjs, a one-shot Node helper that emits the underlying JSON aggregates. Filters out e2e/manual/ observe/restart-harness fixture directories. No source under src/ or tests/ touched. No PII or raw transcript content included in the report. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

- New docs/perf/README.md: orientation, harness env vars, opt-in flags, cross-commit report, workflow for adding optimisations. - docs/debugging.md: new 'Sidebar nav feels slow' walkthrough next to Render performance, linking to perf docs. - AGENTS.md: one architecture-map bullet + footer link to docs/perf/README.md. - docs/perf/sidebar-nav-baseline.md: one-line cross-link to README at top. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Adds 4 rapid-nav sub-passes (cached/uncached × 150ms/50ms cadence) that fire the canonical Ctrl+ArrowDown shortcut without awaiting the previous nav's sentinel. Derives rapidnav.keystroke.{cached,uncached}, rapidnav.gap, and rapidnav.stall.ms spans from the existing nav.session.ready / nav.goal.ready entries. Fixture seed count bumped from 10 to 32 with disjoint zones so each cadence pass gets 10 run-wide-fresh rows on lap 1 and 10 cached rows on lap 2 with no boundary contamination. §5.6 verdict: walking the sidebar with Ctrl+Down does NOT feel smooth - median keystroke→ready 100-170ms across all 8 cells, with no path under the 16.7ms one-frame budget. Render-side cost (paint.first) dominates even the cached path; cache misses add only ~10-40ms p50. Opt-A (defer off-screen paint) becomes the headline target. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

…v measurement

…l+Down keystrokes `getActiveNavId` previously discarded `state.keyboardNavActiveId` whenever the URL hash hadn't caught up to the override's expected hash. Session navigation goes through an async dynamic import + connectToSession, so rapid Ctrl+Down keystrokes landing on a live session at the top of the sidebar (~200ms attach) would each fall back to a cold start in `navigateSidebar` and re-open the same row, eating 3-4 keystrokes during the attach window. The override is installed synchronously by `openForNavItem` and reflects the most recent user intent. `installKeyboardNavOverrideClearListener` continues to clear it on any subsequent hashchange whose URL doesn't match the override, so staleness is bounded. Pinned by tests/rapid-keystroke-nav.spec.ts: - behavioural mirror with buggy + fixed `getActiveNavId` proves the drop pre-fix and 10-for-10 distinct rows post-fix - source-level grep asserts the buggy `window.location.hash === expected` gate doesn't get reintroduced Updates docs/perf/sidebar-nav-baseline.md §5.6 with before/after numbers. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Phase 2 Opt-A — target paint.first p95 on large transcripts (medium p95 27.5ms → large p95 103ms, max 177ms per docs/perf/sidebar-nav-baseline.md §5.4). Synchronous markdown / syntax-highlight render of every message dominates first paint when the session has 100+ messages; rendering only the bottom-tail synchronously and deferring the rest via IntersectionObserver + requestIdleCallback should cut large-fixture paint.first p95 dramatically without affecting median. - New <deferred-block> Lit element wraps each transcript item when the flag is on. Eager items (last 8 in <message-list>) render inline; the rest render a height-preserving placeholder until IO (rootMargin 500px) fires and rIC swaps in the real template. - Ctrl+F / Cmd+F / F3 trigger DeferredBlock.forceResolveAll() so native browser-find sees the full transcript. - Perf-flag OFF path is unchanged (no <deferred-block> wrapper at all). - 7 new unit tests under tests/defer-offscreen-render.spec.ts pin the eager path, deferred-then-intersect resolve, Ctrl+F escape hatch, and the perf-flag-OFF historical behaviour. --trailer Co-authored-by: bobbit-ai <bobbit@bobbit.ai> Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Postmortem retained in docs/perf/sidebar-nav-baseline.md §6.1. Root cause is architectural: REST is metadata-only and the transcript ships over WS, so ?stripToolContent=1 doubles the work for negative gain. Fixing properly is out of scope for this goal. - DELETE src/server/agent/strip-tool-content.ts - DELETE tests/session-strip-tool-content.test.ts - src/server/server.ts: drop ?stripToolContent=1 parsing + invocation - src/app/api.ts: drop _maybeLazyToolContent URL rewrite; keep perf-trace dispatch and Opt-C prefetch logic intact - src/app/perf-flags.ts: remove lazyToolContent registry entry + PERF_FLAG_LAZY_TOOL_CONTENT const Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

n=5 confirmed within-noise; original "win" was a cold-cache artefact. Postmortem retained in docs/perf/sidebar-nav-baseline.md §6.3. - Restore loadDashboardData to pre-Opt-D 7-fetch Promise.all + sequential getTeamState await. - Drop src/app/goal-dashboard-fetches.ts helper. - Drop parallelGoalFetches entry + PERF_FLAG_PARALLEL_GOAL_FETCHES const from src/app/perf-flags.ts. - Drop tests/parallel-goal-fetches.spec.ts + fixtures. - Drop one-off analysers scripts/opt-d-analyse.mjs + scripts/opt-c-summary.mjs (their aggregated results live in the cross-commit report). Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Per task spec, append a one-line revert note pointing at the revert commit. Postmortem stays intact. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

n=5 confirmed only ~19ms median gain on nav.session.ready, far below the 100ms ship bar. Removing the cache + listener complexity per docs/perf/sidebar-nav-baseline.md §6.2 disposition revisited. - src/app/api.ts: remove prefetchUrl/Session/Goal + 20-entry LRU + the _consumePrefetch consultation in gatewayFetch. Phase 1 perf-trace URL-dispatch and Opt-B's _maybeLazyToolContent are untouched. - src/app/sidebar.ts: remove installSidebarPrefetchListener and its pointerover/focusin delegated handler. - src/app/main.ts: remove the prefetch listener install call. - src/app/perf-flags.ts: drop the prefetchOnHover registry entry and PERF_FLAG_PREFETCH_ON_HOVER const. - tests/prefetch-on-hover.spec.ts: deleted. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Co-authored-by: bobbit-ai <bobbit@bobbit.ai> # Conflicts: # src/app/api.ts # src/app/perf-flags.ts

Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

…ss-commit report Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Co-authored-by: bobbit-ai <bobbit@bobbit.ai> # Conflicts: # docs/perf/sidebar-nav-baseline.md

Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Add <deferred-code-block> wrapper that renders a plain <pre><code> placeholder synchronously and upgrades to a real <code-block> (which runs hljs.highlight()) on requestIdleCallback (200ms timeout fallback to setTimeout(0)). Transcript-path renderers now go through codeBlock(code, lang) helper: flag OFF emits <code-block> directly (byte-identical to today); flag ON emits the deferred wrapper. Eager-tail messages (Opt-A) can paint their visible code blocks immediately as plain monospace, freeing the click \xe2\x86\x92 first-paint critical path of hljs work. Files: - src/ui/components/syntax-highlight.ts (new, owns the element + helper) - src/app/perf-flags.ts (deferSyntaxHighlight, default OFF) - Swaps in Messages.ts + all transcript tool renderers - Unit test tests/defer-syntax-highlight.spec.ts (4 cases) Artifact viewers (src/ui/tools/artifacts/*) still call hljs directly via unsafeHTML; they sit on a separate panel off the sidebar-nav critical path and are out of scope. Not touched: MessageList.ts (Opt-H), DeferredBlock.ts (Opt-A frozen), src/app/* other than perf-flags.ts. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

…for bench spawn Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Replaces Opt-A's fixed DEFER_EAGER_TAIL = 8 with a viewport-driven eager set when the new virtualiseTail perf flag is on: walk items bottom-up, accumulating estimateMessageHeight, eager only the bottom-most messages whose cumulative height fills window.innerHeight (plus the one that partially overflows the top edge). On a 1280x800 desktop with chunky messages (~400px each) this is 2-3 eager messages instead of 8 -> fewer synchronous renders at first paint. OFF path is byte-for-byte the Opt-A baseline (verified by existing defer-offscreen-render.spec.ts). New tests/virtualise-tail.spec.ts pins: - flag-on, 200 fat msgs, 800px viewport -> 2 eager / 198 placeholders - flag-on, short transcript -> all eager - flag-on, single message taller than viewport -> bottom-most stays eager - flag-off -> 8 eager regardless of message size Append-only flag entry in src/app/perf-flags.ts (default-OFF, experiment). A/B benchmarking still to run. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Opt-C revert (commit 1fd1563) renamed the BOBBIT_PERF_FLAGS local from perfFlagsArg to perfFlagsCsv but missed the wantHoverWarmup reference on line 843, leaving a ReferenceError that fails every harness run. Blocks all parallel Phase 2 A/B experiments (Opt-F WS-attach, Opt-G defer-highlight, Opt-H virtualise-tail) on the goal branch. One-token rename. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

…apper Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

A/B'd at commit 6edd880 (large fixture, n=5 per arm, interleaved). Hypothesis was that Opt-A's fixed eager-tail of 8 was still over-eager and shrinking it to a viewport-driven 2-3 would cut paint.first further. Data says no: span OFF p95 ON p95 Δmed paint.first 57.7 56.2 −1.5 nav.session.ready 182.3 188.9 +6.6 rapidnav.keystroke.cached 174.9 170.0 −4.9 nav.session.cold p50 330.6 315.1 −15.5 Largest delta is −1.5ms on paint.first p95 with fully overlapping replicate ranges. No critical span moves ≥100ms or past the 100ms snappy threshold. Postmortem in docs/perf/sidebar-nav-baseline.md §6.4 explains why: Opt-A's win came from collapsing the 190+ off-screen messages to placeholders (200 → 8); shaving 8 → 2-3 is rounding-error because the Lit reconciler + IO bookkeeping over 200 wrappers dominates whatever per-message render cost we save in the tail. Files reverted: src/app/perf-flags.ts — flag entry + const removed src/ui/components/MessageList.ts — back to DEFER_EAGER_TAIL = 8 Files deleted: tests/virtualise-tail.spec.ts tests/fixtures/virtualise-tail-{entry.ts,.html} .gitignore entries for the test bundle Data retained: docs/perf/history/6edd880cb47b-opt-h-{off,on}-{1..5}.json as the durable record behind the postmortem. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

…ints n=5 replicates each on the canonical realistic-large fixture, SHA d9750ca. step 0 (baseline, Opt-A off via -deferOffscreenRender): nav.session.ready p50 median 140.1ms paint.first p50 median 25.3ms rapidnav.keystroke.cached p50 median 140.1ms step 1 (+Opt-A, default flags): nav.session.ready p50 median 132.8ms paint.first p50 median 22.9ms rapidnav.keystroke.cached p50 median 133.5ms Future steps land via scripts/perf-progression.mjs --step N+1. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Co-authored-by: bobbit-ai <bobbit@bobbit.ai> # Conflicts: # docs/perf/sidebar-nav-report.html # tests/manual-integration/perf-sidebar-nav.spec.ts

Tried deferring `RemoteAgent.connect()` off the click → first-paint critical path, with pre-`auth_ok` send buffering and a small inline indicator on the chat panel. n=5 medium A/B at SHA 552fb246b4b4 (10 history JSONs under docs/perf/history/). Headline numbers (median across replicates): nav.session.ready p50: 115 → 107 ms (−8 ms, within noise) nav.session.ready p95: 156 → 171 ms (+15 ms, within noise) paint.first p50: 25 → 23 ms (noise) ws.attach p50: 46 → 59 ms (+13 ms, opposite direction) rapidnav.keystroke.cached p50: 131 → 117 ms (−13 ms, within noise) No span clears the ≥100 ms p50 reduction bar, none move from >100 ms to <100 ms, and all median deltas sit inside the per-arm min/max ranges (i.e. inside the noise floor). Why the hypothesis missed: `connectToSession()` already constructs the ChatPanel and calls `renderApp()` BEFORE `await remote.connect()`, so the `nav.session.ready` sentinel (`pi-chat-panel` committed + `appView === 'authenticated'`) closes on the first paint and never sees ws.attach on its critical path. Removing the await can't move a span that didn't include it. Per the HOW-TO-REPEAT §7 discipline: - Reverted all src changes (remote-agent.ts, session-manager.ts, perf-flags.ts entry + const). - Deleted the unit test (tests/defer-ws-attach.spec.ts + fixtures/defer-ws-attach.html); the plumbing is gone, so the test has nothing to pin. - Kept the 10 history JSONs + §6.4 postmortem in docs/perf/sidebar-nav-baseline.md as the durable record. - docs/perf/sidebar-nav-report.html regenerated by the harness with the opt-f-{off,on} A/B pair included. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

Co-authored-by: bobbit-ai <bobbit@bobbit.ai> # Conflicts: # docs/perf/sidebar-nav-baseline.md # docs/perf/sidebar-nav-report.html

…ure) A/B at ea634d6, n=5, realistic-large fixture (200 msgs x 32 sessions). Critical-span p50 medians: nav.session.ready 127.0 -> 116.4 (-10.6 ms, within noise) nav.session.cold 332.5 -> 318.2 (-14.3 ms, within noise) nav.goal.ready 31.4 -> 32.3 (+0.9 ms, noise) nav.goal.cold 1752.8 -> 1753.2 (+0.4 ms, noise) paint.first 23.3 -> 22.9 (-0.4 ms, noise) rapidnav.keystroke.cached 135.4 -> 126.0 (-9.4 ms, within noise) rapidnav.keystroke.uncached 136.6 -> 138.4 (+1.8 ms, noise) Largest critical-span move is -14 ms (nav.session.cold), an order of magnitude below the >=100 ms p50 threshold from HOW-TO-REPEAT section 5. Every delta sits inside the run-to-run noise floor (off/on ranges overlap on every row). No span crosses the 100 ms snappy threshold under either arm. The report marks all four opt-g pair-rows as "within noise". Why the theoretical win didn't materialise: Opt-A already defers off-screen messages behind an IntersectionObserver, so the bulk of code-block density (which lives off-screen in realistic transcripts) is already deferred. The eager-tail messages that render synchronously are dominated by markdown / DOM layout cost, not hljs tokenisation -- paint.first p50 is flat +/-0.4 ms across arms. Opt-G stacked deferral on deferral and ran out of marginal wins on the metric the harness keys the decision on. Reverts 5f0fdeb (Opt-G implementation). Keeps: - docs/perf/history/ea634d62dc3b-opt-g-{off,on}-{1..5}.json (10 history JSONs -- durable evidence behind the decision) - docs/perf/sidebar-nav-baseline.md section 6.6 -- postmortem - regenerated docs/perf/sidebar-nav-report.html (now 4 A/B pairs) Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

…532b0 Co-authored-by: bobbit-ai <bobbit@bobbit.ai> # Conflicts: # .gitignore

SuuBro and others added 30 commits May 13, 2026 21:19

merge: Phase 1 instrumentation + baseline

8ccbde6

feat(perf): harness expand ungrouped, ws.attach span, cross-commit re…

0179e20

…port Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

perf(harness): re-seed warm sessions, widen goal sentinel

999bdc2

Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

docs(perf): drop pre-fix history entry (broken nav.session.ready samp…

9b726ff

…les) Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

merge: Phase 1 finishing — harness fixes, baseline numbers, cross-com…

e87c70f

…mit report

merge: polished cross-commit perf report

b5b341b

merge: Phase 3 budget E2E test scaffold

05a923d

merge: Phase 2B lazy tool-content optimisation behind feature flag

e0b4527

merge: nav.session.cold / nav.goal.cold spans

fedeeb8

merge: Phase 2A richer transcript fixture + re-baseline

4e7ee12

Co-authored-by: bobbit-ai <bobbit@bobbit.ai> # Conflicts: # docs/perf/sidebar-nav-baseline.md

merge: real-session JSONL profile

0479d4c

merge: perf documentation pass

be4da3e

Tune perf fixture to real-session corpus profile

5309f93

Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

docs: regenerate cross-commit perf report with realistic-fixture runs

294bd68

Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

merge: tune perf fixture to real-session profile + Ctrl+Down rapid-na…

d6585b4

…v measurement

merge: Opt-E fix dropped-keystroke bug on rapid Ctrl+Down

975c267

SuuBro and others added 27 commits May 14, 2026 08:49

merge: fix Opt-A test regressions

2939e0b

merge: revert Opt-B (lazyToolContent)

b74d2ab

docs(perf): mark Opt-D §6.3 as code-reverted

6332d0b

Per task spec, append a one-line revert note pointing at the revert commit. Postmortem stays intact. Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

merge: revert Opt-D (parallelGoalFetches)

854826a

docs(perf): note Opt-C code revert at 26d9132 in postmortem

1fd1563

Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

merge: revert Opt-C (prefetchOnHover)

b34de6d

Co-authored-by: bobbit-ai <bobbit@bobbit.ai> # Conflicts: # src/app/api.ts # src/app/perf-flags.ts

docs(perf): delete reverted-experiment history JSONs + add HOW-TO-REPEAT

691b8ef

Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

docs(perf): add §9 HOW-TO-REPEAT link, §6.x revert footers, regen cro…

da66faf

…ss-commit report Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

merge: HOW-TO-REPEAT doc + history cleanup + regen report

8fe64a9

Co-authored-by: bobbit-ai <bobbit@bobbit.ai> # Conflicts: # docs/perf/sidebar-nav-baseline.md

docs(perf-report): label p50/p95 lines in legend, add y-axis label

2cc6837

Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

merge: Opt-G defer syntax highlighting A/B

0f5e5f4

fix(perf): remove dead Opt-C hover-warmup; use shell:true on Windows …

2868463

…for bench spawn Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

perf(report): add Shipped Progression panel + perf-progression.mjs wr…

d9750ca

…apper Co-authored-by: bobbit-ai <bobbit@bobbit.ai>

merge: shipped-progression panel

0848b64

merge: Opt-H virtualise eager-tail A/B

e9079a5

Co-authored-by: bobbit-ai <bobbit@bobbit.ai> # Conflicts: # docs/perf/sidebar-nav-report.html # tests/manual-integration/perf-sidebar-nav.spec.ts

merge: Opt-F defer WS attach A/B

ea634d6

Co-authored-by: bobbit-ai <bobbit@bobbit.ai> # Conflicts: # docs/perf/sidebar-nav-baseline.md # docs/perf/sidebar-nav-report.html

merge: Opt-G A/B verdict

e3e03a1

SuuBro mentioned this pull request May 14, 2026

perf: defer off-screen transcript render + fix dropped Ctrl+Down keystroke #584

Merged

SuuBro changed the title ~~Profile sidebar nav perf — instrumentation, baseline, cross-commit report~~ [FOLLOW-UP — parked] Perf instrumentation, harness, cross-commit report May 14, 2026

Merge remote-tracking branch 'origin/master' into goal/profile-si-320…

2f989ba

…532b0 Co-authored-by: bobbit-ai <bobbit@bobbit.ai> # Conflicts: # .gitignore

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FOLLOW-UP — parked] Perf instrumentation, harness, cross-commit report#579

[FOLLOW-UP — parked] Perf instrumentation, harness, cross-commit report#579
SuuBro wants to merge 76 commits into
masterfrom
goal/profile-si-320532b0

SuuBro commented May 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SuuBro commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SuuBro commented May 13, 2026 •

edited

Loading