Skip to content

feat(demo): record the hero + Codex/Claude demos beside a live dashboard#116

Merged
ThomasK33 merged 10 commits into
mainfrom
feat/record-with-live-dashboard
Jun 3, 2026
Merged

feat(demo): record the hero + Codex/Claude demos beside a live dashboard#116
ThomasK33 merged 10 commits into
mainfrom
feat/record-with-live-dashboard

Conversation

@ThomasK33
Copy link
Copy Markdown
Member

Summary

Reworks both demo recordings so they show a coding agent and the live agent-tty dashboard side by side in a tmux split — the agent drives a session on the left, the dashboard mirrors it on the right. Covers the README hero GIF (assets/hero.tape) and the dogfood/agent-uses-agent-tty/ Codex/Claude proof bundle, regenerates the bundle, and embeds the recordings as inline GitHub <video> players.

Builds on the already-merged read-only Session Dashboard (#113).

What's in here

Recordings as a tmux split (agent ⟷ live dashboard)

  • Both panes share one AGENT_TTY_HOME so the dashboard auto-follows the newest session; status bar off so VHS's whole-screen Wait+Screen scrape stays unambiguous; each run uses an isolated, reaped tmux server socket.
  • The hero hides the split plumbing and launches the dashboard on camera (types agent-tty dashboard in the right pane, hops back with the tmux prefix), and showcases deterministic wait --regex on output text rather than sleeping. Panes/session run bash --norc with a minimal prompt so the mirror stays free of personal shell-prompt clutter.
  • Runs against this checkout's freshly-built dev CLI, since agent-tty dashboard is unreleased.

Dashboard polish

  • Session list scales with the terminal and shows full session IDs when width permits (src/dashboard/sessionListLayout.ts), and a dark-terminal-friendly title color.

Recording workflow

  • mise run demo:agent-uses-agent-tty now defaults to a full promote run (both agents ×3, 180s) — no more forgetting --agent both --runs 3 --promote.
  • Runs record concurrently via a bounded worker pool (--concurrency, default 2 = Codex + Claude overlapping). Each run is mostly an idle review-window sleep, so overlapping roughly halves wall-clock; same-agent attempts stay serialized so two sessions of one account never record at once.
  • New mise run demo:hero task (depends on build); tmux added as a recorder prerequisite; ttyd OS-gated to Linux/CI (macOS uses brew install ttyd).

Video embedding

  • Both READMEs embed the uploaded H.264 MP4s as inline <video> players. renderReadme() is restored to emit <video> (it had regressed to thumbnail links, breaking apply-video-urls), and the upload encoder now targets the recording's probed resolution instead of a hardcoded 1600×900 (which was squishing the 1920×900 recordings).
  • Generated bundle outputs are excluded from oxfmt (integrity is still enforced by validate-bundles).

Testing

  • typecheck, lint, format:check, unit tests (incl. new sessionListLayout, mapWithConcurrency, renderReadmeapply-video-urls contract, and posterConcatFilter tests), and validate-bundle:canonical all pass locally.
  • The live recordings themselves were regenerated and reviewed by hand.

Manual verification still needed

Per dogfood/agent-uses-agent-tty/VIDEO_PLAYBACK.md, open this branch's README while logged out of GitHub and confirm both <video> elements stream (don't navigate to the user-attachments/... URL directly — it 404s anonymously by design).

🤖 Generated with Claude Code

ThomasK33 and others added 10 commits June 3, 2026 13:29
The README hero and the Codex/Claude recordings now record inside a tmux
two-pane split: the agent (or, in the hero, plain agent-tty CLI calls)
drives a session on the left while `agent-tty dashboard` live-mirrors it
on the right.

- hero.tape: hidden tmux split with clean `bash --norc` panes, the
  dashboard launched on camera (typed in the right pane), a
  deterministic-wait showcase (`run --no-wait` + `wait --regex` for the
  printed digits, so it blocks on real output not the echoed command), a
  50/50 split, and a hidden teardown so the GIF ends on the split.
- hero-demo.ts (Codex/Claude): generateTape builds the split off-camera
  so the recording opens directly on it; a preflight asserts the dashboard
  renderer is installed, runOne asserts the dashboard actually rendered,
  and the per-run tmux server is reaped after VHS.
- dashboard: the title is now bold cyan instead of a white-on-blue bar,
  which washed out on dark themes that remap ANSI blue to a light shade.
- mise: pin tmux; add a `demo:hero` task (depends on `build`); ttyd is
  os-gated to Linux (no macOS binary — install via brew on macOS).
- regenerate assets/hero.gif.

Change-Id: Ie9fa66e0ef2827bc82be053010a2641d57b2e009
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
… ids

The session list pane was a fixed 28 columns, so session ids always
showed as the truncated `…last9` form. It now scales with the terminal
width — floored at 28 so it stays usable at 80 columns, capped at 40 so
the Live View keeps the bulk of the screen — and shows the full 26-char
ULID once the list is wide enough to fit it, falling back to the compact
form on narrow terminals. The layout math lives in a pure
`sessionListLayout` module with unit tests.

Change-Id: I5dd85b246dbb43df80672674e8c0e7d997f68e33
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
`mise run demo:agent-uses-agent-tty` (no flags) did a single, non-promote
run that recorded into a temp debug dir and never updated the bundle — so
unless you happened to pass `--agent both --runs 3 --promote`, the run
went nowhere. The defaults are now the canonical regeneration
(`--agent both --runs 3 --record-seconds 180 --promote`), and a new
`--no-promote` flag turns it back into a quick test run
(e.g. `-- --no-promote --runs 1 --agent codex`). The promote-constraint
errors now point at `--no-promote`, and the generated reproduce.sh /
manifest command drop the now-default flags.

Change-Id: Ic30529d1e6b7fcbcf07b15aff8b11ccbeceeaf17
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
The Claude recording blacks out its top ~1/5 to hide Claude's account
header, but the box was full-width (`w=iw`). With the new tmux split that
also covered the dashboard's title bar in the right pane, so the Claude
recording looked like it had no dashboard (Codex, which gets no redaction,
showed it fine). Limit the box to the left ~(100 - DASHBOARD_PANE_PERCENT)%
(the agent pane) so Claude's header is still hidden while the dashboard
stays visible. Re-record to apply it to the bundle.

Change-Id: Ida1bc68846647ad27b20f33c3ad54c9cc38eab09
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
`mise run demo:agent-uses-agent-tty` writes README.md, the per-agent
`*-prompt.md` copies, promoted-run-summary.md, and manifest.json — whose
exact byte layout isn't oxfmt-canonical, so `format-check` failed after
every re-record unless you ran `npm run format` by hand. These are
generated proof artifacts (like dist/ and design/, already ignored), and
their integrity is enforced by `validate-bundles` (sha256 + bytes), so
format-checking them is redundant. Reformatting them *after* the promote
hashes them would also break those manifest hashes. Ignore them in oxfmt;
the hand-written VIDEO_PLAYBACK.md stays checked.

Change-Id: I4a20da2fe077876dee84710b3212a9c790701cbd
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Each agent run is mostly an idle review-window sleep, so the sequential
loop spent most of its ~30+ min wall-clock just waiting. Add a
--concurrency flag (default 2) and a bounded worker pool: the first
`runs` attempts per agent are interleaved across agents, so the default
overlaps one Codex + one Claude without ever running two sessions of the
same account at once.

The --promote retry budget (up to runs*2 attempts to reach the required
successes) is preserved as a batched top-up; same-agent retries stay
serial across rounds so concurrent sessions never pile onto one account.
Promotion selection stays deterministic (lowest passing index per agent).

Change-Id: I46652b51ad50817899c33832f4587b228417a6a7
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
The Claude outer WebM and thumbnail were re-encoded with an ffmpeg
`drawbox` over the top-left to hide the account header, which looked
odd in the split. Copy the Claude artifacts directly like Codex's
instead — no visual redaction. Text artifacts are still sanitized and
the promote leak check still runs, so committed transcripts/prompts
keep emails, home paths, and secrets scrubbed.

Change-Id: I586f4e72e5e50a85c9d3c5632150c94bf30767de
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Re-recorded Codex and Claude runs beside the live dashboard, with the
Claude pane now shown unredacted (no black bar). The manifest sha256/
bytes match the new artifacts (validate-bundle:canonical passes). Also
collapses the two demo-task run-commands in mise.toml onto one line.

Change-Id: I04c01dcb9a9365f9189ff8a0d94454e9c0ebe0f1
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
…E videos

Two issues made the videos render wrong after a bundle regeneration:

- renderReadme had regressed to a thumbnail-link table, so promote
  stripped the inline <video> elements that apply-video-urls rewrites,
  and the task failed with "expected 2 <video> src attributes, found 0".
  Restore the <table>/<video> structure (export renderReadme plus a
  cross-module contract test so it can't drift again).

- The upload encoder hardcoded a 1600x900 frame, silently squishing the
  now-1920x900 recordings. Encode at the recording's own probed
  resolution instead, so the upload always preserves the source aspect
  ratio. Update VIDEO_PLAYBACK.md and add a posterConcatFilter test.

Point both the root and bundle README <video> elements at the freshly
uploaded Codex/Claude user-attachment MP4s. validate-bundle:canonical
passes (README.md and VIDEO_PLAYBACK.md manifest entries refreshed).

Change-Id: I755a67729f51676b3c160f3ef4eb70d87a60a316
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
ttyd was re-added to mise.toml (Linux-gated) but never written to the
lockfile, so `mise install --locked` failed during "Set up mise" on
Linux CI ("ttyd@1.7.7 is not in the lockfile") — failing test-unit
before any test ran. Lock ttyd@1.7.7 for the four Linux platforms with
checksums from upstream SHA256SUMS (x86_64/aarch64). macOS stays
unlocked since ttyd ships no macOS binary (brew provides it there).

Change-Id: Ic5bc2315ce11444fa2f2a90f7528c9a0b3a0780c
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
@ThomasK33 ThomasK33 merged commit 11b3b16 into main Jun 3, 2026
12 checks passed
@ThomasK33 ThomasK33 deleted the feat/record-with-live-dashboard branch June 3, 2026 17:18
@ThomasK33 ThomasK33 mentioned this pull request Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant