feat(demo): record the hero + Codex/Claude demos beside a live dashboard#116
Merged
Conversation
The README hero and the Codex/Claude recordings now record inside a tmux two-pane split: the agent (or, in the hero, plain agent-tty CLI calls) drives a session on the left while `agent-tty dashboard` live-mirrors it on the right. - hero.tape: hidden tmux split with clean `bash --norc` panes, the dashboard launched on camera (typed in the right pane), a deterministic-wait showcase (`run --no-wait` + `wait --regex` for the printed digits, so it blocks on real output not the echoed command), a 50/50 split, and a hidden teardown so the GIF ends on the split. - hero-demo.ts (Codex/Claude): generateTape builds the split off-camera so the recording opens directly on it; a preflight asserts the dashboard renderer is installed, runOne asserts the dashboard actually rendered, and the per-run tmux server is reaped after VHS. - dashboard: the title is now bold cyan instead of a white-on-blue bar, which washed out on dark themes that remap ANSI blue to a light shade. - mise: pin tmux; add a `demo:hero` task (depends on `build`); ttyd is os-gated to Linux (no macOS binary — install via brew on macOS). - regenerate assets/hero.gif. Change-Id: Ie9fa66e0ef2827bc82be053010a2641d57b2e009 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
… ids The session list pane was a fixed 28 columns, so session ids always showed as the truncated `…last9` form. It now scales with the terminal width — floored at 28 so it stays usable at 80 columns, capped at 40 so the Live View keeps the bulk of the screen — and shows the full 26-char ULID once the list is wide enough to fit it, falling back to the compact form on narrow terminals. The layout math lives in a pure `sessionListLayout` module with unit tests. Change-Id: I5dd85b246dbb43df80672674e8c0e7d997f68e33 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
`mise run demo:agent-uses-agent-tty` (no flags) did a single, non-promote run that recorded into a temp debug dir and never updated the bundle — so unless you happened to pass `--agent both --runs 3 --promote`, the run went nowhere. The defaults are now the canonical regeneration (`--agent both --runs 3 --record-seconds 180 --promote`), and a new `--no-promote` flag turns it back into a quick test run (e.g. `-- --no-promote --runs 1 --agent codex`). The promote-constraint errors now point at `--no-promote`, and the generated reproduce.sh / manifest command drop the now-default flags. Change-Id: Ic30529d1e6b7fcbcf07b15aff8b11ccbeceeaf17 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
The Claude recording blacks out its top ~1/5 to hide Claude's account header, but the box was full-width (`w=iw`). With the new tmux split that also covered the dashboard's title bar in the right pane, so the Claude recording looked like it had no dashboard (Codex, which gets no redaction, showed it fine). Limit the box to the left ~(100 - DASHBOARD_PANE_PERCENT)% (the agent pane) so Claude's header is still hidden while the dashboard stays visible. Re-record to apply it to the bundle. Change-Id: Ida1bc68846647ad27b20f33c3ad54c9cc38eab09 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
`mise run demo:agent-uses-agent-tty` writes README.md, the per-agent `*-prompt.md` copies, promoted-run-summary.md, and manifest.json — whose exact byte layout isn't oxfmt-canonical, so `format-check` failed after every re-record unless you ran `npm run format` by hand. These are generated proof artifacts (like dist/ and design/, already ignored), and their integrity is enforced by `validate-bundles` (sha256 + bytes), so format-checking them is redundant. Reformatting them *after* the promote hashes them would also break those manifest hashes. Ignore them in oxfmt; the hand-written VIDEO_PLAYBACK.md stays checked. Change-Id: I4a20da2fe077876dee84710b3212a9c790701cbd Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
Each agent run is mostly an idle review-window sleep, so the sequential loop spent most of its ~30+ min wall-clock just waiting. Add a --concurrency flag (default 2) and a bounded worker pool: the first `runs` attempts per agent are interleaved across agents, so the default overlaps one Codex + one Claude without ever running two sessions of the same account at once. The --promote retry budget (up to runs*2 attempts to reach the required successes) is preserved as a batched top-up; same-agent retries stay serial across rounds so concurrent sessions never pile onto one account. Promotion selection stays deterministic (lowest passing index per agent). Change-Id: I46652b51ad50817899c33832f4587b228417a6a7 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
The Claude outer WebM and thumbnail were re-encoded with an ffmpeg `drawbox` over the top-left to hide the account header, which looked odd in the split. Copy the Claude artifacts directly like Codex's instead — no visual redaction. Text artifacts are still sanitized and the promote leak check still runs, so committed transcripts/prompts keep emails, home paths, and secrets scrubbed. Change-Id: I586f4e72e5e50a85c9d3c5632150c94bf30767de Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
Re-recorded Codex and Claude runs beside the live dashboard, with the Claude pane now shown unredacted (no black bar). The manifest sha256/ bytes match the new artifacts (validate-bundle:canonical passes). Also collapses the two demo-task run-commands in mise.toml onto one line. Change-Id: I04c01dcb9a9365f9189ff8a0d94454e9c0ebe0f1 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
…E videos Two issues made the videos render wrong after a bundle regeneration: - renderReadme had regressed to a thumbnail-link table, so promote stripped the inline <video> elements that apply-video-urls rewrites, and the task failed with "expected 2 <video> src attributes, found 0". Restore the <table>/<video> structure (export renderReadme plus a cross-module contract test so it can't drift again). - The upload encoder hardcoded a 1600x900 frame, silently squishing the now-1920x900 recordings. Encode at the recording's own probed resolution instead, so the upload always preserves the source aspect ratio. Update VIDEO_PLAYBACK.md and add a posterConcatFilter test. Point both the root and bundle README <video> elements at the freshly uploaded Codex/Claude user-attachment MP4s. validate-bundle:canonical passes (README.md and VIDEO_PLAYBACK.md manifest entries refreshed). Change-Id: I755a67729f51676b3c160f3ef4eb70d87a60a316 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Thomas Kosiewski <tk@coder.com>
ttyd was re-added to mise.toml (Linux-gated) but never written to the
lockfile, so `mise install --locked` failed during "Set up mise" on
Linux CI ("ttyd@1.7.7 is not in the lockfile") — failing test-unit
before any test ran. Lock ttyd@1.7.7 for the four Linux platforms with
checksums from upstream SHA256SUMS (x86_64/aarch64). macOS stays
unlocked since ttyd ships no macOS binary (brew provides it there).
Change-Id: Ic5bc2315ce11444fa2f2a90f7528c9a0b3a0780c
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Thomas Kosiewski <tk@coder.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reworks both demo recordings so they show a coding agent and the live
agent-tty dashboardside by side in a tmux split — the agent drives a session on the left, the dashboard mirrors it on the right. Covers the README hero GIF (assets/hero.tape) and thedogfood/agent-uses-agent-tty/Codex/Claude proof bundle, regenerates the bundle, and embeds the recordings as inline GitHub<video>players.Builds on the already-merged read-only Session Dashboard (#113).
What's in here
Recordings as a tmux split (agent ⟷ live dashboard)
AGENT_TTY_HOMEso the dashboard auto-follows the newest session; status bar off so VHS's whole-screenWait+Screenscrape stays unambiguous; each run uses an isolated, reaped tmux server socket.agent-tty dashboardin the right pane, hops back with the tmux prefix), and showcases deterministicwait --regexon output text rather than sleeping. Panes/session runbash --norcwith a minimal prompt so the mirror stays free of personal shell-prompt clutter.agent-tty dashboardis unreleased.Dashboard polish
src/dashboard/sessionListLayout.ts), and a dark-terminal-friendly title color.Recording workflow
mise run demo:agent-uses-agent-ttynow defaults to a full promote run (both agents ×3, 180s) — no more forgetting--agent both --runs 3 --promote.--concurrency, default 2 = Codex + Claude overlapping). Each run is mostly an idle review-window sleep, so overlapping roughly halves wall-clock; same-agent attempts stay serialized so two sessions of one account never record at once.mise run demo:herotask (dependsonbuild);tmuxadded as a recorder prerequisite;ttydOS-gated to Linux/CI (macOS usesbrew install ttyd).Video embedding
<video>players.renderReadme()is restored to emit<video>(it had regressed to thumbnail links, breakingapply-video-urls), and the upload encoder now targets the recording's probed resolution instead of a hardcoded 1600×900 (which was squishing the 1920×900 recordings).oxfmt(integrity is still enforced byvalidate-bundles).Testing
typecheck,lint,format:check, unit tests (incl. newsessionListLayout,mapWithConcurrency,renderReadme↔apply-video-urlscontract, andposterConcatFiltertests), andvalidate-bundle:canonicalall pass locally.Manual verification still needed
Per
dogfood/agent-uses-agent-tty/VIDEO_PLAYBACK.md, open this branch's README while logged out of GitHub and confirm both<video>elements stream (don't navigate to theuser-attachments/...URL directly — it 404s anonymously by design).🤖 Generated with Claude Code