Skip to content

feat: add TUI e2e test harness#3252

Merged
dgageot merged 4 commits into
mainfrom
worktree-board-526bf697506cdadb
Jun 26, 2026
Merged

feat: add TUI e2e test harness#3252
dgageot merged 4 commits into
mainfrom
worktree-board-526bf697506cdadb

Conversation

@dgageot

@dgageot dgageot commented Jun 26, 2026

Copy link
Copy Markdown
Member

End-to-end testing of a terminal UI is notoriously awkward: you either need a real PTY and a human to look at it, or you skip coverage entirely. This change introduces pkg/tui/tuitest, a renderer-less driver that runs the real Bubble Tea model in-process against VCR cassettes, so the full message loop executes without any display hardware or timing tricks.

The driver exposes a fluent API for sending input, waiting on conditions, and asserting against rendered frames. Golden-snapshot support pins full-screen output to a file so regressions are caught at a glance. Two debugging escape hatches are included: --tuitest-live streams every rendered frame to stderr in real time, and --tuitest-dump writes a frame archive to disk — both useful when a test fails in CI and you need to see what the screen looked like. Mouse helpers (MoveMouseToText, ClickText, MustFindText) and an injectable clipboard spy round out the harness, letting the click-to-copy flow be tested end-to-end without touching the system clipboard.

The first batch of e2e tests under e2e/tui/ covers basic chat, editable prompt, command palette, golden snapshot, and copy-to-clipboard. Alongside those, two data races were fixed: scroll-to-bottom calls that were fired from command goroutines are now delivered via a message, and session fields are snapshotted under mu.RLock before the SQLite upsert, eliminating races surfaced by -race runs of the new tests.

task lint and task build pass. Focused go test -race passes for pkg/tui/tuitest and e2e/tui. task test currently fails on an unrelated pre-existing failure in pkg/teamloader (TestLoadRetainsAgentConfig: unknown provider type openai).

dgageot added 4 commits June 25, 2026 19:33
Introduces pkg/tui/tuitest — a renderer-less, fluent driver that runs
the real Bubble Tea model offline against VCR cassettes — and a first
batch of e2e tests under e2e/tui/ covering basic chat, editable prompt,
command palette, and a golden snapshot.

Assisted-By: Claude
Move scrollToBottom calls off command goroutines via scrollToBottomMsg,
and snapshot session fields under mu.RLock before the SQLite upsert.

Assisted-By: claude-sonnet-4-5
Injectable clipboard writer and spy let tests assert copy-to-clipboard
behavior without touching the real system clipboard. New mouse helpers
(MoveMouseToText, ClickText, MustFindText) and an e2e click-to-copy
test exercise the full flow end-to-end.

Assisted-By: docker-agent
@dgageot dgageot requested a review from a team as a code owner June 26, 2026 08:58
@aheritier aheritier added area/sessions For features/issues/fixes related to session lifecycle (resume, persistence, export) area/testing Test infrastructure, CI/CD, test runners, evaluation area/tui For features/issues/fixes related to the TUI kind/feat PR adds a new feature (maps to feat:). Use on PRs only. labels Jun 26, 2026

@docker-agent docker-agent left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟡 NEEDS ATTENTION

This PR introduces a well-designed TUI test harness with a fluent driver API, golden-snapshot support, debug escape hatches, mouse helpers, and a clipboard spy. The two data-race fixes in pkg/session/store.go and pkg/tui/components/messages/messages.go are correct: both InMemorySessionStore.UpdateSession and SQLiteSessionStore.UpdateSession now snapshot all concurrent-writer fields under session.mu.RLock before the SQLite upsert. One reliability issue in the new harness is flagged below.

Comment thread pkg/tui/tuitest/driver.go
// captured, so a following Assert observes the post-message frame rather than
// racing the program's goroutine. captureModel pushes a frame on every Update,
// so exactly one new frame appears per message.
func (d *Driver) sendSync(msg tea.Msg) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] sendSync silently swallows timeout — no test failure when frame never arrives

sendSync polls d.frames.count() until a new frame appears or d.waitTimeout (10 s) elapses. When the deadline is exceeded without a new frame being captured, the function returns silently without calling d.tb.Errorf or d.tb.Fatalf. The test goroutine then continues as if the message was processed, and any immediately following Assert call evaluates the stale pre-message frame — potentially turning a real failure into a spurious pass.

Suggested fix: add a timeout path that fails the test with context, matching what WaitFor already does:

func (d *Driver) sendSync(msg tea.Msg) {
    before := d.frames.count()
    d.program.Send(msg)
    deadline := time.Now().Add(d.waitTimeout)
    for time.Now().Before(deadline) {
        if d.frames.count() > before {
            return
        }
        time.Sleep(pollInterval)
    }
    d.tb.Helper()
    d.tb.Fatalf("tuitest: sendSync timed out after %s waiting for frame\n--- last frame ---\n%s",
        d.waitTimeout, d.frames.latest())
}

Comment thread pkg/tui/tuitest/driver.go
// A blocking Send guarantees Run() has started before we return, so later
// sends are never dropped. The window size also flips the model to its
// ready state.
p.Send(tea.WindowSizeMsg{Width: width, Height: height})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW] Comment incorrectly claims p.Send is "blocking" — startup race may exist

The comment on line 159 states "A blocking Send guarantees Run() has started before we return", but tea.Program.Send() is non-blocking: it enqueues the message and returns immediately. The window size is already seeded via tea.WithWindowSize(width, height) passed to tea.NewProgram, so the model initialization itself is not at risk.

The practical risk is low because bubbletea v2's Send() typically uses a buffered channel, meaning the WindowSizeMsg is held until the event loop drains it rather than dropped. However, if the channel is unbuffered or the program exits very quickly before draining, the message could be silently lost with no error surfaced.

The comment should be corrected to avoid misleading future readers, e.g.:

// Send the initial window-size message. p.Send enqueues it; the event loop
// will drain it once Run() starts. WithWindowSize already seeds the model's
// initial size, so this is belt-and-suspenders to trigger any ready-state logic.
p.Send(tea.WindowSizeMsg{Width: width, Height: height})

@dgageot dgageot merged commit 2dfb61b into main Jun 26, 2026
14 checks passed
@dgageot dgageot deleted the worktree-board-526bf697506cdadb branch June 26, 2026 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/sessions For features/issues/fixes related to session lifecycle (resume, persistence, export) area/testing Test infrastructure, CI/CD, test runners, evaluation area/tui For features/issues/fixes related to the TUI kind/feat PR adds a new feature (maps to feat:). Use on PRs only.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants