feat: add TUI e2e test harness#3252
Conversation
Introduces pkg/tui/tuitest — a renderer-less, fluent driver that runs the real Bubble Tea model offline against VCR cassettes — and a first batch of e2e tests under e2e/tui/ covering basic chat, editable prompt, command palette, and a golden snapshot. Assisted-By: Claude
Move scrollToBottom calls off command goroutines via scrollToBottomMsg, and snapshot session fields under mu.RLock before the SQLite upsert. Assisted-By: claude-sonnet-4-5
Assisted-By: docker-agent
Injectable clipboard writer and spy let tests assert copy-to-clipboard behavior without touching the real system clipboard. New mouse helpers (MoveMouseToText, ClickText, MustFindText) and an e2e click-to-copy test exercise the full flow end-to-end. Assisted-By: docker-agent
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
This PR introduces a well-designed TUI test harness with a fluent driver API, golden-snapshot support, debug escape hatches, mouse helpers, and a clipboard spy. The two data-race fixes in pkg/session/store.go and pkg/tui/components/messages/messages.go are correct: both InMemorySessionStore.UpdateSession and SQLiteSessionStore.UpdateSession now snapshot all concurrent-writer fields under session.mu.RLock before the SQLite upsert. One reliability issue in the new harness is flagged below.
| // captured, so a following Assert observes the post-message frame rather than | ||
| // racing the program's goroutine. captureModel pushes a frame on every Update, | ||
| // so exactly one new frame appears per message. | ||
| func (d *Driver) sendSync(msg tea.Msg) { |
There was a problem hiding this comment.
[MEDIUM] sendSync silently swallows timeout — no test failure when frame never arrives
sendSync polls d.frames.count() until a new frame appears or d.waitTimeout (10 s) elapses. When the deadline is exceeded without a new frame being captured, the function returns silently without calling d.tb.Errorf or d.tb.Fatalf. The test goroutine then continues as if the message was processed, and any immediately following Assert call evaluates the stale pre-message frame — potentially turning a real failure into a spurious pass.
Suggested fix: add a timeout path that fails the test with context, matching what WaitFor already does:
func (d *Driver) sendSync(msg tea.Msg) {
before := d.frames.count()
d.program.Send(msg)
deadline := time.Now().Add(d.waitTimeout)
for time.Now().Before(deadline) {
if d.frames.count() > before {
return
}
time.Sleep(pollInterval)
}
d.tb.Helper()
d.tb.Fatalf("tuitest: sendSync timed out after %s waiting for frame\n--- last frame ---\n%s",
d.waitTimeout, d.frames.latest())
}| // A blocking Send guarantees Run() has started before we return, so later | ||
| // sends are never dropped. The window size also flips the model to its | ||
| // ready state. | ||
| p.Send(tea.WindowSizeMsg{Width: width, Height: height}) |
There was a problem hiding this comment.
[LOW] Comment incorrectly claims p.Send is "blocking" — startup race may exist
The comment on line 159 states "A blocking Send guarantees Run() has started before we return", but tea.Program.Send() is non-blocking: it enqueues the message and returns immediately. The window size is already seeded via tea.WithWindowSize(width, height) passed to tea.NewProgram, so the model initialization itself is not at risk.
The practical risk is low because bubbletea v2's Send() typically uses a buffered channel, meaning the WindowSizeMsg is held until the event loop drains it rather than dropped. However, if the channel is unbuffered or the program exits very quickly before draining, the message could be silently lost with no error surfaced.
The comment should be corrected to avoid misleading future readers, e.g.:
// Send the initial window-size message. p.Send enqueues it; the event loop
// will drain it once Run() starts. WithWindowSize already seeds the model's
// initial size, so this is belt-and-suspenders to trigger any ready-state logic.
p.Send(tea.WindowSizeMsg{Width: width, Height: height})
End-to-end testing of a terminal UI is notoriously awkward: you either need a real PTY and a human to look at it, or you skip coverage entirely. This change introduces
pkg/tui/tuitest, a renderer-less driver that runs the real Bubble Tea model in-process against VCR cassettes, so the full message loop executes without any display hardware or timing tricks.The driver exposes a fluent API for sending input, waiting on conditions, and asserting against rendered frames. Golden-snapshot support pins full-screen output to a file so regressions are caught at a glance. Two debugging escape hatches are included:
--tuitest-livestreams every rendered frame to stderr in real time, and--tuitest-dumpwrites a frame archive to disk — both useful when a test fails in CI and you need to see what the screen looked like. Mouse helpers (MoveMouseToText,ClickText,MustFindText) and an injectable clipboard spy round out the harness, letting the click-to-copy flow be tested end-to-end without touching the system clipboard.The first batch of e2e tests under
e2e/tui/covers basic chat, editable prompt, command palette, golden snapshot, and copy-to-clipboard. Alongside those, two data races were fixed: scroll-to-bottom calls that were fired from command goroutines are now delivered via a message, and session fields are snapshotted undermu.RLockbefore the SQLite upsert, eliminating races surfaced by-raceruns of the new tests.task lintandtask buildpass. Focusedgo test -racepasses forpkg/tui/tuitestande2e/tui.task testcurrently fails on an unrelated pre-existing failure inpkg/teamloader(TestLoadRetainsAgentConfig: unknown provider typeopenai).