v1.4.3 — click-anchored narration + always-release pan-zoom
Two structural fixes for the "I can't follow what's happening" feedback on the videos.
Narration — click-anchored windows
The body is now partitioned into windows by the agent's clicks (with very-close clicks merged so we don't get sub-second slivers). Each window has a fixed start, end, duration. The narrator writes ONE beat per window: they pick the words, the editor pins the timing to the click. Because clicks are when meaningful state changes happen on screen, narration anchored to a click is automatically anchored to the moment the viewer cares about. There's no possible drift between voice and visuals.
The narrator prompt was rewritten with much stronger guidance:
- Audience framing: "imagine the viewer is a teammate who has never seen this codebase. They have ~60 seconds and need to leave understanding what was built, why it matters, how it works, and whether the tests prove it works."
- Plan the story arc first: read PR body, map goals to windows, decide setup / execute / confirm for each window, THEN write beats.
- Name UI elements specifically in the developer's own vocabulary ("the new Bulk Archive button I added at the top") not generically ("a button").
- Concrete good vs bad examples for setup beats, confirm beats, transitions, and silent-investigation moments.
- Beats may be empty for windows with nothing meaningful to say — silent gap beats 4-word filler.
Goal-aware structuring: the narrator gets the test plan goals as the demo's chapter list and is told to march through them in order, each chapter spanning multiple windows.
Plus: fixed an upstream issue where the agent recorded ref=e123 (Playwright's internal handle) instead of the human-readable element description, so the narrator can now actually say "clicked Sign In" instead of "clicked ref=e123".
Pan-zoom — always release between clicks
The previous "ride zoom between clicks" mode held a 2× zoom on the last-clicked region for up to 12 seconds. If anything happened far from that region during the hold — a toast in the corner, a counter at the top, items reordering elsewhere — it was clipped out of frame.
The new cycle is per-click and always releases:
- APPROACH (0.5s before click): ease zoom 1.0 → 2.0
- click fires
- HOLD (0.7s): stay at 2.0 on click site so the viewer reads the immediate reaction
- RELEASE (0.5s): ease zoom 2.0 → 1.0, neutral framing
- neutral 1.0 until next APPROACH
During the gap between clicks the viewer always sees the full page after the hold releases — any visible change anywhere on screen is in frame by construction.
When approach overlaps previous click's hold/release (very tight click pairs), approach wins so the camera arrives on time.
Versions
- CLI:
1.4.3 - Plugin:
1.4.3
v1 tag force-updated.