Releases · marcushyett/tik-test

WordCaption was clamping each word to a 0.28s minimum frame budget. When narration sped up (up to 1.5×) to fit a click window, captions visibly lagged the voice — every page would finish a few hundred ms after the spoken line. Fixes:

Drop the per-word floor when voice is provided; let the highlight track the voice exactly.
Cap the effective caption span at the segment duration so a sequence cut-off can't make captions outrun the on-screen voice.
Pick the latest started caption page over an earlier still-fading one — a slow page-out can no longer visually block a page that's already started speaking.

2. Verification stamps over the video

New animated overlay drawn at the EXACT moment the agent declared each goal's outcome — a green check (pass), red cross (fail), or grey dash (skipped) with the goal's shortLabel below it. 1.8s lifetime, de-overlaps when goals end close together. Punctuates the recording so viewers see "this thing was just verified" without having to wait for the outro checklist.

Hand-drawn SVG (no emoji) with stroke-dash drawing animation, spring scale-in, and label slide-up.

3. Demo discipline — kills the "click and re-check the same thing" loop

The agent was clicking + screenshotting + re-snapshotting + re-clicking the same surface multiple times per goal. From the viewer's perspective, that's dead video — the same thing happening over and over with no forward progress.

Both fast and meticulous prompts now open with an explicit:

ONE GOAL = ONE DEMO BEAT — ACT (1–3 user actions) → VERIFY (one check) → OUTCOME → STOP. The first verification IS the verdict. Once you've seen the success state once, your next message is OUTCOME. No 'just to be sure' second screenshot.

The stuck-loop tripwire now fires on the second occurrence (was third) and explicitly names the cases: re-screenshotting / re-snapshotting / re-clicking / re-typing a surface you already touched in this goal.

Versions kept in sync

package.json → 1.4.7
plugin/.claude-plugin/plugin.json → 1.4.7
v1 tag force-pushed to point at this commit

Assets 2

01 May 22:03

marcushyett

v1.4.6

303d65d

v1.4.6 — missing-content protocol + realistic desktop viewport

Two fixes bundled.

1. Missing-content protocol — kills the most common agent loop

When the agent took an action and the verification snapshot didn't show what it expected, it would loop: retry, screenshot, retry, screenshot. Especially bad on chat replies that scroll off, spinners that finish too fast, toasts that fade.

Both fast and meticulous prompts (src/goal-agent.ts) now have a mandatory rule: if the snapshot doesn't show the expected content, the next tool call MUST be __tikHistory.find({ text, sinceMs: 30000 }) — not a retry, not another screenshot.

The 30-second mutation buffer (added in v1.4.5) disambiguates the failure for the agent — it doesn't have to guess whether the element was ephemeral, scrolled off, or never rendered. Buffer hits → pass with "verified programmatically (time-travel): …". Buffer empty → real failure.

Mandatory for any goal whose success criterion mentions a chat/message/reply, a notification/toast/banner, a loading/spinner state, an animation, or any text appearing on a feed-style or scrolling surface.

2. Realistic 2026 desktop viewport (1920×1080)

Default desktop fallback bumped from 1280×800 → 1920×1080 to match the actual resolution most US users have in 2026 (StatCounter: ~21.5% share, top resolution). The snapWidth function gains a 1920 wide-desktop bucket so consumers who request 1920 keep the resolution end-to-end (1920 → 1080 canvas at 9/16 — clean repeating fraction). Aligned across runner.ts, config.ts, plan.ts, single-video-editor.ts, remotion/Root.tsx.

Consumers who explicitly set viewport: 1280x800 (or any other) in their tiktest.md are unaffected — only the unspecified default changes.

Versions kept in sync

package.json → 1.4.6
plugin/.claude-plugin/plugin.json → 1.4.6
v1 tag force-pushed to point at this commit

Assets 2

01 May 21:49

marcushyett

v1.4.5

0992ce5

v1.4.5 — page-side time-travel buffer for ephemeral UI

What changed

Agent loops on ephemeral state are the most expensive failure mode tik-test had: a chat reply scrolls out, a spinner completes, a toast fades — the agent takes a snapshot, sees nothing, retries, sees nothing again, eventually gives up or marks the feature broken when it actually worked.

Fixed permanently with one small page script:

`window.__tikHistory` — 30-second time-travel buffer

Records every meaningful DOM mutation (added / removed / text-changed / attr-changed) with text, bbox, timestamp, tag, testid, containerTestid, aria-label, role. Three queries:

__tikHistory.find({ text, tag, testid, containerTestid, kind, sinceMs })
__tikHistory.transients(sinceMs)  // elements that appeared then disappeared
__tikHistory.stats()

The agent uses these via the existing browser_evaluate tool — no new MCP infra. Trims itself on every push so memory stays ~900KB.

`window.__tikFreeze` — pause / resume animations

await __tikFreeze.pause()    // pauses CSS animations + Web Animations API
// ... browser_take_screenshot ...
await __tikFreeze.resume()

One-liner replacement for the inline freeze recipe the prompt already taught.

Agent prompt updates

Both fast and meticulous prompts gain a new tier 3 in the verification hierarchy: TIME-TRAVEL via __tikHistory, slotted between freeze-then-screenshot and generic programmatic fallback. The prompt includes literal example queries for the four canonical ephemeral cases:

Chat-bot reply scrolled off: __tikHistory.find({ text: 'expected reply', sinceMs: 30000 })
Spinner completed too fast: __tikHistory.transients(8000)
Toast faded: __tikHistory.find({ containerTestid: 'toast-region', kind: 'added' })
aria-busy flipped: __tikHistory.find({ kind: 'attr-changed' }) and look for changedAttr === 'aria-busy'

Outcomes verified this way start with verified programmatically (time-travel): so reviewers can tell which tier produced the result.

Why this is the simple solution

One page-side script (~120 lines), no external deps, no MCP changes, no tracing infra.
Survives navigations because addInitScript runs before page scripts.
Agent reaches it via the tool it already uses (browser_evaluate).
Buffer is bounded (~900KB max) and self-trimming.

Versions

CLI: 1.4.5
Plugin: 1.4.5

v1 tag force-updated.

Assets 2

01 May 06:51

marcushyett

v1.4.4

96a6753

v1.4.4 — ride zoom by default, release on off-target DOM mutations

What changed

v1.4.3 always-released zoom between clicks, which broke tight click sequences (form-fills, multi-step wizards) where the camera should stay magnified across consecutive nearby actions.

This restores ride-mode as the default and adds a real signal for when the camera SHOULD release: page-side MutationObserver data capturing which regions of the page mutate after each click.

The rule

Mutation lands inside the clicked element's bbox (button press feedback, focus ring, dropdown on the click site) → expected, keep ride.
Mutation lands outside the clicked element's bbox (toast in a corner, counter at the top updates, items reorder elsewhere) → editor flags that gap for release; pan-zoom drops to neutral so the viewer sees the full page during the change.

Implementation

runner.ts injects two new page scripts via addInitScript:
- The interaction recorder now also pushes event.target.getBoundingClientRect() on each click via __tikRecordClickBbox, so we know exactly what region was clicked.
- A new MutationObserver script (__tikRecordMutation) reports the bounding rect of every meaningful DOM change. Filters out invisible nodes, zero-size rects, off-screen rects, and throttles same-position mutations within 200ms so a long type-into-input flurry doesn't flood.
single-video-editor.ts maps clicks + mutations to body-relative time, then for each click scans the post-click window (3.5s) for mutations whose bbox isn't contained in the padded click bbox. Off-target mutations build zoom-release intervals; overlapping intervals merge.
SingleVideoReel.tsx pan-zoom v5: ride zoom by default (hold + pan to next click), release only when (a) the gap intersects a release interval from the editor, or (b) the next click is far from the previous one (> 40% of the viewport's shorter dimension).

Versions

CLI: 1.4.4
Plugin: 1.4.4

v1 tag force-updated.

Assets 2

01 May 06:39

marcushyett

v1.4.3

68709ee

v1.4.3 — click-anchored narration + always-release pan-zoom

Two structural fixes for the "I can't follow what's happening" feedback on the videos.

Narration — click-anchored windows

The body is now partitioned into windows by the agent's clicks (with very-close clicks merged so we don't get sub-second slivers). Each window has a fixed start, end, duration. The narrator writes ONE beat per window: they pick the words, the editor pins the timing to the click. Because clicks are when meaningful state changes happen on screen, narration anchored to a click is automatically anchored to the moment the viewer cares about. There's no possible drift between voice and visuals.

The narrator prompt was rewritten with much stronger guidance:

Audience framing: "imagine the viewer is a teammate who has never seen this codebase. They have ~60 seconds and need to leave understanding what was built, why it matters, how it works, and whether the tests prove it works."
Plan the story arc first: read PR body, map goals to windows, decide setup / execute / confirm for each window, THEN write beats.
Name UI elements specifically in the developer's own vocabulary ("the new Bulk Archive button I added at the top") not generically ("a button").
Concrete good vs bad examples for setup beats, confirm beats, transitions, and silent-investigation moments.
Beats may be empty for windows with nothing meaningful to say — silent gap beats 4-word filler.

Goal-aware structuring: the narrator gets the test plan goals as the demo's chapter list and is told to march through them in order, each chapter spanning multiple windows.

Plus: fixed an upstream issue where the agent recorded ref=e123 (Playwright's internal handle) instead of the human-readable element description, so the narrator can now actually say "clicked Sign In" instead of "clicked ref=e123".

Pan-zoom — always release between clicks

The previous "ride zoom between clicks" mode held a 2× zoom on the last-clicked region for up to 12 seconds. If anything happened far from that region during the hold — a toast in the corner, a counter at the top, items reordering elsewhere — it was clipped out of frame.

The new cycle is per-click and always releases:

APPROACH (0.5s before click): ease zoom 1.0 → 2.0
click fires
HOLD (0.7s): stay at 2.0 on click site so the viewer reads the immediate reaction
RELEASE (0.5s): ease zoom 2.0 → 1.0, neutral framing
neutral 1.0 until next APPROACH

During the gap between clicks the viewer always sees the full page after the hold releases — any visible change anywhere on screen is in frame by construction.

When approach overlaps previous click's hold/release (very tight click pairs), approach wins so the camera arrives on time.

Versions

CLI: 1.4.3
Plugin: 1.4.3

v1 tag force-updated.

Assets 2

01 May 05:49

marcushyett

v1.4.2

7332273

v1.4.2 — narrator-defined timed body beats

What changed

The body narration is back to per-beat audio, but the beats are now defined by the narrator instead of the editor. Each beat carries an explicit startS + durS chosen against the body-moment timeline, gets its own TTS file, and is placed at the narrator-declared timestamp. The spoken word lands exactly when the corresponding visual happens — anchored sync by construction.

Why v1.4.0/v1.4.1 was wrong

A single continuous audio file with a global playbackRate has no anchor to the visuals. Even when the narrator perfectly summarised the journey, the words landed at unrelated times — feedback was 'narration talking about something completely different' from what was on screen.

The new contract

timed-narration.ts asks Claude for the body as { beats: [{ startS, durS, text, captionText }, ...] }, sorted, non-overlapping, covering [0, masterDurS].
normaliseBeats snaps the narrator's beats into a clean contiguous timeline (preserves text + proportional duration; first beat starts at 0, last ends at bodyDurS).
The editor TTSes each beat separately, computes per-beat playbackRate (clamped 0.7–1.5×), builds one BodyChunk per beat.
The compositor renders one <Sequence> per chunk with its own <Audio> + <WordCaption>.

Badges for silent investigative moments keep the 1.4.0 standalone-Sequence model so they stay tied to the original tool-window timestamps regardless of beat pacing.

Bonus fix

The agent was recording ref=e123 (Playwright's internal handle) as the action target, so the narrator had no idea what was actually clicked. Switched the priority so the human-readable element description wins — narrations can say 'clicked Sign In' instead of 'clicked ref=e123'.

Versions

CLI: 1.4.2
Plugin: 1.4.2

v1 tag force-updated.

Assets 2

Releases: marcushyett/tik-test

tik-test review — PR #34

Uh oh!

tik-test review — PR #34

Uh oh!

tik-test review — PR #34

Uh oh!

tik-test review — PR #34

Uh oh!

v1.4.7 — voice-paced captions + verification stamps + demo discipline

1. Captions now track voice exactly

2. Verification stamps over the video

3. Demo discipline — kills the "click and re-check the same thing" loop

Versions kept in sync

Uh oh!

v1.4.6 — missing-content protocol + realistic desktop viewport

1. Missing-content protocol — kills the most common agent loop

2. Realistic 2026 desktop viewport (1920×1080)

Versions kept in sync

Uh oh!

v1.4.5 — page-side time-travel buffer for ephemeral UI

What changed

window.__tikHistory — 30-second time-travel buffer

window.__tikFreeze — pause / resume animations

Agent prompt updates

Why this is the simple solution

Versions

Uh oh!

v1.4.4 — ride zoom by default, release on off-target DOM mutations

What changed

The rule

Implementation

Versions

Uh oh!

v1.4.3 — click-anchored narration + always-release pan-zoom

Narration — click-anchored windows

Pan-zoom — always release between clicks

Versions

Uh oh!

v1.4.2 — narrator-defined timed body beats

What changed

Why v1.4.0/v1.4.1 was wrong

The new contract

Bonus fix

Versions

Uh oh!

`window.__tikHistory` — 30-second time-travel buffer

`window.__tikFreeze` — pause / resume animations