Release v0.11.0 · Hyperyond/Hover

Highlights

⟳ Re-record — the answer to "my UI changed and my spec broke"

When a Hover-generated Playwright spec turns red because semantic selectors no longer match (button renamed Sign in, label split, role swapped), instead of editing the .spec.ts by hand, the agent regenerates it. Two entry points:

1. Widget — Open the new 📜 Saved sessions overlay, switch to the Specs tab, click ⟳ Re-record next to any spec. The agent reads the spec's JSDoc Original prompt: ("log in then add a todo"), drives the current UI, and Hover overwrites the file with new selectors.

2. CLI — pnpm hover re-record <spec> from a terminal. Boots a temporary service, replays the prompt, prints the resulting git diff, tells you the accept/reject commands. Flags: --dry-run (run without overwriting), --cwd <path> (monorepo workspaces), --port <n> (service port).

About 30 seconds, about $0.10 per spec. CI itself stays pure Playwright — AI cost concentrates at authoring time, not amortised across every test run.

Saved-sessions overlay (Skills + Specs tabs)

The widget's old single-purpose "Saved skills" overlay becomes the Saved sessions overlay with two tabs:

Skills — replayable agent instructions under .claude/skills/. Self-adapt to UI changes (the agent re-resolves selectors at runtime). Same UX as before.
Specs — Playwright tests under __vibe_tests__/. Each row carries the spec slug, truncated original prompt, relative mtime ("2h ago"), and a ⟳ Re-record button. Disabled with tooltip when the spec has no Original prompt: header (hand-authored specs).

Per-tab hint paragraphs explain the distinction explicitly so users don't conflate the two artefacts.

Top-level FAQ in README + docs site

New FAQ section in both READMEs and a dedicated docs/faq.md page. Q1 is the load-bearing one: "My UI changed and my saved spec breaks. What now?" Covers the three-layer answer:

Semantic selectors absorb most UI churn (the existing design).
When semantics actually shift, ⟳ Re-record, hand-edit, or treat as regression.
Why no auto-heal at CI time — Hover's stance against the Stagehand/Midscene model. CI tokens accumulate; concentrating LLM cost at deliberate Re-record moments is cheaper and more deterministic over a project's lifetime.

Also covers: Skill vs Spec semantics, why we don't ship re-record --all / --failed in v0.11, headless-Chromium concerns, data-upload boundaries, and production-build no-op behaviour.

What's NOT in this release

re-record --all / --failed. Rejected on purpose for v0.11. --all burns LLM tokens on specs that are fine; --failed is the right shape but needs a first-class "run Playwright, collect failures" step the CLI doesn't yet ship. Rationale in the FAQ. On the v0.12+ roadmap.

Implementation

Layer	Change	Where
Core lib	`listSpecs()` + `parseSpecHeader()` — 145 lines + 13 vitest cases	`packages/core/src/specs/listSpecs.ts`
WS protocol	New `list-specs` request + `specs-list` response; new `command.reRecord.slug` field	`packages/core/src/service.ts`
Service	Collects `SkillStep[]` from `tool_use` events when `reRecord.slug` is set; on clean `session_end` calls `writeSpec({ overwrite: true })`	invocation loop in `service.ts`
CLI	`hover re-record <spec>` subcommand — ~340 lines	`packages/cli/src/re-record.ts`
Widget	Tabbed overlay, Specs list renderer, Re-record button → `command { reRecord }`, ~280 lines across `client.js` + `style.css` + `template.html`	`packages/widget-bootstrap/src/widget/`
Docs	README + zh-CN FAQ, `docs/faq.md`, `docs/features/re-record.md`, new `save-as-spec.md` content, nav/sidebar updates	(see CHANGELOG)

Roadmap reshuffle

v0.11 ✓ Spec resilience (this release)
v0.12 → Security mode recording semantics (was v0.11)
v0.13+ or sibling repo → Chrome extension (was v0.12+)
"Re-record --failed / --all" added to Beyond v0.12.x

Validation

pnpm typecheck clean across all 10 publishable packages.
pnpm --filter @hover-dev/core test: 176 tests pass (was 163; +13 for parseSpecHeader / listSpecs).
pnpm test:e2e: 5 Playwright tests pass on examples/basic-app — no regressions from the overlay rewrite.
Manual smoke (Re-record actually round-tripping against a running service) deferred to post-release — same calculus as v0.10's bench-multi-tab: ship the surface first, iterate based on real usage.

Full diff

v0.10.0...v0.11.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.11.0

Choose a tag to compare

Sorry, something went wrong.