feat: intent-aware verify loop + one-command setup (v0.6.0)#25
Merged
Conversation
Turn the behavioral verify loop from a generic checklist into an intent-aware verifier, and make it install in one command without ever trapping the user. - intent.ts: parse the actual human request from the transcript and infer the work kind (web/api/cli/library) from the evidence, surfacing the concrete run command (`npm run dev`) and local URL. Detection only tailors the protocol — groundtruth still never judges the work, so no new false positives. - loop.ts: buildProtocol now grounds the check in the quoted request and leads with the right verification — web → start it, open the URL, take a screenshot and READ it; api → hit the endpoint; cli → run it; library → tests + smoke call. - cli.ts: new `setup` command (hook + loop + SessionEnd digest + status line, idempotent, global by default); GROUNDTRUTH_NO_LOOP=1 kill-switch so the loop can never trap a turn; help documents the env vars. - pipeline.ts: add analyze() returning report + evidence + turn; runPipeline kept as a thin wrapper. - plugin.json bumped 0.4.0 → 0.6.0 (sync with package.json). - +16 tests (intent, grounded/tailored protocol, request capture); 136 pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
groundtruth dogfooding its own PR veltiq#25 caught a real false positive: bare tokens like `web/UI`, `stdout/stderr`, `grounded/tailored`, `client/server` were being extracted as file paths and flagged `unsupported`. A bare (un-backticked) slash token now counts as a path only when it has a real code extension OR its first segment is a known source root (src, lib, packages, tests, …). Backticked paths keep the more trusting route. Real paths (`src/db/client.ts`, `src/auth`) are unaffected. +2 regression tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Evolves the behavioral verify loop from a generic checklist into an intent-aware verifier, and makes it install in one command — without ever trapping the user.
When an agent finishes a work turn, the loop now:
npm run dev), open the URL, take a screenshot and READ it, compare pixel-for-intent, check the consoleIt surfaces the concrete run command and local URL when discoverable (from
package.jsonscripts + ports in the diff).GROUNDTRUTH_NO_LOOP=1kill-switch instantly pauses the loop regardless of config; the per-session round cap still guarantees a turn always finishes.Detection only tailors the guidance — groundtruth still never judges the work, so it adds no new false positives (the project's core invariant).
Also
groundtruth setup— one idempotent command: Stop hook + verify loop + SessionEnd digest + status line, global by default.installstays for fine-grained control.analyze()(returns report + evidence + turn),detectWorkKind(),summarizeRequest()..claude-plugin/plugin.json0.4.0 → 0.6.0, synced withpackage.json.[Unreleased](verify loop + precision fixes) cut to 0.6.0.Tests
+16 tests (intent classification, request capture, grounded/tailored protocol). 136 pass, typecheck + biome clean. Behaviorally smoke-tested end-to-end: hook returns exit 2 when blocking, renders the web-tailored screenshot protocol with the right URL, and the kill-switch cleanly disables blocking.
🤖 Generated with Claude Code