Skip to content

feat: intent-aware verify loop + one-command setup (v0.6.0)#25

Merged
youcefzemmar merged 2 commits into
veltiq:mainfrom
youcefzemmar:feat/agi-verify-loop
May 31, 2026
Merged

feat: intent-aware verify loop + one-command setup (v0.6.0)#25
youcefzemmar merged 2 commits into
veltiq:mainfrom
youcefzemmar:feat/agi-verify-loop

Conversation

@youcefzemmar
Copy link
Copy Markdown
Collaborator

What

Evolves the behavioral verify loop from a generic checklist into an intent-aware verifier, and makes it install in one command — without ever trapping the user.

When an agent finishes a work turn, the loop now:

  1. Grounds the check in the actual request. It parses the human prompt that opened the turn and quotes it back, so the agent verifies against what was asked, not its own restatement.
  2. Tailors the verification to the work. It infers the kind of work from the diff and leads with the right check:
    • web/UI → start it (npm run dev), open the URL, take a screenshot and READ it, compare pixel-for-intent, check the console
    • API → hit the affected endpoint(s), check status + body (incl. error paths)
    • CLI → run it with realistic args, check stdout/stderr + exit code
    • library → run the tests + a smoke call of the changed function
      It surfaces the concrete run command and local URL when discoverable (from package.json scripts + ports in the diff).
  3. Can never trap you. New GROUNDTRUTH_NO_LOOP=1 kill-switch instantly pauses the loop regardless of config; the per-session round cap still guarantees a turn always finishes.

Detection only tailors the guidance — groundtruth still never judges the work, so it adds no new false positives (the project's core invariant).

Also

  • groundtruth setup — one idempotent command: Stop hook + verify loop + SessionEnd digest + status line, global by default. install stays for fine-grained control.
  • Public API: analyze() (returns report + evidence + turn), detectWorkKind(), summarizeRequest().
  • Fixes the long-standing version drift: .claude-plugin/plugin.json 0.4.0 → 0.6.0, synced with package.json.
  • CHANGELOG [Unreleased] (verify loop + precision fixes) cut to 0.6.0.

Tests

+16 tests (intent classification, request capture, grounded/tailored protocol). 136 pass, typecheck + biome clean. Behaviorally smoke-tested end-to-end: hook returns exit 2 when blocking, renders the web-tailored screenshot protocol with the right URL, and the kill-switch cleanly disables blocking.

🤖 Generated with Claude Code

youcefzemmar and others added 2 commits May 31, 2026 22:49
Turn the behavioral verify loop from a generic checklist into an
intent-aware verifier, and make it install in one command without ever
trapping the user.

- intent.ts: parse the actual human request from the transcript and infer
  the work kind (web/api/cli/library) from the evidence, surfacing the
  concrete run command (`npm run dev`) and local URL. Detection only
  tailors the protocol — groundtruth still never judges the work, so no
  new false positives.
- loop.ts: buildProtocol now grounds the check in the quoted request and
  leads with the right verification — web → start it, open the URL, take a
  screenshot and READ it; api → hit the endpoint; cli → run it; library →
  tests + smoke call.
- cli.ts: new `setup` command (hook + loop + SessionEnd digest + status
  line, idempotent, global by default); GROUNDTRUTH_NO_LOOP=1 kill-switch
  so the loop can never trap a turn; help documents the env vars.
- pipeline.ts: add analyze() returning report + evidence + turn; runPipeline
  kept as a thin wrapper.
- plugin.json bumped 0.4.0 → 0.6.0 (sync with package.json).
- +16 tests (intent, grounded/tailored protocol, request capture); 136 pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
groundtruth dogfooding its own PR veltiq#25 caught a real false positive: bare
tokens like `web/UI`, `stdout/stderr`, `grounded/tailored`, `client/server`
were being extracted as file paths and flagged `unsupported`.

A bare (un-backticked) slash token now counts as a path only when it has a
real code extension OR its first segment is a known source root (src, lib,
packages, tests, …). Backticked paths keep the more trusting route. Real
paths (`src/db/client.ts`, `src/auth`) are unaffected. +2 regression tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@youcefzemmar youcefzemmar merged commit 2d6ec04 into veltiq:main May 31, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant