rgr is a no-dependency Red-Green-Refactor gate for coding agents. It gives agents a human-operator style CLI that records the failing test first, freezes that test with hashes and snapshots, and refuses to mark Green or Refactor if the Red test was edited.
Clone the repo and run it with Bun:
git clone https://github.com/kingbootoshi/rgr.git
cd rgr
bun run rgr -- --helpOptional: link the CLI onto your PATH:
bun link
rgr --helpFrom this repo:
bun run rgr -- --helpFrom another repo:
bun run /path/to/rgr/src/cli/index.ts --root . --helpThe workflow examples below assume rgr is on your PATH. If it is not, replace rgr with bun run /path/to/rgr/src/cli/index.ts.
RGR ships two skills as both Claude Code and Codex plugins, the two halves of trustless engineering:
rgr— proves behavior and scope at one CI verdict (this tool).intent-contract— turns a request into a signed Locked Intent Boundary before any planning, so an agent cannot redefine the goal. It compiles to an IntentLock thatrgr verify --intent-lockenforces. Seeskills/intent-contract/SKILL.md.
Together: intent-contract writes the law, rgr is the court that proves the work stayed inside it. See Trustless Enforcement.
Claude Code:
claude plugin marketplace add kingbootoshi/rgr
claude plugin install rgr@rgrFor local Claude Code testing from a clone:
claude --plugin-dir .Codex:
- Plugin manifest:
.codex-plugin/plugin.json - Shared skill:
skills/rgr/SKILL.md - Codex UI metadata:
skills/rgr/agents/openai.yaml
Install the repo root as the plugin directory in Codex, then invoke $rgr when a code change should use strict Red-Green-Refactor proof. The bundled intent-contract skill is discovered the same way (skills/intent-contract/).
The endgame: accept an agent's work without trusting the agent, because a gate it cannot bribe has already proven the work stayed inside the human's intent and functions.
- Lock intent.
intent-contractproduces a small, human-signed Locked Intent Boundary: required deltas, invariants, non-substitutions, and authorized-change rights (every diff op must map to a row). It compiles to a hash-pinned IntentLock. - Freeze it.
rgr lock-intent --intent-lock <path> --expect-sha256 <H>verifies the lock's payload hash and stores a copy under.rgr/as evidence. - Enforce one verdict.
rgr verify --ci --replay --intent-lock <trusted> --expect-intent-sha256 <H>proves behavior (Red-Green-Replay) and scope (everygit diff --no-renames --name-status <lockedBase>...HEADop maps to an authorized row) in a single pass.
The load-bearing property: .rgr/ is evidence, not authority. CI reads the trusted lock from a path the agent does not control, hash/signature-verifies it, and audits the real diff against that. Tampering the in-tree copy changes nothing. A non-ancestor base or a dirty working tree fails closed. Deny rows (ops: []) fail on contact.
A signed boundary lives at docs/prds/intent-lock-scope-audit.intent-boundary.md — this feature was built under its own boundary and self-audited.
# 1. Initialize a manifest for this goal.
rgr init --goal-id billing-scope
# 2. Write only the failing test, then capture Red.
rgr red --strict --goal-id billing-scope --test src/billing.test.ts -- bun test src/billing.test.ts
# Optional: explicitly freeze a helper, fixture, snapshot, or config file that defines the Red oracle.
rgr red --strict --goal-id billing-scope --test src/billing.test.ts --protect tests/fixtures/billing.ts -- bun test src/billing.test.ts
# 3. Implement production code, then prove Green.
rgr green
# 4. Refactor only while the Red test is still byte-for-byte unchanged.
rgr refactor -- bun test
# 5. Final gate for local CI or sandbox authority.
rgr verify --ci --replay -- bun testEvery run writes .rgr/manifest.json, .rgr/events.jsonl, snapshots, diffs, and command output logs.
When an older completed cycle has known replay-only environment noise, targeted replay can keep later cycles moving while still requiring Green receipts, protected heads, and the final verify command. Treat this as an exception path and report the replay scope.
# Replay only the latest active cycle.
rgr verify --ci --replay --cycle latest -- bun test
# Replay active cycles from a known cycle onward.
rgr verify --ci --replay --from-cycle 004 -- bun test- Red must fail.
- Red defaults to test-surface changes only.
- Red records protected test files with SHA-256 hashes and snapshots.
- Green refuses to run if any protected Red file changed.
- Refactor and Verify refuse to pass if protected Red files changed.
- Every command proof uses argv after
--, currently directbun testonly. - Green runs the exact Red command.
- Strict Red protects imported test helpers, fixtures, snapshots, package/test config, and lockfiles that can change what the test means.
--testis only for root assertion-bearing tests; use--protectfor helpers, fixtures, snapshots, and test config.inspect-testinspects root tests and reports protected support separately, so fixtures are not warned for lackingexpect().verify --ci --replayreconstructs the Red proof from the recorded git base commit and protected snapshots.verify --ci --replay --cycle latestreplays only the latest active cycle;--from-cycle <id>replays active cycles from that id onward and reports skipped active cycles.- Same-file multi-cycle work is supported through current protected heads: each Red hash is frozen through its Green, then a later Red can intentionally advance the file.
- Wrong tests must be superseded through
rgr revise-test, then replaced by a new Red proof. verify --cirequires every active cycle to have Red and Green receipts.--allow-source-changespermits reviewed source dirtiness that already existed before Red starts and snapshots that dirty source into replay; the Red command itself may not create commits or modify source files.
This tool gives honest agents and CI a deterministic contract. If an agent has unrestricted write access to the same repo, it can still delete .rgr or bypass the CLI. Treat local use as a discipline gate and make rgr verify --ci --replay -- bun test mandatory inside CI, sandboxes, or agent harnesses when the result must be authoritative.
For authority, use strict replay:
rgr verify --ci --replay -- bun testUse:
rgr promptIt prints a compact instruction block for agents that need to write meaningful tests instead of shallow mock-echo tests.
