From 079bb6c50cd2cb90bbee45ca8cab7cd12b94c516 Mon Sep 17 00:00:00 2001 From: Vincent De Smet Date: Fri, 29 May 2026 23:46:27 +0800 Subject: [PATCH 1/8] docs(002): skillcore + add/verify spec and tech spike MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Spec for feature 002: vendor (`add`) + verify skills, on a single shared integrity primitive (`skillcore`). spec.md is user-facing (4 stories: add, label-honesty verify, orphan/completeness, scriptable exit-code/JSON gate); spec-tech-spike.md holds the implementation decisions for the plan phase. Scope reshaped during clarification (recorded in spec Clarifications + spike): - Prerequisite/eligibility checking moved out of `verify` into a future `doctor` capability; `verify` is integrity-only (exit 0/1/2). Corrected docs/ARCHITECTURE-v0.md accordingly (verify/doctor split, open Q5 resolved). - `skillrig add` (local git-checkout source) pulled in as a durable capability so the add->verify round-trip is the acceptance contract. - Recorded SDK requirement: skillcore consumable as a Go SDK (no constitutional rule forces internal/); single-impl + presentation-free already satisfy it. Prior-art contrast captured in the spike (no code yet, spec phase): - skills.sh/Vercel (HTTP registry, bespoke SHA-256) and gh skill (GitHub REST, tree-SHA used online, no verify, frontmatter injection) — the latter confirms why skillrig must keep provenance lockfile-only (injection breaks tree-SHA label-honesty) and favors reimplement-core over wrapping gh skill. - Open questions logged for planning: git-origin coupling (A-1/OQ-1), auth for remote add (OQ-2), go-getter vs git wrapper for acquisition (OQ-3). Pushed for team alignment; PR not yet opened. Co-Authored-By: Claude Opus 4.8 (1M context) --- CLAUDE.md | 11 + docs/ARCHITECTURE-v0.md | 12 +- .../checklists/requirements.md | 36 +++ .../002-skillcore-verify/spec-tech-spike.md | 189 ++++++++++++++++ specledger/002-skillcore-verify/spec.md | 208 ++++++++++++++++++ 5 files changed, 450 insertions(+), 6 deletions(-) create mode 100644 specledger/002-skillcore-verify/checklists/requirements.md create mode 100644 specledger/002-skillcore-verify/spec-tech-spike.md create mode 100644 specledger/002-skillcore-verify/spec.md diff --git a/CLAUDE.md b/CLAUDE.md index 48663b9..2d15a24 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -64,3 +64,14 @@ Scripts and agents branch on them, so meanings are fixed (`internal/cli/exit.go` Features follow SpecLedger: **Specify → Clarify → Plan → Tasks → Review → Implement**, with artifacts under `specledger//` (spec, plan, tasks, quickstart, contracts, data-model). Quickstart scenarios are the acceptance contract (each maps to a `TestQuickstart_` integration test) and are written during planning. **Read `AGENTS.md` before tracking work or committing.** It defines the two repo-specific operating rules this project enforces: (1) all work-item tracking goes through the built-in `sl issue` CLI (issues stored per-spec in `specledger//issues.jsonl`) — **never** ad-hoc markdown TODO lists; and (2) the commit/PR conventions (conventional prefixes, imperative ≤72-char subjects, testing evidence in PRs). It exists so task tracking and history stay in one git-friendly system rather than fragmenting across tools — consult it for the exact commands and the precise scope of each rule. + + + +## Active Technologies + +- Go 1.24+ (toolchain in this environment is 1.24.4; 1.25 also fine) — single static binary; cross-OS/arch via goreleaser later, out of scope here +- Go standard `go test`. Two tiers — (a) in-process Cobra unit tests via `SetArgs`/`SetOut`/`SetErr` + table-driven resolver tests; (b) `TestQuickstart_*` integration tests that build and exec the real binary (Constitution II/III). +- Local files only — project `.skillrig/config.toml`, global `~/.config/skillrig/config.toml` (XDG-aware). No database, no network. +- `github.com/spf13/cobra` (command tree); `github.com/pelletier/go-toml/v2` (config read/write — see research.md). Dependencies kept minimal (consume-only +- static binary). + diff --git a/docs/ARCHITECTURE-v0.md b/docs/ARCHITECTURE-v0.md index 96094d9..8d17aad 100644 --- a/docs/ARCHITECTURE-v0.md +++ b/docs/ARCHITECTURE-v0.md @@ -57,12 +57,12 @@ The generic `skillrig` binary is a single static build (goreleaser, cross-OS/arc |---|---|---| | `skillrig search [--tag ...]` | Query `index.json` for skills | human, agent | | `skillrig add ` | Vendor a skill into this repo + write lock entry | human | -| `skillrig verify` | Offline: integrity + prereqs, exit code | **CI**, agent, human | +| `skillrig verify` | Offline: integrity only (label-honesty + orphan), exit code | **CI**, agent, human | | `skillrig bump --pr` | Detect upstream advance, open upgrade PR | **CI (cron)** | | `skillrig global add/verify ` | Manage global-scope skills | human | | `skillrig doctor` | Superset health check (integrity + prereqs + auth) | human, agent | -`verify` and `doctor` overlap deliberately: `verify` is the lean, scriptable CI gate (R11); `doctor` is the human-friendly superset that also checks prerequisite auth (R18) and global hints. +`verify` and `doctor` divide along **what the caller needs**, not just verbosity (clarified 2026-05-29). `verify` is the lean, scriptable **integrity** gate — label-honesty (tree-SHA) + orphan (on-disk = locked) — run by **CI**, which validates *content* and therefore needs **no backing binaries** on the runner. `doctor` is the human/agent **eligibility + health** superset that additionally checks each `[[requires]]` prerequisite (present + version, PATH-or-mise-resolvable) and prerequisite **auth** (R18) — the things an agent needs to actually *run* a skill at runtime. The earlier "`verify` also checks prereqs (fail in CI / warn for humans)" framing was **rejected**: it conflated "non-interactive" with "CI", when the caller that actually needs prerequisites present is the runtime agent, not the content-validating CI gate. So **prerequisite / eligibility checking lives in `doctor` (and in `bump` when re-vendoring), never in `verify`** (resolves open Q5). `verify` emits exit `0/1/2`; the prerequisite class (exit `3`) is emitted by `doctor`. **Critical design rule:** `verify`, `bump`, and `doctor` all call the *same* `internal/skillcore` functions for hashing and manifest parsing. There must be exactly one implementation of "compute a skill's tree SHA" so the value CI writes during `bump` is identical to the value `verify` recomputes (R9, R14, N2). Make this a hard internal boundary. *(This thin-interface-over-shared-core rule is independently validated by Skilldex, whose MCP server and CLI both dispatch to the same `core/` functions "so the two interfaces cannot diverge." If you later expose an MCP surface for agents, it dispatches to `skillcore` too — never a parallel implementation.)* @@ -278,7 +278,7 @@ A skill's required CLI is one of two kinds, and skillrig handles both: Mechanics: - **Declare:** `[[requires]]` in `skill.toml` (§4.1), vendored so checks run offline (R16). A `source` of the org's own origin signals a private, co-located CLI; an external source signals a public one. -- **Verify:** `skillrig verify`/`doctor` checks each `requires` entry — on PATH (or resolvable via mise)? version satisfies constraint? — deterministic pass/fail with exit code (R11, R17). +- **Check (doctor, not verify):** `skillrig doctor` checks each `requires` entry — on PATH (or resolvable via mise)? version satisfies constraint? — deterministic pass/fail with exit code (R11, R17). A `mise.toml` in the consumer repo is a **suggestion, not a requirement**: skillrig works without it and simply reports the binary missing from PATH; when a `mise.toml` *is* present, "resolvable via mise" counts as satisfied. This is **doctor's** job, not `verify`'s — `verify` validates vendored *content* (label-honesty + orphan) and needs no backing binaries; prerequisite/eligibility is what the *runtime agent* needs (clarified 2026-05-29, §2). - **Auth as a distinct failure (R18):** for private-repo tools (mise gh backend pulling from the origin or e.g. `cdktn-io/oxid`), `doctor` must distinguish "tool missing" from "tool exists but you can't authenticate to fetch it" — explicitly check `gh auth` / `GITHUB_TOKEN` reachability and report it as its own actionable error. The most common onboarding/CI footgun; surface it loudly. The CLI can *offer* to write the matching `mise.toml` stanza (helpful), but installation is mise's job, not ours. We ship no Nix package or Homebrew tap in v0; orgs wanting those contribute them. @@ -298,7 +298,7 @@ Verified against mise's GitHub backend docs (current as of early 2026). Three fi > **Net:** the co-located-monorepo origin is viable through mise, but only via per-CLI tagged releases + per-tool tag filtering, with the template generating the config. The "one big release with all binaries" shape does **not** work with mise today. -> **Reference design (from OpenClaw study):** OpenClaw's `openclaw skills list --eligible` is the closest prior art to `skillrig verify`'s prereq check — it filters to skills *actually runnable in the current environment*, treating "missing dependency or auth error" as the disqualifier. Adopt its shape: a skill is "eligible" iff every `[[requires]]` resolves (present + version satisfies) AND any private source is authenticable (R18). `verify` returns the eligible/ineligible partition with per-skill reasons. Study OpenClaw's dependency-declaration schema before finalizing the `[[requires]]` fields. +> **Reference design (from OpenClaw study):** OpenClaw's `openclaw skills list --eligible` is the closest prior art to `skillrig verify`'s prereq check — it filters to skills *actually runnable in the current environment*, treating "missing dependency or auth error" as the disqualifier. Adopt its shape: a skill is "eligible" iff every `[[requires]]` resolves (present + version satisfies) AND any private source is authenticable (R18). **`doctor`** (not `verify`) returns the eligible/ineligible partition with per-skill reasons — eligibility is a runtime-agent concern, not the content-integrity gate (§2). Study OpenClaw's dependency-declaration schema before finalizing the `[[requires]]` fields. --- @@ -383,7 +383,7 @@ Recommendation to pressure-test: **Option B for the core, borrow `gh skill`'s cl 2. Symlink fallback policy on Windows/CI: copy-mode default detection rules. 3. `mise.toml` stanza generation: write it, or only print it? (Leaning: offer, don't impose.) 4. Where exactly the global lock lives across clients (`~/.agents/` canonical vs. per-client) and how `global verify` reconciles multiple client home dirs. -5. Whether `skillrig verify` should hard-fail or warn on a *prerequisite* miss vs. a *label-honesty* miss (likely: label-honesty mismatch = fail; prereq = fail in CI / warn for humans; unresolved merge conflict markers = fail). +5. ~~Whether `skillrig verify` should hard-fail or warn on a *prerequisite* miss vs. a *label-honesty* miss~~ — **resolved (2026-05-29)**: `verify` does **no** prerequisite check at all — that belongs to `doctor` (and `bump` when re-vendoring) (§2, §8). `verify` is integrity-only: label-honesty mismatch = fail (exit 2), orphan (on-disk ≠ locked) = fail (exit 2), unresolved conflict markers = fail (exit 2, reserved until `bump` can produce them). The prerequisite class (exit 3) is `doctor`'s. The old "fail in CI / warn for humans" framing was rejected — CI validates content and needs no binaries; the agent needs prerequisites at runtime, so eligibility belongs where the agent asks for it (`doctor`). 6. **Wrap vs. reimplement `gh skill`** (§11b) — the biggest architectural fork; decide before building §6. 7. **Provenance store reconciliation** — if wrapping `gh skill`, how to avoid two provenance stores (its frontmatter metadata vs. your lockfile). Likely: lockfile is canonical, frontmatter ignored/stripped. 8. **Allowlist authoring location** — `allowlist` block inside the `index.json` build inputs vs. a standalone `policy.toml`; and whether the allowlist is global-only or also per-consumer-repo (a repo tightening the org default). @@ -408,7 +408,7 @@ The smallest thing that delivers the core promise ("the skill your agent runs is - Lockfile with `commit` (provenance) + `treeSha` (label honesty) + `requires` (§4.2); `.skillrig/config.toml` (input) + `.skillrig/skills-lock.json` (output) (§2d). - Origin discovery via env > project config > global default; origin = git, **no auth of its own** (§2d). - One **batteries-included GitHub template** (skills + Go-monorepo backing-CLI pattern + index/lint/release workflows) (§2d). -- Backing-CLI declare + verify (`[[requires]]`, `--eligible`-style readiness, auth-as-distinct-failure) (§8); mise consumption via **per-CLI tagged releases + template-generated `mise.toml`** (§8b). +- Backing-CLI declare + **doctor**-side eligibility check (`[[requires]]`, `--eligible`-style readiness, auth-as-distinct-failure) (§8) — *not* in `verify`, which is integrity-only; mise consumption via **per-CLI tagged releases + template-generated `mise.toml`** (§8b). - Multi-client materialization: canonical `.agents/skills` + symlink views, copy-fallback (§6). - Discovery via committed `index.json` (§9); **deterministic tags ship in the manifest** (data only). - Drift-aware **three-way-merge bump** with conflict-markers-and-error (§5b). diff --git a/specledger/002-skillcore-verify/checklists/requirements.md b/specledger/002-skillcore-verify/checklists/requirements.md new file mode 100644 index 0000000..6ed93f8 --- /dev/null +++ b/specledger/002-skillcore-verify/checklists/requirements.md @@ -0,0 +1,36 @@ +# Specification Quality Checklist: Vendor & Verify Skills (`add` + `verify`) + +**Purpose**: Validate specification completeness and quality before proceeding to planning +**Created**: 2026-05-29 +**Feature**: [spec.md](../spec.md) + +## Content Quality + +- [x] No implementation details (languages, frameworks, APIs) +- [x] Focused on user value and business needs +- [x] Written for non-technical stakeholders +- [x] All mandatory sections completed + +## Requirement Completeness + +- [x] No [NEEDS CLARIFICATION] markers remain +- [x] Requirements are testable and unambiguous +- [x] Success criteria are measurable +- [x] Success criteria are technology-agnostic (no implementation details) +- [x] All acceptance scenarios are defined +- [x] Edge cases are identified +- [x] Scope is clearly bounded +- [x] Dependencies and assumptions identified + +## Feature Readiness + +- [x] All functional requirements have clear acceptance criteria +- [x] User scenarios cover primary flows +- [x] Feature meets measurable outcomes defined in Success Criteria +- [x] No implementation details leak into specification + +## Notes + +- Technical/implementation detail deliberately lives in [spec-tech-spike.md](../spec-tech-spike.md) (the planning input), keeping spec.md user-facing per the team's direction. The spike names the shared-core primitives, tree-SHA mechanics, lock schema, and exit-code mapping; spec.md refers to these only in user terms ("content fingerprint", "record file", "verification-failure status"). +- Scope was reshaped during the 2026-05-29 clarification session (recorded in spec.md → Clarifications and the spike §2): prerequisite checking moved out of `verify` into a later `doctor` capability (exit 3 deferred), and `skillrig add` (local-path vendoring) was pulled in as a durable capability so the `add → verify` round-trip is the acceptance contract. `docs/ARCHITECTURE-v0.md` was corrected the same branch. +- Items marked incomplete require spec updates before `/specledger.clarify` or `/specledger.plan`. All items pass. diff --git a/specledger/002-skillcore-verify/spec-tech-spike.md b/specledger/002-skillcore-verify/spec-tech-spike.md new file mode 100644 index 0000000..3706cc4 --- /dev/null +++ b/specledger/002-skillcore-verify/spec-tech-spike.md @@ -0,0 +1,189 @@ +# Tech Spike: `skillcore` + `add` (local) + `verify` + +**Feature**: `002-skillcore-verify` +**Created**: 2026-05-29 +**Status**: Draft — input to `/specledger.plan` +**Purpose**: Capture the technical decisions and their rationale *before* writing user stories, so `spec.md` can stay user-facing (WHAT/WHY) while the HOW lives here and feeds planning. Decisions here are anchored to `docs/ARCHITECTURE-v0.md` (§2, §4, §8, §9b) and `docs/design/cli.md` (Exit Codes, Verification Gate, AP-02/AP-04). + +> This document is a **spike**, not a contract. Where it commits to a behavior the user feels, that behavior is restated as a user story / FR in `spec.md`. Where it commits to internals (package boundaries, hashing mechanics), that stays here and is finalized in `plan.md`. + +--- + +## 1. Scope of this increment + +Three deliverables, smallest coherent slice that makes the core promise *demonstrable end-to-end and offline*: + +1. **`skillcore`** — the single shared primitive package. One implementation of: git **tree-SHA** computation, **`skill.toml`** manifest parse, and **`skills-lock.json`** read/write. Presentation-free (constitution: `internal/config`-style layering). Consumed by `add` and `verify` now; reusable by `bump`/`doctor`/`index` later **without a parallel copy** (AP-04). +2. **`skillrig add `** — vendor a skill from a **local sample-origin fixture** (offline; no network/git fetch yet) into the consumer repo's canonical skill location, and write/update the lock entry. This is the **producer** that gives `verify` something real to check; it is also the first real cut of the `add` verb (architecture §2, Vendor Mutation pattern). +3. **`skillrig verify`** — the offline, deterministic, read-only **integrity gate**: label-honesty (tree-SHA) + orphan (on-disk = locked). Exit `0/1/2`. + +**Why `add` is in scope** (course-correction 2026-05-29): without a producer, `verify` could only be tested against hand-authored locks, which can't anchor a *real* git tree-SHA (constitution III, ground-truth). `add` from a local fixture origin produces a genuine `commit`+`treeSha`, so the `add → verify` round-trip is the acceptance contract and the tree-SHA is never invented. + +--- + +## 2. Clarification decisions (2026-05-29 session) + +| # | Question | Decision | Consequence | +|---|---|---|---| +| C1 | Does `verify` check `[[requires]]` prerequisites (emitting exit 3)? | **No.** Prerequisite / eligibility checking belongs to **`doctor`** (and `bump` when re-vendoring), never `verify`. | `verify` is integrity-only → exit `0/1/2`. Exit `3` (prerequisite) is **deferred to a future `doctor` spec**. `docs/ARCHITECTURE-v0.md` updated (§2, §8, open-Q5 resolved). | +| C2 | Why not "fail in CI / warn for humans" on prereqs? | **Rejected framing.** It conflated "non-interactive" with "CI". CI validates *content* and needs **no backing binaries**; the runtime **agent** is the caller that needs prerequisites present. So eligibility belongs where the agent asks for it (`doctor`), and the content gate (`verify`) stays binary-free. | CI `verify` **never** fails on a missing/`mise`-absent binary — that worry is entirely a `doctor` concern now. | +| C3 | What counts as a prerequisite "satisfied" (for the future `doctor`)? | On **PATH** (`--version` parsed, constraint checked) **OR** mise-resolvable. A `mise.toml` is a *suggestion*, not required — without it, skillrig just reports "missing from PATH". | Recorded as **future `doctor` design intent**; not implemented in this slice. Open question carried: should `doctor` flag "tool declared in `mise.toml` but not installed" distinctly from "absent everywhere"? | +| C4 | Where does the lock under test come from, given `add`/`bump` are unimplemented? | **`add` is in scope** (local-path mode). `verify` itself stays **read-only**. | The `add → verify` round-trip is the test vehicle; no hand-authored locks needed for the happy path. | + +--- + +## 3. `skillcore` primitives + +Single implementation, presentation-free, the AP-04 hard boundary. + +- **`TreeSHA(skillDir) -> sha`** — the **git tree SHA** of the skill subtree (architecture §4.2). Computed from the on-disk content using git's own object model (no bespoke canonicalization — line endings / mode bits / symlinks handled by git). The value `add` records and the value `verify` recomputes come from **this one function**, so the gate can never diverge from what was written (R9, R14, N2). + - *Open (planning):* compute via shelling to `git` (already a dependency, on PATH per Makefile) vs. an in-process tree-object hash. Either is fine if it matches git's canonical tree SHA byte-for-byte; the choice is a plan.md call. Must be deterministic and offline. + - *Open (planning):* hash the **working-tree content on disk** (catches uncommitted tampering) — this is the intended semantic, since the promise is "what your agent will run", not "what HEAD says". + - **Must equal git's canonical tree-object SHA** — the same value a git origin (and GitHub's Trees API, per §12 `gh skill`) publishes for that subtree. A git tree object hashes only *immediate entry names + modes + child SHAs*, so it is **relocation-invariant**: the subtree's SHA at the origin's `skills//` equals the vendored copy's at `.agents/skills//` **iff their contents match**. That invariance is precisely what makes offline label-honesty survive the origin→consumer relocation. +- **`ParseManifest(skill.toml) -> Manifest`** — parse `name, version, namespace, description, tags, [[requires]]` (architecture §4.1). In this slice, `verify` uses it to *recognize* a directory as a skill and `add` uses it to read `name`/`version`/`requires`; the `[[requires]]` data is **mirrored into the lock** but **not evaluated** (that's `doctor`). +- **`ReadLock` / `WriteLock` (`skills-lock.json`)** — typed lock I/O (architecture §4.2): `lockfileVersion, origin, skills{ name -> { version, commit, treeSha, path, requires[] } }`. Atomic write (temp + rename — open Q10). `WriteLock` used by `add`; `ReadLock` by `verify`. + +> All primitives are **path-in / data-out and never fetch** — they operate on a local filesystem working tree only. This is what makes the local-vs-network choice irrelevant to the core, and it is the SDK boundary (see §10). + +--- + +## 4. `add` (local-path mode) — the producer + +- **Input**: a local path to a skill subtree inside a sample-origin fixture (the fixture is a real git repo, so a `commit` SHA is readable offline). +- **Behavior**: copy the skill subtree into the consumer repo's canonical skill location; compute `treeSha = skillcore.TreeSHA(...)`, read `commit` from the fixture's git, `version`/`requires` from `skill.toml`; write/update the lock entry. Idempotent on re-add of identical content. +- **Pattern** (cli.md): **Vendor Mutation** — supports `--dry-run`; refuses to clobber content diverging from a locked `treeSha` without `--force` (content-comparison-on-write, architecture §9b). Writes the lock via `skillcore` only. +- **No content mutation (critical, contrast with `gh skill` §12):** `add` MUST vendor the skill subtree **byte-identical** to the source — it MUST NOT inject provenance into `SKILL.md` frontmatter the way `gh skill` does. Any injection would change the subtree's tree SHA and **immediately break label-honesty** at `verify` time (`verify` recomputes the *whole* subtree, so an added/modified file is a mismatch). All provenance lives **only** in the sidecar lockfile. This is *required* by the tree-SHA model, not a style choice. +- **Acquisition layer (builds toward full `add`)**: the local source is a **git checkout** (it must be — see §7), i.e. the *offline analog of a git origin cloned locally*. Local-path `add` against it exercises the **exact same** `skillcore` + lock path the future remote git-origin `add` will, with only the *acquisition* step swapped (path → `git clone/fetch`). So this slice is a deliberate stepping-stone, not a throwaway. **Investigate HashiCorp `go-getter` (v2)** as the unifying acquisition layer for that step — see §11 OQ-3. +- **Deferred**: network/git fetch from a remote origin; origin-reference resolution coupling (this mode takes a path directly); `@ref`/`--pin` immutable pins; multi-client symlink materialization (vendor the canonical copy only — symlink views are a later §6 concern). +- *Open (planning):* exact argument grammar (`add ` vs `add --from `); whether `add` requires an origin to be configured (lean: no — the local path *is* the source for this increment; the lock's `origin` field records provenance if available). + +--- + +## 5. `verify` — integrity gate + +**Pattern** (cli.md): **Verification Gate** — offline, deterministic, exit-code driven, **no online/inferential signal** (AP-02). Reads only committed/​on-disk files. Needs **no origin and no network** (works before/without `init`). + +Two checks, both exit-2 class on failure: + +1. **Label-honesty** — for each locked skill, recompute `skillcore.TreeSHA(path)` and compare to the lock's `treeSha`. Mismatch ⇒ fail, naming the skill + expected vs actual. +2. **Orphan / completeness** — the set of skill directories present under the canonical project skill location must **equal** the set of locked skills. An on-disk skill with no lock entry (**orphan** — the supply-chain vector, architecture §9b) ⇒ fail; a locked skill absent on disk (**missing**) ⇒ fail. + - Canonical location: `.agents/skills/` (architecture §6). A "skill on disk" = a directory containing a `skill.toml`/`SKILL.md`. **realpath-containment** (OpenClaw): symlink *views* (e.g. `.claude/skills/*`) must **not** be double-counted — only canonical-dir entries count. + +**Exit codes (this slice):** + +| Code | Meaning here | +|---|---| +| 0 | All locked skills match (label-honesty) and on-disk set = locked set. Incl. the empty case (no skills, no orphans). | +| 1 | Usage/config: malformed/unreadable lock, bad flags, not inside a git repo. | +| 2 | Verification failure: any label-honesty mismatch **or** any orphan/missing skill. | +| 3 | **Not emitted by `verify`** — reserved for `doctor`'s prerequisite class. | + +**Conflict markers** (cli.md lists as exit-2): **deferred** — `bump`'s 3-way merge is the only producer and is out of scope; the exit-2 slot is reserved, and `verify`'s failure taxonomy is documented to grow into it when `bump` lands. (YAGNI — no producer exists yet.) + +**Output** (cli.md two-level): human = compact summary (`N skills verified` / per-failure lines) + footer hint; `--json` = complete, structurally complete per-skill verdicts (`name, path, expectedTreeSha, actualTreeSha, status, orphan/missing`) + overall result. `--verbose` surfaces raw causes. Tests assert *shape* + exit code, not just `Contains` (constitution II). + +--- + +## 6. Edge cases (technical) + +- **No lock file present** → treated as an **empty lock** (zero locked skills). With no on-disk skills ⇒ exit 0 (clean, idempotent no-op); with on-disk skills ⇒ all are orphans ⇒ exit 2. +- **Malformed/unparseable lock** → exit 1 (usage/config), errors-as-navigation (what/why/fix), raw cause under `--verbose` — never a silent skip (a corrupt *gate* input must be loud, unlike a malformed *origin* config which is skipped in resolution). +- **Skill dir with no manifest** → not counted as a skill (or flagged — planning decision; lean: ignore non-skill dirs, only `skill.toml`-bearing dirs count). +- **Symlink view dirs** (`.claude/skills/*` → canonical) → ignored via realpath-containment. +- **Not inside a git repo** → exit 1 (tree-SHA needs git's object model); message tells the user to run inside the vendored repo. + +--- + +## 7. Ground-truth anchoring (constitution III) + +- A **sample-origin fixture**: a real git repo under `test/` (or `testdata/`) containing ≥1 skill subtree with a `skill.toml`. Its real git tree-SHA + commit are the ground truth. +- The `TestQuickstart_*` integration tests build the binary, run `add `, assert the lock records the **fixture's actual** `treeSha`/`commit`, then run `verify` and assert exit 0; then tamper one byte and assert exit 2 with the named skill; then introduce an orphan dir and a missing dir and assert exit 2 for each. +- No tree-SHA is hand-written into a fixture — it is always computed by `skillcore` from real content, so SC "same primitive both sides" is genuinely exercised. + +--- + +## 8. Cross-doc updates required + +- **`docs/ARCHITECTURE-v0.md`** — ✅ updated this branch (§2 table + overlap paragraph, §8 prereq bullet + reference-design para, open-Q5 resolved, v0 roadmap bullet). Captures the doctor-design clarification (C1/C2/C3). +- **`docs/design/cli.md`** — needs a matching update during planning/implementation (CLAUDE.md: a CLI behavior change updates `cli.md` in the same branch): the command index line for `verify` ("integrity + prereq check") and the Verification-Gate row should state `verify` is **integrity-only**, with prerequisite/eligibility attributed to `doctor`. The exit-code *table* (2 = label-honesty/orphan/conflict-markers; 3 = prerequisite) stays correct as the *contract*; only the verb-to-class attribution changes. +- **Skill co-evolution (constitution IX)** — a skill update teaching agents to run `skillrig verify`, interpret exit `0`/`2`, and understand that prerequisites are a `doctor` concern (not a verify failure). New skill vs. extend `skillrig-init` — planning decision. + +--- + +## 9. Deferred / out of this increment + +- Prerequisite/eligibility check + exit 3 → **`doctor`** (future spec; C3 intent recorded above). +- `bump` (upstream advance, 3-way merge) + conflict-marker detection. +- Network/git **fetch** in `add`; origin-resolution-driven `add`; `@ref`/`--pin` immutable pins; **auth for remote `add`** (PAT/SSH/registry token — see §11 OQ-2). +- `index.json` / `search`; multi-client symlink materialization (§6); allowlist/audit (§9b, v1); auth (R18). + +--- + +## 10. `skillcore` as a public SDK (requirement for planning) + +**SDK-1 (requirement):** `skillcore` MUST be consumable as a **Go SDK** by third-party Go projects that want to implement skill `add`/`verify` (and future operations) on top of the same primitives — i.e. **not** locked inside Go's `internal/` (which is import-unreachable outside the module). Goal: a third-party tool can `import` it and do `skillcore.Verify(repoPath)` / `skillcore.Add(opts)`, rendering its own output. + +**No rule blocks this** (verified 2026-05-29): +- Constitution §IV "Single implementation of integrity primitives" mandates *one* implementation everyone dispatches to; it says nothing about visibility. A public SDK **strengthens** AP-04 — external tools reuse the one source instead of re-deriving tree-SHA. +- Constitution §V layering requires execution logic to be presentation-free; an SDK **must** be presentation-free anyway → aligned, not in tension. +- The `internal/skillcore` naming in `docs/ARCHITECTURE-v0.md` (§1, §2, §5, §9), `CLAUDE.md`, and `cli.md` is a **convention inherited from `internal/config`, not a constitutional rule.** The architecture's vNext note already anticipates *other* surfaces (MCP) dispatching to `skillcore`; an SDK is exactly another such surface — it runs with the design's grain. +- The PRE-RELEASE marker (`CLAUDE.md`) means the SDK can be exposed **now with no backward-compatibility obligation** — its API may churn freely while we iterate. + +**Network independence — precisely scoped** (re-evaluated 2026-05-29 against prior art, §11): the *primitives* never fetch (§3) — `TreeSHA` / `ParseManifest` / lock-I/O operate on a local working tree and are genuinely network-free, model-agnostic, and **confirmed by prior art** (the skills.sh implementation computes its content hash locally, *post-fetch*). `verify` is likewise network-free and origin-independent (it recomputes and compares to the committed lock). **But `add`-the-capability is NOT network-free in general** — it is offline *only* because this slice deliberately scopes it to a *local source*. A future remote `add` must fetch (git clone/fetch, or an HTTP registry) and authenticate. So the honest SDK boundary is: **`skillcore` = pure filesystem-operating core (no fetch, ever); acquisition + auth = a layer above it that the CLI or SDK consumer supplies.** Do **not** over-read this as "add needs no network" — it means the *core* needs no network. (My earlier "a git remote can be a `file://` path so fetch is basically local" framing was a weak escape hatch and is **retracted** — see §11: the real origin may be an HTTP registry with no git remote at all.) + +**Proposed SDK surface (plan.md finalizes):** +- Primitives: `TreeSHA`, `ParseManifest`, lock `Read/Write` + typed structs. +- Operation-level, presentation-free entry points so a consumer can do the whole job and render its own output: e.g. `Add(opts) (AddResult, error)`, `Verify(repoPath) (VerifyReport, error)` returning structured verdicts + typed errors (no stdout/stderr writes). +- This pushes execution logic **out of `runXxx()` into the package** — a deliberate revision of cli.md's "these are not separate packages, just a concern within `runXxx()`" line: for SDK consumability the execution layer SHOULD be a separate importable package, with `runXxx()` reduced to flag-parse → call SDK → render. + +**Packaging options (plan.md decision — record trade-off):** +- (a) Exported package at module root (`skillcore/`) or `pkg/skillcore/` — simplest; versioned with the CLI. +- (b) Separate Go module (`github.com/skillrig/skillcore`) — independent SemVer for SDK consumers, at the cost of multi-module release overhead. +- Either honors AP-04: the CLI imports the same package the SDK exposes. + +**Reconciliation list (planning, same branch per the CLI-doc rule — doc-convention changes, not contract changes):** rename `internal/skillcore` → the chosen public path in `docs/ARCHITECTURE-v0.md` (§1, §2, §5, §9), `CLAUDE.md` (Architecture section + skillcore note), and `cli.md` (the Execution-vs-Presentation "not separate packages" statement, which the SDK goal reverses). + +--- + +## 11. Prior-art contrast & the git-origin coupling (re-evaluation 2026-05-29) + +Explored `/Users/vincentdesmet/specledger/specledger/pkg/cli/skills` — an existing `npx skills` / **skills.sh** implementation targeting an **HTTP registry hosted on Vercel**, *not* a git repo. Findings that bear on our design: + +- **It is network-strict — no offline path.** `add` discovers via the **GitHub Trees API** (`api.github.com/.../git/trees`), fetches each `SKILL.md` via **`raw.githubusercontent.com`**, queries an audit API + telemetry on **`add-skill.vercel.sh`**, with a `git clone --depth 1` fallback only for non-GitHub/API-failure cases. No `file://`, no cache; the source parser accepts only `owner/repo` or git URLs. (`client.go:23–273`, `discover.go:26–182`, `source.go:40–99`.) **Takeaway:** our local-path `add` is a *divergence/addition* vs this prior art (which has no offline mode) — we are designing the offline UX **new**, not borrowing it. +- **It uses a bespoke SHA-256 over the installed files, NOT a git tree-SHA** (`hash.go:21–68`), recording `{source, ref, sourceType, computedHash}` — note **ref only, no commit SHA** (`lock.go:11–24`). This is exactly the "custom content hash" our architecture **§4.2 explicitly rejected** in favor of the git tree-SHA, on the stated grounds that "the origin already computes git tree SHAs for free." + +**The sharp finding — §4.2's "tree-SHA is free" justification is contingent on a git origin.** It holds only when the origin is a **git repo** (our stated model — §2c: "the origin … is literally the git remote skills are fetched from"). For an **HTTP-registry origin like skills.sh/Vercel there is no git tree object** to get for free, and our label-honesty primitive ("recompute the subtree's git tree SHA and compare to the origin's recorded tree SHA") has nothing origin-side to compare against. The prior art's SHA-256 is the *consequence* of that: a registry model is forced into a bespoke digest. + +**Assumptions + open questions (for plan.md / architecture):** +- **A-1 (make explicit):** skillrig's integrity model **presumes a git-repo origin**. Internally consistent with §2c/§4.2 and our local-path slice (the fixture is a git checkout, so `TreeSHA` *and* the `commit` provenance read offline). State it; don't leave it implicit. +- **OQ-1:** if skillrig ever consumes an **HTTP-registry origin** (skills.sh-style), the integrity primitive must change from "compare to the origin's git tree SHA" to "recompute a content digest from the fetched bytes" (on-disk self-consistency), **losing the origin-attested "modified-in-transit but mislabeled" check** unless the registry publishes a trusted digest. Decide: git-origin-only (current design), or also support registry origins? +- **OQ-2 (auth, future network `add`):** remote `add` from a **private** origin needs credentials — a GH **PAT** (`GITHUB_TOKEN`/`GH_TOKEN`, as the prior art reads) or an **SSH key** for git, or a **registry token** for an HTTP registry. Re-enters scope the moment network `add` lands; deferred today (this slice is local-path, no auth). Pairs with the doctor-side prerequisite **auth** check (R18) already moved out of `verify`. +- **OQ-3 (acquisition library):** evaluate **HashiCorp `go-getter`** (the fetch engine behind Terraform/Packer/Nomad) as the acquisition layer *above* `skillcore`. It unifies `file://` (our local-path slice), `git::ssh/https` with `?ref=` + `//subdir` (the git-origin future), and `http`/`s3`/`gcs` under one source grammar + detectors — a clean fit for "acquisition = the layer above the core." **Two things to verify before adopting:** + 1. **Dependency footprint** vs the architecture's "minimal deps, static consume-only binary" stance — prefer `go-getter/v2` and trim unused getters (the s3/gcs detectors drag in cloud SDKs). A thin `git` wrapper (we already require `git` on PATH) may honor minimal-deps *better* **if** skillrig stays git-origin-only. + 2. **Provenance capture** — go-getter is built to *get content*, not to preserve git identity; confirm we can still obtain the **resolved commit SHA** and compute the **git tree SHA** for the lock (e.g. capture the commit *before* go-getter copies the subdir / drops `.git`). If it can't, it doesn't serve our provenance need and a `git` wrapper wins. + + **Its value scales with OQ-1:** git-origin-only → a thin `git` wrapper is likely sufficient; multi-origin (incl. HTTP registry) → go-getter's unified acquisition + checksum support becomes attractive. Decide OQ-1 first, then OQ-3. +- **Unaffected:** `verify` and the `skillcore` primitives stay network-free and origin-model-agnostic regardless of the above — they only ever recompute against the *committed lock* and *on-disk content*. **The coupling bites at `add`-time provenance, not at verify-time.** + +--- + +## 12. Prior-art: `gh skill` (GitHub first-party, MIT) — re-evaluation 2026-05-29 + +Explored `/Users/vincentdesmet/specledger/skillrig/gh-cli/pkg/cmd/skills`. A **third** acquisition model alongside skills.sh (HTTP registry, §11) and skillrig (git): the **GitHub REST API** (Trees / Blobs / Contents / Refs / Releases), reusing `gh`'s auth token; network-strict except a `--from-local` mode. Subcommands: `install`(`add`) / `update` / `preview` / `search` / `publish`. **No `verify`.** (`skills.go:15-57`.) + +**What it confirms about skillrig's design:** +- **It GETS the tree SHA from the Trees API (`entry.SHA`) and never recomputes it offline** (`discovery.go:544-589`). It stores that SHA in *both* `SKILL.md` frontmatter (`github-tree-sha`/`-ref`/`-path`/`-pinned`/`-repo`; `frontmatter.go:70-98`) **and** a **user-scope** lockfile `~/.agents/.skill-lock.json` (`skillFolderHash`, `source`, `sourceUrl`, `pinnedRef`, `installedAt`; `lockfile.go:94-137`). It uses the SHA only for **online update-detection** (recorded vs. remote; `update.go:302`). → **Exactly the architecture §11/§4.2 critique, now verified:** right primitive (git tree SHA), wrong *question* ("did upstream move?" not offline label-honesty), and **no offline `verify`**. skillrig's `verify` fills a gap `gh skill` genuinely does not. +- **No `[[requires]]` / backing-CLI concept** — pure file placement. Confirms skillrig's doctor-eligibility check is a real differentiator. +- **Copies, not symlinks; rewrites frontmatter** (`installer.go:251-305`). Confirms architecture §6. +- **Lockfile is user-scope only, not committed per-repo** — a cloning teammate/CI gets no provenance. skillrig's **committed project lock** is what makes the repo self-describing + CI-verifiable (architecture §3) — a divergence in skillrig's favor. + +**The sharp incompatibility finding (resolves architecture open Q7):** +`gh skill` **injects provenance into each `SKILL.md`** *after* computing the upstream tree SHA — so its installed content **deliberately diverges from the tree SHA it recorded**, and it only gets away with this because it **never recomputes** the installed tree's SHA. **skillrig cannot do this:** since `verify` recomputes the on-disk subtree's tree SHA, **any in-file injection fails label-honesty.** Therefore: +- skillrig's **lockfile-only, no-frontmatter-mutation** provenance is **required by the integrity model**, not a preference (recorded in §4 "no content mutation"). +- **You cannot wrap `gh skill`'s placement** and keep tree-SHA label-honesty — its frontmatter rewriting destroys the very SHA you'd verify. Concrete evidence for architecture **§11b Option B (reimplement core)**, not wrap. + +**What to borrow:** +- **Source grammar** for skillrig's `add` (the §4 open question): `OWNER/REPO`, `OWNER/REPO@ref`, **`OWNER/REPO//path`** (subdir), `./local` + `--from-local`. The `//path` subdir syntax **converges with go-getter's `//subdir`** (§11 OQ-3) — a point in go-getter's favor for a unified grammar. +- **`--from-local` is direct prior art for our local-path `add`** — but `gh skill`'s local mode injects only a `local-path` field and records **no git tree SHA / commit** (`frontmatter.go:102-127`). skillrig's local-path `add` deliberately **requires a git checkout** so it records a real tree SHA + commit (§4, §7) — a deliberate improvement, not a copy. +- **Soft-pin** semantics (`--pin v1.0.0` stored as a human-readable ref, re-resolved each fetch, used only as update-skip) vs skillrig's intended **immutable** skill pin (architecture §2d) — decide pin hardness at planning. +- **Upstream-provenance redirect** (`install.go:225-298`): detects re-published skills via `github-repo` metadata and offers the upstream — a governance idea adjacent to skillrig's allowlist/orphan work (v1). diff --git a/specledger/002-skillcore-verify/spec.md b/specledger/002-skillcore-verify/spec.md new file mode 100644 index 0000000..cbe2e23 --- /dev/null +++ b/specledger/002-skillcore-verify/spec.md @@ -0,0 +1,208 @@ +# Feature Specification: Vendor & Verify Skills (`add` + `verify`) + +**Feature Branch**: `002-skillcore-verify` +**Created**: 2026-05-29 +**Status**: Draft +**Input**: User description: "Implement `skillcore` + `verify` — git tree-SHA + `skill.toml` manifest parse; offline label-honesty + orphan check; exit codes 0/2/3 from docs/ARCHITECTURE-v0.md" + +> **Technical companion**: [spec-tech-spike.md](./spec-tech-spike.md) holds the implementation-level decisions (the shared-core primitives, tree-SHA mechanics, lock schema, exit-code mapping, the clarification session that reshaped scope). This spec stays user-facing; the spike is the input to `/specledger.plan`. + +## Overview + +This feature delivers the product's core promise — **"the skill your agent runs is exactly the version that was reviewed and approved"** — as something a user can do end-to-end, offline: + +1. A user **vendors a skill** from their org's library into their repo, recording its exact identity (`add`). +2. Anyone — CI, an agent, or a human — can later **prove the vendored skills are exactly what was recorded**, and that nothing untracked has slipped in (`verify`). + +Both verbs sit on a single shared trust primitive (the content fingerprint), so the value written when a skill is vendored and the value checked at verification time are computed the same way and cannot drift apart. That shared primitive is invisible to users; its user-meaning is simply that **the gate cannot lie**. + +This is the second slice of the CLI. The first (`001-init-origin-resolution`) made a repo self-describing about *where* its skills come from; this slice makes the skills it carries *honest*. + +## Clarifications + +### Session 2026-05-29 + +- Q: The input lists exit codes 0/2/3, but names only label-honesty + orphan checks (both the integrity/exit-2 class). Exit 3 is the *prerequisite* class. Does this slice check backing-CLI prerequisites? → A: **No.** Prerequisite / eligibility checking (does the agent have the backing CLIs a skill needs to *run*?) belongs to a later `doctor` capability, not to `verify`. `verify` is integrity-only and uses exit statuses `0/1/2`; the prerequisite class (exit 3) is deferred. (`docs/ARCHITECTURE-v0.md` was corrected to reflect this — see the spike §8.) +- Q: Why not have `verify` warn-or-fail on prerequisites depending on caller? → A: That framing was rejected. The CI gate validates *content* and needs no backing binaries installed; the caller that actually needs prerequisites present is the runtime agent. So prerequisite/eligibility lives where the agent asks for it (`doctor`), and the content gate stays free of it. Practical consequence: a CI run of `verify` never fails because a backing tool is missing. +- Q: `verify` needs something to verify against, but the lock's writers weren't built yet — fixtures only, or a real producer? → A: A real producer. **`skillrig add` (vendoring from a local copy of the origin) is in scope**, so the `add → verify` round-trip is the acceptance contract and the recorded fingerprint is genuine, not hand-authored. `verify` itself remains read-only. +- Q: Is vendoring-from-a-local-path a throwaway test affordance or a real capability? → A: A **durable capability** — consuming from a local checkout of the org library is a legitimate, kept use case. Fetching directly from a remote origin is a *later, additive* mode, not a replacement for it. + +## User Scenarios & Testing *(mandatory)* + +### User Story 1 - Vendor an approved skill into my repo (Priority: P1) + +A developer (or an agent acting for them) has access to their org's skill library and wants one of its skills available in this repo. They run a single command naming the skill's location in a local copy of that library. The skill's files are placed into the repo's standard skills location, and its exact identity — which version, where it came from, and a tamper-evident fingerprint of its content — is recorded in a committed record file. The repo now carries both the skill and proof of what it is. + +**Why this priority**: Nothing can be verified until something has been vendored and recorded. This is the producer half of the promise and the smallest standalone slice that delivers value: a repo gains an approved skill plus a durable record of its identity. Consuming from a local copy of the library is itself a real, supported workflow (offline, air-gapped, or pre-cloned origins). + +**Independent Test**: Point the vendor command at a skill in a local sample library; confirm the skill's files land in the repo's skills location and the record file gains an entry naming the skill's version, source, and content fingerprint. No network is involved. + +**Acceptance Scenarios**: + +1. **Given** a repo with no vendored skills and a local library containing a skill, **When** the developer vendors that skill, **Then** the skill's files appear under the repo's standard skills location and the record file contains one entry for it (version, source/provenance, and a content fingerprint), and the command reports success. +2. **Given** a skill already vendored from a library, **When** the developer vendors the identical content again, **Then** the outcome is unchanged (idempotent) and the command reports success — no duplicate or corrupted record. +3. **Given** the developer requests machine-readable output, **When** the vendor command succeeds, **Then** the tool emits structured output naming the skill, its recorded version, and where it was placed, and that output is complete and parseable. +4. **Given** a skill is already vendored and the developer has locally changed its files, **When** they vendor again with content that differs from the recorded fingerprint, **Then** the tool refuses to silently overwrite the divergent content and requires an explicit override, so local edits are never lost without intent. +5. **Given** the developer wants to preview only, **When** they run the vendor command in dry-run mode, **Then** the tool reports what it *would* place and record, and writes nothing. + +--- + +### User Story 2 - Prove a vendored skill is exactly what was approved (Priority: P1) + +A reviewer, a CI job, or an agent needs assurance that the skills checked into a repo are exactly the approved versions and have not been altered to claim a version they aren't. They run a single verification command. It recomputes each vendored skill's content fingerprint and compares it to the recorded value. If every skill matches, the command passes. If any skill's content diverges from what its record claims, the command fails and names the offending skill and the discrepancy. + +**Why this priority**: This is the core product promise made checkable. It is the reason the feature exists: a long skill file can hide a change no human reviewer would catch by eye, and this turns "is this really the approved version?" into a deterministic pass/fail. It builds on US1 (something must be vendored and recorded first), so the two together form the minimum viable round-trip. + +**Independent Test**: With at least one vendored-and-recorded skill, run verify and confirm it passes; then alter a single byte of a vendored skill file and confirm verify fails with a non-zero status that names the skill and reports a content mismatch. + +**Acceptance Scenarios**: + +1. **Given** a repo whose vendored skills all match their recorded fingerprints, **When** verify runs, **Then** it reports success with a success exit status and a summary of how many skills were verified. +2. **Given** a vendored skill whose content has been modified so it no longer matches its recorded fingerprint, **When** verify runs, **Then** it exits with the verification-failure status and names the skill plus the recorded-vs-actual discrepancy. +3. **Given** the verification runs entirely on the committed files with no network or external service, **When** it is run repeatedly with unchanged inputs, **Then** it returns the same result every time (deterministic, offline). +4. **Given** a repo with no vendored skills and no record file, **When** verify runs, **Then** it reports success (nothing to verify) rather than an error. + +--- + +### User Story 3 - Catch a skill that's untracked or missing (Priority: P2) + +A security-conscious reviewer worries about a skill that was added to the repo without going through the record — an untracked skill that could quietly instruct an agent to do something unreviewed — or, conversely, a recorded skill whose files have gone missing. Running verify covers the *whole* set of skills on disk against the recorded set, not only the ones listed, so neither an extra unrecorded skill nor a missing recorded one can pass unnoticed. + +**Why this priority**: It closes the highest-severity supply-chain gap the design calls out (an untracked skill is the primary attack vector), and it makes "everything present is accounted for" part of the gate. It depends on the matching machinery from US2, so it follows it. + +**Independent Test**: Starting from a passing repo, (a) add a skill directory that has no record entry and confirm verify fails identifying it as untracked; (b) separately, remove a recorded skill's files and confirm verify fails identifying it as missing. + +**Acceptance Scenarios**: + +1. **Given** a skill directory present in the repo's skills location with no corresponding record entry, **When** verify runs, **Then** it exits with the verification-failure status and identifies the untracked (orphan) skill. +2. **Given** a record entry for a skill whose files are absent from the repo, **When** verify runs, **Then** it exits with the verification-failure status and identifies the missing skill. +3. **Given** the repo uses per-client compatibility views (alternate directory entries pointing at the same canonical skill content), **When** verify runs, **Then** those views are not miscounted as separate or untracked skills. +4. **Given** both a content mismatch (US2) and an untracked skill are present, **When** verify runs, **Then** it fails and reports both classes of problem rather than stopping at the first. + +--- + +### User Story 4 - Branch on the outcome deterministically (Priority: P3) + +An automated caller — a CI merge gate or an agent deciding its next step — needs to act on the verification result without parsing prose. It relies on stable exit statuses to branch (proceed vs. block) and on complete machine-readable output to report *which* skills failed and why. When something is wrong, the message states what failed, the real reason, and a concrete fix. + +**Why this priority**: It hardens the gate for its primary non-human callers and makes the result composable in pipelines, but the happy and failure paths (US1–US3) must exist first. + +**Independent Test**: Trigger each outcome — all-pass, content mismatch, untracked/missing — and confirm each yields its documented exit status and a structurally complete machine-readable verdict; confirm every failure message contains what failed, why, and a suggested fix, and that diagnostics go to the error stream while data goes to standard output. + +**Acceptance Scenarios**: + +1. **Given** any verification outcome, **When** verify runs, **Then** it returns a stable exit status distinguishing success, verification failure, and usage/config error, consistent across repeated runs. +2. **Given** machine-readable output is requested, **When** verify runs, **Then** the output is complete (every checked skill with its per-skill verdict) and parseable, for both passing and failing runs. +3. **Given** a verification failure, **When** the result is reported, **Then** the message names what failed, the underlying reason (never swallowed), and at least one concrete next step; an escape-hatch verbose mode exposes the raw underlying cause. +4. **Given** the record file is itself unreadable or malformed, **When** verify runs, **Then** it exits with the usage/config error status (distinct from a verification failure) and explains the problem rather than dumping a raw parser error. + +--- + +### Edge Cases + +- **No record file at all**: treated as "no skills recorded" — verify passes if there are also no skills on disk, and reports every on-disk skill as untracked if there are. Vendoring creates the record file on first use. +- **Empty repo / nothing vendored**: verify is a success (nothing to check), not an error. +- **Local edits then re-vendor**: re-vendoring content that diverges from the recorded fingerprint requires an explicit override; it never silently discards local edits. +- **Re-vendor identical content**: produces no spurious change and reports success (idempotent). +- **Per-client view directories**: alternate directory entries that point at the same canonical skill content are not counted as separate or untracked skills. +- **Malformed record file**: surfaced as a usage/config error with a clear message, distinct from a content-verification failure. +- **Not inside a version-controlled repo**: the fingerprint relies on the repo's version-control content model, so running outside one is a usage/config error that says so. +- **Whitespace/formatting in the record**: tolerated on read; the fingerprint comparison is on content, not formatting. + +## Requirements *(mandatory)* + +### Functional Requirements + +**Vendoring a skill (`add`)** + +- **FR-001**: The system MUST provide a command that vendors a skill from a local copy of the org's library into the repo's standard skills location and records its identity. +- **FR-002**: For each vendored skill, the system MUST record its version, its provenance (where it came from), and a content fingerprint that uniquely reflects the skill's content. +- **FR-003**: The vendor command MUST be idempotent: re-vendoring identical content leaves an equivalent result and reports success without error. +- **FR-004**: The vendor command MUST NOT silently overwrite vendored content that diverges from the recorded fingerprint; it MUST require an explicit override so local modifications are never lost without intent. +- **FR-005**: The vendor command MUST support a preview (dry-run) mode that reports the intended placement and record changes without writing anything. +- **FR-006**: The vendor command MUST create any missing skills location and record file on first use. +- **FR-007**: The vendor command MUST operate offline against the supplied local source; it MUST NOT require network access in this feature. (Fetching from a remote origin is a later, additive mode — see Out of Scope.) + +**Verifying vendored skills (`verify`)** + +- **FR-008**: The system MUST provide a verification command that checks the repo's vendored skills against their recorded identities, entirely offline and deterministically (same inputs always yield the same result; no network or external/live signal). +- **FR-009**: Verification MUST recompute each recorded skill's content fingerprint from its current on-disk content and compare it to the recorded value, failing when they differ (label honesty). +- **FR-010**: Verification MUST compare the set of skills present on disk against the set of recorded skills, failing when a skill is present but unrecorded (untracked/orphan) or recorded but absent (missing) — covering the whole set, not only recorded entries. +- **FR-011**: Verification MUST NOT count per-client compatibility views (alternate directory entries pointing at the same canonical skill content) as separate or untracked skills. +- **FR-012**: Verification MUST report *all* detected problems in a run (e.g. both a content mismatch and an untracked skill), not stop at the first. +- **FR-013**: Verification MUST treat an empty repo / absent record as success (nothing to verify), and MUST treat an unreadable or malformed record as a usage/config error distinct from a verification failure. +- **FR-014**: Verification MUST NOT perform any backing-tool prerequisite or eligibility check; that is explicitly out of scope for this feature (reserved for a later health command). A missing backing tool MUST NOT cause a verification failure. +- **FR-015**: Verification MUST be read-only — it checks, and never writes, the record or the skill files. + +**Shared trust primitive (`skillcore`)** + +- **FR-016**: The content fingerprint and the skill-record parsing MUST have exactly one shared implementation used by both vendoring and verification, so the value written at vendor time and the value checked at verify time cannot diverge. (No parallel/duplicate implementation.) +- **FR-017**: That shared implementation MUST be reusable by future commands (e.g. upgrade-proposing and health commands) without copying the logic. + +**Command experience (baseline conformance, consistent with the first slice)** + +- **FR-018**: Every command MUST provide self-documenting help including at least two usage examples sufficient to construct a correct invocation without external docs. +- **FR-019**: Every error MUST state (a) what failed, (b) the real underlying reason (never swallowed), and (c) at least one concrete suggested fix; a verbose mode MUST expose the raw underlying cause. +- **FR-020**: Diagnostic output MUST go to the error stream and primary data to the standard output stream, so output can be cleanly piped. +- **FR-021**: The system MUST offer a machine-readable output mode whose output is complete (every checked skill with a per-skill verdict; no truncation) and parseable, for both passing and failing runs. +- **FR-022**: The system MUST use distinct, stable exit statuses: success; usage/config error (bad arguments, malformed record, not in a version-controlled repo); and verification failure (content mismatch or untracked/missing skill). The prerequisite-failure status is reserved and MUST NOT be emitted by this feature's commands. + +### Key Entities *(include if feature involves data)* + +- **Skill**: a unit of agent capability — a directory of files (including a machine-readable manifest declaring its name, version, and any backing-tool prerequisites) vendored into the repo. The thing that is vendored, recorded, and verified. +- **Skill manifest**: the per-skill machine-readable description (name, version, namespace, description, discovery tags, declared backing-tool prerequisites). Read at vendor time; its prerequisite declarations are recorded but not evaluated in this feature. +- **Skill record (lock)**: the committed file mapping each vendored skill to its recorded version, provenance, content fingerprint, and location. Written by vendoring, read by verification; the source of truth for "what was approved." +- **Content fingerprint**: a value that uniquely reflects a skill's content as published for a given version. Used for *label honesty* — confirming on-disk content matches the version it claims to be. Computed identically at vendor time and verify time. +- **Verification verdict**: the outcome of verification — overall pass/fail plus a per-skill result (matched / content-mismatch / untracked / missing), surfaced both compactly for humans and completely for machines. + +## Out of Scope + +The following are explicitly **not** part of this feature and MUST NOT be pulled in: + +- **Backing-tool prerequisite / eligibility checking** (is a skill's required CLI present, the right version, authenticable) and its dedicated exit status — reserved for a later health command. `verify` here is integrity-only. +- **Fetching skills from a remote origin** (network/git fetch), origin-resolution-driven vendoring, and immutable version pins — this feature vendors from a *local* copy of the library only. +- **Upgrade proposal** (detecting upstream advances, three-way-merge of local edits, conflict-marker handling, opening PRs) — a later command; this feature has no producer of merge conflicts. +- **Discovery** (`index.json`, search) and any browse UI. +- **Multi-client materialization** (creating per-client view directories) — verification must merely not miscount existing views; creating them is out of scope. +- **External-source allowlists, audit classification, and risk/vulnerability surfacing** — later governance work. +- **Any authentication or credential handling.** + +## Success Criteria *(mandatory)* + +### Measurable Outcomes + +- **SC-001**: A user can vendor a skill from a local library and verify it in two commands, with zero network access and no hand-authored records. +- **SC-002**: When a vendored skill's content matches its record, verification passes (success status); when any skill's content diverges from its record, verification fails with the verification-failure status — correct for 100% of label-honesty cases. +- **SC-003**: A single altered byte in any vendored skill file is detected as a content mismatch — 0 false negatives. +- **SC-004**: Any on-disk skill with no record entry (untracked) and any recorded skill absent on disk (missing) are both detected and fail the gate. +- **SC-005**: The content fingerprint computed at verify time is identical to the value recorded at vendor time for unmodified content — verified by vendoring real content and re-checking it, never by a hand-written value. +- **SC-006**: A missing backing tool never causes a verification failure (verification is integrity-only). +- **SC-007**: Verification is fully offline and deterministic: identical inputs yield the identical exit status and verdict on every run. +- **SC-008**: Every failure a user can hit (content mismatch, untracked/missing skill, malformed record) produces a message naming the problem and at least one concrete next step — 0 raw-only errors — and machine-readable output is complete and parseable for both passing and failing runs. +- **SC-009**: Each command's help output alone is sufficient for a first-time user or agent to construct a correct invocation (purpose plus ≥2 examples). + +## Constitution Alignment *(skillrig-specific)* + +- **II — Quickstart-as-Contract**: `quickstart.md` scenarios will be authored as executable steps (concrete invocations, observable record/skill contents, exit statuses) mapping 1:1 to integration tests. The vendor→verify round-trip, the tamper→fail case, and the untracked/missing cases are each a scenario. Output-shape assertions are required: machine-readable output parseable and structurally complete; error output checked for its three parts (what/why/fix) plus the correct exit status — not a single substring match. +- **III — Ground-Truth Anchoring**: the content fingerprint must be anchored to real captured content — vendored from a real local library fixture and recomputed — never an invented or hand-written fingerprint. (See spike §7.) +- **VIII — Single-Implementation Discipline (AP-04)**: the fingerprint and record-parsing primitives have exactly one implementation, shared by vendoring and verification and reusable by later commands; a parallel copy is a defect. +- **IX — Skill–CLI Co-Evolution**: this capability ships with a corresponding agent skill update teaching agents how to phrase "vendor this skill / check our skills are unmodified," how to read the pass/fail outcome, and that a missing backing tool is *not* a verification failure (it is a later health concern). + +## Dependencies & Assumptions + +**Assumptions**: + +- A local copy of the org's skill library is available on disk to vendor from; this feature does not fetch it. (Consuming from a local checkout is a supported, durable workflow; remote fetch is additive future work.) +- Skills are vendored into the repo under version control, so the repo's own content model carries file integrity; the recorded fingerprint adds *label honesty* (content matches its claimed version) on top of that. +- The skill record file and the per-skill manifest are separate concerns from the origin config of the first slice; this feature reads/writes the record and reads the manifest, and does not require an origin to be configured. + +**Dependencies**: + +- Builds on the first slice (`001-init-origin-resolution`) for the baseline command experience (help, errors-as-navigation, two-level output, exit-code discipline) and the project's config/skills directory conventions. +- Conventions are governed by `docs/design/cli.md` (Verification Gate and Vendor Mutation command patterns, Exit Codes) and `docs/ARCHITECTURE-v0.md` (§2, §4, §8, §9b), which were updated this branch to attribute prerequisite checking to the later health command rather than to verification. The detailed technical decisions live in [spec-tech-spike.md](./spec-tech-spike.md). + +### Previous work + +### Epic: 001-init-origin-resolution - CLI Initialization & Origin Resolution (closed) + +- **Origin resolution + `skillrig init`**: established the single origin resolver and the baseline command experience (help, errors-as-navigation, two-level output, exit codes 0/1) that this feature extends with the verification-failure class (exit 2). This is the first feature to exercise integrity verification; no prior `add`/`verify`/`skillcore` work exists (`sl issue list --all` shows only closed `001` items). From c82eedc6f9bc3bc5e4326336da85ba7ab4a5d7f3 Mon Sep 17 00:00:00 2001 From: Vincent De Smet Date: Sat, 30 May 2026 01:20:46 +0800 Subject: [PATCH 2/8] docs(002): integrate clarify decisions; resolve 21 review threads MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Clarify session 2026-05-30 reshaped scope from reviewer feedback (21 open comments, all resolved on the SpecLedger backend with reasons). Decisions: - add obtains skills from the RESOLVED origin (no --from/path arg) — the single-origin init contract holds; origin value may be a local checkout; tests run `init --origin ` then `add`. GitHub-only is the prod lean. - add detects + refuses divergent content (--force to override); NO 3-way merge. add != bump: re-vendoring the same version has no upstream-advance axis (base/theirs/ours) to merge — that belongs to bump. - verify conflict-marker detection deferred (marker'd content already fails the tree-SHA check; no producer until bump). - multi-client symlinks deferred; add writes only canonical .agents/skills; agent-shell selection is a separate feature. - [[requires]] NOT mirrored into the lock — the vendored on-disk skill.toml is the single source of truth (diverges from arch 4.2; flagged). Corrections applied: byte-exact fingerprint (no whitespace leniency); verify aggregates ALL failing skills (never first-fail); add/verify require a git repo; record file named .skillrig/skills-lock.json in Key Entities. Spec re-validated against the quality checklist — all items still pass. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../checklists/requirements.md | 1 + .../002-skillcore-verify/spec-tech-spike.md | 22 ++++---- specledger/002-skillcore-verify/spec.md | 51 +++++++++++-------- 3 files changed, 43 insertions(+), 31 deletions(-) diff --git a/specledger/002-skillcore-verify/checklists/requirements.md b/specledger/002-skillcore-verify/checklists/requirements.md index 6ed93f8..70aef14 100644 --- a/specledger/002-skillcore-verify/checklists/requirements.md +++ b/specledger/002-skillcore-verify/checklists/requirements.md @@ -34,3 +34,4 @@ - Technical/implementation detail deliberately lives in [spec-tech-spike.md](../spec-tech-spike.md) (the planning input), keeping spec.md user-facing per the team's direction. The spike names the shared-core primitives, tree-SHA mechanics, lock schema, and exit-code mapping; spec.md refers to these only in user terms ("content fingerprint", "record file", "verification-failure status"). - Scope was reshaped during the 2026-05-29 clarification session (recorded in spec.md → Clarifications and the spike §2): prerequisite checking moved out of `verify` into a later `doctor` capability (exit 3 deferred), and `skillrig add` (local-path vendoring) was pulled in as a durable capability so the `add → verify` round-trip is the acceptance contract. `docs/ARCHITECTURE-v0.md` was corrected the same branch. - Items marked incomplete require spec updates before `/specledger.clarify` or `/specledger.plan`. All items pass. +- `/specledger.clarify` session 2026-05-30 resolved 21 reviewer comments and added 5 clarifications (add source = resolved origin / no `--from`; detect+refuse not 3-way-merge; conflict-marker detection deferred; multi-client symlinks deferred / canonical `.agents/skills` only; `[[requires]]` NOT mirrored into the lock). Byte-exact-fingerprint correction and verify-aggregates-all-failures wording applied. Spec re-validated — all items still pass. diff --git a/specledger/002-skillcore-verify/spec-tech-spike.md b/specledger/002-skillcore-verify/spec-tech-spike.md index 3706cc4..0f53a8a 100644 --- a/specledger/002-skillcore-verify/spec-tech-spike.md +++ b/specledger/002-skillcore-verify/spec-tech-spike.md @@ -13,8 +13,8 @@ Three deliverables, smallest coherent slice that makes the core promise *demonstrable end-to-end and offline*: -1. **`skillcore`** — the single shared primitive package. One implementation of: git **tree-SHA** computation, **`skill.toml`** manifest parse, and **`skills-lock.json`** read/write. Presentation-free (constitution: `internal/config`-style layering). Consumed by `add` and `verify` now; reusable by `bump`/`doctor`/`index` later **without a parallel copy** (AP-04). -2. **`skillrig add `** — vendor a skill from a **local sample-origin fixture** (offline; no network/git fetch yet) into the consumer repo's canonical skill location, and write/update the lock entry. This is the **producer** that gives `verify` something real to check; it is also the first real cut of the `add` verb (architecture §2, Vendor Mutation pattern). +1. **`skillcore`** — the single shared primitive package. One implementation of: git **tree-SHA** computation, **`skill.toml`** manifest parse, and **`skills-lock.json`** read/write. Presentation-free (same layering discipline as `internal/config`). **Package path is an open question — likely public (`skillcore/` or `pkg/skillcore`), not `internal/`** — driven by the SDK requirement; see §10. Consumed by `add` and `verify` now; reusable by `bump`/`doctor`/`index` later **without a parallel copy** (AP-04). +2. **`skillrig add `** — vendor a named skill from the repo's **configured origin** (resolved via the shared resolver; the origin may be a **local checkout** for this increment — offline, no network/git fetch yet) into the consumer repo's canonical skill location, and write/update the lock entry. There is **no** `--from`/path argument that bypasses the origin (clarified 2026-05-30 — that would break the single-origin contract from `init`); tests run `skillrig init --origin ` then `skillrig add `. This is the **producer** that gives `verify` something real to check; it is also the first real cut of the `add` verb (architecture §2, Vendor Mutation pattern). 3. **`skillrig verify`** — the offline, deterministic, read-only **integrity gate**: label-honesty (tree-SHA) + orphan (on-disk = locked). Exit `0/1/2`. **Why `add` is in scope** (course-correction 2026-05-29): without a producer, `verify` could only be tested against hand-authored locks, which can't anchor a *real* git tree-SHA (constitution III, ground-truth). `add` from a local fixture origin produces a genuine `commit`+`treeSha`, so the `add → verify` round-trip is the acceptance contract and the tree-SHA is never invented. @@ -40,8 +40,8 @@ Single implementation, presentation-free, the AP-04 hard boundary. - *Open (planning):* compute via shelling to `git` (already a dependency, on PATH per Makefile) vs. an in-process tree-object hash. Either is fine if it matches git's canonical tree SHA byte-for-byte; the choice is a plan.md call. Must be deterministic and offline. - *Open (planning):* hash the **working-tree content on disk** (catches uncommitted tampering) — this is the intended semantic, since the promise is "what your agent will run", not "what HEAD says". - **Must equal git's canonical tree-object SHA** — the same value a git origin (and GitHub's Trees API, per §12 `gh skill`) publishes for that subtree. A git tree object hashes only *immediate entry names + modes + child SHAs*, so it is **relocation-invariant**: the subtree's SHA at the origin's `skills//` equals the vendored copy's at `.agents/skills//` **iff their contents match**. That invariance is precisely what makes offline label-honesty survive the origin→consumer relocation. -- **`ParseManifest(skill.toml) -> Manifest`** — parse `name, version, namespace, description, tags, [[requires]]` (architecture §4.1). In this slice, `verify` uses it to *recognize* a directory as a skill and `add` uses it to read `name`/`version`/`requires`; the `[[requires]]` data is **mirrored into the lock** but **not evaluated** (that's `doctor`). -- **`ReadLock` / `WriteLock` (`skills-lock.json`)** — typed lock I/O (architecture §4.2): `lockfileVersion, origin, skills{ name -> { version, commit, treeSha, path, requires[] } }`. Atomic write (temp + rename — open Q10). `WriteLock` used by `add`; `ReadLock` by `verify`. +- **`ParseManifest(skill.toml) -> Manifest`** — parse `name, version, namespace, description, tags, [[requires]]` (architecture §4.1). In this slice, `verify` uses it to *recognize* a directory as a skill and `add` uses it to read `name`/`version`. The `[[requires]]` data is **NOT mirrored into the lock** (clarified 2026-05-30): the full subtree — including `skill.toml` — is vendored on disk and fingerprint-attested, so the **vendored manifest is the single source of truth** for prerequisites; a later `doctor` walks it directly. Mirroring would only duplicate data that can drift. → **diverges from architecture §4.2**, whose "mirror requires for offline prereq check (R16)" rationale assumed the manifest might *not* be on disk; in our vendored-in-git model it always is. (Flag for architecture reconciliation.) +- **`ReadLock` / `WriteLock` (`skills-lock.json`)** — typed lock I/O (architecture §4.2, **minus `requires`** per above): `lockfileVersion, origin, skills{ name -> { version, commit, treeSha, path } }`. Atomic write (temp + rename — open Q10). `WriteLock` used by `add`; `ReadLock` by `verify`. > All primitives are **path-in / data-out and never fetch** — they operate on a local filesystem working tree only. This is what makes the local-vs-network choice irrelevant to the core, and it is the SDK boundary (see §10). @@ -55,7 +55,7 @@ Single implementation, presentation-free, the AP-04 hard boundary. - **No content mutation (critical, contrast with `gh skill` §12):** `add` MUST vendor the skill subtree **byte-identical** to the source — it MUST NOT inject provenance into `SKILL.md` frontmatter the way `gh skill` does. Any injection would change the subtree's tree SHA and **immediately break label-honesty** at `verify` time (`verify` recomputes the *whole* subtree, so an added/modified file is a mismatch). All provenance lives **only** in the sidecar lockfile. This is *required* by the tree-SHA model, not a style choice. - **Acquisition layer (builds toward full `add`)**: the local source is a **git checkout** (it must be — see §7), i.e. the *offline analog of a git origin cloned locally*. Local-path `add` against it exercises the **exact same** `skillcore` + lock path the future remote git-origin `add` will, with only the *acquisition* step swapped (path → `git clone/fetch`). So this slice is a deliberate stepping-stone, not a throwaway. **Investigate HashiCorp `go-getter` (v2)** as the unifying acquisition layer for that step — see §11 OQ-3. - **Deferred**: network/git fetch from a remote origin; origin-reference resolution coupling (this mode takes a path directly); `@ref`/`--pin` immutable pins; multi-client symlink materialization (vendor the canonical copy only — symlink views are a later §6 concern). -- *Open (planning):* exact argument grammar (`add ` vs `add --from `); whether `add` requires an origin to be configured (lean: no — the local path *is* the source for this increment; the lock's `origin` field records provenance if available). +- **Origin-driven, not path-driven** (clarified 2026-05-30): `add` **requires a configured origin** and resolves it via the shared resolver — no `--from`/path argument (that would bypass the `init --origin` contract). The origin *value* may be a local checkout for this increment. Borrow the **source grammar** for the skill identity (``, optionally `OWNER/REPO//path@ref` later) from `gh skill` / go-getter (§12, OQ-3). *Open (planning):* exact grammar for naming the skill within the origin, and whether the origin reference itself must grow a local-path form or reuse `OWNER/REPO`. --- @@ -67,7 +67,7 @@ Two checks, both exit-2 class on failure: 1. **Label-honesty** — for each locked skill, recompute `skillcore.TreeSHA(path)` and compare to the lock's `treeSha`. Mismatch ⇒ fail, naming the skill + expected vs actual. 2. **Orphan / completeness** — the set of skill directories present under the canonical project skill location must **equal** the set of locked skills. An on-disk skill with no lock entry (**orphan** — the supply-chain vector, architecture §9b) ⇒ fail; a locked skill absent on disk (**missing**) ⇒ fail. - - Canonical location: `.agents/skills/` (architecture §6). A "skill on disk" = a directory containing a `skill.toml`/`SKILL.md`. **realpath-containment** (OpenClaw): symlink *views* (e.g. `.claude/skills/*`) must **not** be double-counted — only canonical-dir entries count. + - Canonical location: `.agents/skills/` (architecture §6). A "skill on disk" = a directory containing a `skill.toml`/`SKILL.md`. Since multi-client symlink views are **deferred** (clarified 2026-05-30 — `add` creates only `.agents/skills`), the orphan check **scans only the canonical location** and need not reason about views in this slice. (When views land, *realpath-containment* applies — resolve each candidate dir's real path and only count entries whose resolved path stays inside the canonical root, so a symlink view like `.claude/skills/foo → ../.agents/skills/foo` is not double-counted. Deferred with multi-client materialization.) **Exit codes (this slice):** @@ -78,7 +78,7 @@ Two checks, both exit-2 class on failure: | 2 | Verification failure: any label-honesty mismatch **or** any orphan/missing skill. | | 3 | **Not emitted by `verify`** — reserved for `doctor`'s prerequisite class. | -**Conflict markers** (cli.md lists as exit-2): **deferred** — `bump`'s 3-way merge is the only producer and is out of scope; the exit-2 slot is reserved, and `verify`'s failure taxonomy is documented to grow into it when `bump` lands. (YAGNI — no producer exists yet.) +**Conflict markers** (cli.md lists as exit-2): **deferred** (revalidated & confirmed 2026-05-30). Two reasons: (1) **correctness is already covered** — a vendored file containing `<<<<<<<`/`=======`/`>>>>>>>` differs from the locked clean content, so its recomputed tree-SHA won't match the lock and `verify` **already fails it as a label-honesty mismatch**; an explicit marker check would only *upgrade the error message*, not catch a new case. (2) **No producer exists** — the only thing that writes markers is `bump`'s 3-way merge, which is out of scope. The exit-2 slot is reserved; `verify`'s taxonomy grows the distinct "unresolved conflict markers" reason when `bump` lands. **Output** (cli.md two-level): human = compact summary (`N skills verified` / per-failure lines) + footer hint; `--json` = complete, structurally complete per-skill verdicts (`name, path, expectedTreeSha, actualTreeSha, status, orphan/missing`) + overall result. `--verbose` surfaces raw causes. Tests assert *shape* + exit code, not just `Contains` (constitution II). @@ -89,14 +89,16 @@ Two checks, both exit-2 class on failure: - **No lock file present** → treated as an **empty lock** (zero locked skills). With no on-disk skills ⇒ exit 0 (clean, idempotent no-op); with on-disk skills ⇒ all are orphans ⇒ exit 2. - **Malformed/unparseable lock** → exit 1 (usage/config), errors-as-navigation (what/why/fix), raw cause under `--verbose` — never a silent skip (a corrupt *gate* input must be loud, unlike a malformed *origin* config which is skipped in resolution). - **Skill dir with no manifest** → not counted as a skill (or flagged — planning decision; lean: ignore non-skill dirs, only `skill.toml`-bearing dirs count). -- **Symlink view dirs** (`.claude/skills/*` → canonical) → ignored via realpath-containment. +- **Symlink view dirs** → not applicable this slice: multi-client views are deferred (`add` creates only canonical `.agents/skills`), so the orphan check scans the canonical location only. (Realpath-containment handling arrives with multi-client materialization — see §5.) - **Not inside a git repo** → exit 1 (tree-SHA needs git's object model); message tells the user to run inside the vendored repo. --- ## 7. Ground-truth anchoring (constitution III) -- A **sample-origin fixture**: a real git repo under `test/` (or `testdata/`) containing ≥1 skill subtree with a `skill.toml`. Its real git tree-SHA + commit are the ground truth. +- A **sample-origin fixture**: a real git repo containing ≥1 skill subtree with a `skill.toml`. Its real git tree-SHA is the ground truth. **Setup-helper requirement:** the integration tests need helpers to (a) bootstrap the origin (git init + commit) and (b) lay down the origin-template filesystem from committed `testdata/`. *Open (planning) — two strategies, decide in plan.md:* + - **(i) Embed a real (bare?) git repo** committed inside the skillrig-cli repo → fixed, reproducible tree-SHA *and* commit, but a nested git repo is awkward to store/maintain. + - **(ii) Bootstrap in a tmpDir per test** from `testdata/` files + `git init`/commit in a helper → cleaner to maintain. **Determinism note:** the **tree-SHA is deterministic** either way (it depends only on content), but the **commit SHA is not** (it depends on author/timestamp) — so tests should assert the tree-SHA / fingerprint, and treat the recorded `commit` as present-and-well-formed rather than a fixed value, unless commit identity is pinned (e.g. fixed `GIT_AUTHOR_DATE`/committer env). - The `TestQuickstart_*` integration tests build the binary, run `add `, assert the lock records the **fixture's actual** `treeSha`/`commit`, then run `verify` and assert exit 0; then tamper one byte and assert exit 2 with the named skill; then introduce an orphan dir and a missing dir and assert exit 2 for each. - No tree-SHA is hand-written into a fixture — it is always computed by `skillcore` from real content, so SC "same primitive both sides" is genuinely exercised. @@ -157,7 +159,7 @@ Explored `/Users/vincentdesmet/specledger/specledger/pkg/cli/skills` — an exis **Assumptions + open questions (for plan.md / architecture):** - **A-1 (make explicit):** skillrig's integrity model **presumes a git-repo origin**. Internally consistent with §2c/§4.2 and our local-path slice (the fixture is a git checkout, so `TreeSHA` *and* the `commit` provenance read offline). State it; don't leave it implicit. - **OQ-1:** if skillrig ever consumes an **HTTP-registry origin** (skills.sh-style), the integrity primitive must change from "compare to the origin's git tree SHA" to "recompute a content digest from the fetched bytes" (on-disk self-consistency), **losing the origin-attested "modified-in-transit but mislabeled" check** unless the registry publishes a trusted digest. Decide: git-origin-only (current design), or also support registry origins? -- **OQ-2 (auth, future network `add`):** remote `add` from a **private** origin needs credentials — a GH **PAT** (`GITHUB_TOKEN`/`GH_TOKEN`, as the prior art reads) or an **SSH key** for git, or a **registry token** for an HTTP registry. Re-enters scope the moment network `add` lands; deferred today (this slice is local-path, no auth). Pairs with the doctor-side prerequisite **auth** check (R18) already moved out of `verify`. +- **OQ-2 (auth, future network `add`):** remote `add` from a **private** origin needs credentials — a GH **PAT** (`GITHUB_TOKEN`/`GH_TOKEN`, as the prior art reads) or an **SSH key** for git, or a **registry token** for an HTTP registry. Re-enters scope the moment network `add` lands; deferred today (this slice is local-origin, no auth). Pairs with the doctor-side prerequisite **auth** check (R18) already moved out of `verify`. **MVP lean (confirmed 2026-05-30):** production `add` is **GitHub-only** (no arbitrary git remotes) for quick delivery, and reusing **`gh` CLI's auth as a library** is a strong candidate for the token path — both are plan.md decisions. Local-origin consumption (this slice) needs none of it. - **OQ-3 (acquisition library):** evaluate **HashiCorp `go-getter`** (the fetch engine behind Terraform/Packer/Nomad) as the acquisition layer *above* `skillcore`. It unifies `file://` (our local-path slice), `git::ssh/https` with `?ref=` + `//subdir` (the git-origin future), and `http`/`s3`/`gcs` under one source grammar + detectors — a clean fit for "acquisition = the layer above the core." **Two things to verify before adopting:** 1. **Dependency footprint** vs the architecture's "minimal deps, static consume-only binary" stance — prefer `go-getter/v2` and trim unused getters (the s3/gcs detectors drag in cloud SDKs). A thin `git` wrapper (we already require `git` on PATH) may honor minimal-deps *better* **if** skillrig stays git-origin-only. 2. **Provenance capture** — go-getter is built to *get content*, not to preserve git identity; confirm we can still obtain the **resolved commit SHA** and compute the **git tree SHA** for the lock (e.g. capture the commit *before* go-getter copies the subdir / drops `.git`). If it can't, it doesn't serve our provenance need and a `git` wrapper wins. diff --git a/specledger/002-skillcore-verify/spec.md b/specledger/002-skillcore-verify/spec.md index cbe2e23..8b3f8d6 100644 --- a/specledger/002-skillcore-verify/spec.md +++ b/specledger/002-skillcore-verify/spec.md @@ -27,29 +27,37 @@ This is the second slice of the CLI. The first (`001-init-origin-resolution`) ma - Q: `verify` needs something to verify against, but the lock's writers weren't built yet — fixtures only, or a real producer? → A: A real producer. **`skillrig add` (vendoring from a local copy of the origin) is in scope**, so the `add → verify` round-trip is the acceptance contract and the recorded fingerprint is genuine, not hand-authored. `verify` itself remains read-only. - Q: Is vendoring-from-a-local-path a throwaway test affordance or a real capability? → A: A **durable capability** — consuming from a local checkout of the org library is a legitimate, kept use case. Fetching directly from a remote origin is a *later, additive* mode, not a replacement for it. +### Session 2026-05-30 + +- Q: How does `add` obtain its source — an explicit `--from`/path argument, or the configured origin? → A: **The configured origin.** `add` resolves the active origin through the shared resolver (env > project config > global), exactly like every command — there is **no** separate source argument that bypasses it. For this feature the resolved origin may be a **local** source (a local checkout); tests run `skillrig init --origin ` first, then `skillrig add `. This keeps a single-origin contract (the earlier `--from` idea is dropped). Remote GitHub-hosted origins + auth are a later, additive mode (production lean: GitHub-only). +- Q: When `add` re-vendors a skill whose on-disk content diverges from the record (local edits), does it three-way-merge? → A: **No — it detects and refuses without `--force`.** A true three-way merge needs an *upstream-advanced* axis (base/theirs/ours); re-vendoring the **same** version has no such axis, so there is nothing to merge — that belongs to a later `bump`. `add` here only refuses to clobber divergent content (override with `--force`); `verify` independently flags the divergence as a label-honesty mismatch. +- Q: Should `verify` detect unresolved git conflict markers as a distinct failure now? → A: **Deferred.** A skill file containing conflict markers already fails label-honesty (its fingerprint won't match the record), so detection only upgrades the *error message*, not correctness; and the producer of such markers (`bump`'s merge) does not exist yet. The distinct check lands with `bump`. +- Q: Does this feature materialize multi-client symlink views (e.g. `.claude/skills → ../.agents/skills`) and agent-shell selection? → A: **No — canonical only.** `add` writes only the canonical `.agents/skills/`. Multi-client symlink views and the `init`-time agent-shell selection (stored in `.skillrig/config.toml`) are a separate, later feature. `verify`'s orphan check therefore scans only the canonical location. +- Q: Should the record (lock) mirror each skill's `[[requires]]` backing-CLI declarations? → A: **No.** The full skill subtree — including its `skill.toml` manifest — is vendored on disk and fingerprint-attested, so the vendored manifest is the single source of truth for prerequisites; a later health command reads it directly. Mirroring into the record would only duplicate data that can drift. (This diverges from architecture §4.2's "mirror requires for offline prereq check" rationale, which assumed the manifest might not be on disk — see spike.) + ## User Scenarios & Testing *(mandatory)* ### User Story 1 - Vendor an approved skill into my repo (Priority: P1) -A developer (or an agent acting for them) has access to their org's skill library and wants one of its skills available in this repo. They run a single command naming the skill's location in a local copy of that library. The skill's files are placed into the repo's standard skills location, and its exact identity — which version, where it came from, and a tamper-evident fingerprint of its content — is recorded in a committed record file. The repo now carries both the skill and proof of what it is. +A developer (or an agent acting for them) has pointed this repo at their org's skill library with `init` (the origin may be a local checkout). They run a single command naming a skill, and the tool vendors it from the configured origin. The skill's files are placed into the repo's canonical skills location, and its exact identity — which version, where it came from, and a tamper-evident fingerprint of its content — is recorded in a committed record file. The repo now carries both the skill and proof of what it is. **Why this priority**: Nothing can be verified until something has been vendored and recorded. This is the producer half of the promise and the smallest standalone slice that delivers value: a repo gains an approved skill plus a durable record of its identity. Consuming from a local copy of the library is itself a real, supported workflow (offline, air-gapped, or pre-cloned origins). -**Independent Test**: Point the vendor command at a skill in a local sample library; confirm the skill's files land in the repo's skills location and the record file gains an entry naming the skill's version, source, and content fingerprint. No network is involved. +**Independent Test**: In a git repo whose origin is a local sample library, run the vendor command for a named skill; confirm the skill's files land in the canonical skills location (`.agents/skills/`) and the record file gains an entry naming the skill's version, source, and content fingerprint. No network is involved. **Acceptance Scenarios**: -1. **Given** a repo with no vendored skills and a local library containing a skill, **When** the developer vendors that skill, **Then** the skill's files appear under the repo's standard skills location and the record file contains one entry for it (version, source/provenance, and a content fingerprint), and the command reports success. +1. **Given** a git repo pointed at a skill library (its origin — which may be a local checkout) and no vendored skills, **When** the developer vendors a named skill, **Then** the skill's files appear under the canonical skills location (`.agents/skills/`) and the record file contains one entry for it (version, source/provenance, and a content fingerprint), and the command reports success. 2. **Given** a skill already vendored from a library, **When** the developer vendors the identical content again, **Then** the outcome is unchanged (idempotent) and the command reports success — no duplicate or corrupted record. 3. **Given** the developer requests machine-readable output, **When** the vendor command succeeds, **Then** the tool emits structured output naming the skill, its recorded version, and where it was placed, and that output is complete and parseable. -4. **Given** a skill is already vendored and the developer has locally changed its files, **When** they vendor again with content that differs from the recorded fingerprint, **Then** the tool refuses to silently overwrite the divergent content and requires an explicit override, so local edits are never lost without intent. +4. **Given** a skill is already vendored and the developer has locally changed its files, **When** they vendor the same version again, **Then** the tool detects the divergence from the recorded fingerprint, refuses to silently overwrite it, and requires an explicit override (`--force`) — local edits are never lost without intent. (Merging local edits with an *upstream advance* is a later `bump` concern; re-vendoring the same version has no upstream change to merge.) 5. **Given** the developer wants to preview only, **When** they run the vendor command in dry-run mode, **Then** the tool reports what it *would* place and record, and writes nothing. --- ### User Story 2 - Prove a vendored skill is exactly what was approved (Priority: P1) -A reviewer, a CI job, or an agent needs assurance that the skills checked into a repo are exactly the approved versions and have not been altered to claim a version they aren't. They run a single verification command. It recomputes each vendored skill's content fingerprint and compares it to the recorded value. If every skill matches, the command passes. If any skill's content diverges from what its record claims, the command fails and names the offending skill and the discrepancy. +A reviewer, a CI job, or an agent needs assurance that the skills checked into a repo are exactly the approved versions and have not been altered to claim a version they aren't. They run a single verification command. It recomputes each vendored skill's content fingerprint and compares it to the recorded value. If every skill matches, the command passes. If any skills' content diverges from what their record claims, the command fails and produces a full report naming **every** offending skill and its discrepancy — it never stops at the first failure. **Why this priority**: This is the core product promise made checkable. It is the reason the feature exists: a long skill file can hide a change no human reviewer would catch by eye, and this turns "is this really the approved version?" into a deterministic pass/fail. It builds on US1 (something must be vendored and recorded first), so the two together form the minimum viable round-trip. @@ -58,7 +66,7 @@ A reviewer, a CI job, or an agent needs assurance that the skills checked into a **Acceptance Scenarios**: 1. **Given** a repo whose vendored skills all match their recorded fingerprints, **When** verify runs, **Then** it reports success with a success exit status and a summary of how many skills were verified. -2. **Given** a vendored skill whose content has been modified so it no longer matches its recorded fingerprint, **When** verify runs, **Then** it exits with the verification-failure status and names the skill plus the recorded-vs-actual discrepancy. +2. **Given** one or more vendored skills whose content has been modified so they no longer match their recorded fingerprint, **When** verify runs, **Then** it exits with the verification-failure status and names **every** such skill, each with its recorded-vs-actual discrepancy, in a single aggregated report (it does not exit on the first failure). 3. **Given** the verification runs entirely on the committed files with no network or external service, **When** it is run repeatedly with unchanged inputs, **Then** it returns the same result every time (deterministic, offline). 4. **Given** a repo with no vendored skills and no record file, **When** verify runs, **Then** it reports success (nothing to verify) rather than an error. @@ -104,10 +112,10 @@ An automated caller — a CI merge gate or an agent deciding its next step — n - **Empty repo / nothing vendored**: verify is a success (nothing to check), not an error. - **Local edits then re-vendor**: re-vendoring content that diverges from the recorded fingerprint requires an explicit override; it never silently discards local edits. - **Re-vendor identical content**: produces no spurious change and reports success (idempotent). -- **Per-client view directories**: alternate directory entries that point at the same canonical skill content are not counted as separate or untracked skills. +- **Per-client view directories**: this feature does not create per-client symlink views (deferred — see Out of Scope), so the orphan/completeness check scans only the canonical skills location (`.agents/skills`). Robust handling of any manually-created view directories lands with multi-client materialization. - **Malformed record file**: surfaced as a usage/config error with a clear message, distinct from a content-verification failure. -- **Not inside a version-controlled repo**: the fingerprint relies on the repo's version-control content model, so running outside one is a usage/config error that says so. -- **Whitespace/formatting in the record**: tolerated on read; the fingerprint comparison is on content, not formatting. +- **Not inside a git repo**: both `add` and `verify` require a git repository — the canonical skills location and the content fingerprint both derive from the repo's git content model — so running outside one is a usage/config error that says so. +- **Byte-exact fingerprint (no formatting tolerance on skill content)**: the content fingerprint is byte-exact; **any** change to a vendored skill file — including whitespace or line-ending changes — produces a different fingerprint and is a mismatch (there is no git-style "ignore whitespace" leniency). Only the *record file's own* incidental formatting is tolerated when reading it back; that has no bearing on the skill-content fingerprint. ## Requirements *(mandatory)* @@ -115,20 +123,20 @@ An automated caller — a CI merge gate or an agent deciding its next step — n **Vendoring a skill (`add`)** -- **FR-001**: The system MUST provide a command that vendors a skill from a local copy of the org's library into the repo's standard skills location and records its identity. +- **FR-001**: The system MUST provide a command that vendors a named skill from the repo's **configured origin** (resolved via the shared origin resolver — there is no separate source argument that bypasses it; the origin may be a local checkout) into the canonical skills location (`.agents/skills/`) and records its identity. Both this command and verification MUST run inside a git repository. - **FR-002**: For each vendored skill, the system MUST record its version, its provenance (where it came from), and a content fingerprint that uniquely reflects the skill's content. - **FR-003**: The vendor command MUST be idempotent: re-vendoring identical content leaves an equivalent result and reports success without error. -- **FR-004**: The vendor command MUST NOT silently overwrite vendored content that diverges from the recorded fingerprint; it MUST require an explicit override so local modifications are never lost without intent. +- **FR-004**: The vendor command MUST NOT silently overwrite vendored content that diverges from the recorded fingerprint; it MUST detect the divergence and require an explicit override (`--force`) so local modifications are never lost without intent. It MUST NOT attempt a three-way merge — re-vendoring the same version has no upstream-advance axis to merge; that is a later `bump` concern. - **FR-005**: The vendor command MUST support a preview (dry-run) mode that reports the intended placement and record changes without writing anything. - **FR-006**: The vendor command MUST create any missing skills location and record file on first use. -- **FR-007**: The vendor command MUST operate offline against the supplied local source; it MUST NOT require network access in this feature. (Fetching from a remote origin is a later, additive mode — see Out of Scope.) +- **FR-007**: The vendor command MUST operate offline when the resolved origin is a local source; it MUST NOT require network access in this feature. (Fetching from a remote GitHub-hosted origin, and the credential/auth handling that needs, is a later, additive mode — see Out of Scope.) **Verifying vendored skills (`verify`)** - **FR-008**: The system MUST provide a verification command that checks the repo's vendored skills against their recorded identities, entirely offline and deterministically (same inputs always yield the same result; no network or external/live signal). - **FR-009**: Verification MUST recompute each recorded skill's content fingerprint from its current on-disk content and compare it to the recorded value, failing when they differ (label honesty). - **FR-010**: Verification MUST compare the set of skills present on disk against the set of recorded skills, failing when a skill is present but unrecorded (untracked/orphan) or recorded but absent (missing) — covering the whole set, not only recorded entries. -- **FR-011**: Verification MUST NOT count per-client compatibility views (alternate directory entries pointing at the same canonical skill content) as separate or untracked skills. +- **FR-011**: Verification's orphan/completeness check MUST scan the canonical skills location (`.agents/skills`). This feature does not create per-client symlink views; robust handling of such views is deferred together with multi-client materialization (see Out of Scope). - **FR-012**: Verification MUST report *all* detected problems in a run (e.g. both a content mismatch and an untracked skill), not stop at the first. - **FR-013**: Verification MUST treat an empty repo / absent record as success (nothing to verify), and MUST treat an unreadable or malformed record as a usage/config error distinct from a verification failure. - **FR-014**: Verification MUST NOT perform any backing-tool prerequisite or eligibility check; that is explicitly out of scope for this feature (reserved for a later health command). A missing backing tool MUST NOT cause a verification failure. @@ -150,8 +158,8 @@ An automated caller — a CI merge gate or an agent deciding its next step — n ### Key Entities *(include if feature involves data)* - **Skill**: a unit of agent capability — a directory of files (including a machine-readable manifest declaring its name, version, and any backing-tool prerequisites) vendored into the repo. The thing that is vendored, recorded, and verified. -- **Skill manifest**: the per-skill machine-readable description (name, version, namespace, description, discovery tags, declared backing-tool prerequisites). Read at vendor time; its prerequisite declarations are recorded but not evaluated in this feature. -- **Skill record (lock)**: the committed file mapping each vendored skill to its recorded version, provenance, content fingerprint, and location. Written by vendoring, read by verification; the source of truth for "what was approved." +- **Skill manifest**: the per-skill machine-readable description (name, version, namespace, description, discovery tags, declared backing-tool prerequisites) — vendored on disk as part of the skill subtree. Read at vendor time for identity. Its prerequisite declarations are **neither copied into the record nor evaluated** in this feature; the vendored manifest itself is the single source of truth for them (a later health command reads it directly). +- **Skill record (lock)**: the committed file (`.skillrig/skills-lock.json`) mapping each vendored skill to its recorded version, provenance (origin + commit), content fingerprint, and location. It does **not** duplicate the skill's backing-tool prerequisites (those live in the vendored manifest). Written by vendoring, read by verification; the source of truth for "what was approved." - **Content fingerprint**: a value that uniquely reflects a skill's content as published for a given version. Used for *label honesty* — confirming on-disk content matches the version it claims to be. Computed identically at vendor time and verify time. - **Verification verdict**: the outcome of verification — overall pass/fail plus a per-skill result (matched / content-mismatch / untracked / missing), surfaced both compactly for humans and completely for machines. @@ -160,10 +168,10 @@ An automated caller — a CI merge gate or an agent deciding its next step — n The following are explicitly **not** part of this feature and MUST NOT be pulled in: - **Backing-tool prerequisite / eligibility checking** (is a skill's required CLI present, the right version, authenticable) and its dedicated exit status — reserved for a later health command. `verify` here is integrity-only. -- **Fetching skills from a remote origin** (network/git fetch), origin-resolution-driven vendoring, and immutable version pins — this feature vendors from a *local* copy of the library only. -- **Upgrade proposal** (detecting upstream advances, three-way-merge of local edits, conflict-marker handling, opening PRs) — a later command; this feature has no producer of merge conflicts. +- **Fetching from a remote origin** (network/git fetch) and the credential/auth it needs, plus immutable version pins — this feature consumes a **local** origin (a local checkout); remote GitHub-hosted origins + auth are a later, additive mode (production lean: GitHub-only). +- **Upgrade proposal & three-way merge** (`bump`): detecting upstream advances and merging them with local edits (base/theirs/ours), plus the conflict-marker handling that merge produces. Re-vendoring the *same* version has no upstream-advance axis, so this feature only **detects and refuses** divergence (`--force` to override); the merge and its conflict markers have no producer here. - **Discovery** (`index.json`, search) and any browse UI. -- **Multi-client materialization** (creating per-client view directories) — verification must merely not miscount existing views; creating them is out of scope. +- **Multi-client symlink materialization & agent-shell selection** — creating per-client view directories (e.g. `.claude/skills → ../.agents/skills`) and the `init`-time agent-shell selection stored in `.skillrig/config.toml`. `add` writes only the canonical `.agents/skills`; verification scans only that location. A separate, later feature. - **External-source allowlists, audit classification, and risk/vulnerability surfacing** — later governance work. - **Any authentication or credential handling.** @@ -173,7 +181,7 @@ The following are explicitly **not** part of this feature and MUST NOT be pulled - **SC-001**: A user can vendor a skill from a local library and verify it in two commands, with zero network access and no hand-authored records. - **SC-002**: When a vendored skill's content matches its record, verification passes (success status); when any skill's content diverges from its record, verification fails with the verification-failure status — correct for 100% of label-honesty cases. -- **SC-003**: A single altered byte in any vendored skill file is detected as a content mismatch — 0 false negatives. +- **SC-003**: A single altered byte in any vendored skill file is detected as a content mismatch — 0 false negatives; and when multiple skills are altered, all are reported in one run (the check never exits on the first failure). - **SC-004**: Any on-disk skill with no record entry (untracked) and any recorded skill absent on disk (missing) are both detected and fail the gate. - **SC-005**: The content fingerprint computed at verify time is identical to the value recorded at vendor time for unmodified content — verified by vendoring real content and re-checking it, never by a hand-written value. - **SC-006**: A missing backing tool never causes a verification failure (verification is integrity-only). @@ -192,9 +200,10 @@ The following are explicitly **not** part of this feature and MUST NOT be pulled **Assumptions**: -- A local copy of the org's skill library is available on disk to vendor from; this feature does not fetch it. (Consuming from a local checkout is a supported, durable workflow; remote fetch is additive future work.) +- The repo is pointed at its origin via `init` (feature 001); for this feature the resolved origin is a **local** source (a local checkout). `add` consumes the *resolved* origin — there is no separate source argument that bypasses it. Remote fetch is additive future work. +- Both `add` and `verify` require being run inside a **git repository**: the canonical skills location and the content fingerprint both derive from the repo's git content model, so running outside one is a usage/config error. - Skills are vendored into the repo under version control, so the repo's own content model carries file integrity; the recorded fingerprint adds *label honesty* (content matches its claimed version) on top of that. -- The skill record file and the per-skill manifest are separate concerns from the origin config of the first slice; this feature reads/writes the record and reads the manifest, and does not require an origin to be configured. +- `add` requires the origin to be configured (it resolves it to know what to vendor); `verify` does **not** (it reads the committed record and on-disk content). The skill record file and the per-skill manifest are separate concerns from the origin config of the first slice. **Dependencies**: From 529d4791266a17b4d35d7673ca89f7ad7b19d62e Mon Sep 17 00:00:00 2001 From: Vincent De Smet Date: Sat, 30 May 2026 08:16:24 +0800 Subject: [PATCH 3/8] docs(002): implementation plan, research, contracts, quickstart MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Plan phase for 002 (skillcore + add + verify). Key decisions (research.md D1-D13), grounded in prior-art study (gh-cli/git, gh skill, skills.sh): - Tree-SHA by SHELLING git (`git rev-parse :`), not in-process — gh-cli reimplements nothing; add records it from the origin, verify recomputes on the consumer's committed tree, both git-canonical so they match by construction (AP-04 hardened). Relocation-invariance confirmed on real git. - verify hashes the COMMITTED tree + flags dirty separately (read-only); refines the spike's working-tree intent. - skillcore is a PUBLIC pkg/skillcore package (SDK-1), presentation-free, never fetches; CLI resolves the origin and passes a local path down (SDK boundary). - lock omits [[requires]] (the vendored manifest is the source of truth). - Test-oracle independence (D11): integration tests use raw git, never skillcore (Constitution III, no circular validation). - Fixture mirrors a CANONICAL design-aligned origin (D12); the existing skillrig-origin repo is a pre-design sample, reconciled via recommendations. - Large-monorepo perf (D13): local origin = no clone; future remote-add uses partial-clone + sparse-checkout with ZERO change to the tree-SHA primitive. Artifacts: plan.md, research.md, data-model.md (real git-tree-SHA ground truth), contracts/{add,verify,skillcore-sdk}.md, quickstart.md (~16 TestQuickstart_*). CLAUDE.md: sl-context-update tech section + the PRE-RELEASE marker reword. Constitution Check I-IX all pass; no complexity violations. Co-Authored-By: Claude Opus 4.8 (1M context) --- CLAUDE.md | 13 +- .../002-skillcore-verify/contracts/add.md | 77 +++++++++ .../contracts/skillcore-sdk.md | 86 ++++++++++ .../002-skillcore-verify/contracts/verify.md | 89 ++++++++++ specledger/002-skillcore-verify/data-model.md | 158 ++++++++++++++++++ specledger/002-skillcore-verify/plan.md | 115 +++++++++++++ specledger/002-skillcore-verify/quickstart.md | 115 +++++++++++++ specledger/002-skillcore-verify/research.md | 123 ++++++++++++++ .../002-skillcore-verify/spec-tech-spike.md | 4 +- 9 files changed, 772 insertions(+), 8 deletions(-) create mode 100644 specledger/002-skillcore-verify/contracts/add.md create mode 100644 specledger/002-skillcore-verify/contracts/skillcore-sdk.md create mode 100644 specledger/002-skillcore-verify/contracts/verify.md create mode 100644 specledger/002-skillcore-verify/data-model.md create mode 100644 specledger/002-skillcore-verify/plan.md create mode 100644 specledger/002-skillcore-verify/quickstart.md create mode 100644 specledger/002-skillcore-verify/research.md diff --git a/CLAUDE.md b/CLAUDE.md index 2d15a24..10c2b2d 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -2,7 +2,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. -> PRE-RELEASE MARKER: As long as this marker is present we NEVER PLAN BACKWARD COMPATBILITY. We are in rapid iteration and may make breaking changes to the CLI and/or skill contract at any time. ANY PLAN IGNORES BACKWARD COMPATIBILITY. +> PRE-RELEASE MARKER: As long as this marker is present we NEVER PLAN BACKWARD COMPATBILITY. We are in rapid iteration and may make breaking changes to the CLI and/or skill contract at any time. ANY PLANNING and DESIGN EFFORTS MUST IGNORE BACKWARD COMPATIBILITY. ## What this is @@ -69,9 +69,10 @@ Features follow SpecLedger: **Specify → Clarify → Plan → Tasks → Review ## Active Technologies -- Go 1.24+ (toolchain in this environment is 1.24.4; 1.25 also fine) — single static binary; cross-OS/arch via goreleaser later, out of scope here -- Go standard `go test`. Two tiers — (a) in-process Cobra unit tests via `SetArgs`/`SetOut`/`SetErr` + table-driven resolver tests; (b) `TestQuickstart_*` integration tests that build and exec the real binary (Constitution II/III). -- Local files only — project `.skillrig/config.toml`, global `~/.config/skillrig/config.toml` (XDG-aware). No database, no network. -- `github.com/spf13/cobra` (command tree); `github.com/pelletier/go-toml/v2` (config read/write — see research.md). Dependencies kept minimal (consume-only -- static binary). +- Go 1.24+ (toolchain 1.24.4) — single static binary. +- Go standard `go test`, two tiers (Constitution II/III): (a) **unit** — table-driven `skillcore` tests + a **ground-truth** test asserting `skillcore.TreeSHA` equals real `git` tree output; (b) **integration** — `TestQuickstart_*` build + exec the real binary over a fixture origin bootstrapped in a tmpDir. **No network boundary this slice → no `httptest`/go-vcr** (that tier arrives with remote `add`). +- `github.com/pelletier/go-toml/v2` (config + `skill.toml` parse); lock uses stdlib `encoding/json`. **No new dependencies +- and no in-process hashing dependency** — the tree-SHA is obtained by *shelling `git`* (see Runtime dependency + research). `go-getter` is explicitly *not* adopted this slice (acquisition is a local origin; OQ-3 deferred). Deps kept minimal (consume-only static binary). +- existing only — `github.com/spf13/cobra` (command tree) +- local files only — vendored subtree under `.agents/skills//` (canonical, committed), `.skillrig/skills-lock.json` (committed, tool-written, atomic). `add` reads the resolved origin (a local path this slice). No database, no network. diff --git a/specledger/002-skillcore-verify/contracts/add.md b/specledger/002-skillcore-verify/contracts/add.md new file mode 100644 index 0000000..cc13eec --- /dev/null +++ b/specledger/002-skillcore-verify/contracts/add.md @@ -0,0 +1,77 @@ +# Contract: `skillrig add` + +**Pattern**: Vendor Mutation — [cli.md](../../../docs/design/cli.md) Pattern Classification. Writes the skill tree + lock entry via `skillcore` only; supports `--dry-run`; refuses to clobber divergent content without `--force`. +**Purpose**: Vendor a named skill from the repo's **configured origin** (a local checkout this slice) into the canonical `.agents/skills//`, recording its identity in `.skillrig/skills-lock.json`. Offline. Requires a git repository. + +## Synopsis + +``` +skillrig add [--dry-run] [--force] [--json] [--verbose] +``` + +## Flags & Args + +| | Type | Default | Meaning | +|---|---|---|---| +| `` | arg (`cobra.ExactArgs(1)`) | — | Skill name to vendor (its directory within the origin's `skills/`). | +| `--dry-run` | bool | false | Report what *would* be placed/recorded; write nothing. | +| `--force` | bool | false | Overwrite a vendored skill whose on-disk content diverges from the recorded fingerprint (otherwise refused). | +| `--json` | bool | false | Emit the complete `AddResult` on stdout instead of compact human text. | +| `--verbose` | bool | false | Print underlying paths / raw git cause behind summaries and errors. | + +> **Origin, not a path** (clarified 2026-05-30): there is **no** `--from`/path argument. `add` resolves the active origin through the shared resolver (`SKILLRIG_ORIGIN` > project `.skillrig/config.toml` > global) exactly like every command; the origin *value* may be a local checkout this slice. Tests do `skillrig init --origin ` then `skillrig add `. + +## Help (Progressive Discovery) + +``` +Examples: + # Vendor a skill from your configured origin into .agents/skills/ + skillrig add terraform-plan-review + + # Preview what would be vendored, writing nothing + skillrig add terraform-plan-review --dry-run +``` + +## Behavior + +1. **Resolve origin** (CLI layer, via `config.ResolveOrigin`). No origin in any source → usage error (exit 1, same shape as the resolver's "no origin configured"). The resolved origin (a local path this slice) + ref is handed to `skillcore.Add`. +2. **Locate the skill** in the origin at `skills//`; absent → usage error (exit 1). Read `skill.toml` for `name`/`version` (`skillcore.ParseManifest`). +3. **Fingerprint + provenance** from the origin (git-canonical, research D1): `treeSha = git -C rev-parse :skills/`; `commit = git -C rev-parse `. +4. **Placement guard**: if `.agents/skills/` already exists and its content diverges from the lock's `treeSha`, **refuse** without `--force` (exit 1, "use --force"); never silently clobber (FR-004). No three-way merge (that is `bump`). +5. **Vendor** (unless `--dry-run`): copy the subtree into `.agents/skills//` **byte-identical, preserving file modes** (the exec bit is part of the tree SHA); inject nothing. Idempotent if identical (`action=unchanged`). +6. **Write the lock** entry `{ version, commit, treeSha, path }` under `skills.` (atomic temp+rename; `requires` is **not** recorded — research D4). `--dry-run` writes nothing. +7. Emit result (see Output). The user then commits `.agents/skills/` + the lock (vendored-in-git); `verify` checks the committed tree. + +## Output + +**Human (default, stdout, compact — ≤2 lines incl. footer):** +``` +vendored terraform-plan-review@1.4.0 → .agents/skills/terraform-plan-review (treeSha c967789) +→ commit it, then run: skillrig verify +``` +(idempotent re-add prints `terraform-plan-review@1.4.0 already vendored (no change)`; `--dry-run` prefixes `would vendor …`.) + +**`--json` (stdout, complete + parseable):** +```json +{ "ok": true, "name": "terraform-plan-review", "version": "1.4.0", + "path": ".agents/skills/terraform-plan-review", + "commit": "9f1a052e596d5d28f13838061a1ab93207ef6fc3", + "treeSha": "c967789527370d2e0fba03a92e70dffef6f3bf31", + "action": "vendored", "dryRun": false } +``` +Keys always present: `ok, name, version, path, commit, treeSha, action, dryRun`. `action ∈ {vendored, unchanged, overwritten}`. + +## Errors (stderr; prose what/why/fix; raw cause preserved) + +| Condition | Exit | Message shape | +|---|---|---| +| No origin configured | 1 | what: no origin configured; why: no `SKILLRIG_ORIGIN` / project / global origin; fix: `skillrig init --origin OWNER/REPO` or set `SKILLRIG_ORIGIN`. | +| Skill not found in origin | 1 | what: skill `` not found in origin; why: no `skills//` at `@`; fix: check the name / `skillrig search` (future). | +| Divergent content, no `--force` | 1 | what: refusing to overwrite ``; why: on-disk content diverges from the recorded fingerprint; fix: re-run with `--force`, or revert local edits. | +| Not inside a git repo | 1 | what: not a git repository; why: tree-SHA + provenance need git; fix: run inside the repo (or `git init`). | + +Exit `0` on success (incl. idempotent no-op and `--dry-run`). Code `2` is `verify`'s; `3` is reserved (`doctor`). + +## Test mapping (Constitution II) + +Each Output/Errors/Behavior row maps to a `TestQuickstart_Add*` scenario. Output-shape: human line-count bound; `--json` `json.Unmarshal` + all-keys-present + the `treeSha`/`commit` are the **fixture's real** values (ground truth, data-model.md); error asserts what/why/fix as distinct checks + exit code. diff --git a/specledger/002-skillcore-verify/contracts/skillcore-sdk.md b/specledger/002-skillcore-verify/contracts/skillcore-sdk.md new file mode 100644 index 0000000..b03f7c3 --- /dev/null +++ b/specledger/002-skillcore-verify/contracts/skillcore-sdk.md @@ -0,0 +1,86 @@ +# Contract: `pkg/skillcore` (public Go SDK — SDK-1) + +**Import**: `github.com/skillrig/cli/pkg/skillcore` +**Guarantee**: the **one** implementation (AP-04) of the integrity primitives — git tree-SHA, `skill.toml` parse, `skills-lock.json` I/O, and the `Add`/`Verify` operations. **Presentation-free** (returns typed values + typed errors; **never** writes to stdout/stderr or formats user-facing text — Constitution V). **Never fetches** (pure filesystem + local `git`; acquisition/auth are the caller's concern — the SDK boundary, spike §10). Consumed by the `skillrig` CLI and importable by third-party Go tools. + +> Signatures below are the *intended surface* (finalized in code; PRE-RELEASE → may churn freely). Names are illustrative; the contract is the behavior + the presentation-free/typed-error discipline. + +## Primitives + +```go +// TreeSHA returns the git tree-object SHA of relPath at ref within the git repo +// rooted at gitDir, by shelling `git -C gitDir rev-parse :` +// (research D1 — git-canonical, relocation-invariant). relPath must resolve to a +// directory (a skill subtree). Used by Add (on the origin, ref=resolved) and by +// Verify (on the consumer, ref="HEAD") — same function both sides. +func TreeSHA(gitDir, ref, relPath string) (string, error) + +// ParseManifest parses a skill.toml. Unknown keys are ignored (forward-compat). +func ParseManifest(path string) (Manifest, error) +type Manifest struct { Name, Version, Namespace, Description string; Tags []string; Requires []Require } +type Require struct { Tool, Version, Source, Manager string } // parsed; NOT written to the lock (D4) + +// Lock I/O — atomic write (temp+rename); deterministic serialization. No `requires`. +func ReadLock(repoRoot string) (LockFile, error) // absent file → zero LockFile, nil err +func WriteLock(repoRoot string, lf LockFile) error +type LockFile struct { LockfileVersion int; Origin string; Skills map[string]LockEntry } +type LockEntry struct { Version, Commit, TreeSha, Path string } +``` + +## Operations + +```go +// Add vendors one skill from an already-resolved LOCAL origin into repoRoot's +// canonical .agents/skills//, mode-preserving and byte-identical (no +// injection — D6), and writes/updates the lock. It does NOT resolve origins, +// read config, or fetch — the caller supplies opts.OriginDir (a local git +// checkout) + opts.Ref. Refuses divergent overwrite unless opts.Force; opts.DryRun +// writes nothing. Idempotent on identical content. +func Add(opts AddOptions) (AddResult, error) +type AddOptions struct { OriginDir, Ref, Skill, RepoRoot string; Force, DryRun bool } +type AddResult struct { Name, Version, Path, Commit, TreeSha string; Action Action; DryRun bool } +type Action string // "vendored" | "unchanged" | "overwritten" + +// Verify checks every vendored skill in repoRoot against the lock: label-honesty +// (recompute TreeSHA on HEAD), orphan/completeness (on-disk set = locked set), +// and dirty (uncommitted). Read-only; offline; deterministic; aggregates ALL +// findings. Returns a Report; returns a *VerifyFailure error when not ok so +// callers can branch, with the same Report attached. +func Verify(repoRoot string) (Report, error) +type Report struct { OK bool; Counts Counts; Verdicts []Verdict } +type Counts struct { Verified, Mismatch, Orphan, Missing, Dirty int } +type Verdict struct { Name, Path, Status, ExpectedTreeSha, ActualTreeSha, Reason string } +``` + +## Errors (typed, presentation-free) + +```go +type VerifyFailure struct { Report Report } // ≥1 non-ok verdict; CLI maps → exit 2 +func (e *VerifyFailure) Error() string // terse; CLI renders the Report richly +type GitError struct { ExitCode int; Stderr string } // git invocation failure (gh pattern) +``` +The CLI (`internal/cli`) maps `*VerifyFailure` → `ExitVerification(2)`, `GitError`/malformed-lock/not-a-repo → `*cli.UsageError` (1), and renders human/`--json`. The SDK itself prints nothing. + +## git client (testability — gh pattern, research D7) + +`skillcore` shells `git` through a small internal client with a **pluggable command constructor** (a `func(ctx, name string, args ...string) *exec.Cmd` field, default `exec.CommandContext`) and the `GitError` type, mirroring `gh/git`'s `Client`. Tests swap the constructor for a stub (unit, error paths) or run real `git` in a `t.TempDir()` (integration, ground truth). Output via injectable writers; never `os.Stdout` directly. + +## Example: third-party consumer (SDK-1) + +```go +import "github.com/skillrig/cli/pkg/skillcore" + +rep, err := skillcore.Verify(repoRoot) +if err != nil { + var vf *skillcore.VerifyFailure + if errors.As(err, &vf) { renderMyOwnWay(vf.Report); os.Exit(2) } // caller owns presentation + exit policy + log.Fatal(err) +} +fmt.Printf("%d skills verified\n", rep.Counts.Verified) +``` + +## Invariants (tested) + +- `TreeSHA` value `Add` records (on the origin) == value `Verify` recomputes (on the consumer's committed tree) for identical content — by construction (both `git rev-parse`); proven by the `add → verify` round-trip and the relocation-invariance ground truth (data-model.md). +- No exported function writes to stdout/stderr or returns pre-formatted user text (Constitution V) — enforced by review + the CLI being the only renderer. +- Requires a git repository at `gitDir`/`repoRoot`; otherwise returns a `GitError`/typed error (the CLI renders "not a git repository"). diff --git a/specledger/002-skillcore-verify/contracts/verify.md b/specledger/002-skillcore-verify/contracts/verify.md new file mode 100644 index 0000000..b435785 --- /dev/null +++ b/specledger/002-skillcore-verify/contracts/verify.md @@ -0,0 +1,89 @@ +# Contract: `skillrig verify` + +**Pattern**: Verification Gate — [cli.md](../../../docs/design/cli.md) Pattern Classification. MUST be offline + deterministic, exit-code driven, **no online/inferential signal** (AP-02). Read-only. +**Purpose**: Prove the repo's vendored skills are exactly what was recorded — **label-honesty** (tree-SHA) + **orphan/completeness** (on-disk set = locked set). Integrity only; **no** prerequisite check (that is `doctor`; exit `3` not emitted). Requires a git repository; needs no origin and no network. + +## Synopsis + +``` +skillrig verify [--json] [--verbose] +``` + +## Flags & Args + +`Args`: none (`cobra.NoArgs`) — verifies the whole repo. `--json` (complete `VerifyReport` on stdout), `--verbose` (raw git/path causes). No `--dry-run`/`--force` (read-only). + +## Help (Progressive Discovery) + +``` +Examples: + # Verify every vendored skill matches its recorded version (CI gate) + skillrig verify + + # Machine-readable per-skill verdicts for an agent / jq + skillrig verify --json +``` + +## Behavior (research D1/D2; aggregates ALL findings — never first-fail) + +1. **Read the lock** `.skillrig/skills-lock.json` (`skillcore.ReadLock`). Absent → treat as empty (zero locked skills). Unparseable / wrong `lockfileVersion` → usage error (exit 1), not a verification failure. +2. **Label-honesty**, per locked skill: `actual = git rev-parse HEAD:`; compare to the lock's `treeSha`. Differ → `mismatch`. Path absent from `HEAD` but present on disk (uncommitted) → `dirty`; path absent entirely → `missing`. +3. **Dirty check**: `git status --porcelain -- ` for each locked path; uncommitted modifications → `dirty` (distinct from `mismatch` — "commit / has local modifications"). +4. **Orphan/completeness**: enumerate `.agents/skills/*` dirs that contain `skill.toml`/`SKILL.md`; any with no lock entry → `orphan`. (Multi-client symlink views are not created this slice, so only the canonical location is scanned — spike §6.) +5. **Aggregate** every verdict into a `VerifyReport`; **do not stop at the first failure** (FR-012). `ok` iff all verdicts are `ok`. + +## Output + +**Human (default, stdout, compact — line count bounded ≤ findings + K, Constitution II):** + +Pass (2 lines): +``` +verified 3 skills ✓ +→ all match their recorded version +``` +Fail (one line per failing skill + summary + footer; bounded by # findings): +``` +verify FAILED: 2 of 3 skills + ✗ terraform-plan-review content mismatch (recorded c967789, on-disk a1b2c3d) + ✗ secret-scanner untracked (no lock entry) +→ inspect with: skillrig verify --json +``` + +**`--json` (stdout, complete + structurally complete — every checked skill):** +```json +{ "ok": false, + "counts": { "verified": 1, "mismatch": 1, "orphan": 1, "missing": 0, "dirty": 0 }, + "verdicts": [ + { "name": "terraform-plan-review", "path": ".agents/skills/terraform-plan-review", + "status": "mismatch", + "expectedTreeSha": "c967789527370d2e0fba03a92e70dffef6f3bf31", + "actualTreeSha": "a1b2c3d…", "reason": "content does not match recorded version" }, + { "name": "secret-scanner", "path": ".agents/skills/secret-scanner", + "status": "orphan", "expectedTreeSha": "", "actualTreeSha": "…", + "reason": "present on disk but not in the lock" }, + { "name": "pr-summary", "path": ".agents/skills/pr-summary", "status": "ok", + "expectedTreeSha": "…", "actualTreeSha": "…", "reason": "" } + ] } +``` +Keys always present: `ok, counts{verified,mismatch,orphan,missing,dirty}, verdicts[]`; each verdict carries `name, path, status, expectedTreeSha, actualTreeSha, reason`. `status ∈ {ok, mismatch, orphan, missing, dirty}`. + +## Exit codes (load-bearing) + +| Code | When | +|---|---| +| 0 | All verdicts `ok` (incl. the empty case: no skills, no orphans → clean pass). | +| 1 | Usage/config: malformed/unreadable lock, bad flags, **not inside a git repo**. | +| 2 | Verification failure: any `mismatch`, `orphan`, `missing`, or `dirty`. | +| 3 | **Never emitted** — reserved for `doctor`'s prerequisite class. A missing backing tool MUST NOT fail `verify` (FR-014). | + +## Errors (stderr; what/why/fix; raw cause under `--verbose`) + +| Condition | Exit | Message shape | +|---|---|---| +| Malformed / unreadable lock | 1 | what: cannot read `.skillrig/skills-lock.json`; why: ``; fix: check the file / re-vendor with `skillrig add`. | +| Not inside a git repo | 1 | what: not a git repository; why: tree-SHA recompute needs git; fix: run inside the repo. | +| (verification failures) | 2 | rendered as the per-skill report above (the report *is* the message). | + +## Test mapping (Constitution II) + +`TestQuickstart_Verify*`: clean pass; tamper one file → `mismatch` exit 2 (names it); add an unlocked dir → `orphan` exit 2; delete a locked dir → `missing` exit 2; **multiple failures in one run** → all reported (FR-012); empty repo → exit 0; malformed lock → exit 1; not-a-git-repo → exit 1. Output-shape: human line-count bound (≤ findings + K); `--json` `json.Unmarshal` + `counts`/`verdicts` structurally complete; the pass case's `treeSha` equals the fixture's **real** value (ground truth). diff --git a/specledger/002-skillcore-verify/data-model.md b/specledger/002-skillcore-verify/data-model.md new file mode 100644 index 0000000..a5809e2 --- /dev/null +++ b/specledger/002-skillcore-verify/data-model.md @@ -0,0 +1,158 @@ +# Data Model: `skillcore` + `add` + `verify` + +**Feature**: `002-skillcore-verify` | **Date**: 2026-05-30 +**Anchored to ground truth** (Constitution III): the SHAs below are **real `git` output** captured from a bootstrapped fixture (a `git init` + commit of the sample skill), not invented. The same procedure the tests use (D8). + +## Ground-truth sample (real, captured 2026-05-30) + +> **Representative, not canonical.** The sample skill's *content* (and therefore the exact SHA below) is illustrative — both this fixture skill and the existing `skillrig-origin` template are **pre-canonical samples** (research D12). What is anchored is the **mechanism**: these are genuine `git` tree/commit SHAs, and the tests recompute the expected value independently via raw `git` (research D11/D8), so the fixture content can change without touching the tests. + +A skill `terraform-plan-review` (`SKILL.md` + `skill.toml`) committed at `skills/terraform-plan-review/` with pinned author/committer (`GIT_*_DATE=2026-01-01T00:00:00Z`, name `skillrig`, email `ci@skillrig.dev`): + +``` +$ git rev-parse HEAD +9f1a052e596d5d28f13838061a1ab93207ef6fc3 # commit (provenance) + +$ git rev-parse HEAD:skills/terraform-plan-review +c967789527370d2e0fba03a92e70dffef6f3bf31 # subtree tree-SHA (the fingerprint) + +$ git ls-tree HEAD:skills/terraform-plan-review +100644 blob 22de421b19fe58eeccfae1660dff0d139914e312 SKILL.md +100644 blob ec4b72549e3a28d59f7ec4e3ea29087b2ba5699f skill.toml +``` + +**Relocation-invariance — confirmed empirically**: copying that subtree into a *different* repo at `.agents/skills/terraform-plan-review` and recomputing yields the **identical** tree-SHA: + +``` +$ git rev-parse HEAD:.agents/skills/terraform-plan-review +c967789527370d2e0fba03a92e70dffef6f3bf31 # same as the origin's skills/… tree +``` + +This is the entire label-honesty mechanism: `add` records the origin's `c96778…`; after the consumer commits the vendored copy, `verify` recomputes `c96778…` and they match — both via `git rev-parse`, so equal by construction (research D1). The `commit` SHA is reproducible only because author/committer identity+date were pinned; with default identity it varies (tests assert it is present + 40-hex, or pin env — D8). + +## Entities + +### Manifest — `skill.toml` (parsed, read-only) + +The per-skill manifest, vendored on disk as part of the subtree. `skillcore.ParseManifest` reads it (go-toml/v2). + +| Field | Type | Notes | +|---|---|---| +| `name` | string | skill identity; SHOULD equal the directory name | +| `version` | string | recorded into the lock (e.g. `1.4.0`); not deep-validated this slice | +| `namespace` | string | reverse-DNS-ish; informational this slice | +| `description` | string | informational | +| `tags` | []string | discovery data (architecture §9); informational this slice | +| `requires` | []Require | `{ tool, version, source, manager }` — **parsed but NOT written to the lock** (D4); the on-disk manifest is the single source of truth, read later by `doctor` | + +`add` uses `name` + `version`; `verify` uses the presence of `skill.toml` (or `SKILL.md`) to *recognize* a directory as a skill (for orphan detection). Unknown keys ignored on read (forward-compat). + +### LockFile — `.skillrig/skills-lock.json` (committed; tool-written) + +| Field | Type | Notes | +|---|---|---| +| `lockfileVersion` | int | `1` for this slice | +| `origin` | string | the configured origin reference for provenance (`OWNER/REPO[@REF]`, or the local origin used this slice) | +| `skills` | map[string]LockEntry | keyed by skill name | + +**LockEntry** (note: **no `requires`** — D4): + +| Field | Type | Notes | +|---|---|---| +| `version` | string | from the manifest at vendor time | +| `commit` | string | 40-hex; the origin commit the skill was vendored from (provenance) | +| `treeSha` | string | 40-hex; git tree-object SHA of the skill subtree (label honesty) | +| `path` | string | repo-relative vendored location, e.g. `.agents/skills/terraform-plan-review` | + +**Serialization**: deterministic JSON — keys sorted (Go `encoding/json` sorts map keys), 2-space indent, trailing newline; **atomic write** (temp file in the same dir + rename, mirroring `internal/config.Save`) so the CI-bump-vs-human-edit race and partial writes are avoided (spike open Q10). Hand-editing is not intended (tool-written output). + +Real example (from the ground-truth sample above): + +```json +{ + "lockfileVersion": 1, + "origin": "my-org/my-skills", + "skills": { + "terraform-plan-review": { + "version": "1.4.0", + "commit": "9f1a052e596d5d28f13838061a1ab93207ef6fc3", + "treeSha": "c967789527370d2e0fba03a92e70dffef6f3bf31", + "path": ".agents/skills/terraform-plan-review" + } + } +} +``` + +### TreeSHA (value) + +The git tree-object SHA (40-hex SHA-1) of a skill subtree, produced by `git rev-parse :` (research D1). Relocation-invariant (depends only on subtree contents). The single fingerprint primitive both `add` and `verify` use. + +### AddResult (returned by `skillcore.Add`, rendered by the CLI) + +| Field | Type | Notes | +|---|---|---| +| `name` | string | vendored skill | +| `version` | string | recorded version | +| `path` | string | where it was placed (`.agents/skills/`) | +| `commit` | string | recorded provenance commit | +| `treeSha` | string | recorded fingerprint | +| `dryRun` | bool | true when `--dry-run` (nothing written) | +| `action` | enum | `vendored` \| `unchanged` (idempotent re-add) \| `overwritten` (`--force` over divergent) | + +### VerifyReport + SkillVerdict (returned by `skillcore.Verify`, rendered by the CLI) + +**VerifyReport**: + +| Field | Type | Notes | +|---|---|---| +| `ok` | bool | true iff every verdict is `ok` | +| `verdicts` | []SkillVerdict | one per skill in the **union** of locked ∪ on-disk (so orphans + missing both appear) | +| `counts` | struct | `{ verified, mismatch, orphan, missing, dirty }` for the compact summary | + +**SkillVerdict**: + +| Field | Type | Notes | +|---|---|---| +| `name` | string | skill name | +| `path` | string | repo-relative path | +| `status` | enum | `ok` \| `mismatch` \| `orphan` \| `missing` \| `dirty` | +| `expectedTreeSha` | string | from the lock (empty for `orphan`) | +| `actualTreeSha` | string | recomputed (empty for `missing`) | +| `reason` | string | human-facing one-liner for non-`ok` (what/why/fix seed) | + +**Status semantics** (research D1/D2): +- `ok` — locked, present, committed, `actualTreeSha == expectedTreeSha`. +- `mismatch` — locked + committed, but `actualTreeSha != expectedTreeSha` (label-honesty failure → exit 2). +- `orphan` — on disk under `.agents/skills/` (has `skill.toml`/`SKILL.md`) but no lock entry (→ exit 2). +- `missing` — lock entry whose `path` is absent on disk / absent from `HEAD` (→ exit 2). +- `dirty` — locked + present but uncommitted/modified vs `HEAD` (working tree dirty for that path) → reported distinctly; **counts as a verification failure (exit 2)** but with a "commit it / it has local modifications" message rather than a tree-SHA mismatch. + +### Typed errors (presentation-free; `pkg/skillcore/errors.go`) + +| Type | Carries | CLI mapping | +|---|---|---| +| `VerifyFailure` | the `VerifyReport` (≥1 non-`ok` verdict) | → `ExitVerification` (2); CLI renders per-skill verdicts | +| `GitError` | `{ ExitCode, Stderr }` (gh pattern) | → wrapped as a `*cli.UsageError` (1) for env problems (e.g. not a git repo), with raw cause under `--verbose` | +| (malformed lock / not-a-git-repo) | path + cause | → `*cli.UsageError` (1) | + +`skillcore` returns these; **it never formats user-facing text** (Constitution V) — the CLI layer renders human/`--json` output and maps errors to exit codes (research D9). + +## Validation rules + +- `treeSha`, `commit`: 40-char lowercase hex (git SHA-1). +- `path`: repo-relative, under the canonical `.agents/skills/` root; the leaf SHOULD equal the skill `name`. +- `lockfileVersion`: exactly `1`; any other value → malformed-lock usage error (forward-compat guard). +- `origin`: the configured origin string (provenance only this slice; not re-validated by `verify`). +- A directory under `.agents/skills/` is a "skill on disk" iff it contains `skill.toml` or `SKILL.md` (non-skill dirs ignored — spike §6). + +## State transitions (add) + +``` +absent ──add──▶ vendored(locked, on disk) # first add: write files + lock entry +vendored ──add(identical)──▶ unchanged # idempotent (action=unchanged) +vendored ──[local edit]──▶ divergent # tree-SHA != lock +divergent ──add──▶ refused (exit 1) unless --force # never silently clobber (FR-004) +divergent ──add --force──▶ overwritten(re-vendored) # explicit override +``` + +`verify` performs **no** state transitions (read-only). diff --git a/specledger/002-skillcore-verify/plan.md b/specledger/002-skillcore-verify/plan.md new file mode 100644 index 0000000..abdcf40 --- /dev/null +++ b/specledger/002-skillcore-verify/plan.md @@ -0,0 +1,115 @@ +# Implementation Plan: Vendor & Verify Skills (`add` + `verify`) + +**Branch**: `002-skillcore-verify` | **Date**: 2026-05-30 | **Spec**: [spec.md](./spec.md) +**Input**: Feature specification from `specledger/002-skillcore-verify/spec.md` · Technical companion: [spec-tech-spike.md](./spec-tech-spike.md) + +## Summary + +Deliver the second slice of `skillrig`: the shared integrity primitive **`skillcore`** (a public Go SDK package at `pkg/skillcore`) plus two consumer commands that make the product promise — *"the skill your agent runs is exactly the version that was reviewed and approved"* — demonstrable end-to-end and offline: + +- **`skillrig add `** (Vendor Mutation): vendors a named skill from the repo's **resolved origin** (which may be a *local checkout* this slice) into the canonical `.agents/skills//`, and writes its identity (version, commit, tree-SHA, path) to a committed `.skillrig/skills-lock.json`. +- **`skillrig verify`** (Verification Gate): offline, deterministic, read-only — recomputes each locked skill's git tree-SHA and compares to the lock (**label-honesty**), and checks the on-disk skill set equals the locked set (**orphan/completeness**). Exit `0`/`1`/`2`. + +`skillcore` is the **one** implementation (AP-04) of git tree-SHA computation, `skill.toml` parse, and `skills-lock.json` I/O, consumed by both `add` and `verify` (and reusable by future `bump`/`doctor`). It is **presentation-free** and **never fetches** — pure filesystem core; acquisition + auth live above it (the SDK boundary, SDK-1). The `add → verify` round-trip is the acceptance contract; the tree-SHA is anchored to real `git` output, never invented (Constitution III). + +**Deferred (clarified 2026-05-29/30):** prerequisite/eligibility check + exit `3` → `doctor`; network/git **fetch** + auth in `add`; three-way merge + conflict-marker detection → `bump`; multi-client symlink views + agent-shell selection; `index.json`/search; allowlist/audit. See spike §9. + +## Technical Context + +**Language/Version**: Go 1.24+ (toolchain 1.24.4) — single static binary. +**Primary Dependencies**: existing only — `github.com/spf13/cobra` (command tree), `github.com/pelletier/go-toml/v2` (config + `skill.toml` parse); lock uses stdlib `encoding/json`. **No new dependencies, and no in-process hashing dependency** — the tree-SHA is obtained by *shelling `git`* (see Runtime dependency + research). `go-getter` is explicitly *not* adopted this slice (acquisition is a local origin; OQ-3 deferred). Deps kept minimal (consume-only static binary). +**Runtime dependency (required)**: **`git`** on `PATH` — `skillcore` shells `git` for **all** integrity plumbing (gh-cli's proven pattern — it reimplements nothing): the tree-SHA is `git rev-parse :` (git's own **canonical tree-object SHA**), commit provenance is `git rev-parse `, and an uncommitted/dirty vendored tree is detected with `git status --porcelain`. Because the value `add` records and the value `verify` recomputes are *both git's own output*, they match by construction — there is no second implementation to drift (AP-04 hardened), and no autocrlf/mode-bit/tree-sort reimplementation to get subtly wrong. `git` is already a project prerequisite (`init`), and tests use it to bootstrap fixtures (gh-cli's `initRepo` pattern). +**Storage**: local files only — vendored subtree under `.agents/skills//` (canonical, committed), `.skillrig/skills-lock.json` (committed, tool-written, atomic). `add` reads the resolved origin (a local path this slice). No database, no network. +**Testing**: Go standard `go test`, two tiers (Constitution II/III): (a) **unit** — table-driven `skillcore` tests + a **ground-truth** test asserting `skillcore.TreeSHA` equals real `git` tree output; (b) **integration** — `TestQuickstart_*` build + exec the real binary over a fixture origin bootstrapped in a tmpDir. **No network boundary this slice → no `httptest`/go-vcr** (that tier arrives with remote `add`). +**Target Platform**: macOS/Linux/Windows terminals, CI, agent runners. (Symlink/Windows concerns deferred with multi-client materialization.) +**Project Type**: single Go module (`github.com/skillrig/cli`). +**Performance Goals**: sub-100ms for `add`/`verify` on typical small skill trees (offline; soft target — cli.md records no per-command duration). +**Constraints**: offline; deterministic; `verify` is **read-only** (it only runs `git rev-parse`/`git status` — no object writes); `verify` checks the **committed** vendored tree (`git rev-parse HEAD:`) and flags an uncommitted/dirty vendored tree as a *distinct* finding; `add` vendors **byte-identical** preserving file modes (the exec bit is part of the tree SHA) and injects nothing; both `add` and `verify` require a **git repository**. Exit codes this slice: `0` ok, `1` usage/config, `2` verification failure. `3` (prerequisite) reserved for `doctor`. +**Scale/Scope**: small — one new package (`pkg/skillcore`), two commands, one exit-code-mapping extension, fixtures. ~Several hundred LOC. + +## Constitution Check + +*GATE: Must pass before Phase 0 research. Re-checked after Phase 1 design (below).* + +Verify compliance with `.specledger/memory/constitution.md` (v2.1.0): + +- [x] **I. Specification-First**: spec.md complete, clarified (2 sessions; 21 reviewer comments resolved), prioritized user stories P1–P3. +- [x] **II. Quickstart-as-Contract**: quickstart.md authored as executable scenarios mapping 1:1 to `TestQuickstart_*`; output-shape assertions — `verify` human output line-count bounded (`≤ skillCount + K`), `--json` parseable + structurally complete (per-skill verdict carries `name`/`path`/`expectedTreeSha`/`actualTreeSha`/`status`), error output asserts what/why/fix as distinct checks + exit code. +- [x] **III. Ground-Truth Anchoring**: the git-origin boundary's ground truth is a **real git tree-SHA** — and because `skillcore` *shells `git`* to obtain it, the recorded value **is** git's canonical tree SHA by construction (no reimplementation to validate against ground truth). Fixtures are bootstrapped via `git init` + commit from `testdata/` (gh-cli's `initRepo` pattern), never hand-authored; data-model.md captures a real recorded tree-SHA sample. The `add → verify` round-trip proves `add` records exactly what `verify` recomputes. No network boundary this slice, so no httptest/go-vcr (deferred with remote `add`). +- [x] **IV. Agent-First CLI Design**: `add` classified **Vendor Mutation** (`--dry-run`, `--force`, idempotent, writes lock via `skillcore` only); `verify` classified **Verification Gate** (offline, deterministic, exit-code-driven, no online/inferential signal — AP-02). Progressive `--help` with ≥2 examples each; errors-as-navigation (what/why/fix, raw cause under `--verbose`, stderr); two-level output (compact human + footer hint, complete `--json`). `skillcore` is the **one** integrity implementation (AP-04). +- [x] **V. Code Quality (Go)**: `gofmt` + `go vet` + `golangci-lint` gate; `pkg/skillcore` is **presentation-free** (returns typed structs + typed errors, no `fmt.Println` of user text); CLI layer renders. Execution/presentation separation preserved. +- [x] **VI. YAGNI**: no `requires` in the lock (manifest on disk is the source of truth); no symlink views; no conflict-marker detection; no network/auth; no `bump`/3-way-merge; no `go-getter`. +- [x] **VII. Shortest Path to MVP**: `skillcore` + `add` (local origin) + `verify` only — the minimum that makes the promise demonstrable. +- [x] **VIII. Simplicity Over Cleverness**: the tree-SHA is git's **own** output (shelled via a small testable client, gh-cli pattern) — no clever in-process re-hashing to get subtly wrong; plain structs + stdlib json/toml; no reflection. +- [x] **IX. Skill–CLI Co-Evolution**: an agent skill update (teaching `add` / `verify` usage, exit `0`/`2` meaning, and that a missing backing tool is **not** a verify failure) is a planned task with a trigger-accuracy eval. + +**Design-doc sync (Constitution / Architecture & CLI Design):** `docs/design/cli.md` MUST be updated **in this branch** (CLI behavior change): the `verify` command index line + Verification-Gate row to state `verify` is **integrity-only** (prerequisite/eligibility attributed to `doctor`), and the Execution-vs-Presentation "these are not separate packages" line is **reversed** for `skillcore` (it IS a separate importable package per SDK-1). `docs/ARCHITECTURE-v0.md` was already updated (spike §8). These are tracked as tasks, not plan-blocking. + +**Complexity Violations**: None. (`skillcore` living in public `pkg/` rather than `internal/` is **required by SDK-1**, not a complexity violation — recorded in research.md; it strengthens AP-04 rather than weakening any principle.) + +## Project Structure + +### Documentation (this feature) + +```text +specledger/002-skillcore-verify/ +├── spec.md # user-facing spec (clarified) +├── spec-tech-spike.md # technical companion / decision log (§1–§12) +├── plan.md # This file +├── research.md # Phase 0 output — decisions + rationale + prior work +├── data-model.md # Phase 1 output — entities + a real git-tree-SHA ground-truth sample +├── quickstart.md # Phase 1 output — executable TestQuickstart_* scenarios +├── contracts/ # Phase 1 output +│ ├── add.md # `skillrig add` command surface (Vendor Mutation) +│ ├── verify.md # `skillrig verify` command surface (Verification Gate) +│ └── skillcore-sdk.md # public pkg/skillcore API surface (SDK-1) +├── checklists/requirements.md +└── tasks.md # Phase 2 output (/specledger.tasks — NOT created here) +``` + +### Source Code (repository root) + +Module path: `github.com/skillrig/cli`. + +```text +. +├── main.go # unchanged: os.Exit(cli.Execute()) +├── go.mod / go.sum # no new deps +├── .golangci.yml +├── pkg/ +│ └── skillcore/ # PUBLIC SDK package (SDK-1) — presentation-free, never fetches +│ ├── git.go # small testable git client (gh pattern): pluggable commandContext + GitError; revParse / status +│ ├── treesha.go # TreeSHA(gitDir, ref, relPath) — shells `git rev-parse :` (canonical) +│ ├── manifest.go # Manifest + ParseManifest(skill.toml) (go-toml/v2) +│ ├── lock.go # LockFile/LockEntry types + ReadLock/WriteLock (atomic, NO requires) +│ ├── add.go # Add(opts) (AddResult, error) — copy subtree (mode-preserving) + write lock +│ ├── verify.go # Verify(repoRoot) (Report, error) — label-honesty + orphan + dirty-flag, read-only +│ └── errors.go # typed errors (VerifyFailure, etc.) — no user-facing formatting +├── internal/ +│ ├── cli/ +│ │ ├── root.go # registerSubcommands: + newAddCmd, + newVerifyCmd +│ │ ├── add.go # wiring: ResolveOrigin → skillcore.Add → render AddResult +│ │ ├── verify.go # wiring: skillcore.Verify → render Report → exit code +│ │ ├── exit.go # exitCodeFor EXTENDED: skillcore.VerifyFailure → ExitVerification (2) +│ │ └── output.go # render AddResult / VerifyReport (human compact + footer; --json) +│ └── config/ # UNCHANGED — ResolveOrigin reused by add (CLI layer resolves, passes local path down) +└── test/ + ├── quickstart_test.go # TestQuickstart_Add* / _Verify* — build + exec the real binary + └── testdata/ + └── sample-origin/ # canonical sample origin: .skillrig-origin.toml + skills// (research D12); + # a helper git-inits + commits it into a tmpDir (raw git oracle, D11) +``` + +**Structure Decision**: `skillcore` lives in **public `pkg/skillcore`** (import `github.com/skillrig/cli/pkg/skillcore`) per SDK-1 — third-party Go tools can build their own `add`/`verify` on the same primitives, and the CLI imports exactly that package (so there is no parallel implementation, AP-04). It is **presentation-free and never fetches** (the SDK boundary): `skillcore.Add` takes an already-resolved local source path + destination repo root; **origin resolution stays in the CLI layer** (`internal/cli/add.go` calls the existing `config.ResolveOrigin`, then passes the resolved local path down), keeping `skillcore` free of origin/config/network concerns. `internal/config` is unchanged and reused. + +`skillcore` shells `git` through a **small testable client** (`git.go`) modeled on `gh`'s `git.Client` — a struct with a pluggable `commandContext` (function field, swappable in tests) and a `GitError{ExitCode, Stderr}` type. The **one** primitive `TreeSHA(gitDir, ref, relPath)` runs `git -C gitDir rev-parse :`; `add` calls it on the *origin* (`ref` = resolved ref), `verify` calls it on the *consumer* (`ref` = `HEAD`) — same function, both sides git-canonical (AP-04). `verify` needs no origin: `skillcore.Verify(repoRoot)` reads only the committed lock + the committed vendored tree + `git status` (read-only). The CLI's `exitCodeFor` is extended so a `skillcore` verification failure maps to exit `2` while everything else stays `1` (a typed-error switch). `main.go` is untouched. + +## Phase Breakdown (this command produces Phase 0 + Phase 1 artifacts) + +- **Phase 0 — research.md**: resolve the spike's plan-level open questions (tree-SHA mechanism, package path, fixture strategy, lock schema, origin-resolution reuse, go-getter deferral) with decisions + rationale + alternatives; summarize prior work. +- **Phase 1 — design**: `data-model.md` (entities + a real git-tree-SHA ground-truth sample), `contracts/{add,verify,skillcore-sdk}.md`, `quickstart.md` (executable scenarios), agent-context update. +- **Phase 2 — tasks.md**: produced by `/specledger.tasks` (NOT here). + +## Complexity Tracking + +> No constitutional violations to justify. (`pkg/skillcore` public placement is mandated by SDK-1 and recorded in research.md, not a violation.) Table intentionally empty. diff --git a/specledger/002-skillcore-verify/quickstart.md b/specledger/002-skillcore-verify/quickstart.md new file mode 100644 index 0000000..f19da65 --- /dev/null +++ b/specledger/002-skillcore-verify/quickstart.md @@ -0,0 +1,115 @@ +# Quickstart: `add` + `verify` (Executable Acceptance Contract) + +**Feature**: `002-skillcore-verify` | **Date**: 2026-05-30 +Per Constitution II, **each scenario below maps 1:1 to a `TestQuickstart_` integration test** that builds and execs the real `skillrig` binary. Every scenario states its **output-shape assertions** (not just `Contains`): human line-count bound, `--json` parseable + structurally complete, error = what/why/fix as distinct checks + the correct exit code. + +## Test harness & helpers + +- Build the binary once (existing 001 harness); exec it with a controlled cwd + env. +- **`bootstrapOrigin(t) (dir, ref)`** — `git init` a `t.TempDir()`, copy `test/testdata/sample-origin/**` in, `git add -A && git commit` with **pinned** `GIT_AUTHOR_*`/`GIT_COMMITTER_*` name=`skillrig` email=`ci@skillrig.dev` date=`2026-01-01T00:00:00Z` (so the commit SHA is reproducible — D8). Returns the origin dir + ref (`HEAD`/`main`). +- **`newConsumerRepo(t) dir`** — `git init` a `t.TempDir()`; run `skillrig init --origin ` in it (the origin value is the local path — clarified 2026-05-30). +- **`commitAll(t, dir, msg)`** — stage + commit (pinned identity) so `verify` sees committed content. +- **Ground truth & oracle independence** (research D11/D12): the fixture is a *canonical, design-aligned* sample origin (`test/testdata/sample-origin/` mirroring the origin layout — `.skillrig-origin.toml` + `skills//{SKILL.md,skill.toml}`); the sample skill (`terraform-plan-review@1.4.0`) and its tree-SHA are **illustrative real-`git` output, not a locked constant** (both the fixture and the existing `skillrig-origin` template are pre-canonical samples). Integration tests compute the **expected** tree-SHA with **raw `git`** (`git -C rev-parse :skills/`), **never** through `skillcore` — the binary under test uses `skillcore`, so the oracle must stay independent (Constitution III, no circular validation). A separate `skillcore` unit test pins `skillcore.TreeSHA == ` raw `git` output. + +--- + +## US1 — Vendor a skill (`add`) + +### TestQuickstart_AddVendorsSkill (US1.1) +- **Given** a consumer git repo whose origin is a local checkout containing `terraform-plan-review`. +- **When** `skillrig add terraform-plan-review`. +- **Then** exit `0`; `.agents/skills/terraform-plan-review/{SKILL.md,skill.toml}` exist **byte-identical** to the origin (modes preserved); `.skillrig/skills-lock.json` has one entry `{version:"1.4.0", commit, treeSha, path:".agents/skills/terraform-plan-review"}` with **`treeSha` == the value `git rev-parse` gives for the origin subtree** (ground truth) and **no `requires` field**. +- **Shape**: human ≤ 2 lines incl. footer (`→ commit it, then run: skillrig verify`). `--json`: `json.Unmarshal` ok; keys `ok,name,version,path,commit,treeSha,action,dryRun` all present; `action=="vendored"`. + +### TestQuickstart_AddIdempotent (US1.2) +- **Given** `terraform-plan-review` already vendored. +- **When** `skillrig add terraform-plan-review` again (identical content). +- **Then** exit `0`; lock unchanged (one entry, no dup); `--json action=="unchanged"`; human prints `… already vendored (no change)`. + +### TestQuickstart_AddDryRunWritesNothing (US1.5) +- **When** `skillrig add terraform-plan-review --dry-run` in a fresh consumer. +- **Then** exit `0`; **no** `.agents/skills/` and **no** `.skillrig/skills-lock.json` created; human prefixed `would vendor …`; `--json dryRun==true, action=="vendored"`. + +### TestQuickstart_AddRefusesDivergentWithoutForce (US1.4) +- **Given** `terraform-plan-review` vendored; a byte of its `SKILL.md` then edited. +- **When** `skillrig add terraform-plan-review` (no `--force`). +- **Then** exit `1`; **error has 3 parts** — what: `refusing to overwrite .agents/skills/terraform-plan-review`; why: `on-disk content diverges from the recorded fingerprint`; fix: `re-run with --force`. Files unchanged. +- **And** `skillrig add terraform-plan-review --force` → exit `0`, `action=="overwritten"`, content restored to origin. + +### TestQuickstart_AddRequiresOrigin +- **Given** a git repo with **no** origin (no `init`, no `SKILLRIG_ORIGIN`, no global). +- **When** `skillrig add terraform-plan-review`. +- **Then** exit `1`; 3-part error — what: `no origin configured`; why: `no SKILLRIG_ORIGIN / project / global origin`; fix: `skillrig init --origin OWNER/REPO`. + +### TestQuickstart_AddNotGitRepo +- **When** `skillrig add …` in a non-git tmpdir (origin via `SKILLRIG_ORIGIN`). +- **Then** exit `1`; what: `not a git repository`; why: `tree-SHA + provenance need git`; fix: `run inside the repo`. + +--- + +## US2 — Prove a skill is unmodified (`verify` label-honesty) + +### TestQuickstart_VerifyPasses (US2.1) +- **Given** `terraform-plan-review` vendored **and committed**. +- **When** `skillrig verify`. +- **Then** exit `0`; human exactly 2 lines (`verified 1 skills ✓` + `→ all match their recorded version`); `--json ok==true`, `counts.verified==1`, one verdict `status=="ok"` whose `expectedTreeSha==actualTreeSha==` the ground-truth tree-SHA. + +### TestQuickstart_VerifyDetectsTamper (US2.2, SC-003) +- **Given** the skill vendored + committed; then one byte of `SKILL.md` changed **and committed**. +- **When** `skillrig verify`. +- **Then** exit `2`; the failing verdict `status=="mismatch"` **names** `terraform-plan-review` with `expectedTreeSha` (recorded) ≠ `actualTreeSha` (on-disk). Human ≤ findings + K lines. + +### TestQuickstart_VerifyDirtyUncommitted (D2) +- **Given** the skill vendored but **not committed** (or committed then edited-without-commit). +- **When** `skillrig verify`. +- **Then** exit `2`; verdict `status=="dirty"`, reason names the uncommitted/locally-modified skill and says to commit it — *distinct* from `mismatch`. + +### TestQuickstart_VerifyEmptyRepoPasses (US2.4) +- **Given** a fresh git repo, no skills, no lock. +- **When** `skillrig verify`. +- **Then** exit `0` (nothing to verify), not an error; `--json ok==true, counts all zero, verdicts==[]`. + +--- + +## US3 — Orphan / completeness (`verify`) + +### TestQuickstart_VerifyDetectsOrphan (US3.1) +- **Given** `terraform-plan-review` vendored + committed; plus an **unlocked** `.agents/skills/rogue/skill.toml` created + committed (no `add`). +- **When** `skillrig verify`. +- **Then** exit `2`; a verdict `status=="orphan"` naming `rogue` (present on disk, no lock entry). + +### TestQuickstart_VerifyDetectsMissing (US3.2) +- **Given** the skill vendored + committed; then `.agents/skills/terraform-plan-review/` removed + committed (lock still references it). +- **When** `skillrig verify`. +- **Then** exit `2`; verdict `status=="missing"` naming `terraform-plan-review`. + +### TestQuickstart_VerifyAggregatesAllFailures (US3.4, FR-012) +- **Given** one skill tampered **and** one orphan present (committed). +- **When** `skillrig verify`. +- **Then** exit `2`; **both** reported in one run — `counts.mismatch>=1 && counts.orphan>=1`, `len(verdicts)` covers all skills; the check did **not** stop at the first failure. + +--- + +## US4 — Scriptable outcome (exit codes + `--json`) + +### TestQuickstart_VerifyExitCodeMatrix (US4.1, FR-022) +- Assert the stable mapping over the scenarios above: pass→`0`, any verification failure→`2`, malformed-lock/not-a-repo→`1`, and **never `3`**. Repeated runs on unchanged input yield the identical code (deterministic). + +### TestQuickstart_VerifyJSONComplete (US4.2) +- For both a passing and a failing run: `--json` on stdout is `json.Unmarshal`-able and **structurally complete** — top-level `ok,counts,verdicts`; `counts` has all five keys; **every** checked skill appears as a verdict with all six fields. Diagnostics go to **stderr** (stdout stays clean JSON: `verify --json 2>/dev/null | jq .` parses). + +### TestQuickstart_VerifyMalformedLock (US4.4) +- **Given** a `.skillrig/skills-lock.json` that is not valid JSON (or wrong `lockfileVersion`). +- **When** `skillrig verify`. +- **Then** exit `1` (usage/config, **distinct** from verification failure `2`); 3-part error naming the file + raw cause under `--verbose`; **not** a raw parser dump. + +--- + +## Round-trip (the core acceptance contract) + +### TestQuickstart_AddThenVerifyRoundTrip (SC-001, SC-005) +- `init --origin ` → `add terraform-plan-review` → `commitAll` → `verify` ⇒ exit `0`, in **two commands** (+commit), **zero network**, **no hand-authored lock**. Proves `add` records exactly what `verify` recomputes (same git-canonical tree-SHA, both sides — research D1). Then a one-byte tamper + commit ⇒ `verify` exit `2`. This is the headline scenario; all primitives exercised end-to-end on real git output. + +--- + +> **Coverage check** (for `/specledger.verify`): every spec user story (US1–US4) and acceptance scenario, and every Output/Errors/Exit row in `contracts/{add,verify}.md`, has a `TestQuickstart_*` above. Deferred behaviors (prereq/exit-3, conflict markers, network, symlinks) have **no** scenarios here — by design (spec Out of Scope). diff --git a/specledger/002-skillcore-verify/research.md b/specledger/002-skillcore-verify/research.md new file mode 100644 index 0000000..d738567 --- /dev/null +++ b/specledger/002-skillcore-verify/research.md @@ -0,0 +1,123 @@ +# Phase 0 Research: `skillcore` + `add` + `verify` + +**Feature**: `002-skillcore-verify` | **Date**: 2026-05-30 +**Inputs**: [spec.md](./spec.md), [spec-tech-spike.md](./spec-tech-spike.md) (§1–§12), [plan.md](./plan.md), constitution v2.1.0, `docs/ARCHITECTURE-v0.md`, `docs/design/cli.md`. Prior-art studied: `pkg/cli/skills` (skills.sh), `gh-cli/pkg/cmd/skills` (`gh skill`), `gh-cli/git` (gh's git wrapper). + +## Prior Work + +`sl issue list --all` → only **closed** `001-init-origin-resolution` items (epic SL-227789 + features/tasks). No prior `add`/`verify`/`skillcore` work. This feature reuses 001's `internal/config` resolver and baseline command experience and adds the verification-failure exit class (2). + +Three external prior-art implementations were studied and triangulated (spike §11/§12 + the gh-cli `git` study): + +| Tool | Acquisition | Integrity | Has `verify`? | Lesson for us | +|---|---|---|---|---| +| **skills.sh** (`npx skills`) | HTTP registry (Vercel) + GitHub Trees/raw API | bespoke **SHA-256** over files, `ref` only (no commit) | no | the "custom hash" §4.2 rejected; network-strict, no offline mode | +| **`gh skill`** (GitHub first-party) | GitHub REST API (Trees/Blobs) | git tree-SHA **from the API**, used for *online update-detection*; injects provenance into frontmatter | **no** | confirms our `verify` fills a real gap; frontmatter injection is *incompatible* with tree-SHA label-honesty (so we keep provenance lockfile-only) | +| **`gh-cli/git`** (gh's git wrapper) | n/a (general git ops) | **shells `git` for everything; zero in-process object hashing** | n/a | the decisive input for D1 below — don't reimplement git | + +## Decisions + +### D1 — Tree-SHA is git's own output (shell `git`), not in-process hashing + +**Decision**: `skillcore.TreeSHA(gitDir, ref, relPath)` shells **`git -C gitDir rev-parse :`**, returning git's canonical tree-object SHA. `add` calls it on the **origin** (`ref` = the resolved ref); `verify` calls it on the **consumer** repo (`ref` = `HEAD`). One primitive, both sides git-canonical. + +**Rationale**: +- **gh-cli grounds it.** `gh/git` (`client.go:52–99`) wraps the `git` binary via a pluggable `commandContext` and shells out for *every* git operation; it contains **no** in-process git object hashing anywhere in the codebase. A mature, widely-used reference deliberately does not reimplement git internals. +- **Canonical by construction.** Because both the recorded value (`add`, on the origin) and the recomputed value (`verify`, on the consumer's committed tree) are *git's own* `rev-parse` output, they match by construction when content is identical — there is literally no second implementation to drift (AP-04 hardened). An in-process re-hash would have to exactly reproduce git's blob/tree object format, executable-bit/symlink mode mapping, the tree-entry sort quirk (subtrees sorted as if names end in `/`), and any clean filters — a real, ongoing correctness risk for marginal benefit. +- **Relocation-invariant** (spike §3): a git tree object hashes only immediate `{mode, name, childSHA}` entries, so the origin's `skills/foo` tree SHA equals the consumer's `.agents/skills/foo` tree SHA iff contents match — exactly what makes offline label-honesty survive the origin→consumer relocation. +- `git` is already a required dependency (001's `init` uses `git rev-parse --show-toplevel`); adds nothing new. + +**Alternatives rejected**: +- *In-process SHA-1 tree hashing* (the spike's earlier lean): pure-Go, no subprocess, but must reproduce git's object model exactly — gh-cli's "never reimplement" signal + the correctness surface (autocrlf, mode bits, sort) outweigh the "no subprocess" benefit. Rejected. +- *Bespoke SHA-256 canonicalization* (skills.sh's approach): re-derives a guarantee git already gives, and would *not* equal the origin's git tree SHA (breaks future `bump`/origin comparison). Rejected (architecture §4.2 already rejected it). + +### D2 — `verify` hashes the **committed** tree + flags dirty separately + +**Decision**: `verify` recomputes each locked skill's tree-SHA from the **committed** vendored tree (`git rev-parse HEAD:`) and compares to the lock. An **uncommitted / dirty** vendored path (detected via `git status --porcelain -- `, or a path absent from `HEAD`) is reported as a **distinct** finding ("vendored but not committed / locally modified — commit before verifying"), not folded into the label-honesty fingerprint. + +**Rationale**: the load-bearing caller is the **CI gate**, which runs on committed content (working tree == `HEAD`), so committed-tree hashing is exactly right there. It keeps `verify` truly **read-only** — `rev-parse`/`status` write no git objects (a temp-index `write-tree` over the working tree *would* write loose objects, violating the read-only spirit of FR-015). Separating "uncommitted local edits" (a working-state warning) from "committed content mismatches its recorded version" (a label-honesty failure, exit 2) is a *better* taxonomy than collapsing both into one fingerprint. This **refines** spike §3's "hash the working tree" intent. + +**Alternatives rejected**: +- *Hash the working tree via temp-index `git add` + `write-tree --prefix`*: catches uncommitted edits in the fingerprint, but writes loose objects into `.git` (not read-only) and adds `.gitignore`/filter subtleties + temp-index/object-dir plumbing — too clever for the MVP (Constitution VIII). The dirty-flag covers the same ground more clearly. + +**Consequence**: the `add → verify` round-trip commits the vendored skill before `verify` (realistic — vendored-in-git means you commit what you vendored). Quickstart scenarios include the `git add && git commit` step. + +### D3 — `skillcore` is a public package at `pkg/skillcore` (SDK-1) + +**Decision**: import path `github.com/skillrig/cli/pkg/skillcore`. **Rationale**: SDK-1 requires third-party Go tools import it (so not `internal/`); the `pkg/` convention explicitly signals "public, importable" and segregates it from `internal/`; the CLI imports the *same* package (AP-04). **Alternatives rejected**: `internal/skillcore` (un-importable — violates SDK-1); module-root `skillcore/` (also fine, but `pkg/` was chosen for the explicit public signal); a *separate module* (`github.com/skillrig/skillcore`) — independent SemVer, but multi-module overhead is YAGNI pre-release. (No constitutional rule mandated `internal/`; see spike §10.) + +### D4 — Lock schema omits `[[requires]]` + +**Decision**: lock entry = `{ version, commit, treeSha, path }`; top-level `{ lockfileVersion, origin, skills{} }`. **No** `requires` array. **Rationale**: the full skill subtree — including `skill.toml` — is vendored on disk and fingerprint-attested, so the vendored manifest is the single source of truth for prerequisites; a future `doctor` walks it directly. Mirroring into the lock duplicates data that can drift (YAGNI). **Diverges from architecture §4.2** (whose "mirror requires for offline prereq check" assumed the manifest might not be on disk) — flagged for architecture reconciliation. **Alternative rejected**: mirror `requires` now for forward-compat — rejected because nothing reads it this slice and the manifest is always present. + +### D5 — Origin resolution stays in the CLI layer (SDK boundary) + +**Decision**: `internal/cli/add.go` resolves the active origin via the existing `config.ResolveOrigin` and passes the **resolved local path** down to `skillcore.Add`. `skillcore` never resolves origins, reads config, or fetches. **Rationale**: keeps `skillcore` a pure filesystem/git core (the SDK boundary, spike §10) — an SDK consumer supplies the source path themselves; acquisition + origin policy are CLI concerns. `verify` needs no origin at all. **Alternative rejected**: a `--from`/path argument on `add` that bypasses the configured origin — rejected as a single-origin-contract violation (clarified 2026-05-30); tests do `init --origin ` then `add`. + +### D6 — Vendor copy preserves file modes; injects nothing + +**Decision**: `add` copies the skill subtree byte-for-byte **preserving file modes** (the executable bit is part of the git tree SHA) and adds/modifies nothing (no frontmatter injection). **Rationale**: any mutation or mode change alters the tree SHA and breaks label-honesty (the `gh skill` frontmatter-injection incompatibility, spike §12). A mode-preserving recursive copy is the boring, obvious implementation (Constitution VIII); `git archive`-based extraction is an equivalent alternative if mode handling proves fiddly. **Alternative rejected**: injecting provenance into `SKILL.md` (gh skill's model) — fundamentally incompatible with recompute-the-tree-SHA verification. + +### D7 — git interaction via a small testable client (gh pattern) + +**Decision**: a `git.go` inside `pkg/skillcore` with a `Client`-like struct carrying a pluggable `commandContext` (function field) and a `GitError{ExitCode, Stderr}` type, mirroring `gh/git` (`client.go`, `errors.go`). Exposes `revParse`, `status`, etc. used by `TreeSHA`/`Add`/`Verify`. **Rationale**: directly testable (swap `commandContext` for a stub in unit tests; run real `git` in a tmpdir for integration), errors classified (exit code + stderr) so the CLI can render errors-as-navigation. **Testing** (D8) uses gh's dual strategy. **Note**: `internal/config` already shells `git` for `rev-parse --show-toplevel`; consolidating both onto one git client is a future cleanup (out of scope — YAGNI), noted so it isn't lost. + +### D8 — Fixtures: bootstrap real git in a tmpDir (gh `initRepo` pattern) + +**Decision**: a test helper does `git init` + `git add` + `git commit` in a `t.TempDir()` from files committed under `test/testdata/sample-origin/`, producing a real origin to `add` from; the consumer repo is likewise a tmpDir git repo. **Rationale**: avoids committing a nested/bare git repo inside skillrig-cli (the rejected alternative); mirrors gh's `initRepo` helper (`git/client_test.go:1948`). **Determinism**: the **tree-SHA is content-only → deterministic**, so tests assert the exact tree-SHA / fingerprint; the **commit SHA varies** with author/date, so tests assert it is present + well-formed (40-hex) — *or* pin `GIT_AUTHOR_*`/`GIT_COMMITTER_*` env for a fully reproducible commit when an exact assertion is wanted. **Alternative rejected**: commit a bare fixture repo (gh also does this via `fixtures/simple.git`) — rejected to avoid nested-repo maintenance; bootstrap is cleaner here. + +### D9 — Exit-code mapping extended for verification failure + +**Decision**: `internal/cli/exit.go`'s `exitCodeFor` is extended from "any error → `ExitUsage(1)`" to a typed switch: a `skillcore` verification failure (a typed error, e.g. `*skillcore.VerifyFailure` surfaced through the CLI) → `ExitVerification(2)`; `*UsageError` and everything else → `ExitUsage(1)`; `nil` → `ExitOK(0)`. **Rationale**: load-bearing exit codes (Constitution IV; cli.md) — CI/agents branch on *why* `verify` failed. `ExitVerification=2` and `ExitPrereq=3` constants already exist (reserved); this activates 2. **Alternative rejected**: a sentinel error value — a typed error carrying the per-skill report is richer for rendering. + +### D10 — No `go-getter`, no network, no new deps (this slice) + +**Decision**: acquisition is a local origin (a filesystem path that is a git checkout); no `go-getter`, no HTTP, no auth. **Rationale**: OQ-3 says go-getter's value scales with multi-origin support, which is deferred; a thin git interaction suffices for a local origin and honors "minimal deps" (architecture §1). go-getter / `gh`-auth-as-library / GitHub-only remote fetch are recorded for the *remote-`add`* follow-up (spike §11 OQ-2/OQ-3). **Alternative rejected**: adopt go-getter now — premature (YAGNI); it also may not surface the commit SHA we need for provenance (spike OQ-3 caveat). + +### D11 — Test-oracle independence: integration tests use raw `git`, not `skillcore` + +**Decision**: the `TestQuickstart_*` integration tests (build + exec the real binary) use **raw `git`** (via `os/exec` helpers) for both fixture bootstrap *and* computing the **expected** tree-SHA they assert against — they do **not** route the expected value through `skillcore`. The small testable git client (pluggable `commandContext` stub) is for `skillcore`'s **own unit tests** (error paths — simulated `git` failures), and one `skillcore` unit test pins `skillcore.TreeSHA(...) == ` raw `git rev-parse :` against the fixture (the SDK invariant vs ground truth). + +**Rationale**: integration tests are **black-box** — the binary under test uses `skillcore` internally, so `skillcore` is part of the system under test. Using it to also produce the *expected* value is **circular validation**, which Constitution III explicitly forbids ("a wrong spec yields matching-but-wrong types, fixtures, AND tests that all agree with each other and with nothing real"). Raw `git` is the independent oracle. Mechanical setup (`git init`/`commit`) *could* reuse the client, but raw `git` is simplest and keeps the oracle boundary crisp. + +**Alternative rejected**: route setup + expected values through `skillcore`'s git client for DRY — rejected: it couples the oracle to the implementation and cannot catch a `TreeSHA` bug. + +### D12 — Fixture mirrors a *canonical* (design-aligned) origin layout; the existing template is a pre-design sample to reconcile + +**Context** (clarified by the user, 2026-05-30): the existing `skillrig-origin` repo (`/Users/vincentdesmet/specledger/skillrig/skillrig-origin`) is a **pre-design sample** — *not* canonical to copy verbatim. The fixture and that template should both conform to the canonical origin structure the locked design implies; the specific sample content (and thus its tree-SHA) is illustrative, and tests compute the SHA independently (D11) so content can change freely. + +**Decision — desired fixture** (`test/testdata/sample-origin/`), minimal + design-aligned: +``` +test/testdata/sample-origin/ +├── .skillrig-origin.toml # convention_version, origin, skills_dir = "skills" +└── skills/ + └── / + ├── SKILL.md + └── skill.toml # manifest carries [[requires]] (manifest = single source of truth; the lock OMITS it, D4) +``` +`add`/`verify` read **only** `skills//{skill.toml,SKILL.md}` this slice; `.skillrig-origin.toml` is carried for fidelity (and lets a test assert `add` ignores non-skill origin files). `index.json`, `cmd/`, `policy.toml`, workflows are **not** needed for `add`/`verify` and are the origin-template's concern, not the fixture's (YAGNI). + +**Recommended changes to the `skillrig-origin` template** (cross-repo — tracked here as recommendations; the template is a separate repo, not edited on this branch): +- `docs/CONVENTION.md` MUST pin the **fingerprint boundary** precisely: `treeSha = git tree-object SHA of skills/`, i.e. `git rev-parse :skills/` — so origin and consumer compute identically (the locked shell-`git` decision, D1). Today it references the boundary loosely. +- Note (manifest vs lock): the skill.toml `[[requires]]` is the **single source of truth** and is **not** mirrored into the consumer lock (D4) — state this in `CONVENTION.md`/`AGENTS.md` so lock-mirroring isn't re-introduced (it contradicts architecture §4.2's old wording). +- **Layout constraint (perf, D13)**: each skill dir MUST be **self-contained** under `skills//` (no cross-skill shared files), so one skill can be fetched via partial-clone + sparse-checkout without the rest of the monorepo. The template already satisfies this — document it as a convention constraint. +- `convention_version` stays `1` (this slice introduces no structural change to the origin). + +### D13 — Large-monorepo performance: no full clone needed; the tree-SHA primitive is acquisition-agnostic + +**Decision (this slice)**: the origin is a **local checkout already on disk**, so `add` = read one subtree + 2 `git rev-parse` calls — **trivial** cost even for a huge monorepo (no clone). The git-wrapper (shell `git rev-parse :` against a local dir) is the whole cost. + +**Decision (remote-`add` follow-up — recorded for OQ-1/OQ-3)**: fetch one skill via `git clone --filter=blob:none --sparse ` + `git sparse-checkout set skills/` into a temp dir, then the **same** `TreeSHA(tempDir, ref, path)` primitive. **Key finding** (Explore agent, 2026-05-30): the git tree-SHA is **identical** whether the objects arrived by full clone, partial clone, or are read from GitHub's Trees API (`entry.SHA`) — it *is* the canonical git tree object SHA — so the remote case needs **zero change** to the integrity primitive. **The local-origin slice does not paint us into a corner.** + +**Prior-art locking contrast** (the TreeSha difference): +- **`gh skill`** reads the tree SHA from GitHub's **Trees API** (`discovery.go:544-592`) — no clone, but GitHub-coupled, and used for *online update-detection*, not offline verify; injects it into frontmatter (`frontmatter.go:70-98`). +- **skills.sh** computes a **bespoke SHA-256** over fetched files (`hash.go:18-68`), records `ref` only (no commit), and falls back to a `--depth 1` **shallow clone** (whole tree) on API failure (`discover.go:102-182`). +- **skillrig** records the git **tree-SHA + commit**, computed by `git`, **offline-verifiable** — the only one of the three with an offline integrity gate. + +**Recommendation for remote-`add` MVP**: **partial-clone + sparse-checkout** (pure `git`, any host, downloads only the skill's blobs, preserves the tree-SHA primitive) over the GitHub Trees+Blobs API (faster + no `git` binary, but GitHub-API-coupled — conflicts with the generic-binary stance). Shallow clone (`--depth 1`) is the simple fallback when partial clone is unavailable. The choice is deferred to the remote-`add` spec; **this slice needs none of it**. + +## Open items carried to `/specledger.tasks` / implement + +- **Design-doc sync (this branch):** update `docs/design/cli.md` — `verify` is integrity-only (prereq → `doctor`); reverse the "not separate packages" line for `skillcore` (SDK-1). (Architecture already updated, spike §8.) +- **Skill co-evolution (Constitution IX):** ship/extend an agent skill for `add` + `verify` (exit `0`/`2` meaning; missing backing tool is *not* a verify failure) with a trigger-accuracy eval. +- **Architecture reconciliation:** note D4 (lock omits `requires`) against architecture §4.2; rename `internal/skillcore` references → `pkg/skillcore` (spike §10 reconciliation list). diff --git a/specledger/002-skillcore-verify/spec-tech-spike.md b/specledger/002-skillcore-verify/spec-tech-spike.md index 0f53a8a..16a1538 100644 --- a/specledger/002-skillcore-verify/spec-tech-spike.md +++ b/specledger/002-skillcore-verify/spec-tech-spike.md @@ -37,8 +37,8 @@ Three deliverables, smallest coherent slice that makes the core promise *demonst Single implementation, presentation-free, the AP-04 hard boundary. - **`TreeSHA(skillDir) -> sha`** — the **git tree SHA** of the skill subtree (architecture §4.2). Computed from the on-disk content using git's own object model (no bespoke canonicalization — line endings / mode bits / symlinks handled by git). The value `add` records and the value `verify` recomputes come from **this one function**, so the gate can never diverge from what was written (R9, R14, N2). - - *Open (planning):* compute via shelling to `git` (already a dependency, on PATH per Makefile) vs. an in-process tree-object hash. Either is fine if it matches git's canonical tree SHA byte-for-byte; the choice is a plan.md call. Must be deterministic and offline. - - *Open (planning):* hash the **working-tree content on disk** (catches uncommitted tampering) — this is the intended semantic, since the promise is "what your agent will run", not "what HEAD says". + - **Resolved (plan.md/research, 2026-05-30):** **shell `git`** — `TreeSHA = git rev-parse :` (git's canonical tree SHA), *not* in-process hashing. Rationale: gh-cli (mature reference) shells git for everything and reimplements nothing; both `add` (on the origin) and `verify` (on the consumer's `HEAD`) use git's own output, so they match by construction with zero autocrlf/mode-bit/tree-sort reimplementation risk. The git client follows gh's pattern (pluggable `commandContext` + `GitError`). See research.md. + - **Resolved:** `verify` hashes the **committed** vendored tree (`HEAD:`) and flags an uncommitted/dirty vendored tree as a *distinct* finding via `git status --porcelain` — a cleaner taxonomy than folding uncommitted edits into the fingerprint, and it keeps `verify` truly read-only (`rev-parse`/`status` write no objects, unlike `write-tree`). This **refines** the earlier "hash the working tree on disk" intent. - **Must equal git's canonical tree-object SHA** — the same value a git origin (and GitHub's Trees API, per §12 `gh skill`) publishes for that subtree. A git tree object hashes only *immediate entry names + modes + child SHAs*, so it is **relocation-invariant**: the subtree's SHA at the origin's `skills//` equals the vendored copy's at `.agents/skills//` **iff their contents match**. That invariance is precisely what makes offline label-honesty survive the origin→consumer relocation. - **`ParseManifest(skill.toml) -> Manifest`** — parse `name, version, namespace, description, tags, [[requires]]` (architecture §4.1). In this slice, `verify` uses it to *recognize* a directory as a skill and `add` uses it to read `name`/`version`. The `[[requires]]` data is **NOT mirrored into the lock** (clarified 2026-05-30): the full subtree — including `skill.toml` — is vendored on disk and fingerprint-attested, so the **vendored manifest is the single source of truth** for prerequisites; a later `doctor` walks it directly. Mirroring would only duplicate data that can drift. → **diverges from architecture §4.2**, whose "mirror requires for offline prereq check (R16)" rationale assumed the manifest might *not* be on disk; in our vendored-in-git model it always is. (Flag for architecture reconciliation.) - **`ReadLock` / `WriteLock` (`skills-lock.json`)** — typed lock I/O (architecture §4.2, **minus `requires`** per above): `lockfileVersion, origin, skills{ name -> { version, commit, treeSha, path } }`. Atomic write (temp + rename — open Q10). `WriteLock` used by `add`; `ReadLock` by `verify`. From e0d8ccd2bad0067cd298bc436524079027d106b6 Mon Sep 17 00:00:00 2001 From: Vincent De Smet Date: Sat, 30 May 2026 09:37:01 +0800 Subject: [PATCH 4/8] docs(002): resolve review findings; align roadmap + architecture MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cross-artifact verify cycle (two independent reviews merged) + doc alignment. spec/quickstart/contracts (002): - C1 (HIGH): reconcile spec to committed-tree + 'dirty' verdict (FR-009, FR-022, Key-Entities verdict, SC-001) — was the only spec<->plan semantic ambiguity. - Q1 (CRITICAL, caught by the independent review): rewrite stale US3.3 to the deferred/canonical-only behavior (aligns with FR-011 / Out-of-Scope). - Close test gaps: help-examples, verify-read-only, prereq-ignored scenarios. - verify help text now self-documents PROJECT scope (code/test readers never see the spec). Scope the git requirement to project (--global is a deferred carve-out); fix the add not-a-git-repo rationale. Record global-scope forward concerns (spike). - reviews/002-review.md: merged + resolved record of both reviews. docs: - ARCHITECTURE §9c (new): federated skill registries (skills.sh, AWS AgentRegistry) as external origin types — reconciles the integrity model (git origin -> tree-SHA; registry -> recorded content digest; verify stays offline; audit/usage advisory-only). Answers the spike's OQ-1 (git-origin coupling). - ROADMAP: 002 row -> exit 0/1/2 + includes local 'add'; 004 reframed as remote-origin 'add' fetch; lockfile commitment drops 'requires' (manifest is source of truth, D4). - constitution: lock-entry example drops 'requires' (matches D4). Experiment files (implement-workflow command, CLAUDE/AGENTS changes, vcr-cassettes) deliberately left unstaged. Co-Authored-By: Claude Opus 4.8 (1M context) --- .specledger/memory/constitution.md | 2 +- docs/ARCHITECTURE-v0.md | 17 +++++++ docs/ROADMAP.md | 9 ++-- .../002-skillcore-verify/contracts/add.md | 2 +- .../002-skillcore-verify/contracts/verify.md | 12 ++++- specledger/002-skillcore-verify/quickstart.md | 12 ++++- .../reviews/002-review.md | 44 +++++++++++++++++++ .../002-skillcore-verify/spec-tech-spike.md | 1 + specledger/002-skillcore-verify/spec.md | 15 ++++--- 9 files changed, 98 insertions(+), 16 deletions(-) create mode 100644 specledger/002-skillcore-verify/reviews/002-review.md diff --git a/.specledger/memory/constitution.md b/.specledger/memory/constitution.md index 826d26d..bf38db3 100644 --- a/.specledger/memory/constitution.md +++ b/.specledger/memory/constitution.md @@ -57,7 +57,7 @@ verbose. Required by output type: lines, never N × fieldCount. - **JSON output (`--json`):** MUST be parseable (`json.Unmarshal` succeeds) AND structurally complete (key counts match the schema — e.g. a lock entry carries - `version`/`commit`/`treeSha`/`requires`). Assert field presence, not truncation + `version`/`commit`/`treeSha`). Assert field presence, not truncation absence. - **Error output:** MUST contain all three Principle-2 parts (what failed, why, suggested fix) as *distinct* checks, plus the correct exit code (0/1/2/3 per diff --git a/docs/ARCHITECTURE-v0.md b/docs/ARCHITECTURE-v0.md index 8d17aad..3d8d9e7 100644 --- a/docs/ARCHITECTURE-v0.md +++ b/docs/ARCHITECTURE-v0.md @@ -333,6 +333,23 @@ Verified against mise's GitHub backend docs (current as of early 2026). Three fi --- +## 9c. Federated skill registries — external origin types (skills.sh, AWS AgentRegistry) + +**Status:** Evolution beyond v0 (roadmap 011/012); built on 002's `skillcore` + the §9b allowlist/canon. Recorded so the integrity model stays coherent when skills come from somewhere other than the org's git origin. + +**The default origin stays git** (§2c, §4.2): the org's monorepo is the source of truth and the integrity gold standard (git tree-SHA label-honesty). Federated registries are **additional, governed external source types** consumed through the **canon's allowlist** (§9b) — *not* replacements — and **skillrig still operates no registry service of its own** (§2b unchanged; it *consumes* registries, never hosts one). Two are on the roadmap: +- **Public — skills.sh (Vercel).** Community skills, vetted by the registry's **usage statistics + audit reports**. Adopted only when **allowlisted in the origin's `policy.toml`** (§9b graded allowance: blocked / advisory / approved / pinned-version-only), and flagged with warnings when advisory. +- **Private — AWS AgentRegistry (enterprise).** A governed, IAM/OAuth-gated catalog as an external source for AWS-centric orgs. + +**Integrity reconciliation (the key alignment — the git-origin coupling, §4.2).** A registry is **not a git repo**, so there is **no origin-published git tree-SHA** to compare against. The fingerprint therefore **forks by source type**, with `skillcore` owning both (one implementation, AP-04): +- **git origin** → the **git tree-SHA** (origin-attested label-honesty, §4.2) — unchanged; the strongest guarantee. +- **registry source** → a **content digest** recorded at vendor time (the registry's published digest, or one recomputed from the fetched bytes); `verify` recomputes it offline and compares, exactly as it does the tree-SHA. This proves *the content has not changed since vendoring* — but **not** *origin-attested-against-a-git-tree* (a registry cannot offer that). The **canon's allowlist + the registry's audit/usage signals** carry the "reviewed/approved" half instead. +- Either way `verify` stays **offline + deterministic** (recompute the recorded fingerprint). The registry's **live risk / audit / usage scores are advisory, human-facing, online-only** (§9b / R29) — surfaced by `doctor`/`add`, **never** in `verify` (N6: truth stays deterministic). + +**Net:** the lockfile generalizes from "git tree-SHA + commit" to a **typed fingerprint per source** (`gitTreeSha` for git origins, `digest` for registries) without weakening the offline `verify` gate; governance moves to the canon's allowlist + each registry's own provenance/audit signals. The git origin remains the default and the highest-integrity path. + +--- + ## 10. What we deliberately did *not* build (maps to requirements §5) - **Team→skill suggestion engine (D1):** tags ship in the manifest now (R24); the suggestion UX is v1. Any future suggestion layer reads tags deterministically and stays additive — truth never moves into an LLM/inferential component (N6). `doctor` can already deterministically list "skills tagged `` not present in your global scope" without any inference — that may be enough to make the v1 "engine" unnecessary. diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md index ba1adc0..6e2b75e 100644 --- a/docs/ROADMAP.md +++ b/docs/ROADMAP.md @@ -29,20 +29,21 @@ resolver — AP-04 / AP-06) and layers thin commands on top. | # | Feature branch | Pattern | Depends on | Status | |---|----------------|---------|------------|--------| | 001 | **`init` + origin resolution** — `env SKILLRIG_ORIGIN > .skillrig/config.toml > ~/.config/skillrig/config.toml`; `skillrig init [--origin] [--global]` binds an existing origin (never bootstraps) | Environment | — (project skeleton) | 🚧 | -| 002 | **`skillcore` + `verify`** — git tree-SHA + `skill.toml` manifest parse; offline label-honesty + orphan check; exit codes 0/2/3 | Verification Gate | 001 | ⬜ | +| 002 | **`skillcore` + `add` (local) + `verify`** — git tree-SHA + `skill.toml` parse; **local-origin** `add` (vendor subtree + lock; `--dry-run`/`--force`); offline label-honesty + orphan check; **exit codes 0/1/2** (exit 3 → `doctor`/005) | Vendor Mutation + Verification Gate | 001 | ⬜ | | 003 | **`search`** — read origin (branch aware) committed `index.json`, deterministic tag filter, Two-Level Output | Query | 001 | ⬜ | -| 004 | **`add`** — vendor a skill subtree + write lock entry; `--dry-run`, refuse-clobber-without-`--force` | Vendor Mutation | 002 | ⬜ | +| 004 | **`add` — remote origin fetch** — network fetch from a GitHub-hosted origin (partial-clone + sparse-checkout) + auth; `@ref`/`--pin` immutable pins (local-origin `add` already shipped in 002) | Vendor Mutation | 002 | ⬜ | | 005 | **backing-CLI prereqs** — `[[requires]]` declare + verify (`--eligible`-style readiness, auth-as-distinct-failure R18); mise consumption via per-CLI tagged releases + template-generated `mise.toml` | (extends verify/doctor) | 002 | ⬜ | | 006 | **`doctor`** — superset health check (integrity + prereqs + auth) | Environment | 002, 005 | ⬜ | | 007 | **`bump --pr`** — detect upstream advance, drift-aware three-way-merge, open reviewable PR (conflict markers + non-zero exit on conflict) | Vendor Mutation | 002, 004 | ⬜ | | 008 | **`global add` / `global verify`** — fetch/restore user-scope skills against the global lock | Global Management | 002 | ⬜ | | 009 | **multi-client materialization** — canonical `.agents/skills` + symlink views, copy-fallback (Windows/CI) | (supports add/global) | 004 | ⬜ | | 010 | **`lint`** — author-side conformance gate, required PR check on the origin | Verification Gate | 002 | ⬜ | -| 011 | **`aws`** — support AWS AgentRegistry hosted skills | Evolution | 002 | ⬜ | +| 011 | **`skills.sh`** — support Vercel's skill.sh hosted skills. External skill adoption workflow (federated skill registries, whitelisted in origin, origin policy provisions for approval/review (skills.sh are evaluated on their usage statistics and audit reports, they should vetted or flagged with warnings)) | Evolution | 002 | ⬜ | +| 012 | **`aws`** — ENTERPRISE - support Private AWS AgentRegistry hosted skills | Evolution | 002 | ⬜ | **Cross-cutting v0 commitments** (architecture §13): - Two scopes only — project (vendored, verify-only) + global (fetch/restore). **No "shared" middle tier.** -- Lockfile carries `commit` (provenance) + `treeSha` (label honesty) + `requires` (§4.2); `.skillrig/config.toml` (input) split from `.skillrig/skills-lock.json` (output) (§2d). +- Lockfile carries `commit` (provenance) + `treeSha` (label honesty); the per-skill **manifest** (not the lock) carries `[[requires]]` — the vendored manifest is the single source of truth (002 D4; reconciles §4.2). `.skillrig/config.toml` (input) split from `.skillrig/skills-lock.json` (output) (§2d). - Origin = git; **no auth of its own** (§2d). - One **batteries-included GitHub template** (skills + Go-monorepo backing-CLI pattern + index/lint/release workflows) (§2d). - Discovery via committed `index.json`; **deterministic tags ship in the manifest** (data only) (§9). diff --git a/specledger/002-skillcore-verify/contracts/add.md b/specledger/002-skillcore-verify/contracts/add.md index cc13eec..bbe0215 100644 --- a/specledger/002-skillcore-verify/contracts/add.md +++ b/specledger/002-skillcore-verify/contracts/add.md @@ -68,7 +68,7 @@ Keys always present: `ok, name, version, path, commit, treeSha, action, dryRun`. | No origin configured | 1 | what: no origin configured; why: no `SKILLRIG_ORIGIN` / project / global origin; fix: `skillrig init --origin OWNER/REPO` or set `SKILLRIG_ORIGIN`. | | Skill not found in origin | 1 | what: skill `` not found in origin; why: no `skills//` at `@`; fix: check the name / `skillrig search` (future). | | Divergent content, no `--force` | 1 | what: refusing to overwrite ``; why: on-disk content diverges from the recorded fingerprint; fix: re-run with `--force`, or revert local edits. | -| Not inside a git repo | 1 | what: not a git repository; why: tree-SHA + provenance need git; fix: run inside the repo (or `git init`). | +| Not inside a git repo (project scope) | 1 | what: not a git repository; why: project-scope `add` places `.agents/skills` at the repo root and writes a lock that `verify` checks against git; fix: run inside the repo (or `git init`). _(A future `--global` path is exempt — see spec Out of Scope.)_ | Exit `0` on success (incl. idempotent no-op and `--dry-run`). Code `2` is `verify`'s; `3` is reserved (`doctor`). diff --git a/specledger/002-skillcore-verify/contracts/verify.md b/specledger/002-skillcore-verify/contracts/verify.md index b435785..2176344 100644 --- a/specledger/002-skillcore-verify/contracts/verify.md +++ b/specledger/002-skillcore-verify/contracts/verify.md @@ -1,7 +1,7 @@ # Contract: `skillrig verify` **Pattern**: Verification Gate — [cli.md](../../../docs/design/cli.md) Pattern Classification. MUST be offline + deterministic, exit-code driven, **no online/inferential signal** (AP-02). Read-only. -**Purpose**: Prove the repo's vendored skills are exactly what was recorded — **label-honesty** (tree-SHA) + **orphan/completeness** (on-disk set = locked set). Integrity only; **no** prerequisite check (that is `doctor`; exit `3` not emitted). Requires a git repository; needs no origin and no network. +**Purpose**: Prove **this repository's** vendored skills (**project scope** — `.agents/skills` checked against the committed `.skillrig/skills-lock.json`) are exactly what was recorded — **label-honesty** (tree-SHA) + **orphan/completeness** (on-disk set = locked set). Integrity only; **no** prerequisite check (that is `doctor`; exit `3` not emitted). **Project-scope**: it verifies the current repo, *not* global/user-scope skills. Requires a git repository; needs no origin and no network. ## Synopsis @@ -15,9 +15,17 @@ skillrig verify [--json] [--verbose] ## Help (Progressive Discovery) +The cobra `Short`/`Long`/`Example` MUST state project scope (other code/test readers never see this spec — the help text is where "project-level" must live): + ``` +Short: Check THIS repo's vendored skills match their recorded versions (project scope) +Long: verify checks the PROJECT's vendored skills (.agents/skills) against the + committed lock (.skillrig/skills-lock.json) — label-honesty (git tree-SHA) + + orphan/completeness — offline and deterministic. PROJECT-SCOPE: it verifies + THIS repository, not global/user-scope skills. Exit 0 ok / 1 usage / 2 failure. + Examples: - # Verify every vendored skill matches its recorded version (CI gate) + # Verify this repo's vendored skills match their recorded versions (project-scope CI gate) skillrig verify # Machine-readable per-skill verdicts for an agent / jq diff --git a/specledger/002-skillcore-verify/quickstart.md b/specledger/002-skillcore-verify/quickstart.md index f19da65..4fae4d4 100644 --- a/specledger/002-skillcore-verify/quickstart.md +++ b/specledger/002-skillcore-verify/quickstart.md @@ -43,7 +43,7 @@ Per Constitution II, **each scenario below maps 1:1 to a `TestQuickstart_` ### TestQuickstart_AddNotGitRepo - **When** `skillrig add …` in a non-git tmpdir (origin via `SKILLRIG_ORIGIN`). -- **Then** exit `1`; what: `not a git repository`; why: `tree-SHA + provenance need git`; fix: `run inside the repo`. +- **Then** exit `1`; what: `not a git repository`; why: `project-scope add vendors into the repo's canonical .agents/skills and writes a lock that verify checks against git`; fix: `run inside the repo (or git init first)`. (Project-scope precondition — a future --global path is exempt; see spec Out of Scope.) --- @@ -53,6 +53,12 @@ Per Constitution II, **each scenario below maps 1:1 to a `TestQuickstart_` - **Given** `terraform-plan-review` vendored **and committed**. - **When** `skillrig verify`. - **Then** exit `0`; human exactly 2 lines (`verified 1 skills ✓` + `→ all match their recorded version`); `--json ok==true`, `counts.verified==1`, one verdict `status=="ok"` whose `expectedTreeSha==actualTreeSha==` the ground-truth tree-SHA. +- **And (FR-014 / SC-006)**: the vendored skill declares `[[requires]]` (oxid, terraform) for tools **absent** in the test environment, yet verify still exits `0` — proving it performs **no** prerequisite check (integrity-only). + +### TestQuickstart_VerifyIsReadOnly (FR-015) +- **Given** `terraform-plan-review` vendored + committed. +- **When** `skillrig verify` (pass) and again after a tamper (fail). +- **Then** in **both** runs the working tree is **unchanged** — assert `git status --porcelain` is identical before/after, and `.skillrig/skills-lock.json` + the skill files are byte-for-byte untouched (verify writes nothing). ### TestQuickstart_VerifyDetectsTamper (US2.2, SC-003) - **Given** the skill vendored + committed; then one byte of `SKILL.md` changed **and committed**. @@ -103,6 +109,10 @@ Per Constitution II, **each scenario below maps 1:1 to a `TestQuickstart_` - **When** `skillrig verify`. - **Then** exit `1` (usage/config, **distinct** from verification failure `2`); 3-part error naming the file + raw cause under `--verbose`; **not** a raw parser dump. +### TestQuickstart_AddHelpExamples / TestQuickstart_VerifyHelpExamples (FR-018 / SC-009) +- **When** `skillrig add --help` and `skillrig verify --help`. +- **Then** exit `0`; each help output contains a one-line purpose **and ≥2 usage examples** (assert ≥2 lines beginning `skillrig add `/`skillrig verify ` in the Examples block) — sufficient to construct a correct invocation without external docs. (Output-shape, not a single `Contains`.) + --- ## Round-trip (the core acceptance contract) diff --git a/specledger/002-skillcore-verify/reviews/002-review.md b/specledger/002-skillcore-verify/reviews/002-review.md new file mode 100644 index 0000000..f6b4c6c --- /dev/null +++ b/specledger/002-skillcore-verify/reviews/002-review.md @@ -0,0 +1,44 @@ +--- +date: 2026-05-30 +total_requirements: 22 +total_tasks: 0 +coverage_pct: "100% project-scope coverage (tasks dimension intentionally skipped)" +critical_issues: 0 +--- + +# Cross-Artifact Verification — `002-skillcore-verify` (merged + resolved) + +**Scope:** read-only cross-verification of `spec.md` against `plan.md`, `research.md`, `data-model.md`, `contracts/{add,verify,skillcore-sdk}.md`, `quickstart.md`. The **tasks dimension was intentionally skipped** (no `tasks.md` — trialling `/specledger.implement-workflow` instead). Not a defect. + +**Two independent reviews merged.** Review **A** (independent agent) and review **B** (cross-check, this session) were complementary — each caught a finding the other missed. All findings below are **resolved** (artifacts updated 2026-05-30). + +## Findings (all resolved) + +| ID | Source | Category | Severity | Summary | Resolution | +|----|--------|----------|----------|---------|------------| +| **C1** | B | Consistency / Decision | HIGH | **Committed-tree vs working-tree.** Spec FR-009 said "current on-disk content" (working-tree), verdict enum + FR-022 + SC-001 reflected a no-commit model — but research D2 + contracts + quickstart implement **committed-tree** hashing + a **`dirty`** verdict + a commit step. (A scored "0 ambiguities" and missed this.) | **Reconciled spec to committed-tree+dirty:** FR-009 (committed content; uncommitted → distinct `dirty`), FR-022 (dirty in exit-2), Key-Entities verdict (adds `dirty`, maps spec↔`--json` names), SC-001 (vendor→commit→verify loop). | +| **Q1** | A | Quickstart drift | CRITICAL | **US3 scenario 3** asserted per-client-view handling that the feature **defers** (FR-011 / Out-of-Scope) — stale acceptance scenario. (B missed this.) | US3.3 rewritten to the deferred / canonical-only behavior, aligned with FR-011. | +| **C3 / Q2** | A + B | Coverage gap | HIGH | FR-018 / SC-009 (help with ≥2 examples) had **no `--help` quickstart test**. | Added `TestQuickstart_AddHelpExamples` / `VerifyHelpExamples` (output-shape: purpose + ≥2 examples). | +| **C4** | A + B | Coverage gap | MEDIUM | FR-015 (verify read-only) had no test asserting files/lock unchanged. | Added `TestQuickstart_VerifyIsReadOnly` (before/after `git status --porcelain` + lock byte-unchanged). | +| **C2** | B | Coverage gap | MEDIUM | FR-014 / SC-006 (missing backing tool never fails verify) only implicit. | Made explicit on `VerifyPasses`: the vendored skill declares `[[requires]]` for tools absent in CI, yet verify exits 0. | +| **C6** | B | Terminology | LOW | Verdict-name drift (spec *matched/content-mismatch/untracked* vs contracts *ok/mismatch/orphan*). | Folded into the Key-Entities verdict reword — spec now shows both the readable name and the `--json` field value. | +| **C7** | B | Wording / Scope | LOW | `add` not-a-git-repo error rationale was *verify's* reason ("tree-SHA + provenance need git") — false for `add` (those come from the origin). Plus: "all commands require git" is too strong. | Corrected the `add` error rationale (project-scope: places `.agents/skills` at the repo root, writes a lock `verify` checks). Scoped the git requirement to **project scope** (FR-001 / Assumptions / Out-of-Scope), with `--global` as a deferred carve-out. Recorded two global-scope forward concerns in spike §9 (`add --global` non-git target; `verify --global` needs a working-tree fingerprint). | +| **C5** | B | Traceability | INFO | Public-SDK scope (SDK-1, third-party `pkg/skillcore`) lives in spike/research, not spec FR-016/017. | Left as-is — deliberate earlier choice to keep SDK-1 in the spike (plan input). Noted. | +| **T1** | A | Task coverage | INFO | `tasks.md` absent → task↔requirement + `TestQuickstart_*`-task mapping skipped. | Intentional (experiment). Re-run after `/specledger.tasks` if durable task coverage is wanted. | + +## Coverage summary + +All 22 FR + 9 SC trace to plan + contracts + a `TestQuickstart_*` scenario after the fixes (help / read-only / prereq gaps closed; US3.3 reconciled). Reverse traceability: the only previously-orphan behavior (`dirty` verdict) is now grounded in FR-009/FR-022. + +## Decision integrity + +exit 0/1/2-not-3 · add-detect+refuse-not-merge · conflict-markers-deferred · symlinks-deferred · requires-NOT-in-lock · shell-`git` tree-SHA · `pkg/skillcore` · origin-resolution-not-`--from` · byte-identical vendoring · oracle independence — **applied consistently across all artifacts**, no stale wording remaining. The committed-tree+`dirty` refinement (C1) is now propagated to the spec. + +## Metrics + +- Requirements: 22 FR + 9 SC · Tasks: 0 (skipped) · Critical: **0** (was 1 — Q1, resolved) · High: 0 (C1 + Q2 resolved) · Medium/Low/Info: resolved or noted. + +## Next actions + +- Artifacts are internally consistent — **clear to proceed to implementation** (`/specledger.implement-workflow` experiment, or `/specledger.tasks` for the durable ledger). +- Re-run `/specledger.verify` after `tasks.md` exists if you want task-coverage + `TestQuickstart_*`-task mapping validated. diff --git a/specledger/002-skillcore-verify/spec-tech-spike.md b/specledger/002-skillcore-verify/spec-tech-spike.md index 16a1538..a6182cd 100644 --- a/specledger/002-skillcore-verify/spec-tech-spike.md +++ b/specledger/002-skillcore-verify/spec-tech-spike.md @@ -118,6 +118,7 @@ Two checks, both exit-2 class on failure: - `bump` (upstream advance, 3-way merge) + conflict-marker detection. - Network/git **fetch** in `add`; origin-resolution-driven `add`; `@ref`/`--pin` immutable pins; **auth for remote `add`** (PAT/SSH/registry token — see §11 OQ-2). - `index.json` / `search`; multi-client symlink materialization (§6); allowlist/audit (§9b, v1); auth (R18). +- **Global scope** (`--global` / `global add` / `global verify` → `~/.agents/skills`). Two forward concerns surfaced 2026-05-30 (scope, don't solve, this slice): **(a) `add --global` target is non-git** — the project-scope "add requires a git repo" precondition must be **scoped to project mode**, not applied to the global path (home is not a repo), or it would wrongly reject `add --global`. **(b) `verify --global` cannot use committed-tree + shell-`git`** (D1/D2) — there is no `HEAD` at home, so the global tier needs a **working-tree fingerprint** (the in-process / temp-index hash set aside in D1), reopening that option for global only. Neither affects this project-scope slice; both are inputs to a future global-scope spec. --- diff --git a/specledger/002-skillcore-verify/spec.md b/specledger/002-skillcore-verify/spec.md index 8b3f8d6..b21ac62 100644 --- a/specledger/002-skillcore-verify/spec.md +++ b/specledger/002-skillcore-verify/spec.md @@ -84,7 +84,7 @@ A security-conscious reviewer worries about a skill that was added to the repo w 1. **Given** a skill directory present in the repo's skills location with no corresponding record entry, **When** verify runs, **Then** it exits with the verification-failure status and identifies the untracked (orphan) skill. 2. **Given** a record entry for a skill whose files are absent from the repo, **When** verify runs, **Then** it exits with the verification-failure status and identifies the missing skill. -3. **Given** the repo uses per-client compatibility views (alternate directory entries pointing at the same canonical skill content), **When** verify runs, **Then** those views are not miscounted as separate or untracked skills. +3. **Given** multi-client compatibility views are **not** created by this feature (deferred — see Out of Scope and FR-011), **When** verify runs, **Then** the orphan/completeness check scans only the canonical skills location (`.agents/skills`); robust handling of any manually-created view directories is deferred together with multi-client materialization. 4. **Given** both a content mismatch (US2) and an untracked skill are present, **When** verify runs, **Then** it fails and reports both classes of problem rather than stopping at the first. --- @@ -123,7 +123,7 @@ An automated caller — a CI merge gate or an agent deciding its next step — n **Vendoring a skill (`add`)** -- **FR-001**: The system MUST provide a command that vendors a named skill from the repo's **configured origin** (resolved via the shared origin resolver — there is no separate source argument that bypasses it; the origin may be a local checkout) into the canonical skills location (`.agents/skills/`) and records its identity. Both this command and verification MUST run inside a git repository. +- **FR-001**: The system MUST provide a command that vendors a named skill from the repo's **configured origin** (resolved via the shared origin resolver — there is no separate source argument that bypasses it; the origin may be a local checkout) into the canonical skills location (`.agents/skills/`) and records its identity. For this **project-scope** feature, both this command and verification MUST run inside a git repository (the canonical `.agents/skills` location lives at the repo root). A future **global** scope (`--global`, materializing into the user's home `~/.agents/skills`, which is not a repo) is a separate, deferred carve-out — see Out of Scope — and will **not** be bound by this project-scope requirement. - **FR-002**: For each vendored skill, the system MUST record its version, its provenance (where it came from), and a content fingerprint that uniquely reflects the skill's content. - **FR-003**: The vendor command MUST be idempotent: re-vendoring identical content leaves an equivalent result and reports success without error. - **FR-004**: The vendor command MUST NOT silently overwrite vendored content that diverges from the recorded fingerprint; it MUST detect the divergence and require an explicit override (`--force`) so local modifications are never lost without intent. It MUST NOT attempt a three-way merge — re-vendoring the same version has no upstream-advance axis to merge; that is a later `bump` concern. @@ -134,7 +134,7 @@ An automated caller — a CI merge gate or an agent deciding its next step — n **Verifying vendored skills (`verify`)** - **FR-008**: The system MUST provide a verification command that checks the repo's vendored skills against their recorded identities, entirely offline and deterministically (same inputs always yield the same result; no network or external/live signal). -- **FR-009**: Verification MUST recompute each recorded skill's content fingerprint from its current on-disk content and compare it to the recorded value, failing when they differ (label honesty). +- **FR-009**: Verification MUST recompute each recorded skill's content fingerprint from its **committed** on-disk content and compare it to the recorded value, failing when they differ (label honesty). A vendored skill with **uncommitted** local modifications MUST be surfaced as a **distinct** finding (a "dirty" verdict — "commit it / it has local modifications"), never silently passed nor conflated with a content mismatch. - **FR-010**: Verification MUST compare the set of skills present on disk against the set of recorded skills, failing when a skill is present but unrecorded (untracked/orphan) or recorded but absent (missing) — covering the whole set, not only recorded entries. - **FR-011**: Verification's orphan/completeness check MUST scan the canonical skills location (`.agents/skills`). This feature does not create per-client symlink views; robust handling of such views is deferred together with multi-client materialization (see Out of Scope). - **FR-012**: Verification MUST report *all* detected problems in a run (e.g. both a content mismatch and an untracked skill), not stop at the first. @@ -153,7 +153,7 @@ An automated caller — a CI merge gate or an agent deciding its next step — n - **FR-019**: Every error MUST state (a) what failed, (b) the real underlying reason (never swallowed), and (c) at least one concrete suggested fix; a verbose mode MUST expose the raw underlying cause. - **FR-020**: Diagnostic output MUST go to the error stream and primary data to the standard output stream, so output can be cleanly piped. - **FR-021**: The system MUST offer a machine-readable output mode whose output is complete (every checked skill with a per-skill verdict; no truncation) and parseable, for both passing and failing runs. -- **FR-022**: The system MUST use distinct, stable exit statuses: success; usage/config error (bad arguments, malformed record, not in a version-controlled repo); and verification failure (content mismatch or untracked/missing skill). The prerequisite-failure status is reserved and MUST NOT be emitted by this feature's commands. +- **FR-022**: The system MUST use distinct, stable exit statuses: success; usage/config error (bad arguments, malformed record, not in a git repo); and verification failure (content mismatch, untracked/missing skill, **or an uncommitted/locally-modified vendored skill**). The prerequisite-failure status is reserved and MUST NOT be emitted by this feature's commands. ### Key Entities *(include if feature involves data)* @@ -161,7 +161,7 @@ An automated caller — a CI merge gate or an agent deciding its next step — n - **Skill manifest**: the per-skill machine-readable description (name, version, namespace, description, discovery tags, declared backing-tool prerequisites) — vendored on disk as part of the skill subtree. Read at vendor time for identity. Its prerequisite declarations are **neither copied into the record nor evaluated** in this feature; the vendored manifest itself is the single source of truth for them (a later health command reads it directly). - **Skill record (lock)**: the committed file (`.skillrig/skills-lock.json`) mapping each vendored skill to its recorded version, provenance (origin + commit), content fingerprint, and location. It does **not** duplicate the skill's backing-tool prerequisites (those live in the vendored manifest). Written by vendoring, read by verification; the source of truth for "what was approved." - **Content fingerprint**: a value that uniquely reflects a skill's content as published for a given version. Used for *label honesty* — confirming on-disk content matches the version it claims to be. Computed identically at vendor time and verify time. -- **Verification verdict**: the outcome of verification — overall pass/fail plus a per-skill result (matched / content-mismatch / untracked / missing), surfaced both compactly for humans and completely for machines. +- **Verification verdict**: the outcome of verification — overall pass/fail plus a per-skill result: **matched** (`ok`), **content-mismatch** (`mismatch`), **untracked** (`orphan`), **missing**, or **uncommitted/modified** (`dirty`) — surfaced both compactly for humans and completely for machines. (Parenthised names are the machine field values used in `--json`.) ## Out of Scope @@ -173,13 +173,14 @@ The following are explicitly **not** part of this feature and MUST NOT be pulled - **Discovery** (`index.json`, search) and any browse UI. - **Multi-client symlink materialization & agent-shell selection** — creating per-client view directories (e.g. `.claude/skills → ../.agents/skills`) and the `init`-time agent-shell selection stored in `.skillrig/config.toml`. `add` writes only the canonical `.agents/skills`; verification scans only that location. A separate, later feature. - **External-source allowlists, audit classification, and risk/vulnerability surfacing** — later governance work. +- **Global-scope skills** (`--global` / a future `global add` / `global verify` materializing into the user's home `~/.agents/skills`) — a separate, later tier; this feature is **project-scope only**. Global targets are **not** git repos, so the git-repo requirement here is project-scope-specific, and global `verify` will need a non-repo fingerprint mechanism (recorded as a future concern in the spike). - **Any authentication or credential handling.** ## Success Criteria *(mandatory)* ### Measurable Outcomes -- **SC-001**: A user can vendor a skill from a local library and verify it in two commands, with zero network access and no hand-authored records. +- **SC-001**: A user can vendor a skill from a local library, commit it, and verify it — the vendor→commit→verify round-trip — with zero network access and no hand-authored records. (Verification checks committed content, so the commit is part of the loop.) - **SC-002**: When a vendored skill's content matches its record, verification passes (success status); when any skill's content diverges from its record, verification fails with the verification-failure status — correct for 100% of label-honesty cases. - **SC-003**: A single altered byte in any vendored skill file is detected as a content mismatch — 0 false negatives; and when multiple skills are altered, all are reported in one run (the check never exits on the first failure). - **SC-004**: Any on-disk skill with no record entry (untracked) and any recorded skill absent on disk (missing) are both detected and fail the gate. @@ -201,7 +202,7 @@ The following are explicitly **not** part of this feature and MUST NOT be pulled **Assumptions**: - The repo is pointed at its origin via `init` (feature 001); for this feature the resolved origin is a **local** source (a local checkout). `add` consumes the *resolved* origin — there is no separate source argument that bypasses it. Remote fetch is additive future work. -- Both `add` and `verify` require being run inside a **git repository**: the canonical skills location and the content fingerprint both derive from the repo's git content model, so running outside one is a usage/config error. +- Both `add` and `verify` (this **project-scope** feature) require being run inside a **git repository**: the canonical skills location and the content fingerprint derive from the repo's git content model, so running outside one is a usage/config error. The deferred **global** scope (`--global` → user home, not a repo) is explicitly *not* bound by this (see Out of Scope). - Skills are vendored into the repo under version control, so the repo's own content model carries file integrity; the recorded fingerprint adds *label honesty* (content matches its claimed version) on top of that. - `add` requires the origin to be configured (it resolves it to know what to vendor); `verify` does **not** (it reads the committed record and on-disk content). The skill record file and the per-skill manifest are separate concerns from the origin config of the first slice. From 168afd110bdcb29859481ff9a2bfacf71c3cab3a Mon Sep 17 00:00:00 2001 From: Vincent De Smet Date: Sat, 30 May 2026 12:55:52 +0800 Subject: [PATCH 5/8] feat(002): skillcore SDK + add/verify (vendor & verify skills) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Implement the second slice: the shared integrity primitive pkg/skillcore (public SDK, SDK-1/AP-04) plus `skillrig add` (Vendor Mutation) and `skillrig verify` (Verification Gate) — making "the skill your agent runs is exactly the version that was reviewed and approved" demonstrable offline. skillcore (pkg/skillcore, presentation-free, one shared implementation): TreeSHA via shelled `git rev-parse` (git-canonical, relocation-invariant), ParseManifest (go-toml/v2), Read/WriteLock (atomic, deterministic, no `requires`), Add (mode-preserving vendor + lock; force/dry-run/idempotent), Verify (label-honesty + orphan/missing/dirty, read-only, aggregates all), typed errors (VerifyFailure, GitError, LockError, SkillNotFound, Overwrite). CLI (internal/cli): add (resolve origin via config.ResolveOrigin -> skillcore .Add -> render), verify (skillcore.Verify -> render -> exit), exitCodeFor maps *VerifyFailure -> ExitVerification(2), output renderers (compact human + footer / complete --json), git repo-root helper. Tests: TestQuickstart_* round-trip, tamper->mismatch, dirty, orphan, missing, aggregate-all, exit matrix, json-complete, help examples, ignores non-canonical view dirs; + skillcore unit tests (ground-truth TreeSHA == raw git). RAW-git is the independent oracle (no circular validation). Docs/skills: cli.md synced (verify integrity-only; pkg/skillcore as a separate public package; exit 3 reserved for doctor); contracts/add.md + skillrig-init local-origin note; new skillrig-add-verify agent skill + eval sets (Constitution IX); checkpoint session log + post-implementation adversarial review report. Tooling: implement-workflow (+v2) and checkpoint-workflow commands (latter adds a clean-tree precondition before launching review agents); remove AGENTS.md; CLAUDE.md + tasks-template updates. Open follow-ups (see specledger/002-skillcore-verify/reviews/002-review.md): AR-1/AR-2 local-origin CWD-relative resolution + misleading "skill not found" when the origin checkout is absent; AR-3 add pkg/skillcore/verify_test.go. New public package: github.com/skillrig/cli/pkg/skillcore. No new dependencies. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../specledger.checkpoint-workflow.md | 263 ++++ .../specledger.implement-workflow-v2.md | 115 ++ .../commands/specledger.implement-workflow.md | 115 ++ .agents/skills/skillrig-add-verify/SKILL.md | 151 ++ .../skillrig-add-verify/evals/evals.json | 63 + .../evals/trigger-eval-set.json | 22 + .agents/skills/skillrig-init/SKILL.md | 18 + .specledger/templates/tasks-template-v2.md | 287 ++++ AGENTS.md | 95 -- CLAUDE.md | 4 +- docs/design/cli.md | 24 +- internal/cli/add.go | 175 +++ internal/cli/exit.go | 28 +- internal/cli/output.go | 206 +++ internal/cli/repo.go | 63 + internal/cli/root.go | 14 + internal/cli/verify.go | 107 ++ pkg/skillcore/add.go | 399 +++++ pkg/skillcore/add_test.go | 221 +++ pkg/skillcore/errors.go | 50 + pkg/skillcore/git.go | 85 ++ pkg/skillcore/helpers_test.go | 156 ++ pkg/skillcore/lock.go | 114 ++ pkg/skillcore/lock_test.go | 131 ++ pkg/skillcore/manifest.go | 46 + pkg/skillcore/manifest_test.go | 97 ++ pkg/skillcore/treesha.go | 25 + pkg/skillcore/treesha_test.go | 166 ++ pkg/skillcore/verify.go | 348 +++++ .../002-skillcore-verify/contracts/add.md | 2 + .../reviews/002-review.md | 44 + .../002-skillcore-verify-checkpoint.md | 60 + test/skillcore_quickstart_test.go | 1332 +++++++++++++++++ .../sample-origin/.skillrig-origin.toml | 10 + .../skills/terraform-plan-review/SKILL.md | 16 + .../skills/terraform-plan-review/skill.toml | 23 + 36 files changed, 4960 insertions(+), 115 deletions(-) create mode 100644 .agents/commands/specledger.checkpoint-workflow.md create mode 100644 .agents/commands/specledger.implement-workflow-v2.md create mode 100644 .agents/commands/specledger.implement-workflow.md create mode 100644 .agents/skills/skillrig-add-verify/SKILL.md create mode 100644 .agents/skills/skillrig-add-verify/evals/evals.json create mode 100644 .agents/skills/skillrig-add-verify/evals/trigger-eval-set.json create mode 100644 .specledger/templates/tasks-template-v2.md delete mode 100644 AGENTS.md create mode 100644 internal/cli/add.go create mode 100644 internal/cli/repo.go create mode 100644 internal/cli/verify.go create mode 100644 pkg/skillcore/add.go create mode 100644 pkg/skillcore/add_test.go create mode 100644 pkg/skillcore/errors.go create mode 100644 pkg/skillcore/git.go create mode 100644 pkg/skillcore/helpers_test.go create mode 100644 pkg/skillcore/lock.go create mode 100644 pkg/skillcore/lock_test.go create mode 100644 pkg/skillcore/manifest.go create mode 100644 pkg/skillcore/manifest_test.go create mode 100644 pkg/skillcore/treesha.go create mode 100644 pkg/skillcore/treesha_test.go create mode 100644 pkg/skillcore/verify.go create mode 100644 specledger/002-skillcore-verify/sessions/002-skillcore-verify-checkpoint.md create mode 100644 test/skillcore_quickstart_test.go create mode 100644 test/testdata/sample-origin/.skillrig-origin.toml create mode 100644 test/testdata/sample-origin/skills/terraform-plan-review/SKILL.md create mode 100644 test/testdata/sample-origin/skills/terraform-plan-review/skill.toml diff --git a/.agents/commands/specledger.checkpoint-workflow.md b/.agents/commands/specledger.checkpoint-workflow.md new file mode 100644 index 0000000..618e806 --- /dev/null +++ b/.agents/commands/specledger.checkpoint-workflow.md @@ -0,0 +1,263 @@ +--- +description: Critical divergence review — compare implementation against plan artifacts, flag divergence from plan, and surface gaps. Updates session log at FEATURE_DIR/sessions/-checkpoint.md +--- + +## User Input + +```text +$ARGUMENTS +``` + +You **MUST** consider the user input before proceeding (if not empty). + +**Execution Tracking**: Before starting work, create a task list (using the TaskCreate tool) covering all execution steps in this workflow. If `$ARGUMENTS` contains user-specified actions beyond the standard workflow, place those tasks where they logically fit: before setup steps if arguments change what gets set up, or after all standard steps if arguments extend the workflow. Update task status as you complete each step. + +**User Interaction**: Whenever you need input, clarification, or a decision from the user, use the **AskUserQuestion** tool directly. Do not output questions as plain text and stop — always use the interactive tool for proper UX. + +## Purpose + +Perform a critical divergence review of the current implementation state against plan artifacts. Your job is to **find problems, not confirm success**. Surface plan drift, uncovered requirements, and implementation gaps that human reviewers need to know about before merge. + +**When to use**: During or after implementation to catch drift, before handoff, or before merging. + +## Framing + +Adopt an adversarial reviewer mindset. Assume the implementation has gaps until proven otherwise. + +## Outline + +Goal: Identify divergences between planned and actual implementation, classify them, and produce an actionable review. + +Execution steps: + +1. Run `sl spec info --json --paths-only` to get `FEATURE_DIR` and `BRANCH`. + +2. Gather implementation state: Use git to see the staged changes + +3. Run project tests and checks: + - Consult the project's `CLAUDE.md` (or equivalent) for the canonical test/lint/format commands. + - If no project-level instructions exist, detect the project type and use conventional commands: + - **Go**: `go test ./...` + - **Node (npm/pnpm/yarn)**: check `package.json` for `test`, `lint`, `format:check` scripts and run those that exist + - **Python**: `pytest` or the configured test runner + - **Other**: look for a `Makefile`, `justfile`, or CI config for test commands + - If no test runner is configured, state that explicitly — do not fabricate a test step + - All executed checks must pass (exit code 0) for a clean checkpoint + - If any check fails, report failures and include them as CRITICAL divergences + +4. Compare implementation against plan artifacts: + + Read the following from `FEATURE_DIR` (skip any that don't exist): + + **From spec.md:** + - Functional requirements (FR-xxx or numbered list) + - User stories and their acceptance criteria + - Non-functional requirements + - Edge cases + - Derive Definition of Done per User Story acceptance criteria **Example conversion**: + - Spec acceptance: "Then the user can log in with valid credentials" + - DoD item: "- User can authenticate with valid username/password" + - Spec acceptance: "Then invalid credentials show an error message" + - DoD item: "- Invalid credentials display appropriate error message" + - Also verify: + - quickstart.md scenario(s) match this story's user stories + - TestQuickstart_ integration test(s) exist and pass for each scenario + + **From plan.md:** + - Phases and their deliverables + - Project structure (expected files/components) + - Architecture decisions and constraints + + **From data-model.md** (if present): + - Entity names and key fields + - Validation rules + - Relationships + + **From quickstart.md** (if present): + - Integration scenarios + - Expected output formats + + For each artifact claim, check: + - Does the implementation match the specification? (Check actual code if uncertain.) + - Are there planned files/components that don't exist? + - Are there data model entities defined but not implemented, or implemented differently? + - Are there quickstart scenarios not validated by tests? + +5. Classify each divergence: + + **Severity** (use same scale as `/specledger.verify`): + - **CRITICAL**: Missing core requirement, failing tests, security/compliance gap + - **HIGH**: Significant unchecked DoD, requirement partially implemented, test gap for critical path + - **MEDIUM**: Data model drift, terminology inconsistency, undocumented architecture change + - **LOW**: Minor format difference, non-critical edge case not covered + + **Type** — Leverage any Decision in the session log: + - **conscious**: Divergence is documented somewhere (decision log, commented source code, ...) + - **oversight**: No documentation found — this was likely missed + +6. Update session log: + - Create `FEATURE_DIR/sessions/` directory if it doesn't exist + - **Determine output file based on scope**: + - **Phase-scoped checkpoint**: If `$ARGUMENTS` indicates a phase scope (e.g., `"Verify phase:setup issues only"`), write to `FEATURE_DIR/sessions/-checkpoint-.md`. One file per phase, overwriting any prior phase-scoped checkpoint for the same phase. + - **Full checkpoint** (no phase scope): Append a timestamped entry to `FEATURE_DIR/sessions/-checkpoint.md`. + - Use the entry format below + + ```markdown + ## Divergence Review: YYYY-MM-DD HH:MM + + ### Divergences + + | # | Severity | Type | Category | Artifact | Description | + |---|----------|------|----------|----------------|-------------| + | 1 | HIGH | oversight | Missing requirement | spec.md FR-003 | Rate limiting not implemented | + | 2 | MEDIUM | conscious | Data model drift | SL-xxx / data-model.md | Field renamed from X to Y (documented in source code) | + + ### DoD Bypassed + + | User Story | Title | Acceptance Criteria | Risk | + |------------|-------|---------------------|------| + | US1 | Add validation | "Integration test passes" unchecked | HIGH — no test coverage | + + ### Issues Encountered & Resolutions + - + + ### Items Requiring Action Before Merge + 1. [CRITICAL] Fix + 2. [HIGH] Write test for + + ### Tests & Checks + - Status: PASS/FAIL/SKIPPED + - Commands run: + - Failures:
+ + --- + ``` + +7. Report divergence summary to the user: + - Lead with divergence count and severity breakdown + - Show the divergence table + - List items requiring action + - End with test status and progress numbers + - If CRITICAL divergences exist, recommend resolving before commit/merge + +8. Offer adversarial deep-dive agent: + + After reporting your findings, **always offer** to launch an independent adversarial review agent. This agent runs in a fresh context with no knowledge of the implementation session — it cannot rationalize shortcuts or inherit anchoring bias from prior decisions. + + > **PRECONDITION — commit first / clean working tree.** Before launching the review agent, the working tree SHOULD be committed (or otherwise clean). A thorough reviewer **exercises the real binary** — building, running the app, and running git round-trips (`add`/`commit`/`reset`, integration tests) to confirm behavior — and it may do so **in the repo itself**. That is *fine and encouraged once the tree is clean*: a committed tree means any stray test commit/reset the agent makes can't clobber uncommitted or mis-staged work, and is trivially undone (`git reset --hard ` / drop the dangling commit). If you launch with uncommitted or partially-staged changes, a reviewer's `git add -A && git commit` can sweep up files you didn't intend and disturb your staging. So: **commit, confirm `git status` is clean, then launch.** Do not instead forbid the agent from using git — the freedom to exercise the binary is what makes the review valuable. + + **When all Definition of Done completed** strongly recommend running the adversarial agent as a best practice before merge. This is the highest-value moment: the work appears complete, so the risk of undetected drift is greatest. + + Otherwise, present it as an optional next step — useful when the checkpoint is mid-implementation and more sessions are expected. + + Generate an adversarial review agent filling in the definition of done derived, per User story into the prompt tempalate below. + + ~~~ + You are an adversarial code reviewer. Your job is to find problems, not confirm success. + + ## Context + - Feature directory: {FEATURE_DIR} + - Branch: {BRANCH} + - This review is context-free by design — you have no prior knowledge of + implementation decisions or tradeoffs made during development. + + ## Instructions + + 1. Run `sl spec setup-plan --json` from repo root and parse JSON for FEATURE_SPEC, IMPL_PLAN, SPECS_DIR, BRANCH + 2. Read the spec, plan, and any design artifacts in {FEATURE_DIR}. + 3. Focus on these definition of done items per user story: + + x. User Story: + - + - + Also verify: + - quickstart.md scenario(s) match this story's user stories + - TestQuickstart_ integration test(s) exist and pass for each scenario + + + 4. Read the actual implementation code on this branch. For each requirement and + planned deliverable, verify it exists and behaves as specified. + 5. Run the project's test/lint commands (check CLAUDE.md for canonical commands). + 6. Produce a findings report: + - Divergences (severity + conscious/oversight classification) + - User Stories and Acceptance Criteria with unchecked DoD items + - Code quality concerns (dead code, missing error handling, untested paths) + - Requirements with no corresponding implementation + - Implementation that has no corresponding requirement (scope creep) + 7. Be specific: cite file paths, line numbers, user story IDs, and artifact references. + 8. Report findings only — do not fix anything. + ~~~ + + Show the generated print output to the user and Use AskUserQuestion to ask: **"Would you like me to launch an independent adversarial review agent?"** This runs in a separate context with no memory of this session — it reviews the code and artifacts cold. + +## Behavior Rules + +- **Lead with divergences, not accomplishments** — the progress summary is an appendix +- **Flag unchecked DoD** this is always worth reporting +- **Classify every divergence** as conscious or oversight by checking source code and decision logs +- **If zero divergences found**, report that explicitly — this is a positive signal worth stating, not a default +- All executed tests/checks must pass for a clean checkpoint +- Don't auto-commit — prompt user instead +- If CRITICAL divergences exist, strongly recommend resolving before merge +- If no progress since last checkpoint, report "no changes detected" +- Include file paths for uncommitted changes + +## Example Usage + +```bash +# Critical divergence review after implementation +/specledger.checkpoint + +# Review with specific focus area +/specledger.checkpoint "Focus on data model alignment and test coverage" + +# Pre-merge divergence review +/specledger.checkpoint "Pre-merge review for PR #42" + +# Checkpoint with known context +/specledger.checkpoint "We switched from go-vcr to httptest — flag that as conscious" +``` + +## Session Log Format + +Session logs are stored at `FEATURE_DIR/sessions/-checkpoint.md`: + +```markdown +# Session Log: + +## Divergence Review: 2026-03-05 14:30 + +### Divergences + +| # | Severity | Type | Category | Artifact | Description | +|---|----------|------|----------|----------------|-------------| +| 1 | HIGH | oversight | Missing requirement | spec.md FR-009 | JSONL fallback on 404 not implemented — only shows warning | +| 2 | LOW | conscious | Architecture change | plan.md Phase 2 | Used httptest instead of go-vcr cassettes | +| 3 | MEDIUM | oversight | Test gap | quickstart.md Scenario 12 | TestPlanShowCacheReuse never written | + +### DoD Missing + +| User Story | Title | Unchecked DoD Items | Risk | +|-------|-------|---------------------|------| +| US1 | go-vcr cassette setup | "Cassette file created", "Replay test passes" | LOW — httptest approach covers same ground | +| US2 | TestPlanShowCacheReuse | "Test implemented", "Cache hit verified" | MEDIUM — no test for cache reuse path | + +### Issues Encountered & Resolutions +- TestParsePlanJSONSensitive failed: sensitive values compared equal → added isSensitive flag +- TestRunCancelJSON mock returned non-cancelable state → fixed mock to return cancelable first + +### Items Requiring Action Before Merge +1. [HIGH] Fix Scenario 11 JSONL fallback (spec.md FR-009 requires it) +2. [MEDIUM] Write TestPlanShowCacheReuse or document why it's deferred +3. [MEDIUM] Verify formatAttrValue output matches quickstart scenarios + +### Tests & Checks +- Status: PASS +- Commands run: go test ./pkg/cli/commands/... ./pkg/plan/... +- 21 tests passing + +### Uncommitted Changes +- None + +--- +``` diff --git a/.agents/commands/specledger.implement-workflow-v2.md b/.agents/commands/specledger.implement-workflow-v2.md new file mode 100644 index 0000000..3417eaf --- /dev/null +++ b/.agents/commands/specledger.implement-workflow-v2.md @@ -0,0 +1,115 @@ +--- +description: EXPERIMENTAL — implement the current feature by fanning the plan/contracts out to subagents via a deterministic Workflow (interface-first pipeline → primitives → add/verify → CLI → tests → make-check-until-green), instead of /specledger.tasks + /specledger.implement. Run from a FRESH session at DEFAULT effort for cost. +handoffs: + - label: Checkpoint For Consistency + agent: specledger.checkpoint + prompt: Run critical divergence review +--- + +## User Input + +```text +$ARGUMENTS +``` + +Optional `$ARGUMENTS`: a phase-scope override (e.g. "skillcore only", "skip docs/skill phase") or a feature id. If empty, implement the full current feature. + +## Purpose + +Run **one deterministic multi-agent Workflow** that reads the design artifacts and fans the implementation out to subagents. This is an **experiment**: it deliberately **skips the durable `sl issue` ledger** — the **quickstart scenarios + `make check` are the acceptance gate** instead. + +> **AskUserQuestion which model to use advise switching effort (it is inherited).** This script leaves `model` **unset** on every `agent()` (the override is optional — it defaults to the launcher's session model), this can keep the fanned-out agents cheap. The workflow author can add `model:` per `agent()` if a specific tier is wanted. + +## Execution steps + +1. **Locate artifacts**: run `sl spec info --json --paths-only`. Read from `FEATURE_DIR`: `plan.md`, `research.md`, `data-model.md`, `contracts/*.md`, `quickstart.md`. These are the source of truth — the agents will Read them too. +2. **Discover relevant skills**: enumerate the skills available in the session (the available-skills list surfaced by the harness; or invoke `/find-skills` for a gap). **Map each pipeline phase to the skills that govern that work** — e.g. Go code style, lint, cobra, agentic CLI design, Go testing. Workflow subagents **do** have the `Skill` tool (verified empirically), so every agent prompt **can and MUST** instruct the agent to load its relevant skills via the `Skill` tool *before* writing code. Record the phase→skills map you'll bake into the prompts. +3. **Author + launch the Workflow** following the pipeline below. It is a **dependency-ordered pipeline**, not a wide fan-out, because the Go code must compile together. Every `agent()` prompt MUST open with a `SKILLS:` directive (see the mandatory rule below). +4. When the workflow completes, **report**: files written, the final `make check` result, and any remaining failures. Do **not** create `sl issue` entries (the experiment skips the durable ledger). + +## Skill loading is mandatory (not optional) + +> **Every `agent()` prompt MUST begin with a `SKILLS:` line** naming the skills to invoke via the `Skill` tool and apply *before* doing the work. Design artifacts say *what* to build; the skills carry *how this repo builds it* (idioms, lint rules, cobra/CLI-design conventions, test patterns) — relying on the artifacts alone leaves that on the table. Workflow subagents have the `Skill` tool, so this works directly; do **not** distill skill content into the prompt by hand and do **not** assume an agent will load a skill unprompted. + +Here is an example phase to skill mapping, adjust to your discovered skill set and the feature's needs (this is the Go/CLI default): + +**Phase → skill mapping (adapt to the discovered skill set; this is the Go/CLI default):** + +| Phase | Skills the agent loads first | +|---|---| +| Scaffold | `golang-code-style` | +| Primitives | `golang-code-style` (+ `golang-cli` for any exec/IO/git-client file) | +| Operations | `golang-code-style` | +| CLI | `agentic-go-cli-design` + `golang-spf13-cobra` + `golang-cli` | +| Tests | `golang-testing` | +| Verify / repair | `golang-lint` + `golang-code-style` | +| Doc sync | `agentic-go-cli-design` | + +Map by *relevance*, not volume: 1–3 skills per agent. Loading skills unrelated to a phase just burns context. + +## Workflow pipeline (author this script) + +``` +export const meta = { + name: 'implement-feature', + description: 'Implement from its plan/contracts; gate on make check', + phases: [ + { title: 'Scaffold' }, { title: 'Primitives' }, { title: 'Operations' }, + { title: 'CLI' }, { title: 'Tests' }, { title: 'Verify' }, + ], +} +const FD = args.featureDir // pass FEATURE_DIR in via Workflow `args` + +// Every prompt OPENS with a `SKILLS:` directive — the agent invokes those skills +// via the Skill tool and applies them before writing code (mandatory rule above). + +// Phase 1 — Scaffold (interface-first): ONE agent pins the exact public API. +phase('Scaffold') +await agent(`SKILLS: invoke "golang-code-style" via the Skill tool and apply it. Then: read ${FD}/contracts/skillcore-sdk.md and ${FD}/data-model.md. Create pkg/skillcore/ with the EXACT exported types + function signatures (Manifest, Require, LockFile, LockEntry, AddOptions, AddResult, Report, Verdict, Counts, VerifyFailure, GitError) and stub bodies returning errors.New("not implemented"). Ensure 'go build ./...' compiles. Touch ONLY pkg/skillcore/.`, + { phase: 'Scaffold', label: 'scaffold' }) + +// Phase 2 — Primitives in parallel (disjoint files). +phase('Primitives') +await parallel([ + () => agent(`SKILLS: invoke "golang-code-style" AND "golang-cli" via the Skill tool and apply them (this is the exec/IO/git boundary). Then: implement pkg/skillcore/git.go: a small git client (pluggable commandContext field + GitError{ExitCode,Stderr}, gh-cli pattern) with revParse and statusPorcelain helpers. Read ${FD}/research.md (D1,D7) and ${FD}/contracts/skillcore-sdk.md. Touch ONLY git.go.`, { phase: 'Primitives', label: 'git' }), + () => agent(`SKILLS: invoke "golang-code-style" via the Skill tool and apply it. Then: implement pkg/skillcore/treesha.go: TreeSHA(gitDir,ref,relPath) = shell 'git -C gitDir rev-parse :'. Read ${FD}/research.md (D1) + ${FD}/data-model.md. Touch ONLY treesha.go.`, { phase: 'Primitives', label: 'treesha' }), + () => agent(`SKILLS: invoke "golang-code-style" via the Skill tool and apply it. Then: implement pkg/skillcore/manifest.go: ParseManifest(skill.toml) via go-toml/v2; ignore unknown keys. Read ${FD}/data-model.md. Touch ONLY manifest.go.`, { phase: 'Primitives', label: 'manifest' }), + () => agent(`SKILLS: invoke "golang-code-style" via the Skill tool and apply it. Then: implement pkg/skillcore/lock.go: ReadLock/WriteLock for .skillrig/skills-lock.json (atomic temp+rename, deterministic JSON, NO 'requires' field). Read ${FD}/data-model.md (D4). Touch ONLY lock.go.`, { phase: 'Primitives', label: 'lock' }), +]) + +// Phase 3 — Operations in parallel (depend on primitives). +phase('Operations') +await parallel([ + () => agent(`SKILLS: invoke "golang-code-style" via the Skill tool and apply it. Then: implement pkg/skillcore/add.go: Add(opts) — copy subtree mode-preserving (no injection), treeSha+commit via the git client on the ORIGIN, write lock; force/dry-run/idempotent. Read ${FD}/contracts/add.md + ${FD}/contracts/skillcore-sdk.md. Touch ONLY add.go.`, { phase: 'Operations', label: 'add' }), + () => agent(`SKILLS: invoke "golang-code-style" via the Skill tool and apply it. Then: implement pkg/skillcore/verify.go: Verify(repoRoot) — label-honesty (TreeSHA on HEAD vs lock) + orphan/completeness + dirty; aggregate ALL findings; read-only; return *VerifyFailure when not ok. Read ${FD}/contracts/verify.md. Touch ONLY verify.go (+ errors.go if needed).`, { phase: 'Operations', label: 'verify' }), +]) + +// Phase 4 — CLI wiring. +phase('CLI') +await agent(`SKILLS: invoke "agentic-go-cli-design", "golang-spf13-cobra", AND "golang-cli" via the Skill tool and apply them (they encode this repo's CLI contract — errors-as-navigation, two-level output, exit codes, cobra patterns). Then: wire the CLI in internal/cli/: add.go (resolve origin via config.ResolveOrigin, call skillcore.Add, render), verify.go (call skillcore.Verify, render, exit code), extend exit.go so *skillcore.VerifyFailure → ExitVerification(2), output.go renderers (human compact + --json), register both in root.go. Read ${FD}/contracts/{add,verify}.md + ${FD}/plan.md. Match the existing internal/cli style.`, { phase: 'CLI', label: 'cli' }) + +// Phase 5 — Tests in parallel (RAW git oracle — never skillcore; research D11). +phase('Tests') +await parallel([ + () => agent(`SKILLS: invoke "golang-testing" via the Skill tool and apply it (table-driven, helpers, t.TempDir, idiomatic naming). Then: create test/testdata/sample-origin/ (.skillrig-origin.toml + skills//{SKILL.md,skill.toml}, research D12) and test/quickstart_test.go with the TestQuickstart_* scenarios from ${FD}/quickstart.md. Bootstrap fixtures with RAW git (git init/commit) and compute expected tree-SHAs with RAW 'git rev-parse' — NEVER via skillcore (Constitution III, no circular oracle). Touch ONLY test/.`, { phase: 'Tests', label: 'quickstart' }), + () => agent(`SKILLS: invoke "golang-testing" via the Skill tool and apply it. Then: write pkg/skillcore/*_test.go unit tests: a ground-truth test asserting skillcore.TreeSHA == raw 'git rev-parse' output; lock round-trip; manifest parse; error paths via a stubbed commandContext. Touch ONLY pkg/skillcore/*_test.go.`, { phase: 'Tests', label: 'unit' }), +]) + +// Phase 6 — Verify + repair (loop on make check until green or budget). +phase('Verify') +let green = false +for (let i = 0; i < 4 && !green; i++) { + const r = await agent(`SKILLS: invoke "golang-lint" AND "golang-code-style" via the Skill tool and use them to interpret/fix lint+vet+fmt output (nolint only as a justified last resort). Then: run 'make check' (fmt+vet+lint+test). If it passes, report PASS. If not, FIX the smallest set of files to make it pass (respect the contracts; do not weaken tests to pass) and report what you changed. Return JSON {pass:boolean, changed:string[], failures:string}.`, + { phase: 'Verify', label: `make-check#${i+1}`, schema: { type:'object', properties:{ pass:{type:'boolean'}, changed:{type:'array',items:{type:'string'}}, failures:{type:'string'} }, required:['pass'] } }) + green = r?.pass === true + log(`make check round ${i+1}: ${green ? 'PASS' : 'fail'}`) +} +return { green } +``` + +Pass `args: { featureDir: "" }` to the Workflow call. After it returns, report `green` + the last round's `failures`/`changed` if not green, and list the files created under `pkg/skillcore/`, `internal/cli/`, `test/`. + +## What this deliberately skips (experiment) + +- **No `sl issue` ledger** — the durable, team-visible record is *not* produced; the acceptance gate is the quickstart tests + `make check`. +- **No `docs/design/cli.md` / agent-skill update** unless you add a Phase 7 agent (recommended before merge — Constitution IX + the same-branch doc-sync rule). diff --git a/.agents/commands/specledger.implement-workflow.md b/.agents/commands/specledger.implement-workflow.md new file mode 100644 index 0000000..3417eaf --- /dev/null +++ b/.agents/commands/specledger.implement-workflow.md @@ -0,0 +1,115 @@ +--- +description: EXPERIMENTAL — implement the current feature by fanning the plan/contracts out to subagents via a deterministic Workflow (interface-first pipeline → primitives → add/verify → CLI → tests → make-check-until-green), instead of /specledger.tasks + /specledger.implement. Run from a FRESH session at DEFAULT effort for cost. +handoffs: + - label: Checkpoint For Consistency + agent: specledger.checkpoint + prompt: Run critical divergence review +--- + +## User Input + +```text +$ARGUMENTS +``` + +Optional `$ARGUMENTS`: a phase-scope override (e.g. "skillcore only", "skip docs/skill phase") or a feature id. If empty, implement the full current feature. + +## Purpose + +Run **one deterministic multi-agent Workflow** that reads the design artifacts and fans the implementation out to subagents. This is an **experiment**: it deliberately **skips the durable `sl issue` ledger** — the **quickstart scenarios + `make check` are the acceptance gate** instead. + +> **AskUserQuestion which model to use advise switching effort (it is inherited).** This script leaves `model` **unset** on every `agent()` (the override is optional — it defaults to the launcher's session model), this can keep the fanned-out agents cheap. The workflow author can add `model:` per `agent()` if a specific tier is wanted. + +## Execution steps + +1. **Locate artifacts**: run `sl spec info --json --paths-only`. Read from `FEATURE_DIR`: `plan.md`, `research.md`, `data-model.md`, `contracts/*.md`, `quickstart.md`. These are the source of truth — the agents will Read them too. +2. **Discover relevant skills**: enumerate the skills available in the session (the available-skills list surfaced by the harness; or invoke `/find-skills` for a gap). **Map each pipeline phase to the skills that govern that work** — e.g. Go code style, lint, cobra, agentic CLI design, Go testing. Workflow subagents **do** have the `Skill` tool (verified empirically), so every agent prompt **can and MUST** instruct the agent to load its relevant skills via the `Skill` tool *before* writing code. Record the phase→skills map you'll bake into the prompts. +3. **Author + launch the Workflow** following the pipeline below. It is a **dependency-ordered pipeline**, not a wide fan-out, because the Go code must compile together. Every `agent()` prompt MUST open with a `SKILLS:` directive (see the mandatory rule below). +4. When the workflow completes, **report**: files written, the final `make check` result, and any remaining failures. Do **not** create `sl issue` entries (the experiment skips the durable ledger). + +## Skill loading is mandatory (not optional) + +> **Every `agent()` prompt MUST begin with a `SKILLS:` line** naming the skills to invoke via the `Skill` tool and apply *before* doing the work. Design artifacts say *what* to build; the skills carry *how this repo builds it* (idioms, lint rules, cobra/CLI-design conventions, test patterns) — relying on the artifacts alone leaves that on the table. Workflow subagents have the `Skill` tool, so this works directly; do **not** distill skill content into the prompt by hand and do **not** assume an agent will load a skill unprompted. + +Here is an example phase to skill mapping, adjust to your discovered skill set and the feature's needs (this is the Go/CLI default): + +**Phase → skill mapping (adapt to the discovered skill set; this is the Go/CLI default):** + +| Phase | Skills the agent loads first | +|---|---| +| Scaffold | `golang-code-style` | +| Primitives | `golang-code-style` (+ `golang-cli` for any exec/IO/git-client file) | +| Operations | `golang-code-style` | +| CLI | `agentic-go-cli-design` + `golang-spf13-cobra` + `golang-cli` | +| Tests | `golang-testing` | +| Verify / repair | `golang-lint` + `golang-code-style` | +| Doc sync | `agentic-go-cli-design` | + +Map by *relevance*, not volume: 1–3 skills per agent. Loading skills unrelated to a phase just burns context. + +## Workflow pipeline (author this script) + +``` +export const meta = { + name: 'implement-feature', + description: 'Implement from its plan/contracts; gate on make check', + phases: [ + { title: 'Scaffold' }, { title: 'Primitives' }, { title: 'Operations' }, + { title: 'CLI' }, { title: 'Tests' }, { title: 'Verify' }, + ], +} +const FD = args.featureDir // pass FEATURE_DIR in via Workflow `args` + +// Every prompt OPENS with a `SKILLS:` directive — the agent invokes those skills +// via the Skill tool and applies them before writing code (mandatory rule above). + +// Phase 1 — Scaffold (interface-first): ONE agent pins the exact public API. +phase('Scaffold') +await agent(`SKILLS: invoke "golang-code-style" via the Skill tool and apply it. Then: read ${FD}/contracts/skillcore-sdk.md and ${FD}/data-model.md. Create pkg/skillcore/ with the EXACT exported types + function signatures (Manifest, Require, LockFile, LockEntry, AddOptions, AddResult, Report, Verdict, Counts, VerifyFailure, GitError) and stub bodies returning errors.New("not implemented"). Ensure 'go build ./...' compiles. Touch ONLY pkg/skillcore/.`, + { phase: 'Scaffold', label: 'scaffold' }) + +// Phase 2 — Primitives in parallel (disjoint files). +phase('Primitives') +await parallel([ + () => agent(`SKILLS: invoke "golang-code-style" AND "golang-cli" via the Skill tool and apply them (this is the exec/IO/git boundary). Then: implement pkg/skillcore/git.go: a small git client (pluggable commandContext field + GitError{ExitCode,Stderr}, gh-cli pattern) with revParse and statusPorcelain helpers. Read ${FD}/research.md (D1,D7) and ${FD}/contracts/skillcore-sdk.md. Touch ONLY git.go.`, { phase: 'Primitives', label: 'git' }), + () => agent(`SKILLS: invoke "golang-code-style" via the Skill tool and apply it. Then: implement pkg/skillcore/treesha.go: TreeSHA(gitDir,ref,relPath) = shell 'git -C gitDir rev-parse :'. Read ${FD}/research.md (D1) + ${FD}/data-model.md. Touch ONLY treesha.go.`, { phase: 'Primitives', label: 'treesha' }), + () => agent(`SKILLS: invoke "golang-code-style" via the Skill tool and apply it. Then: implement pkg/skillcore/manifest.go: ParseManifest(skill.toml) via go-toml/v2; ignore unknown keys. Read ${FD}/data-model.md. Touch ONLY manifest.go.`, { phase: 'Primitives', label: 'manifest' }), + () => agent(`SKILLS: invoke "golang-code-style" via the Skill tool and apply it. Then: implement pkg/skillcore/lock.go: ReadLock/WriteLock for .skillrig/skills-lock.json (atomic temp+rename, deterministic JSON, NO 'requires' field). Read ${FD}/data-model.md (D4). Touch ONLY lock.go.`, { phase: 'Primitives', label: 'lock' }), +]) + +// Phase 3 — Operations in parallel (depend on primitives). +phase('Operations') +await parallel([ + () => agent(`SKILLS: invoke "golang-code-style" via the Skill tool and apply it. Then: implement pkg/skillcore/add.go: Add(opts) — copy subtree mode-preserving (no injection), treeSha+commit via the git client on the ORIGIN, write lock; force/dry-run/idempotent. Read ${FD}/contracts/add.md + ${FD}/contracts/skillcore-sdk.md. Touch ONLY add.go.`, { phase: 'Operations', label: 'add' }), + () => agent(`SKILLS: invoke "golang-code-style" via the Skill tool and apply it. Then: implement pkg/skillcore/verify.go: Verify(repoRoot) — label-honesty (TreeSHA on HEAD vs lock) + orphan/completeness + dirty; aggregate ALL findings; read-only; return *VerifyFailure when not ok. Read ${FD}/contracts/verify.md. Touch ONLY verify.go (+ errors.go if needed).`, { phase: 'Operations', label: 'verify' }), +]) + +// Phase 4 — CLI wiring. +phase('CLI') +await agent(`SKILLS: invoke "agentic-go-cli-design", "golang-spf13-cobra", AND "golang-cli" via the Skill tool and apply them (they encode this repo's CLI contract — errors-as-navigation, two-level output, exit codes, cobra patterns). Then: wire the CLI in internal/cli/: add.go (resolve origin via config.ResolveOrigin, call skillcore.Add, render), verify.go (call skillcore.Verify, render, exit code), extend exit.go so *skillcore.VerifyFailure → ExitVerification(2), output.go renderers (human compact + --json), register both in root.go. Read ${FD}/contracts/{add,verify}.md + ${FD}/plan.md. Match the existing internal/cli style.`, { phase: 'CLI', label: 'cli' }) + +// Phase 5 — Tests in parallel (RAW git oracle — never skillcore; research D11). +phase('Tests') +await parallel([ + () => agent(`SKILLS: invoke "golang-testing" via the Skill tool and apply it (table-driven, helpers, t.TempDir, idiomatic naming). Then: create test/testdata/sample-origin/ (.skillrig-origin.toml + skills//{SKILL.md,skill.toml}, research D12) and test/quickstart_test.go with the TestQuickstart_* scenarios from ${FD}/quickstart.md. Bootstrap fixtures with RAW git (git init/commit) and compute expected tree-SHAs with RAW 'git rev-parse' — NEVER via skillcore (Constitution III, no circular oracle). Touch ONLY test/.`, { phase: 'Tests', label: 'quickstart' }), + () => agent(`SKILLS: invoke "golang-testing" via the Skill tool and apply it. Then: write pkg/skillcore/*_test.go unit tests: a ground-truth test asserting skillcore.TreeSHA == raw 'git rev-parse' output; lock round-trip; manifest parse; error paths via a stubbed commandContext. Touch ONLY pkg/skillcore/*_test.go.`, { phase: 'Tests', label: 'unit' }), +]) + +// Phase 6 — Verify + repair (loop on make check until green or budget). +phase('Verify') +let green = false +for (let i = 0; i < 4 && !green; i++) { + const r = await agent(`SKILLS: invoke "golang-lint" AND "golang-code-style" via the Skill tool and use them to interpret/fix lint+vet+fmt output (nolint only as a justified last resort). Then: run 'make check' (fmt+vet+lint+test). If it passes, report PASS. If not, FIX the smallest set of files to make it pass (respect the contracts; do not weaken tests to pass) and report what you changed. Return JSON {pass:boolean, changed:string[], failures:string}.`, + { phase: 'Verify', label: `make-check#${i+1}`, schema: { type:'object', properties:{ pass:{type:'boolean'}, changed:{type:'array',items:{type:'string'}}, failures:{type:'string'} }, required:['pass'] } }) + green = r?.pass === true + log(`make check round ${i+1}: ${green ? 'PASS' : 'fail'}`) +} +return { green } +``` + +Pass `args: { featureDir: "" }` to the Workflow call. After it returns, report `green` + the last round's `failures`/`changed` if not green, and list the files created under `pkg/skillcore/`, `internal/cli/`, `test/`. + +## What this deliberately skips (experiment) + +- **No `sl issue` ledger** — the durable, team-visible record is *not* produced; the acceptance gate is the quickstart tests + `make check`. +- **No `docs/design/cli.md` / agent-skill update** unless you add a Phase 7 agent (recommended before merge — Constitution IX + the same-branch doc-sync rule). diff --git a/.agents/skills/skillrig-add-verify/SKILL.md b/.agents/skills/skillrig-add-verify/SKILL.md new file mode 100644 index 0000000..a127b0f --- /dev/null +++ b/.agents/skills/skillrig-add-verify/SKILL.md @@ -0,0 +1,151 @@ +--- +name: skillrig-add-verify +description: >- + Vendor agent skills from your configured origin into a repo with `skillrig add`, and + prove vendored skills are exactly what was recorded with `skillrig verify`. Use this + whenever the user wants to "add/vendor/pull in a skill" from their org's skills library, + "lock" or "pin" a skill into the repo, set up a CI gate that the committed skills are + unmodified, "check/verify our skills haven't been tampered with", understand the + vendor→commit→verify round-trip, read a verify pass/fail or exit code, or debug a + `mismatch` / `orphan` / `missing` / `dirty` verdict. Trigger even when the user doesn't + name the command — e.g. "make sure nobody changed our skills", "why did the skills check + fail in CI", or "our agent skill got edited, how do I restore it". Also use to explain + that a missing backing tool is NOT a verify failure (integrity-only). +license: MIT +metadata: + author: skillrig + cli: skillrig + user-invocable: true +--- + +# skillrig-add-verify Skill + +**When to Load**: The user wants to vendor a skill from their origin (`skillrig add`), +verify the repo's vendored skills against their recorded identities (`skillrig verify`), +gate CI on that verification, or interpret/debug a verify outcome (exit codes, per-skill +verdicts) — or whenever `skillrig add` / `skillrig verify` is referenced. + +## The promise these two commands make + +> *The skill your agent runs is exactly the version that was reviewed and approved.* + +`add` records a tamper-evident fingerprint of a skill's content when it is vendored; +`verify` later recomputes that fingerprint and fails if it drifted. Both use the **same** +git tree-SHA, computed by shelling `git`, so the value written at vendor time and the value +checked at verify time **cannot diverge** — the gate cannot lie. This is the whole point; +keep it intact (never hand-edit the lock, never mutate vendored files to "fix" a mismatch). + +## `skillrig add ` — vendor a skill (Vendor Mutation) + +Vendors `` from the repo's **configured origin** into the canonical +`.agents/skills//`, byte-identical and mode-preserving (it injects nothing), and +records its identity — `version`, `commit`, `treeSha`, `path` — in `.skillrig/skills-lock.json`. +Offline and consume-only. **Requires a git repository** (project scope). + +- **Origin, not a path**: `add` resolves the active origin via the shared resolver + (`SKILLRIG_ORIGIN` > project `.skillrig/config.toml` > global) exactly like every command. + There is **no** `--from`/path argument. +- **Local origin (this release)**: the configured `OWNER/REPO` is read from a local git + checkout at `./OWNER/REPO`, relative to where you run `add` (your repo root) — no network. + So `init --origin my-org/my-skills` expects that library checked out at `./my-org/my-skills` + (keep it out of your index, e.g. `echo 'my-org/' >> .git/info/exclude`). +- **Idempotent**: re-adding identical content reports success and changes nothing + (`action: "unchanged"`). +- **Never clobbers**: if the on-disk copy diverges from the recorded fingerprint, `add` + **refuses** without `--force` (so local edits are never lost silently). It does **not** + three-way-merge — re-vendoring the same version has no upstream change to merge; that is a + future `bump`. Use `--force` to overwrite with the origin's content, or revert your edits. +- **`--dry-run`** previews placement + record changes and writes nothing. + +After `add`, **commit** `.agents/skills/` + the lock, then run `verify` — verify +checks the *committed* tree. + +| Flag | Purpose | +|------|---------| +| `--dry-run` | Report what would be vendored/recorded; write nothing | +| `--force` | Overwrite a vendored skill whose on-disk content diverges from the lock | +| `--json` | Emit the complete `AddResult` on stdout | +| `--verbose` | Show underlying paths / raw git cause behind summaries and errors | + +`--json` keys (always present): `ok, name, version, path, commit, treeSha, action, dryRun`; +`action ∈ {vendored, unchanged, overwritten}`. + +## `skillrig verify` — prove vendored skills are unmodified (Verification Gate) + +Checks **this repository's** vendored skills (project scope: `.agents/skills` vs the +committed `.skillrig/skills-lock.json`) — offline, deterministic, **read-only**. It +aggregates **all** findings in one run (never stops at the first failure). Takes no args. + +Two checks: +- **Label-honesty**: recompute each locked skill's tree-SHA from its **committed** content + and compare to the lock. +- **Orphan / completeness**: the on-disk skill set under `.agents/skills` must equal the + locked set — an unrecorded skill (`orphan`) or a recorded-but-absent one (`missing`) fails. + +### Per-skill verdicts (the `status` field) + +| Status | Meaning | Fix | +|--------|---------|-----| +| `ok` | committed content matches the recorded fingerprint | — | +| `mismatch` | committed content differs from the record (label-honesty failure) | re-`add` from origin, or restore the approved content | +| `orphan` | on disk but no lock entry (untracked — the primary supply-chain risk) | `skillrig add` it, or remove it | +| `missing` | lock entry whose files are absent | restore the files, or remove the lock entry | +| `dirty` | locked + present but **uncommitted / locally modified** | commit it (verify checks committed content) — *distinct* from `mismatch` | + +### CRITICAL: verify is integrity-only — a missing backing tool is NOT a failure + +`verify` does **no** prerequisite/eligibility check. A skill may declare `[[requires]]` +backing tools in its `skill.toml`; if those tools are absent in the environment, `verify` +**still passes** (it checks content, not runnability). Prerequisite checking is a future +`doctor` concern (the reserved exit `3`), never emitted here. Don't tell a user that verify +failed because a tool isn't installed — that's never the cause. + +## Exit codes (load-bearing — branch on these in CI/agents) + +| Code | When | +|------|------| +| `0` | All verdicts `ok` (**including** the empty case: no skills / no lock → clean pass) | +| `1` | Usage/config: malformed or unreadable lock, bad flags, **not inside a git repo** | +| `2` | Verification failure: any `mismatch`, `orphan`, `missing`, or `dirty` | +| `3` | **Never emitted** — reserved for `doctor`'s prerequisite class | + +A malformed lock is a **`1`**, not a `2` — keep that distinction when scripting (a `2` +means "content drifted"; a `1` means "I couldn't even run the check"). + +`verify --json` keys: `ok, counts{verified,mismatch,orphan,missing,dirty}, verdicts[]` with +each verdict carrying `name, path, status, expectedTreeSha, actualTreeSha, reason`. +Diagnostics go to stderr, so `skillrig verify --json 2>/dev/null | jq .` stays clean JSON. + +## Workflow patterns + +1. **Vendor + lock a skill**: + `skillrig add terraform-plan-review` → `git add -A && git commit` → `skillrig verify`. +2. **CI merge gate** (the headline use): run `skillrig verify` (or `--json` for an agent); + exit `0` proceeds, `2` blocks with a per-skill report, `1` is a setup/config problem. +3. **Recover a tampered skill**: a `mismatch`/`dirty` verdict → re-vendor from origin with + `skillrig add --force`, then commit and re-verify. +4. **Found an `orphan`**: either `skillrig add` it (record it) or delete the directory. +5. **Preview before writing**: `skillrig add --dry-run`. + +## Error handling + +| Symptom (stderr) | Cause | Fix | +|------------------|-------|-----| +| `no origin configured` | no `SKILLRIG_ORIGIN` / project / global origin | `skillrig init --origin OWNER/REPO`, or set `SKILLRIG_ORIGIN` | +| `skill "" not found in origin` | no `skills//` at the configured origin | check the name against the origin's `skills/` | +| `refusing to overwrite ` | on-disk content diverges from the record | re-run with `--force`, or revert local edits | +| `not a git repository` | `add`/`verify` run outside a repo | run inside the repo (or `git init` first) | +| `cannot read .skillrig/skills-lock.json` | malformed/unreadable lock (exit `1`, **not** `2`) | check/repair the file, or re-vendor with `skillrig add` | + +All failures state what/why/fix; add `--verbose` for the raw underlying cause. Errors go to +stderr, data to stdout. + +## Token efficiency + +Human output is compact (a summary line per finding + a footer hint). Use `--json` only when +a program/agent will parse the verdicts; otherwise the compact human form keeps context small. + +## Related + +- `skillrig-init` — bind the repo to an origin first (this skill assumes an origin is set; + see that skill for origin references, `@REF` branch tracking, and resolution precedence). diff --git a/.agents/skills/skillrig-add-verify/evals/evals.json b/.agents/skills/skillrig-add-verify/evals/evals.json new file mode 100644 index 0000000..2b84a8d --- /dev/null +++ b/.agents/skills/skillrig-add-verify/evals/evals.json @@ -0,0 +1,63 @@ +[ + { + "id": 1, + "name": "vendor-a-skill-roundtrip", + "description": "User wants to pull a skill from their library into the repo — should reach for `skillrig add ` and explain the vendor→commit→verify round-trip (lands in .agents/skills, records the lock, commit, then verify).", + "prompt": "Our team's skills library is already configured for this repo. I want to bring in the `terraform-plan-review` skill and make sure it's locked to exactly the approved version. How do I do that with skillrig?", + "trap": "Model invents a non-existent command (e.g. `skillrig install`/`skillrig pull`), suggests hand-editing .skillrig/skills-lock.json, copies files manually instead of `skillrig add`, expects a network fetch, or forgets that you must commit before `verify` (which checks committed content).", + "assertions": [ + { "id": "1.1", "text": "Recommends `skillrig add terraform-plan-review` to vendor from the configured origin into .agents/skills/" }, + { "id": "1.2", "text": "Explains the identity is recorded in .skillrig/skills-lock.json (version/commit/treeSha/path) — not hand-authored" }, + { "id": "1.3", "text": "Says to git-commit the vendored skill + lock, THEN run `skillrig verify` (verify checks committed content)" }, + { "id": "1.4", "text": "Does NOT invent a path/`--from` argument or claim it fetches over the network (local, consume-only this release)" } + ] + }, + { + "id": 2, + "name": "verify-integrity-only-not-prereq", + "description": "User suspects verify failed because a backing CLI is missing — should explain verify is integrity-only and a missing backing tool NEVER fails verify (prereq is a future doctor / reserved exit 3).", + "prompt": "`skillrig verify` is failing in CI and one of our skills declares it needs `terraform` and `oxid`, which aren't installed on the CI runner. Is the missing tool why verify fails, and should I install them to fix it?", + "trap": "Model claims verify checks prerequisites / that the missing tool causes the failure, tells the user to install the tools to fix verify, or conflates integrity (exit 2) with a prerequisite check (the reserved exit 3).", + "assertions": [ + { "id": "2.1", "text": "States a missing backing tool does NOT cause a verify failure — verify is integrity-only (checks content, not runnability)" }, + { "id": "2.2", "text": "Attributes prerequisite/eligibility checking to a future `doctor` (the reserved exit 3, never emitted by verify)" }, + { "id": "2.3", "text": "Redirects the user to the real cause: a non-zero verify is a content mismatch / orphan / missing / dirty (exit 2), or a config problem (exit 1) — inspect with `skillrig verify --json`" } + ] + }, + { + "id": 3, + "name": "ci-gate-exit-codes", + "description": "User wants to gate a merge on verify — should explain the stable exit codes (0 pass / 1 usage-config / 2 verification failure; 3 never) and branching, not prose parsing.", + "prompt": "I want our merge pipeline to block if any vendored skill has been tampered with. How do I wire `skillrig verify` into CI, and what exit codes do I branch on?", + "trap": "Model gets the exit codes wrong (e.g. says verification failure is exit 1), suggests grepping the human text instead of using the exit code or `--json`, claims a malformed lock is exit 2, or mentions exit 3 as something verify emits.", + "assertions": [ + { "id": "3.1", "text": "Says exit 0 = pass (including empty/no-skills), exit 2 = verification failure (mismatch/orphan/missing/dirty), exit 1 = usage/config (incl. malformed lock, not-a-git-repo)" }, + { "id": "3.2", "text": "Recommends branching on the exit code (and/or `--json`), not parsing prose" }, + { "id": "3.3", "text": "Notes a malformed/unreadable lock is exit 1 (distinct from a content failure 2), and exit 3 is never emitted by verify" } + ] + }, + { + "id": 4, + "name": "divergent-overwrite-force", + "description": "User locally edited a vendored skill and re-adds the same version — should explain add refuses without --force (no silent clobber, no three-way merge) and how to proceed.", + "prompt": "I made some local tweaks to .agents/skills/terraform-plan-review, and now `skillrig add terraform-plan-review` is erroring with something about refusing to overwrite. What's going on and how do I get the original back?", + "trap": "Model claims `add` performs a three-way merge, tells the user to hand-edit the lock, suggests deleting the lock entry, or implies add will silently overwrite their edits.", + "assertions": [ + { "id": "4.1", "text": "Explains add refuses because the on-disk content diverges from the recorded fingerprint (it never silently clobbers local edits)" }, + { "id": "4.2", "text": "Says `skillrig add terraform-plan-review --force` overwrites with the origin's approved content (or revert the local edits to proceed)" }, + { "id": "4.3", "text": "Does NOT claim add three-way-merges the local edits (re-vendoring the same version has no upstream change to merge — that is a future `bump`)" } + ] + }, + { + "id": 5, + "name": "dirty-vs-mismatch-verdict", + "description": "User gets a `dirty` verdict — should explain it means uncommitted/locally-modified (commit it), distinct from a content `mismatch`, because verify checks the committed tree.", + "prompt": "`skillrig verify --json` shows one of my skills with status `dirty` rather than `ok` or `mismatch`. I haven't changed the recorded version on purpose. What does dirty mean and how do I clear it?", + "trap": "Model conflates `dirty` with `mismatch`, tells the user the recorded fingerprint is wrong, or suggests editing the lock — instead of recognizing that verify checks the COMMITTED tree and the working copy has uncommitted changes.", + "assertions": [ + { "id": "5.1", "text": "Explains `dirty` = the vendored skill has uncommitted/locally-modified content (verify checks the committed tree)" }, + { "id": "5.2", "text": "Says to commit the vendored files (or discard the local changes) so the committed tree matches the record, then re-verify" }, + { "id": "5.3", "text": "Distinguishes `dirty` from `mismatch` (mismatch = committed content differs from the record; dirty = not yet committed) and does NOT suggest editing the lock" } + ] + } +] diff --git a/.agents/skills/skillrig-add-verify/evals/trigger-eval-set.json b/.agents/skills/skillrig-add-verify/evals/trigger-eval-set.json new file mode 100644 index 0000000..26bcf01 --- /dev/null +++ b/.agents/skills/skillrig-add-verify/evals/trigger-eval-set.json @@ -0,0 +1,22 @@ +[ + { "query": "Our skills library is configured for this repo — how do I vendor the terraform-plan-review skill into it with skillrig?", "should_trigger": true }, + { "query": "I want to pull in an agent skill from our org library and lock it to the approved version. How?", "should_trigger": true }, + { "query": "How do I add a CI check that the skills committed in our repo haven't been modified from what was approved?", "should_trigger": true }, + { "query": "skillrig verify failed in CI and exited 2 — what does that mean and which skills are the problem?", "should_trigger": true }, + { "query": "make sure nobody secretly edited one of our agent skills before we merge", "should_trigger": true }, + { "query": "verify is reporting one of my skills as 'orphan' and another as 'dirty' — what do those mean and how do I fix them?", "should_trigger": true }, + { "query": "someone changed .agents/skills/terraform-plan-review locally and now I want to restore the original from our library", "should_trigger": true }, + { "query": "does skillrig verify fail if the CLI a skill needs (like terraform) isn't installed on the runner?", "should_trigger": true }, + { "query": "I re-ran skillrig add and it says 'refusing to overwrite' — how do I force it to take the origin's version?", "should_trigger": true }, + { "query": "what's the difference between a mismatch and a dirty verdict in skillrig verify?", "should_trigger": true }, + { "query": "How do I point this repo at our team's skills library acme/agent-skills with skillrig?", "should_trigger": false }, + { "query": "skillrig says 'no origin configured' — how do I set the origin?", "should_trigger": false }, + { "query": "What's the precedence between SKILLRIG_ORIGIN and the project config file?", "should_trigger": false }, + { "query": "I want this repo to track the staging branch of our skills repo with skillrig init", "should_trigger": false }, + { "query": "verify the SHA-256 checksum of this tarball I downloaded matches the release page", "should_trigger": false }, + { "query": "how do I check that my last git commit is GPG-signed and verified on GitHub?", "should_trigger": false }, + { "query": "configure golangci-lint to enable the gosec linter for our Go CLI", "should_trigger": false }, + { "query": "write a table-driven unit test for a Go function that parses OWNER/REPO references", "should_trigger": false }, + { "query": "how do I vendor my Go module dependencies with go mod vendor?", "should_trigger": false }, + { "query": "add a new skill to my Claude Code setup by writing a SKILL.md from scratch", "should_trigger": false } +] diff --git a/.agents/skills/skillrig-init/SKILL.md b/.agents/skills/skillrig-init/SKILL.md index 7c66ada..bfc0917 100644 --- a/.agents/skills/skillrig-init/SKILL.md +++ b/.agents/skills/skillrig-init/SKILL.md @@ -86,6 +86,24 @@ SKILLRIG_ORIGIN > project .skillrig/config.toml (nearest ancestor) > global The project lookup walks **up** from the working directory, so any subdirectory of a bound repo resolves the same origin. +## Local origin (this release) + +`init` records only an `OWNER/REPO[@REF]` **reference** — never a filesystem path (passing +a path fails with `invalid origin … expected OWNER/REPO[@REF]`). In this release there is no +network fetch, so when a later command (`skillrig add`) needs the origin's files it reads +them from a **local git checkout at `./OWNER/REPO`, relative to where you run the command** +(your repo root). So to vendor from a local copy of `my-org/my-skills`: + +``` +skillrig init --origin my-org/my-skills # records the reference +git clone my-org/my-skills # the checkout add reads from (./my-org/my-skills) +echo 'my-org/' >> .git/info/exclude # keep it out of your repo's index +skillrig add # reads ./my-org/my-skills/skills// +``` + +`@REF` selects the revision (default `HEAD`). Fetching a remote origin over the network is a +later, additive mode. See `skillrig add --help` for the vendoring side. + ## JSON Output `skillrig init --origin my-org/my-skills --json` emits a single object with all keys diff --git a/.specledger/templates/tasks-template-v2.md b/.specledger/templates/tasks-template-v2.md new file mode 100644 index 0000000..7118d24 --- /dev/null +++ b/.specledger/templates/tasks-template-v2.md @@ -0,0 +1,287 @@ +--- +description: "Task list template for feature implementation" +--- + +# Tasks: [FEATURE NAME] + +**Input**: Design documents from `/specs/[###-feature-name]/` + +**Prerequisites**: plan.md (required), spec.md (required for user stories), research.md, data-model.md, contracts/ + +**Tests**: The examples below include test tasks. Tests are OPTIONAL - only include them if explicitly requested in the feature specification. + +**Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story. + +## Format: `[ID] [P?] [Story] Description` + +- **[P]**: Can run in parallel (different files, no dependencies) +- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3) +- Include exact file paths in descriptions + +## Path Conventions + +- **Single project**: `src/`, `tests/` at repository root +- **Web app**: `backend/src/`, `frontend/src/` +- **Mobile**: `api/src/`, `ios/src/` or `android/src/` +- Paths shown below assume single project - adjust based on plan.md structure + + + +## Phase 1: Setup (Shared Infrastructure) + +**Purpose**: Project initialization and basic structure + +- [ ] T001 Create project structure per implementation plan +- [ ] T002 Initialize [language] project with [framework] dependencies +- [ ] T003 [P] Configure linting and formatting tools + +### Definition of Done (DoD) + +- Language/Framework tooling works (e.g., `go get`, `npm install`) + +--- + +## Phase 2: Foundational (Blocking Prerequisites) + +**Purpose**: Core infrastructure that MUST be complete before ANY user story can be implemented + +**⚠️ CRITICAL**: No user story work can begin until this phase is complete + +Examples of foundational tasks (adjust based on your project): + +- [ ] T004 Setup database schema and migrations framework +- [ ] T005 [P] Implement authentication/authorization framework +- [ ] T006 [P] Setup API routing and middleware structure +- [ ] T007 Create base models/entities that all stories depend on +- [ ] T008 Configure error handling and logging infrastructure +- [ ] T009 Setup environment configuration management + +### Definition of Done (DoD) + +- [ ] Database can be bootstrapped and migrated +- [ ] Authentication flow works end-to-end (e.g., login/logout) +- [ ] Environments can be spun up with correct configuration +- [ ] ... + +**Checkpoint**: Foundation ready - user story implementation can now begin in parallel + +--- + +## Phase 3: User Story 1 - [Title] (Priority: P1) 🎯 MVP + +**Goal**: [Brief description of what this story delivers] + +**Independent Test**: [How to verify this story works on its own] + +### Tests for User Story 1 (REQUIRED — Quickstart-as-Contract, Constitution II) ⚠️ + +**NOTE: Write these tests FIRST, ensure they FAIL before implementation. Each quickstart.md scenario for this story MUST map 1:1 to a Go integration test.** + +- [ ] T010 [P] [US1] Integration test `TestQuickstart_[scenario]` (from quickstart.md) in [pkg]/[name]_test.go +- [ ] T011 [P] [US1] Unit test for non-obvious internal logic (only where genuinely useful) in [pkg]/[name]_test.go + +### Implementation for User Story 1 + +- [ ] T012 [P] [US1] Create [Entity1] model in src/models/[entity1].py +- [ ] T013 [P] [US1] Create [Entity2] model in src/models/[entity2].py +- [ ] T014 [US1] Implement [Service] in src/services/[service].py (depends on T012, T013) +- [ ] T015 [US1] Implement [endpoint/feature] in src/[location]/[file].py +- [ ] T016 [US1] Add validation and error handling +- [ ] T017 [US1] Add logging for user story 1 operations + +### Definition of Done (DoD) + +- [ ] User can perform the user story's actions successfully +- [ ] quickstart.md scenario(s) match this story's user stories +- [ ] TestQuickstart_ integration test passes + +**Checkpoint**: At this point, User Story 1 should be fully functional and testable independently + +--- + +## Phase 4: User Story 2 - [Title] (Priority: P2) + +**Goal**: [Brief description of what this story delivers] + +**Independent Test**: [How to verify this story works on its own] + +### Tests for User Story 2 (REQUIRED — Quickstart-as-Contract, Constitution II) ⚠️ + +- [ ] T018 [P] [US2] Integration test `TestQuickstart_[scenario]` (from quickstart.md) in [pkg]/[name]_test.go +- [ ] T019 [P] [US2] Unit test for non-obvious internal logic (only where genuinely useful) in [pkg]/[name]_test.go + +### Implementation for User Story 2 + +- [ ] T020 [P] [US2] Create [Entity] model in src/models/[entity].py +- [ ] T021 [US2] Implement [Service] in src/services/[service].py +- [ ] T022 [US2] Implement [endpoint/feature] in src/[location]/[file].py +- [ ] T023 [US2] Integrate with User Story 1 components (if needed) + +### Definition of Done (DoD) + +- [ ] User can perform the user story's actions successfully +- [ ] quickstart.md scenario(s) match this story's user stories +- [ ] TestQuickstart_ integration test passes + +**Checkpoint**: At this point, User Stories 1 AND 2 should both work independently + +--- + +## Phase 5: User Story 3 - [Title] (Priority: P3) + +**Goal**: [Brief description of what this story delivers] + +**Independent Test**: [How to verify this story works on its own] + +### Tests for User Story 3 (REQUIRED — Quickstart-as-Contract, Constitution II) ⚠️ + +- [ ] T024 [P] [US3] Integration test `TestQuickstart_[scenario]` (from quickstart.md) in [pkg]/[name]_test.go +- [ ] T025 [P] [US3] Unit test for non-obvious internal logic (only where genuinely useful) in [pkg]/[name]_test.go + +### Implementation for User Story 3 + +- [ ] T026 [P] [US3] Create [Entity] model in src/models/[entity].py +- [ ] T027 [US3] Implement [Service] in src/services/[service].py +- [ ] T028 [US3] Implement [endpoint/feature] in src/[location]/[file].py + +### Definition of Done (DoD) + +- [ ] User can perform the user story's actions successfully +- [ ] quickstart.md scenario(s) match this story's user stories +- [ ] TestQuickstart_ integration test passes + +**Checkpoint**: All user stories should now be independently functional + +--- + +[Add more user story phases as needed, following the same pattern] + +--- + +## Phase N: Polish & Cross-Cutting Concerns + +**Purpose**: Improvements that affect multiple user stories + +- [ ] TXXX [P] Documentation updates in docs/ +- [ ] TXXX Update CLAUDE.md to reflect current architectural setup: new/changed commands, shared library paths added or moved (e.g. a new `internal/`, `skillcore`, the origin resolver). Rule of thumb — every shared library path belongs in CLAUDE.md. (Plan's CLAUDE.md Sync Gate) +- [ ] TXXX Code cleanup and refactoring +- [ ] TXXX Performance optimization across all stories +- [ ] TXXX [P] Additional unit tests for non-obvious internal logic +- [ ] TXXX Security hardening +- [ ] TXXX Confirm full quickstart.md suite passes (`go test ./...` — all `TestQuickstart_*` green) + +### Definition of Done (DoD) + +- [ ] quickstart.md scenario(s) all complete successfully +- [ ] TestQuickstart_ integration test passes + +--- + +## Dependencies & Execution Order + +### Phase Dependencies + +- **Setup (Phase 1)**: No dependencies - can start immediately +- **Foundational (Phase 2)**: Depends on Setup completion - BLOCKS all user stories +- **User Stories (Phase 3+)**: All depend on Foundational phase completion + - User stories can then proceed in parallel (if staffed) + - Or sequentially in priority order (P1 → P2 → P3) +- **Polish (Final Phase)**: Depends on all desired user stories being complete + +### User Story Dependencies + +- **User Story 1 (P1)**: Can start after Foundational (Phase 2) - No dependencies on other stories +- **User Story 2 (P2)**: Can start after Foundational (Phase 2) - May integrate with US1 but should be independently testable +- **User Story 3 (P3)**: Can start after Foundational (Phase 2) - May integrate with US1/US2 but should be independently testable + +### Within Each User Story + +- Quickstart-derived integration tests MUST be written and FAIL before implementation (Constitution II) +- Models before services +- Services before endpoints +- Core implementation before integration +- **Definition of Done**: a story is DONE only when its `quickstart.md` scenarios match the user stories AND the corresponding `TestQuickstart_*` integration tests pass +- Story complete before moving to next priority + +### Parallel Opportunities + +- All Setup tasks marked [P] can run in parallel +- All Foundational tasks marked [P] can run in parallel (within Phase 2) +- Once Foundational phase completes, all user stories can start in parallel (if team capacity allows) +- All tests for a user story marked [P] can run in parallel +- Models within a story marked [P] can run in parallel +- Different user stories can be worked on in parallel by different team members + +--- + +## Parallel Example: User Story 1 + +```bash +# Launch all tests for User Story 1 together (if tests requested): +Task: "Contract test for [endpoint] in tests/contract/test_[name].py" +Task: "Integration test for [user journey] in tests/integration/test_[name].py" + +# Launch all models for User Story 1 together: +Task: "Create [Entity1] model in src/models/[entity1].py" +Task: "Create [Entity2] model in src/models/[entity2].py" +``` + +--- + +## Implementation Strategy + +### MVP First (User Story 1 Only) + +1. Complete Phase 1: Setup +2. Complete Phase 2: Foundational (CRITICAL - blocks all stories) +3. Complete Phase 3: User Story 1 +4. **STOP and VALIDATE**: Test User Story 1 independently +5. Deploy/demo if ready + +### Incremental Delivery + +1. Complete Setup + Foundational → Foundation ready +2. Add User Story 1 → Test independently → Deploy/Demo (MVP!) +3. Add User Story 2 → Test independently → Deploy/Demo +4. Add User Story 3 → Test independently → Deploy/Demo +5. Each story adds value without breaking previous stories + +### Parallel Team Strategy + +With multiple agents: + +1. Team completes Setup + Foundational together +2. Once Foundational is done: + - Agent A: User Story 1 + - Agent B: User Story 2 + - Agent C: User Story 3 +3. Stories complete and integrate independently + +--- + +## Notes + +- [P] tasks = different files, no dependencies +- [Story] label maps task to specific user story for traceability +- Each user story should be independently completable and testable +- Verify tests fail before implementing +- Commit after each task or logical group +- Stop at any checkpoint to validate story independently +- Avoid: vague tasks, same file conflicts, cross-story dependencies that break independence diff --git a/AGENTS.md b/AGENTS.md deleted file mode 100644 index 27bb014..0000000 --- a/AGENTS.md +++ /dev/null @@ -1,95 +0,0 @@ -# Repository Guidelines - -## Issue Tracking with `sl issue` - -**IMPORTANT**: This project uses the built-in **`sl issue`** commands for **issue and work-item tracking**. Do NOT use markdown TODO/task lists as a substitute for the issue system. - -> **Scope of this rule**: it targets *work-item / task tracking* only. It does **not** prohibit the checkbox checklists that are part of `/specledger` spec and plan artifacts — e.g. the spec-quality checklist (`checklists/requirements.md`) and the plan's Constitution Check (`plan.md`). Those are in-document review checklists generated by the workflow, not a parallel work tracker, and are allowed. - -### Why `sl issue`? - -- No external dependencies: Built directly into the `sl` CLI -- Git-friendly: Issues stored as JSONL files, one per spec -- Agent-optimized: JSON output, ready work detection, dependency links -- Prevents duplicate tracking systems and confusion - -### Quick Start - -**Check for open issues:** - -```bash -sl issue list --status open -sl issue list --all # across all specs -``` - -**Create new issues:** - -```bash -sl issue create --title "Issue title" --type bug|feature|task --priority 0-4 -sl issue create --title "Fix login" --type bug --priority 1 -``` - -**Update issues:** - -```bash -sl issue update SL-abc123 --status in_progress -sl issue update SL-abc123 --priority 1 -``` - -**Complete work:** - -```bash -sl issue close SL-abc123 --reason "Completed: Fixed login validation" -``` - -### Issue Types - -- `bug` - Something broken -- `feature` - New functionality -- `task` - Work item (tests, docs, refactoring) -- `epic` - Large feature with subtasks - -### Priorities - -- `0` - Critical (security, data loss, broken builds) -- `1` - High (major features, important bugs) -- `2` - Medium (default, nice-to-have) -- `3` - Low (polish, optimization) - -### Issue IDs - -Issues use deterministic IDs in format `SL-xxxxxx` (6 hex characters derived from SHA-256 hash of spec context + title + timestamp). - -### Spec Storage - -Issues are stored per-spec in `specledger//issues.jsonl` to avoid merge conflicts when working on different features. - -### Workflow for AI Agents - -1. **Check open issues**: `sl issue list --status open` -2. **Claim your task**: `sl issue update --status in_progress` -3. **Work on it**: Implement, test, document -4. **Complete**: `sl issue close --reason "Done"` -5. **Commit together**: Always commit the `specledger//issues.jsonl` file together with the code changes - -### Dependency Management - -Link issues with dependencies: - -```bash -sl issue link SL-abc123 blocks SL-def456 # abc123 blocks def456 -``` - -### Important Rules - -- ✅ Use `sl issue` for ALL task tracking -- ✅ Issues are stored per-spec in `specledger//issues.jsonl` -- ✅ Check `sl issue list --status open` before asking "what should I work on?" -- ❌ Do NOT use markdown TODO lists as a substitute for `sl issue` work-item tracking - - ✅ Checkbox checklists inside `/specledger` spec/plan artifacts (requirements checklist, Constitution Check) are in-document review checklists, not work tracking — allowed -- ❌ Do NOT use external issue trackers -- ❌ Do NOT duplicate tracking systems - -## Commit & Pull Request Guidelines - -Follow the existing conventional prefixes (`feat:`, `fix:`, `chore:`, `docs:`) and keep messages imperative and under 72 characters. Reference related issues in the body and mention migrations, proto changes, or new binaries explicitly. PRs should include a concise summary, testing evidence (`make test-unit`, `make test-integration`, etc.), and screenshots or CLI transcripts when behavior changes. Request reviews from domain owners and ensure generated artifacts and docs stay in sync with code changes. diff --git a/CLAUDE.md b/CLAUDE.md index 10c2b2d..5d20ae4 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -63,7 +63,9 @@ Scripts and agents branch on them, so meanings are fixed (`internal/cli/exit.go` Features follow SpecLedger: **Specify → Clarify → Plan → Tasks → Review → Implement**, with artifacts under `specledger//` (spec, plan, tasks, quickstart, contracts, data-model). Quickstart scenarios are the acceptance contract (each maps to a `TestQuickstart_` integration test) and are written during planning. -**Read `AGENTS.md` before tracking work or committing.** It defines the two repo-specific operating rules this project enforces: (1) all work-item tracking goes through the built-in `sl issue` CLI (issues stored per-spec in `specledger//issues.jsonl`) — **never** ad-hoc markdown TODO lists; and (2) the commit/PR conventions (conventional prefixes, imperative ≤72-char subjects, testing evidence in PRs). It exists so task tracking and history stay in one git-friendly system rather than fragmenting across tools — consult it for the exact commands and the precise scope of each rule. +**Commit & PR conventions.** Conventional prefixes (`feat:`, `fix:`, `chore:`, `docs:`), imperative subjects ≤72 chars, scoped to the feature (e.g. `docs(002): …`). Reference related issues in the body; call out migrations / new binaries explicitly. PRs carry a concise summary + testing evidence (`make test-unit`, `make test-integration`) and a CLI transcript when behavior changes. + +**Work-item tracking.** The durable, team-visible record lives in the SpecLedger issue tracker — `sl issue`, stored per-spec in `specledger//issues.jsonl` (committed to git). The agent's in-session task list (the `Task*` tools) is an ephemeral execution aid, not a substitute for that committed record. diff --git a/docs/design/cli.md b/docs/design/cli.md index 709e55e..e63906b 100644 --- a/docs/design/cli.md +++ b/docs/design/cli.md @@ -40,8 +40,8 @@ skillrig — rig up your agents with skills (git-native skill distribution) Commands: search Query the origin's index.json for skills - add Vendor a skill into this repo + write the lock entry - verify Offline integrity + prereq check (exit code; CI gate) + add Vendor a skill into this repo + write the lock entry [implemented] + verify Offline integrity check — label-honesty (exit code; CI gate) [implemented] bump Detect upstream advance, open an upgrade PR global Manage global-scope skills (fetch/restore) doctor Superset health check (integrity + prereqs + auth) @@ -204,8 +204,8 @@ Exit codes are **load-bearing** here, not cosmetic: `verify` and `lint` are dete |------|---------| | 0 | Pass (including empty results) | | 1 | Usage / config error (bad args, no origin configured) | -| 2 | Verification failure — label-honesty mismatch, orphan (on-disk ≠ locked), or unresolved conflict markers | -| 3 | Prerequisite failure — a `[[requires]]` tool missing/unsatisfied or unauthenticated (fail in CI; may warn for humans) | +| 2 | Verification failure — label-honesty mismatch, orphan (on-disk ≠ locked), or unresolved conflict markers. **Active**: emitted by the implemented `verify`. | +| 3 | Prerequisite failure — a `[[requires]]` tool missing/unsatisfied or unauthenticated (fail in CI; may warn for humans). **Reserved**: not emitted today; lands with `doctor` (`verify` is integrity-only). | No duration metadata — `verify`/`search`/`lint` run offline against committed files; there is no per-command network cost to report. @@ -258,8 +258,8 @@ Every `skillrig` subcommand MUST identify which pattern(s) it follows. This clas | Pattern | Purpose | Examples | Constraints | |---------|---------|----------|-------------| | **Query** | Deterministic read of the discovery artifact | `search` | Offline. Reads committed `index.json`. Deterministic tag filtering — **no inference** (N6). | -| **Vendor Mutation** | Write skill tree + lock entry | `add`, `bump --pr` | Writes lock via `skillcore` only. Supports `--dry-run`; refuses to clobber content that diverges from the locked `treeSha` without `--force`. `bump` *proposes* (opens a PR), never force-adopts (R13). MUST never silently discard local edits (R32). | -| **Verification Gate** | Offline integrity / prereq / conformance | `verify`, `lint` | MUST be offline + deterministic. Exit-code driven. **No live/online signal in this path** (R11/N1). `verify` = consumer CI gate; `lint` = author CI gate on the origin. | +| **Vendor Mutation** | Write skill tree + lock entry | `add` *(implemented)*, `bump --pr` | Writes lock via `skillcore` only. Supports `--dry-run`; refuses to clobber content that diverges from the locked `treeSha` without `--force`. `bump` *proposes* (opens a PR), never force-adopts (R13). MUST never silently discard local edits (R32). | +| **Verification Gate** | Offline integrity / prereq / conformance | `verify` *(implemented — integrity-only)*, `lint` | MUST be offline + deterministic. Exit-code driven. **No live/online signal in this path** (R11/N1). `verify` = consumer CI gate; `lint` = author CI gate on the origin. As implemented, `verify` is **integrity-only** (label-honesty + orphan detection, exit 2); prerequisite/eligibility checks (a missing `[[requires]]` tool → exit 3) belong to the future `doctor`, so `verify` does not emit exit 3 today. | | **Environment** | Health, auth, config, bootstrap | `doctor`, `init` | MUST be idempotent. `doctor` checks prerequisite auth (R18); works without a fully-configured project. `init` is **consumer-side only** — binds to an *existing* origin, never bootstraps one (architecture §2d). | | **Global Management** | Fetch/restore user-scope skills | `global add`, `global verify` | Genuinely *fetches and materializes* (the restore mode project scope doesn't need, §3). Touches per-environment home dirs, never the repo's project lock (R8). | @@ -271,7 +271,7 @@ Each pattern has a distinct failure mode expectation: |---------|-------------| | **Query** | MUST fail with clear error + suggested fix (e.g. no origin → run `init`). | | **Vendor Mutation** | MUST validate origin + auth before fetching. Three-way-merge conflict → non-zero exit, write git-style conflict markers, instruct resolve-and-rerun (architecture §5b). Never discard local edits. | -| **Verification Gate** | MUST be deterministic pass/fail by exit code. Label-honesty mismatch = fail; orphan = fail; unresolved conflict markers = fail; prereq miss = fail in CI / may warn for humans. | +| **Verification Gate** | MUST be deterministic pass/fail by exit code. Label-honesty mismatch = fail (exit 2); orphan = fail (exit 2); unresolved conflict markers = fail. Prereq miss (exit 3) is reserved for the future `doctor` — the implemented `verify` is integrity-only and does not emit it. | | **Environment** | MUST be idempotent and safe to retry. MUST distinguish "tool missing" from "tool exists but unauthenticated" (R18). | | **Global Management** | MUST fetch/restore what's missing and report drift between the global lock and what's materialized. | @@ -331,9 +331,9 @@ return fmt.Errorf("add failed: %s\n→ Check the origin: 'cat .skillrig/config.t sha := myLocalTreeHash(dir) // in bump sha := someOtherHash(dir) // in verify -// Right: exactly ONE implementation, in internal/skillcore, called by -// verify, bump, AND doctor. Make it a hard internal boundary -// (architecture §2: "the two interfaces cannot diverge"). +// Right: exactly ONE implementation, in the public pkg/skillcore package +// (SDK-1), called by add, verify, AND future bump/doctor. Make it a +// hard boundary (architecture §2: "the two interfaces cannot diverge"). sha := skillcore.TreeSHA(dir) ``` @@ -378,9 +378,9 @@ Inside the CLI, there are two conceptual layers: └─────────────────────────────────────────────┘ ``` -The execution layer handles command routing, the shared `skillcore` primitives (tree-SHA computation, `skill.toml` / lock parsing), index comparison for `bump`, and lock I/O. The presentation layer formats output for the consumer (human or agent). These are not separate packages — they're a design concern within each command's `runXxx()` function. +The execution layer handles command routing, the shared `skillcore` primitives (tree-SHA computation, `skill.toml` / lock parsing), index comparison for `bump`, and lock I/O. The presentation layer formats output for the consumer (human or agent). The presentation/execution split itself is a design concern within each command's `runXxx()` function — but the integrity primitives are **not** inline: `skillcore` is a separate, importable **public package** (`pkg/skillcore`, per SDK-1), so third-party Go tools can build their own `add`/`verify` on the same primitives the CLI uses. -**Key rule**: Execution logic must not depend on output format. The same data path serves both `--json` and human output. And per AP-04, there is exactly one `skillcore` implementation of the integrity primitives — `verify`, `bump`, and `doctor` all dispatch to it so the gate can never diverge from what CI wrote. If an MCP surface for agents is ever added, it dispatches to `skillcore` too — never a parallel implementation (architecture §2). +**Key rule**: Execution logic must not depend on output format. The same data path serves both `--json` and human output. And per AP-04, there is exactly one `skillcore` implementation of the integrity primitives — the public `pkg/skillcore` package — that `add` and `verify` (and future `bump`/`doctor`) all dispatch to, so the gate can never diverge from what CI wrote. If an MCP surface for agents is ever added, it dispatches to `pkg/skillcore` too — never a parallel implementation (architecture §2). --- diff --git a/internal/cli/add.go b/internal/cli/add.go new file mode 100644 index 0000000..2be1813 --- /dev/null +++ b/internal/cli/add.go @@ -0,0 +1,175 @@ +package cli + +import ( + "errors" + "fmt" + + "github.com/spf13/cobra" + + "github.com/skillrig/cli/internal/config" + "github.com/skillrig/cli/pkg/skillcore" +) + +// addCmd holds the add command's flags and its injectable seams. Production uses +// the os-backed defaults; tests inject deterministic stubs (cwd, env). +type addCmd struct { + opts *globalOpts + skill string + dryRun bool + force bool + + // getwd returns the working directory. Defaults to os.Getwd. + getwd func() (string, error) + // env is the environment accessor used by the origin resolver. + env config.Env +} + +// newAddCmd builds the `skillrig add ` command (Vendor Mutation pattern): +// vendor a named skill from the repo's resolved origin into the canonical +// .agents/skills// and record its identity in the lock. +func newAddCmd(opts *globalOpts) *cobra.Command { + ac := &addCmd{ + opts: opts, + getwd: osGetwd, + env: config.OSEnv, + } + + cmd := &cobra.Command{ + Use: "add ", + Short: "Vendor a skill from your configured origin into .agents/skills/", + Long: "Vendor a named skill from this repo's configured origin into the canonical\n" + + ".agents/skills//, recording its identity (version, commit, tree-SHA, path)\n" + + "in .skillrig/skills-lock.json. add is offline and consume-only: it resolves the\n" + + "active origin (SKILLRIG_ORIGIN > project > global) exactly like every command and\n" + + "copies the skill byte-identically, injecting nothing.\n\n" + + "Local origin (this release): the configured origin OWNER/REPO is read from a local\n" + + "git checkout at ./OWNER/REPO — relative to the directory you run add from (your\n" + + "repo root) — not over the network. So `init --origin my-org/my-skills` expects that\n" + + "library checked out at ./my-org/my-skills; keep it out of your index (e.g. echo\n" + + "'my-org/' >> .git/info/exclude). Fetching a remote origin is a later, additive mode.\n\n" + + "add is idempotent on identical content and refuses to overwrite a vendored skill\n" + + "whose on-disk content diverges from the lock unless you pass --force. Requires a\n" + + "git repository; commit the result, then run skillrig verify.", + Example: " # Vendor a skill from your configured origin (a local checkout at ./OWNER/REPO)\n" + + " skillrig add terraform-plan-review\n\n" + + " # Preview what would be vendored, writing nothing\n" + + " skillrig add terraform-plan-review --dry-run\n\n" + + " # Overwrite a locally-diverged copy with the origin's content\n" + + " skillrig add terraform-plan-review --force", + Args: cobra.ExactArgs(1), + RunE: func(cmd *cobra.Command, args []string) error { + ac.skill = args[0] + + return ac.run(cmd) + }, + } + + cmd.Flags().BoolVar(&ac.dryRun, "dry-run", false, "report what would be vendored and recorded; write nothing") + cmd.Flags().BoolVar(&ac.force, "force", false, "overwrite a vendored skill whose on-disk content diverges from the lock") + + return cmd +} + +// run resolves the origin and the repo root, vendors the skill via skillcore, +// and renders the result. skillcore's typed errors are mapped to navigational +// *UsageError values (exit 1), preserving the raw cause for --verbose. +func (ac *addCmd) run(cmd *cobra.Command) error { + cwd, err := ac.getwd() + if err != nil { + return &UsageError{Msg: "cannot determine working directory\nwhy: " + err.Error(), Cause: err} + } + + res, err := config.ResolveOrigin(cwd, ac.env) + if err != nil { + return &UsageError{Msg: "cannot resolve the active origin\nwhy: " + err.Error() + "\n" + missingOriginFix, Cause: err} + } + + if res.Source == config.SourceNone { + return usageNoOriginConfigured() + } + + repoRoot, err := gitToplevel(cmd.Context(), cwd) + if err != nil { + return usageNotGitRepo(addNotGitRepoWhy, err) + } + + originDir, ref := originDirRef(res.Origin) + + result, err := skillcore.Add(skillcore.AddOptions{ + OriginDir: originDir, + Ref: ref, + Skill: ac.skill, + RepoRoot: repoRoot, + Origin: res.Origin.String(), + Force: ac.force, + DryRun: ac.dryRun, + }) + if err != nil { + return mapAddError(ac.skill, err) + } + + return renderAddResult(cmd.OutOrStdout(), result, ac.opts.json) +} + +// addNotGitRepoWhy is the project-scope rationale for add's not-a-repo error. +const addNotGitRepoWhy = "project-scope add vendors into the repo's canonical .agents/skills " + + "and writes a lock that verify checks against git" + +// originDirRef maps a resolved origin to the (local directory, ref) skillcore +// needs. This slice the origin value IS a local checkout path, so the OWNER/REPO +// portion is the directory and Ref (empty → HEAD) selects the revision. +func originDirRef(origin config.Origin) (dir, ref string) { + dir = config.Origin{Owner: origin.Owner, Repo: origin.Repo}.String() + + ref = origin.Ref + if ref == "" { + ref = "HEAD" + } + + return dir, ref +} + +// usageNoOriginConfigured builds the 3-part "no origin configured" usage error +// (contract add.md): what / why / fix. +func usageNoOriginConfigured() *UsageError { + return usageErrorf("no origin configured\n" + + "why: no SKILLRIG_ORIGIN / project / global origin\n" + + "fix: skillrig init --origin OWNER/REPO or set SKILLRIG_ORIGIN") +} + +// mapAddError maps skillcore's typed Add errors to navigational *UsageError +// values (exit 1), authoring the what/why/fix prose while preserving the raw +// cause for --verbose. An unexpected error is wrapped generically. +func mapAddError(skill string, err error) error { + var notFound *skillcore.SkillNotFoundError + if errors.As(err, ¬Found) { + return &UsageError{ + Msg: fmt.Sprintf("skill %q not found in origin\n", skill) + + "why: no skills/" + skill + "/ at the configured origin\n" + + "fix: check the skill name against the origin", + Cause: err, + } + } + + var overwrite *skillcore.OverwriteError + if errors.As(err, &overwrite) { + return &UsageError{ + Msg: fmt.Sprintf("refusing to overwrite %s\n", overwrite.Path) + + "why: on-disk content diverges from the recorded fingerprint\n" + + "fix: re-run with --force, or revert local edits", + Cause: err, + } + } + + var gitErr *skillcore.GitError + if errors.As(err, &gitErr) { + return &UsageError{ + Msg: "add failed: git error\n" + + "why: " + gitErr.Error() + "\n" + + "fix: ensure git works in this directory and the origin is a valid checkout", + Cause: err, + } + } + + return &UsageError{Msg: "add failed\nwhy: " + err.Error(), Cause: err} +} diff --git a/internal/cli/exit.go b/internal/cli/exit.go index 0dcca32..487339a 100644 --- a/internal/cli/exit.go +++ b/internal/cli/exit.go @@ -1,6 +1,11 @@ package cli -import "fmt" +import ( + "errors" + "fmt" + + "github.com/skillrig/cli/pkg/skillcore" +) // Load-bearing process exit codes (docs/design/cli.md). These are part of the // CLI contract: scripts and agents branch on them, so their meanings are fixed. @@ -10,9 +15,9 @@ const ( // ExitUsage signals a usage or configuration error (bad flags, invalid // origin, no origin configured, unwritable config). ExitUsage = 1 - // ExitVerification is reserved for a future verification failure (e.g. a - // `verify` command). Declared here so the meaning is stable; unused in this - // feature. + // ExitVerification signals a verification failure (`verify` found at least + // one mismatch / orphan / missing / dirty skill). Mapped from a + // *skillcore.VerifyFailure. ExitVerification = 2 // ExitPrereq is reserved for a future missing-prerequisite failure. // Declared here for stability; unused in this feature. @@ -44,13 +49,22 @@ func usageErrorf(format string, args ...any) *UsageError { return &UsageError{Msg: fmt.Sprintf(format, args...)} } -// exitCodeFor maps a returned error to a process exit code. nil → ExitOK; every -// error in this feature's surface (usage/config) → ExitUsage. Codes 2/3 are -// reserved for later commands and never returned here. +// exitCodeFor maps a returned error to a process exit code (a typed switch, so +// the gate's exit code can never diverge from the error class that produced it): +// - nil → ExitOK (0) +// - *skillcore.VerifyFailure → ExitVerification (2) +// - everything else → ExitUsage (1) +// +// Code 3 (prerequisite) is reserved for a future `doctor` and never returned. func exitCodeFor(err error) int { if err == nil { return ExitOK } + var verifyFail *skillcore.VerifyFailure + if errors.As(err, &verifyFail) { + return ExitVerification + } + return ExitUsage } diff --git a/internal/cli/output.go b/internal/cli/output.go index e21f6a6..e356d36 100644 --- a/internal/cli/output.go +++ b/internal/cli/output.go @@ -4,6 +4,9 @@ import ( "encoding/json" "fmt" "io" + "strings" + + "github.com/skillrig/cli/pkg/skillcore" ) // bindResult is the presentation-layer view of an init outcome. It is the @@ -41,3 +44,206 @@ func renderBindResult(w io.Writer, r bindResult, jsonOut bool) error { return err } + +// addResultJSON is the complete, untruncated --json view of an add. Keys are +// always present (contract add.md): ok,name,version,path,commit,treeSha,action, +// dryRun. It is the presentation-layer projection of skillcore.AddResult (which +// carries no JSON tags — skillcore stays presentation-free). +type addResultJSON struct { + OK bool `json:"ok"` + Name string `json:"name"` + Version string `json:"version"` + Path string `json:"path"` + Commit string `json:"commit"` + TreeSha string `json:"treeSha"` + Action string `json:"action"` + DryRun bool `json:"dryRun"` +} + +// addFooterHint is the next-step footer for a human add summary (cli.md +// Principle 3: confirmation + the command to run next). +const addFooterHint = "→ commit it, then run: skillrig verify" + +// renderAddResult writes an add outcome to w. With jsonOut it emits one complete +// JSON object (all keys present); otherwise a compact human summary (≤2 lines +// incl. the footer hint). Data goes to stdout (the caller passes +// cmd.OutOrStdout()). +func renderAddResult(w io.Writer, r skillcore.AddResult, jsonOut bool) error { + if jsonOut { + enc := json.NewEncoder(w) + enc.SetEscapeHTML(false) + + return enc.Encode(addResultJSON{ + OK: true, + Name: r.Name, + Version: r.Version, + Path: r.Path, + Commit: r.Commit, + TreeSha: r.TreeSha, + Action: string(r.Action), + DryRun: r.DryRun, + }) + } + + _, err := io.WriteString(w, addSummary(r)+"\n"+addFooterHint+"\n") + + return err +} + +// addSummary builds the one-line human confirmation for an add. The verb tracks +// the Action (and --dry-run): a fresh/forced placement names the destination + +// short tree-SHA; an idempotent re-add reports no change. +func addSummary(r skillcore.AddResult) string { + switch r.Action { + case skillcore.ActionUnchanged: + return fmt.Sprintf("%s@%s already vendored (no change)", r.Name, r.Version) + case skillcore.ActionOverwritten: + return fmt.Sprintf("overwrote %s@%s → %s (treeSha %s)", + r.Name, r.Version, r.Path, shortSha(r.TreeSha)) + case skillcore.ActionVendored: + fallthrough + default: + verb := "vendored" + if r.DryRun { + verb = "would vendor" + } + + return fmt.Sprintf("%s %s@%s → %s (treeSha %s)", + verb, r.Name, r.Version, r.Path, shortSha(r.TreeSha)) + } +} + +// verifyReportJSON is the complete, untruncated --json view of a verify report. +// Top-level keys ok,counts,verdicts are always present; counts carries all five +// fields and verdicts every checked skill with all six fields. It is the +// presentation projection of skillcore.Report (which carries no JSON tags). +type verifyReportJSON struct { + OK bool `json:"ok"` + Counts countsJSON `json:"counts"` + Verdicts []verdictJSON `json:"verdicts"` +} + +type countsJSON struct { + Verified int `json:"verified"` + Mismatch int `json:"mismatch"` + Orphan int `json:"orphan"` + Missing int `json:"missing"` + Dirty int `json:"dirty"` +} + +type verdictJSON struct { + Name string `json:"name"` + Path string `json:"path"` + Status string `json:"status"` + ExpectedTreeSha string `json:"expectedTreeSha"` + ActualTreeSha string `json:"actualTreeSha"` + Reason string `json:"reason"` +} + +// verifyOKFooter / verifyFailFooter are the two-level-output footer hints. +const ( + verifyOKFooter = "→ all match their recorded version" + verifyFailFooter = "→ inspect with: skillrig verify --json" +) + +// renderVerifyReport writes a verify report to w. With jsonOut it emits one +// complete JSON object (every checked skill present, all keys); otherwise a +// compact human summary whose line count is bounded by the number of findings +// plus a small constant. Data goes to stdout (the caller passes +// cmd.OutOrStdout()). +func renderVerifyReport(w io.Writer, r skillcore.Report, jsonOut bool) error { + if jsonOut { + enc := json.NewEncoder(w) + enc.SetEscapeHTML(false) + + return enc.Encode(verifyReportJSON{ + OK: r.OK, + Counts: countsJSON(r.Counts), + Verdicts: verdictsJSON(r.Verdicts), + }) + } + + if r.OK { + _, err := io.WriteString(w, fmt.Sprintf("verified %d skills ✓\n%s\n", + r.Counts.Verified, verifyOKFooter)) + + return err + } + + return renderVerifyFailure(w, r) +} + +// renderVerifyFailure writes the compact failing report: a header line, one line +// per failing verdict (passing ones are summarized by the header count), and the +// footer. Bounded by the number of findings (Constitution II). +func renderVerifyFailure(w io.Writer, r skillcore.Report) error { + total := len(r.Verdicts) + failed := total - r.Counts.Verified + + var b strings.Builder + + fmt.Fprintf(&b, "verify FAILED: %d of %d skills\n", failed, total) + + for _, v := range r.Verdicts { + if v.Status == skillcore.StatusOK { + continue + } + + fmt.Fprintf(&b, " ✗ %s %s\n", v.Name, verdictLine(v)) + } + + b.WriteString(verifyFailFooter + "\n") + + _, err := io.WriteString(w, b.String()) + + return err +} + +// verdictLine is the compact human explanation for one failing verdict. Mismatch +// shows the recorded vs on-disk short tree-SHAs; the rest use the skillcore +// reason verbatim (already a human-readable phrase). +func verdictLine(v skillcore.Verdict) string { + switch v.Status { + case skillcore.StatusMismatch: + return fmt.Sprintf("content mismatch (recorded %s, on-disk %s)", + shortSha(v.ExpectedTreeSha), shortSha(v.ActualTreeSha)) + case skillcore.StatusOrphan: + return "untracked (no lock entry)" + case skillcore.StatusMissing: + return "missing (locked but absent on disk and from HEAD)" + case skillcore.StatusDirty: + return "uncommitted or locally modified — commit before verifying" + default: + return v.Reason + } +} + +// verdictsJSON projects skillcore verdicts into the JSON view. It returns a +// non-nil empty slice so an empty repo serializes verdicts as [] (not null), +// matching the contract. +func verdictsJSON(vs []skillcore.Verdict) []verdictJSON { + out := make([]verdictJSON, 0, len(vs)) + for _, v := range vs { + out = append(out, verdictJSON{ + Name: v.Name, + Path: v.Path, + Status: v.Status, + ExpectedTreeSha: v.ExpectedTreeSha, + ActualTreeSha: v.ActualTreeSha, + Reason: v.Reason, + }) + } + + return out +} + +// shortSha trims a tree/commit SHA to git's conventional 7-char prefix for +// compact human output. Shorter strings (incl. empty) are returned unchanged. +func shortSha(sha string) string { + const short = 7 + if len(sha) <= short { + return sha + } + + return sha[:short] +} diff --git a/internal/cli/repo.go b/internal/cli/repo.go new file mode 100644 index 0000000..92dbdbb --- /dev/null +++ b/internal/cli/repo.go @@ -0,0 +1,63 @@ +package cli + +import ( + "context" + "errors" + "os" + "os/exec" + "strings" +) + +// osGetwd is the production working-directory accessor (a package-level seam so +// commands can inject a deterministic cwd in tests). +var osGetwd = os.Getwd + +// errNotGitRepo is the sentinel returned by gitToplevel when cwd is not inside a +// git work tree (git ran and reported so, or git is not installed). Commands map +// it to their own project-scope "not a git repository" usage error. +var errNotGitRepo = errors.New("not a git repository") + +// gitToplevel returns the absolute work-tree root for cwd via an offline +// `git rev-parse --show-toplevel` (reads local .git, no network). It is the one +// repo-root helper both add and verify dispatch to. A clean non-zero git exit +// (cwd is not a repository) or a missing git binary is reported as errNotGitRepo +// — a project-scope precondition the caller renders as navigation; any other +// failure (e.g. context cancellation) is returned verbatim so callers never +// silently treat an unexpected error as "not a repo". +func gitToplevel(ctx context.Context, cwd string) (string, error) { + cmd := exec.CommandContext(ctx, "git", "rev-parse", "--show-toplevel") + cmd.Dir = cwd + + out, err := cmd.Output() + if err != nil { + if ctx.Err() != nil { + return "", ctx.Err() + } + + var exitErr *exec.ExitError + if errors.As(err, &exitErr) || errors.Is(err, exec.ErrNotFound) { + return "", errNotGitRepo + } + + return "", err + } + + root := strings.TrimSpace(string(out)) + if root == "" { + return "", errNotGitRepo + } + + return root, nil +} + +// usageNotGitRepo builds the project-scope "not a git repository" usage error +// (exit 1) shared by add and verify. why states the command-specific rationale; +// the raw cause is preserved for --verbose. +func usageNotGitRepo(why string, cause error) *UsageError { + return &UsageError{ + Msg: "not a git repository\n" + + "why: " + why + "\n" + + "fix: run inside the repo (or git init first)", + Cause: cause, + } +} diff --git a/internal/cli/root.go b/internal/cli/root.go index 7b2c714..7868002 100644 --- a/internal/cli/root.go +++ b/internal/cli/root.go @@ -11,6 +11,8 @@ import ( "os" "github.com/spf13/cobra" + + "github.com/skillrig/cli/pkg/skillcore" ) // globalOpts holds the persistent, command-wide output flags. Shared by value @@ -56,7 +58,17 @@ func newRootCmd(opts *globalOpts) *cobra.Command { // renderError prints an error as navigation: the actionable what/why/fix // message, plus the raw cause when --verbose is set. Always to stderr; the // write itself is best-effort. +// +// A *skillcore.VerifyFailure is rendered to NOTHING here: the verify command +// already wrote the per-skill report to stdout (the report IS the message), so +// printing a generic "error:" line on stderr would double-report. The non-nil +// error still drives the exit code (2) via exitCodeFor. func renderError(w io.Writer, err error, verbose bool) { + var verifyFail *skillcore.VerifyFailure + if errors.As(err, &verifyFail) { + return + } + _, _ = io.WriteString(w, errorMessage(err, verbose)) } @@ -96,4 +108,6 @@ func Execute() int { // separate so each user story wires its command here as it lands. func registerSubcommands(root *cobra.Command, opts *globalOpts) { root.AddCommand(newInitCmd(opts)) + root.AddCommand(newAddCmd(opts)) + root.AddCommand(newVerifyCmd(opts)) } diff --git a/internal/cli/verify.go b/internal/cli/verify.go new file mode 100644 index 0000000..1adafd5 --- /dev/null +++ b/internal/cli/verify.go @@ -0,0 +1,107 @@ +package cli + +import ( + "errors" + + "github.com/spf13/cobra" + + "github.com/skillrig/cli/pkg/skillcore" +) + +// verifyCmd holds the verify command's flags and its injectable cwd seam. +type verifyCmd struct { + opts *globalOpts + + // getwd returns the working directory. Defaults to os.Getwd. + getwd func() (string, error) +} + +// newVerifyCmd builds the `skillrig verify` command (Verification Gate pattern): +// prove THIS repo's vendored skills match their recorded versions. Offline, +// deterministic, read-only, exit-code driven (0 ok / 1 usage / 2 failure). +func newVerifyCmd(opts *globalOpts) *cobra.Command { + vc := &verifyCmd{ + opts: opts, + getwd: osGetwd, + } + + cmd := &cobra.Command{ + Use: "verify", + Short: "Check THIS repo's vendored skills match their recorded versions (project scope)", + Long: "verify checks the PROJECT's vendored skills (.agents/skills) against the\n" + + "committed lock (.skillrig/skills-lock.json) — label-honesty (git tree-SHA)\n" + + "+ orphan/completeness — offline and deterministic. PROJECT-SCOPE: it verifies\n" + + "THIS repository, not global/user-scope skills. It is read-only (recomputes git\n" + + "tree-SHAs; writes nothing) and needs no origin and no network.\n\n" + + "Exit 0 ok / 1 usage / 2 verification failure.", + Example: " # Verify this repo's vendored skills match their recorded versions (project-scope CI gate)\n" + + " skillrig verify\n\n" + + " # Machine-readable per-skill verdicts for an agent / jq\n" + + " skillrig verify --json", + Args: cobra.NoArgs, + RunE: func(cmd *cobra.Command, _ []string) error { + return vc.run(cmd) + }, + } + + return cmd +} + +// run resolves the repo root, runs skillcore.Verify, renders the report to +// stdout, and returns the exit-driving error. A verification failure renders the +// report (so the report IS the message) and then returns the *VerifyFailure so +// exitCodeFor maps it to exit 2; a malformed lock / not-a-repo is a *UsageError +// (exit 1). +func (vc *verifyCmd) run(cmd *cobra.Command) error { + cwd, err := vc.getwd() + if err != nil { + return &UsageError{Msg: "cannot determine working directory\nwhy: " + err.Error(), Cause: err} + } + + repoRoot, err := gitToplevel(cmd.Context(), cwd) + if err != nil { + return usageNotGitRepo(verifyNotGitRepoWhy, err) + } + + report, err := skillcore.Verify(repoRoot) + if err != nil { + return vc.handleVerifyError(cmd, err) + } + + return renderVerifyReport(cmd.OutOrStdout(), report, vc.opts.json) +} + +// verifyNotGitRepoWhy is the rationale for verify's not-a-repo error. +const verifyNotGitRepoWhy = "tree-SHA recompute needs git" + +// handleVerifyError classifies skillcore.Verify's error. A *VerifyFailure is a +// per-skill finding: render the report to stdout (human or --json) and return +// the failure so exitCodeFor yields exit 2. A *LockError is a config/usage +// problem (exit 1); any other error is wrapped as a usage error. +func (vc *verifyCmd) handleVerifyError(cmd *cobra.Command, err error) error { + var verifyFail *skillcore.VerifyFailure + if errors.As(err, &verifyFail) { + if renderErr := renderVerifyReport(cmd.OutOrStdout(), verifyFail.Report, vc.opts.json); renderErr != nil { + return renderErr + } + + return verifyFail + } + + var lockErr *skillcore.LockError + if errors.As(err, &lockErr) { + return &UsageError{ + Msg: "cannot read .skillrig/skills-lock.json\n" + + "why: " + lockErr.Cause.Error() + "\n" + + "fix: check the file, or re-vendor with skillrig add", + Cause: err, + } + } + + var gitErr *skillcore.GitError + if errors.As(err, &gitErr) { + return usageNotGitRepo(verifyNotGitRepoWhy, err) + } + + return &UsageError{Msg: "verify failed\nwhy: " + err.Error(), Cause: err} +} diff --git a/pkg/skillcore/add.go b/pkg/skillcore/add.go new file mode 100644 index 0000000..54e08c7 --- /dev/null +++ b/pkg/skillcore/add.go @@ -0,0 +1,399 @@ +package skillcore + +import ( + "bytes" + "fmt" + "io" + "io/fs" + "os" + "path/filepath" +) + +// Action is the outcome of an Add: how the vendored tree changed. +type Action string + +const ( + // ActionVendored means the skill was newly written into the repo. + ActionVendored Action = "vendored" + // ActionUnchanged means an identical copy was already present (idempotent re-add). + ActionUnchanged Action = "unchanged" + // ActionOverwritten means a divergent copy was replaced under Force. + ActionOverwritten Action = "overwritten" +) + +// vendorRoot is the canonical, repo-relative root every skill is vendored under. +const vendorRoot = ".agents/skills" + +// AddOptions configures Add. The caller supplies an already-resolved local +// origin checkout (OriginDir + Ref); skillcore neither resolves origins, reads +// config, nor fetches. +type AddOptions struct { + OriginDir string + Ref string + Skill string + RepoRoot string + // Origin is the resolved origin reference (OWNER/REPO[@REF]) the CLI + // resolved this add from. skillcore records it verbatim in the lock's + // top-level origin field; it does not parse or resolve it (presentation- and + // resolution-free). Empty leaves any existing lock origin untouched. + Origin string + Force bool + DryRun bool +} + +// AddResult reports what Add did, for the CLI to render. +type AddResult struct { + Name string + Version string + Path string + Commit string + TreeSha string + Action Action + DryRun bool +} + +// SkillNotFoundError is returned when the requested skill has no skills// +// directory in the origin checkout. It is presentation-free (terse Error); the +// CLI maps it to a usage error (exit 1) and renders the what/why/fix prose. +type SkillNotFoundError struct { + Skill string +} + +func (e *SkillNotFoundError) Error() string { + return fmt.Sprintf("skill %q not found in origin", e.Skill) +} + +// OverwriteError is returned when the vendored skill already exists on disk with +// content that diverges from the recorded fingerprint and Force is not set. It +// is presentation-free (terse Error); the CLI maps it to a usage error (exit 1) +// and renders the "use --force" guidance. +type OverwriteError struct { + Skill string + Path string +} + +func (e *OverwriteError) Error() string { + return fmt.Sprintf("refusing to overwrite %q", e.Path) +} + +// Add vendors one skill from the local origin at opts.OriginDir into +// opts.RepoRoot's canonical .agents/skills//, byte-identical and +// mode-preserving, then writes/updates the lock. It refuses a divergent +// overwrite unless opts.Force, writes nothing when opts.DryRun, and is +// idempotent on identical content. +func Add(opts AddOptions) (AddResult, error) { + srcDir := filepath.Join(opts.OriginDir, "skills", opts.Skill) + + info, err := os.Stat(srcDir) + if err != nil || !info.IsDir() { + return AddResult{}, &SkillNotFoundError{Skill: opts.Skill} + } + + manifest, err := ParseManifest(filepath.Join(srcDir, "skill.toml")) + if err != nil { + return AddResult{}, err + } + + originRelPath := "skills/" + opts.Skill + + treeSha, err := TreeSHA(opts.OriginDir, opts.Ref, originRelPath) + if err != nil { + return AddResult{}, err + } + + commit, err := revParse(opts.OriginDir, opts.Ref) + if err != nil { + return AddResult{}, err + } + + destPath := vendorRoot + "/" + opts.Skill + destDir := filepath.Join(opts.RepoRoot, ".agents", "skills", opts.Skill) + + action, err := resolvePlacement(opts, srcDir, destDir, treeSha) + if err != nil { + return AddResult{}, err + } + + result := AddResult{ + Name: manifest.Name, + Version: manifest.Version, + Path: destPath, + Commit: commit, + TreeSha: treeSha, + Action: action, + DryRun: opts.DryRun, + } + + if opts.DryRun || action == ActionUnchanged { + return result, nil + } + + if action == ActionOverwritten { + if err := os.RemoveAll(destDir); err != nil { + return AddResult{}, fmt.Errorf("remove %s: %w", destDir, err) + } + } + + if err := copyTreePreservingModes(srcDir, destDir); err != nil { + return AddResult{}, err + } + + if err := writeLockEntry(opts.RepoRoot, manifest.Name, opts.Origin, result); err != nil { + return AddResult{}, err + } + + return result, nil +} + +// resolvePlacement inspects the destination and decides the Action without +// writing anything. A fresh placement is ActionVendored. An existing tree is +// ActionUnchanged (idempotent) only when its on-disk content is byte-identical +// to the origin source AND the lock records the matching fingerprint; any +// divergence is ActionOverwritten under Force, otherwise an *OverwriteError. +func resolvePlacement(opts AddOptions, srcDir, destDir, treeSha string) (Action, error) { + if _, err := os.Stat(destDir); err != nil { + if os.IsNotExist(err) { + return ActionVendored, nil + } + + return "", fmt.Errorf("stat %s: %w", destDir, err) + } + + lock, err := ReadLock(opts.RepoRoot) + if err != nil { + return "", err + } + + identical, err := treesIdentical(srcDir, destDir) + if err != nil { + return "", err + } + + recorded := lock.Skills[opts.Skill].TreeSha + if identical && recorded == treeSha { + return ActionUnchanged, nil + } + + if !opts.Force { + return "", &OverwriteError{ + Skill: opts.Skill, + Path: vendorRoot + "/" + opts.Skill, + } + } + + return ActionOverwritten, nil +} + +// treesIdentical reports whether the directory trees at a and b have the same +// set of relative paths with byte-identical regular-file contents. It is the +// read-only divergence probe for the placement guard (no git objects written, +// per research): the on-disk vendored tree is "unchanged" only when it still +// matches the origin source exactly. +func treesIdentical(a, b string) (bool, error) { + files := map[string]struct{}{} + + walkA, err := relFiles(a) + if err != nil { + return false, err + } + + walkB, err := relFiles(b) + if err != nil { + return false, err + } + + if len(walkA) != len(walkB) { + return false, nil + } + + for rel := range walkA { + files[rel] = struct{}{} + } + + for rel := range walkB { + if _, ok := files[rel]; !ok { + return false, nil + } + } + + for rel := range files { + same, err := filesEqual(filepath.Join(a, rel), filepath.Join(b, rel)) + if err != nil { + return false, err + } + + if !same { + return false, nil + } + } + + return true, nil +} + +// relFiles returns the set of regular-file paths under root, each relative to +// root. Directories are not listed (an empty directory is not part of git's +// content identity anyway). +func relFiles(root string) (map[string]struct{}, error) { + out := map[string]struct{}{} + + err := filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error { + if err != nil { + return err + } + + if d.IsDir() { + return nil + } + + rel, relErr := filepath.Rel(root, path) + if relErr != nil { + return relErr + } + + out[rel] = struct{}{} + + return nil + }) + if err != nil { + return nil, err + } + + return out, nil +} + +// filesEqual reports whether two files have identical mode bits and identical +// bytes. The exec bit is part of the tree-SHA, so a mode change alone is a +// divergence. +func filesEqual(a, b string) (bool, error) { + infoA, err := os.Stat(a) + if err != nil { + return false, fmt.Errorf("stat %s: %w", a, err) + } + + infoB, err := os.Stat(b) + if err != nil { + if os.IsNotExist(err) { + return false, nil + } + + return false, fmt.Errorf("stat %s: %w", b, err) + } + + if infoA.Mode().Perm() != infoB.Mode().Perm() || infoA.Size() != infoB.Size() { + return false, nil + } + + //nolint:gosec // G304: a and b are paths within the resolved origin/repo trees. + dataA, err := os.ReadFile(a) + if err != nil { + return false, fmt.Errorf("read %s: %w", a, err) + } + + //nolint:gosec // G304: see above; both paths are tool-controlled subtree members. + dataB, err := os.ReadFile(b) + if err != nil { + return false, fmt.Errorf("read %s: %w", b, err) + } + + return bytes.Equal(dataA, dataB), nil +} + +// writeLockEntry merges one skill's entry into the lock at repoRoot, preserving +// every other skill and pinning LockfileVersion to 1. It records origin (the +// resolved OWNER/REPO[@REF] the CLI vendored from) when non-empty, leaving any +// existing value in place otherwise. +func writeLockEntry(repoRoot, name, origin string, result AddResult) error { + lock, err := ReadLock(repoRoot) + if err != nil { + return err + } + + lock.LockfileVersion = 1 + if origin != "" { + lock.Origin = origin + } + + if lock.Skills == nil { + lock.Skills = map[string]LockEntry{} + } + + lock.Skills[name] = LockEntry{ + Version: result.Version, + Commit: result.Commit, + TreeSha: result.TreeSha, + Path: result.Path, + } + + return WriteLock(repoRoot, lock) +} + +// copyTreePreservingModes recursively copies src to dst byte-identically, +// preserving each file's mode (the exec bit is part of the tree SHA) and +// creating directories with 0o755. It injects nothing. +func copyTreePreservingModes(src, dst string) error { + return filepath.WalkDir(src, func(path string, d fs.DirEntry, err error) error { + if err != nil { + return err + } + + rel, err := filepath.Rel(src, path) + if err != nil { + return err + } + + target := filepath.Join(dst, rel) + + if d.IsDir() { + if err := os.MkdirAll(target, 0o750); err != nil { + return fmt.Errorf("create %s: %w", target, err) + } + + return nil + } + + return copyFilePreservingMode(path, target) + }) +} + +// copyFilePreservingMode copies a single regular file byte-identically and then +// chmods the destination to match the source's mode bits. +func copyFilePreservingMode(src, dst string) error { + info, err := os.Stat(src) + if err != nil { + return fmt.Errorf("stat %s: %w", src, err) + } + + //nolint:gosec // G304: src is a file within the resolved origin subtree. + in, err := os.Open(src) + if err != nil { + return fmt.Errorf("open %s: %w", src, err) + } + + defer func() { _ = in.Close() }() + + if err := os.MkdirAll(filepath.Dir(dst), 0o750); err != nil { + return fmt.Errorf("create %s: %w", filepath.Dir(dst), err) + } + + //nolint:gosec // G304: dst is within the repo's canonical .agents/skills root. + out, err := os.OpenFile(dst, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, info.Mode()) + if err != nil { + return fmt.Errorf("create %s: %w", dst, err) + } + + if _, err := io.Copy(out, in); err != nil { + _ = out.Close() + + return fmt.Errorf("copy %s: %w", dst, err) + } + + if err := out.Close(); err != nil { + return fmt.Errorf("close %s: %w", dst, err) + } + + if err := os.Chmod(dst, info.Mode()); err != nil { + return fmt.Errorf("chmod %s: %w", dst, err) + } + + return nil +} diff --git a/pkg/skillcore/add_test.go b/pkg/skillcore/add_test.go new file mode 100644 index 0000000..832b848 --- /dev/null +++ b/pkg/skillcore/add_test.go @@ -0,0 +1,221 @@ +package skillcore + +import ( + "errors" + "os" + "path/filepath" + "testing" +) + +// newConsumer returns a fresh tmpDir git repo to vendor into. +func newConsumer(t *testing.T) string { + t.Helper() + + dir := t.TempDir() + runGit(t, dir, "init", "-q") + + return dir +} + +// addOpts builds AddOptions for the bootstrapped origin + consumer. +func addOpts(originDir, skill, repoRoot string, force bool) AddOptions { + return AddOptions{ + OriginDir: originDir, + Ref: "HEAD", + Skill: skill, + RepoRoot: repoRoot, + Origin: "my-org/my-skills", + Force: force, + } +} + +// TestAdd_VendorsAndRecordsLock is the happy path: a first add writes the +// subtree byte-identically under .agents/skills// and records a lock entry +// whose treeSha equals the origin's raw-git tree-SHA (independent oracle, D11). +func TestAdd_VendorsAndRecordsLock(t *testing.T) { + t.Parallel() + + originDir, skill := bootstrapOrigin(t) + consumer := newConsumer(t) + + res, err := Add(addOpts(originDir, skill, consumer, false)) + if err != nil { + t.Fatalf("Add: %v", err) + } + + if res.Action != ActionVendored { + t.Errorf("Action = %q, want %q", res.Action, ActionVendored) + } + + wantTree := runGit(t, originDir, "rev-parse", "HEAD:skills/"+skill) + if res.TreeSha != wantTree { + t.Errorf("AddResult.TreeSha = %q, want (raw git) %q", res.TreeSha, wantTree) + } + + // The manifest was copied, not injected/altered. + vendored := filepath.Join(consumer, ".agents", "skills", skill, "skill.toml") + + got, err := os.ReadFile(vendored) + if err != nil { + t.Fatalf("read vendored manifest: %v", err) + } + + if string(got) != sampleManifest { + t.Error("vendored skill.toml is not byte-identical to the origin") + } + + // The lock records the same fingerprint and the configured origin. + lock, err := ReadLock(consumer) + if err != nil { + t.Fatalf("ReadLock: %v", err) + } + + entry, ok := lock.Skills[skill] + if !ok { + t.Fatalf("lock has no entry for %q (entries: %v)", skill, lock.Skills) + } + + if entry.TreeSha != wantTree { + t.Errorf("lock treeSha = %q, want %q", entry.TreeSha, wantTree) + } + + if lock.Origin != "my-org/my-skills" { + t.Errorf("lock origin = %q, want my-org/my-skills", lock.Origin) + } +} + +// TestAdd_Idempotent asserts re-adding identical content is a no-op: +// action=unchanged, same fingerprint, no error (FR idempotency). +func TestAdd_Idempotent(t *testing.T) { + t.Parallel() + + originDir, skill := bootstrapOrigin(t) + consumer := newConsumer(t) + + first, err := Add(addOpts(originDir, skill, consumer, false)) + if err != nil { + t.Fatalf("first Add: %v", err) + } + + second, err := Add(addOpts(originDir, skill, consumer, false)) + if err != nil { + t.Fatalf("second Add: %v", err) + } + + if second.Action != ActionUnchanged { + t.Errorf("second Action = %q, want %q", second.Action, ActionUnchanged) + } + + if second.TreeSha != first.TreeSha { + t.Errorf("treeSha drifted across idempotent adds: %q vs %q", second.TreeSha, first.TreeSha) + } +} + +// TestAdd_DivergentRefused asserts the never-silently-clobber guard (FR-004): +// once a vendored skill is locally modified it diverges from the lock, and a +// plain re-add must refuse with an *OverwriteError (the CLI maps it to exit 1). +func TestAdd_DivergentRefused(t *testing.T) { + t.Parallel() + + originDir, skill := bootstrapOrigin(t) + consumer := newConsumer(t) + + if _, err := Add(addOpts(originDir, skill, consumer, false)); err != nil { + t.Fatalf("seed Add: %v", err) + } + + // Diverge the vendored copy. + writeFile(t, consumer, filepath.Join(".agents/skills", skill, "SKILL.md"), 0o644, "tampered\n") + + _, err := Add(addOpts(originDir, skill, consumer, false)) + if err == nil { + t.Fatal("Add over divergent copy: want error, got nil") + } + + var oe *OverwriteError + if !errors.As(err, &oe) { + t.Fatalf("Add error = %T (%v), want *OverwriteError", err, err) + } +} + +// TestAdd_ForceOverwritesDivergent asserts --force re-vendors a divergent copy +// (action=overwritten) and restores byte-identical origin content. +func TestAdd_ForceOverwritesDivergent(t *testing.T) { + t.Parallel() + + originDir, skill := bootstrapOrigin(t) + consumer := newConsumer(t) + + if _, err := Add(addOpts(originDir, skill, consumer, false)); err != nil { + t.Fatalf("seed Add: %v", err) + } + + writeFile(t, consumer, filepath.Join(".agents/skills", skill, "SKILL.md"), 0o644, "tampered\n") + + res, err := Add(addOpts(originDir, skill, consumer, true)) + if err != nil { + t.Fatalf("forced Add: %v", err) + } + + if res.Action != ActionOverwritten { + t.Errorf("Action = %q, want %q", res.Action, ActionOverwritten) + } + + restored, err := os.ReadFile( + filepath.Join(consumer, ".agents", "skills", skill, "SKILL.md")) + if err != nil { + t.Fatalf("read restored SKILL.md: %v", err) + } + + if string(restored) != sampleSkillMd { + t.Errorf("forced overwrite did not restore origin content: got %q", restored) + } +} + +// TestAdd_SkillNotFound asserts a request for an absent origin skill returns a +// *SkillNotFoundError (CLI → exit 1), not a panic or generic error. +func TestAdd_SkillNotFound(t *testing.T) { + t.Parallel() + + originDir, _ := bootstrapOrigin(t) + consumer := newConsumer(t) + + _, err := Add(addOpts(originDir, "no-such-skill", consumer, false)) + if err == nil { + t.Fatal("Add(missing skill): want error, got nil") + } + + var nf *SkillNotFoundError + if !errors.As(err, &nf) { + t.Fatalf("Add error = %T (%v), want *SkillNotFoundError", err, err) + } +} + +// TestAdd_DryRunWritesNothing asserts --dry-run reports the action but leaves no +// vendored files and no lock on disk. +func TestAdd_DryRunWritesNothing(t *testing.T) { + t.Parallel() + + originDir, skill := bootstrapOrigin(t) + consumer := newConsumer(t) + + opts := addOpts(originDir, skill, consumer, false) + opts.DryRun = true + + res, err := Add(opts) + if err != nil { + t.Fatalf("dry-run Add: %v", err) + } + + if !res.DryRun { + t.Error("AddResult.DryRun = false, want true") + } + + if _, err := os.Stat(filepath.Join(consumer, ".agents", "skills", skill)); !os.IsNotExist(err) { + t.Error("dry-run vendored files on disk; want nothing written") + } + + if _, err := os.Stat(filepath.Join(consumer, ".skillrig", "skills-lock.json")); !os.IsNotExist(err) { + t.Error("dry-run wrote a lock file; want nothing written") + } +} diff --git a/pkg/skillcore/errors.go b/pkg/skillcore/errors.go new file mode 100644 index 0000000..6e8000c --- /dev/null +++ b/pkg/skillcore/errors.go @@ -0,0 +1,50 @@ +package skillcore + +import ( + "fmt" + "strconv" +) + +// VerifyFailure is returned by Verify when at least one verdict is not ok. It +// carries the full Report so callers can render it their own way and choose +// their own exit policy. It is presentation-free: Error is terse. +// +//nolint:errname // name fixed by the skillcore SDK contract (contracts/skillcore-sdk.md); a report-carrying failure, not an "XxxError". +type VerifyFailure struct { + Report Report +} + +func (e *VerifyFailure) Error() string { + return "skillcore: verification failed (" + + strconv.Itoa(len(e.Report.Verdicts)) + " verdicts)" +} + +// LockError is returned when the lock file cannot be read or is malformed +// (unreadable, unparseable, or an unsupported lockfileVersion). It is a +// configuration/usage problem — the CLI maps it to exit 1 — distinct from a +// *VerifyFailure, which reports per-skill findings (exit 2). It is +// presentation-free: it carries the lock path and the raw underlying cause. +type LockError struct { + Path string + Cause error +} + +func (e *LockError) Error() string { + return fmt.Sprintf("skillcore: invalid lock %q: %v", e.Path, e.Cause) +} + +func (e *LockError) Unwrap() error { + return e.Cause +} + +// GitError is returned when a git invocation fails. It carries the process exit +// code and captured stderr, mirroring the gh/git client pattern, so the caller +// can render an environment error. It is presentation-free. +type GitError struct { + ExitCode int + Stderr string +} + +func (e *GitError) Error() string { + return fmt.Sprintf("git failed (exit %d): %s", e.ExitCode, e.Stderr) +} diff --git a/pkg/skillcore/git.go b/pkg/skillcore/git.go new file mode 100644 index 0000000..14a6c16 --- /dev/null +++ b/pkg/skillcore/git.go @@ -0,0 +1,85 @@ +package skillcore + +import ( + "bytes" + "context" + "errors" + "os/exec" + "strings" +) + +// gitClient shells the git binary for skillcore's integrity primitives. It is +// modeled on gh-cli's git.Client (research D7): a small, testable wrapper whose +// command constructor is a pluggable field so unit tests can swap in a stub +// while integration tests run real git. It captures stdout/stderr into buffers +// and never writes to os.Stdout/os.Stderr — the CLI owns all presentation. +type gitClient struct { + // commandContext builds the *exec.Cmd for a git invocation. It defaults to + // exec.CommandContext; tests override it to return a stubbed command. + commandContext func(ctx context.Context, name string, args ...string) *exec.Cmd +} + +// newGitClient returns a gitClient that shells the real git binary. +func newGitClient() *gitClient { + return &gitClient{commandContext: exec.CommandContext} +} + +// run invokes git with args, capturing stdout and stderr into buffers. On a +// non-zero exit it returns a *GitError carrying the exit code and trimmed +// stderr; on success it returns the trimmed stdout. +func (c *gitClient) run(ctx context.Context, args ...string) (string, error) { + var stdout, stderr bytes.Buffer + + cmd := c.commandContext(ctx, "git", args...) + cmd.Stdout = &stdout + cmd.Stderr = &stderr + + if err := cmd.Run(); err != nil { + // A non-zero git exit surfaces as *exec.ExitError carrying the code; + // any other failure (e.g. git not on PATH) has no exit code, so we + // record -1 to signal "git could not be run". + exitCode := -1 + + var exitErr *exec.ExitError + if errors.As(err, &exitErr) { + exitCode = exitErr.ExitCode() + } + + return "", &GitError{ + ExitCode: exitCode, + Stderr: strings.TrimSpace(stderr.String()), + } + } + + return strings.TrimSpace(stdout.String()), nil +} + +// revParse runs `git -C rev-parse ` and returns the trimmed +// output (e.g. a resolved commit or tree SHA). +func (c *gitClient) revParse(gitDir, rev string) (string, error) { + return c.run(context.Background(), "-C", gitDir, "rev-parse", rev) +} + +// statusPorcelain runs `git -C status --porcelain -- ` and +// returns the trimmed output. Empty output means relPath is clean versus HEAD. +func (c *gitClient) statusPorcelain(gitDir, relPath string) (string, error) { + return c.run( + context.Background(), + "-C", gitDir, + "status", "--porcelain", + "--", relPath, + ) +} + +// revParse runs `git -C rev-parse ` using the default client (the +// real git binary). It is the package-level entry point TreeSHA dispatches to; +// the client method underneath stays pluggable for skillcore's own unit tests. +func revParse(gitDir, rev string) (string, error) { + return newGitClient().revParse(gitDir, rev) +} + +// statusPorcelain runs `git -C status --porcelain -- ` using +// the default client. Verify dispatches to it to detect uncommitted changes. +func statusPorcelain(gitDir, relPath string) (string, error) { + return newGitClient().statusPorcelain(gitDir, relPath) +} diff --git a/pkg/skillcore/helpers_test.go b/pkg/skillcore/helpers_test.go new file mode 100644 index 0000000..7f5fc40 --- /dev/null +++ b/pkg/skillcore/helpers_test.go @@ -0,0 +1,156 @@ +package skillcore + +import ( + "context" + "fmt" + "os" + "os/exec" + "path/filepath" + "strconv" + "strings" + "testing" +) + +// pinnedGitEnv is a fully reproducible author/committer identity+date (research +// D8): with it the commit SHA is deterministic; tests that only need a +// well-formed commit can ignore it, but pinning keeps the fixtures stable. +func pinnedGitEnv() []string { + const stamp = "2026-01-01T00:00:00Z" + + return append(os.Environ(), + "GIT_AUTHOR_NAME=skillrig", + "GIT_AUTHOR_EMAIL=ci@skillrig.dev", + "GIT_AUTHOR_DATE="+stamp, + "GIT_COMMITTER_NAME=skillrig", + "GIT_COMMITTER_EMAIL=ci@skillrig.dev", + "GIT_COMMITTER_DATE="+stamp, + // Neutralize any ambient user config so the fixture is hermetic. + "GIT_CONFIG_GLOBAL=/dev/null", + "GIT_CONFIG_SYSTEM=/dev/null", + ) +} + +// runGit execs the real git binary in dir with the pinned identity and fails +// the test on any error. It is the independent oracle (research D11): the tests +// drive setup and compute expected values through raw git, never through +// skillcore, so a TreeSHA bug cannot hide behind matching-but-wrong output. +func runGit(t *testing.T, dir string, args ...string) string { + t.Helper() + + cmd := exec.CommandContext(context.Background(), "git", args...) + cmd.Dir = dir + cmd.Env = pinnedGitEnv() + + out, err := cmd.CombinedOutput() + if err != nil { + t.Fatalf("git %s: %v\n%s", strings.Join(args, " "), err, out) + } + + return strings.TrimSpace(string(out)) +} + +// writeFile writes content under dir/rel (creating parents) with the given +// mode, failing the test on error. +func writeFile(t *testing.T, dir, rel string, mode os.FileMode, content string) { + t.Helper() + + path := filepath.Join(dir, rel) + if err := os.MkdirAll(filepath.Dir(path), 0o755); err != nil { + t.Fatalf("mkdir %s: %v", filepath.Dir(path), err) + } + + if err := os.WriteFile(path, []byte(content), mode); err != nil { + t.Fatalf("write %s: %v", path, err) + } +} + +// sampleManifest is a representative skill.toml carrying [[requires]] and an +// UNKNOWN key, mirroring the data-model sample. Content is illustrative (D12): +// the tests recompute every expected SHA independently, so it may change freely. +const sampleManifest = `name = "terraform-plan-review" +version = "1.4.0" +namespace = "dev.skillrig.samples" +description = "Review a terraform plan" +tags = ["terraform", "review"] + +# Unknown top-level key — must be ignored (forward-compat). +experimental = true + +[[requires]] +tool = "terraform" +version = ">=1.5" +source = "https://releases.hashicorp.com" +manager = "asdf" + +[[requires]] +tool = "tflint" +version = "0.50.0" +# Unknown per-require key — must also be ignored. +optional = true +` + +const sampleSkillMd = "# terraform-plan-review\n\nReview a terraform plan.\n" + +// bootstrapOrigin creates a real git repo in a fresh tmpDir containing a single +// committed skill at skills//, returning the repo dir and skill name. The +// commit is reproducible (pinned identity), and the subtree tree-SHA is +// content-only and therefore deterministic. +func bootstrapOrigin(t *testing.T) (dir, skill string) { + t.Helper() + + dir = t.TempDir() + skill = "terraform-plan-review" + + runGit(t, dir, "init", "-q") + + writeFile(t, dir, filepath.Join("skills", skill, "SKILL.md"), 0o644, sampleSkillMd) + writeFile(t, dir, filepath.Join("skills", skill, "skill.toml"), 0o644, sampleManifest) + // A non-skill origin file, to prove add ignores everything outside skills/. + writeFile(t, dir, ".skillrig-origin.toml", 0o644, "convention_version = 1\nskills_dir = \"skills\"\n") + + runGit(t, dir, "add", "-A") + runGit(t, dir, "commit", "-q", "-m", "seed sample skill") + + return dir, skill +} + +// The helper-process pattern (Go stdlib's os/exec test idiom): TestHelperProcess +// is a real test that, when invoked with GO_WANT_HELPER_PROCESS=1, impersonates +// the git binary. stubCommandContext returns a commandContext that re-execs the +// test binary into this function, letting unit tests simulate an exact git exit +// code + stderr without a real git invocation. +func stubCommandContext(exitCode int, stderr string) func(ctx context.Context, name string, args ...string) *exec.Cmd { + return func(ctx context.Context, _ string, args ...string) *exec.Cmd { + // Re-exec this test binary, routing into TestHelperProcess. + csArgs := append([]string{"-test.run=TestHelperProcess", "--"}, args...) + + cmd := exec.CommandContext(ctx, os.Args[0], csArgs...) + cmd.Env = []string{ + "GO_WANT_HELPER_PROCESS=1", + "HELPER_EXIT_CODE=" + strconv.Itoa(exitCode), + "HELPER_STDERR=" + stderr, + } + + return cmd + } +} + +// TestHelperProcess is not a real test: it is the stub "git" binary re-exec'd by +// stubCommandContext. It writes HELPER_STDERR to stderr and exits with +// HELPER_EXIT_CODE, so the gitClient under test sees a genuine *exec.ExitError. +// It must be named TestHelperProcess (matched by -test.run) and must not call +// t.Parallel — it impersonates git and calls os.Exit. +// +//nolint:paralleltest // not a real test; the os/exec helper-process re-exec target. +func TestHelperProcess(_ *testing.T) { + if os.Getenv("GO_WANT_HELPER_PROCESS") != "1" { + return + } + + if msg := os.Getenv("HELPER_STDERR"); msg != "" { + fmt.Fprint(os.Stderr, msg) + } + + code, _ := strconv.Atoi(os.Getenv("HELPER_EXIT_CODE")) + os.Exit(code) +} diff --git a/pkg/skillcore/lock.go b/pkg/skillcore/lock.go new file mode 100644 index 0000000..43939ac --- /dev/null +++ b/pkg/skillcore/lock.go @@ -0,0 +1,114 @@ +package skillcore + +import ( + "encoding/json" + "errors" + "fmt" + "io/fs" + "os" + "path/filepath" +) + +// lockDirName is the per-repo directory holding the lock file. +const lockDirName = ".skillrig" + +// lockFileName is the tool-written lock file inside lockDirName. +const lockFileName = "skills-lock.json" + +// LockFile is the parsed .skillrig/skills-lock.json: the committed, +// tool-written record of every vendored skill. It carries no requires data +// (the manifest owns that, D4). +type LockFile struct { + LockfileVersion int `json:"lockfileVersion"` + Origin string `json:"origin"` + Skills map[string]LockEntry `json:"skills"` +} + +// LockEntry is the locked record for one vendored skill, keyed by skill name. +// Note: no requires field (D4) — the on-disk manifest is the single source of +// truth for dependencies. +type LockEntry struct { + Version string `json:"version"` + Commit string `json:"commit"` + TreeSha string `json:"treeSha"` + Path string `json:"path"` +} + +// lockPath returns the lock file path for a repo root. +func lockPath(repoRoot string) string { + return filepath.Join(repoRoot, lockDirName, lockFileName) +} + +// ReadLock reads the lock at repoRoot's .skillrig/skills-lock.json. An absent +// file is not an error: it returns a zero LockFile and a nil error. A present +// file is JSON-unmarshalled. +func ReadLock(repoRoot string) (LockFile, error) { + path := lockPath(repoRoot) + + //nolint:gosec // G304: path is built from repoRoot + fixed names, not + // attacker-controlled; reading the designated lock file is the point. + data, err := os.ReadFile(path) + if err != nil { + if errors.Is(err, fs.ErrNotExist) { + return LockFile{}, nil + } + + return LockFile{}, fmt.Errorf("read %s: %w", path, err) + } + + var lf LockFile + if err := json.Unmarshal(data, &lf); err != nil { + return LockFile{}, fmt.Errorf("parse %s: %w", path, err) + } + + return lf, nil +} + +// WriteLock writes lf to repoRoot's .skillrig/skills-lock.json with +// deterministic serialization (Go sorts map keys; 2-space indent; trailing +// newline) via an atomic temp-file-plus-rename. The temp file lives in the same +// directory so os.Rename stays on one filesystem, mirroring internal/config.Save. +func WriteLock(repoRoot string, lf LockFile) error { + data, err := json.MarshalIndent(lf, "", " ") + if err != nil { + return fmt.Errorf("encode lock: %w", err) + } + + data = append(data, '\n') + + path := lockPath(repoRoot) + dir := filepath.Dir(path) + + if err := os.MkdirAll(dir, 0o750); err != nil { + return fmt.Errorf("create %s: %w", dir, err) + } + + tmp, err := os.CreateTemp(dir, lockFileName+".tmp-*") + if err != nil { + return fmt.Errorf("create temp in %s: %w", dir, err) + } + + tmpName := tmp.Name() + // Best-effort cleanup if we bail before the rename. + defer func() { _ = os.Remove(tmpName) }() + + if _, err := tmp.Write(data); err != nil { + _ = tmp.Close() + + return fmt.Errorf("write %s: %w", tmpName, err) + } + + if err := tmp.Close(); err != nil { + return fmt.Errorf("close %s: %w", tmpName, err) + } + + if err := os.Chmod(tmpName, 0o600); err != nil { + return fmt.Errorf("chmod %s: %w", tmpName, err) + } + + if err := os.Rename(tmpName, path); err != nil { + return fmt.Errorf("install %s: %w", path, err) + } + + return nil +} diff --git a/pkg/skillcore/lock_test.go b/pkg/skillcore/lock_test.go new file mode 100644 index 0000000..58c7ae5 --- /dev/null +++ b/pkg/skillcore/lock_test.go @@ -0,0 +1,131 @@ +package skillcore + +import ( + "os" + "path/filepath" + "reflect" + "strings" + "testing" +) + +// sampleLock is the data-model's ground-truth lock example: a single skill with +// a 40-hex commit + treeSha and a repo-relative path under .agents/skills. +func sampleLock() LockFile { + return LockFile{ + LockfileVersion: 1, + Origin: "my-org/my-skills", + Skills: map[string]LockEntry{ + "terraform-plan-review": { + Version: "1.4.0", + Commit: "9f1a052e596d5d28f13838061a1ab93207ef6fc3", + TreeSha: "c967789527370d2e0fba03a92e70dffef6f3bf31", + Path: ".agents/skills/terraform-plan-review", + }, + }, + } +} + +// TestLock_RoundTrip writes a lock, reads it back, and asserts equality — the +// committed, tool-written record must survive serialization losslessly. +func TestLock_RoundTrip(t *testing.T) { + t.Parallel() + + repoRoot := t.TempDir() + want := sampleLock() + + if err := WriteLock(repoRoot, want); err != nil { + t.Fatalf("WriteLock: %v", err) + } + + got, err := ReadLock(repoRoot) + if err != nil { + t.Fatalf("ReadLock: %v", err) + } + + if !reflect.DeepEqual(got, want) { + t.Errorf("round-trip mismatch:\n got = %+v\nwant = %+v", got, want) + } +} + +// TestLock_OnDiskShape pins the serialization contract (data-model §LockFile): +// 2-space indentation, a trailing newline, and crucially NO "requires" key — the +// manifest is the single source of truth for dependencies (D4). +func TestLock_OnDiskShape(t *testing.T) { + t.Parallel() + + repoRoot := t.TempDir() + if err := WriteLock(repoRoot, sampleLock()); err != nil { + t.Fatalf("WriteLock: %v", err) + } + + path := filepath.Join(repoRoot, ".skillrig", "skills-lock.json") + + raw, err := os.ReadFile(path) + if err != nil { + t.Fatalf("read lock: %v", err) + } + + text := string(raw) + + if !strings.HasSuffix(text, "\n") { + t.Error("lock file does not end with a trailing newline") + } + + if strings.Contains(text, "\t") { + t.Error("lock file contains a tab; indentation must be 2 spaces") + } + + // The first nested object key must be indented by exactly two spaces. + if !strings.Contains(text, "\n \"origin\"") { + t.Errorf("lock file is not 2-space indented:\n%s", text) + } + + if strings.Contains(strings.ToLower(text), "requires") { + t.Errorf("lock file leaks a 'requires' key (D4: manifest owns deps):\n%s", text) + } +} + +// TestReadLock_Absent codifies the contract that a missing lock is not an error: +// ReadLock returns a zero LockFile and a nil error so first-add flows just work. +func TestReadLock_Absent(t *testing.T) { + t.Parallel() + + got, err := ReadLock(t.TempDir()) + if err != nil { + t.Fatalf("ReadLock(absent): unexpected error: %v", err) + } + + if !reflect.DeepEqual(got, LockFile{}) { + t.Errorf("ReadLock(absent) = %+v, want zero LockFile", got) + } +} + +// TestWriteLock_NoPartialTempFile asserts the atomic-write discipline leaves no +// .tmp-* sibling behind after a successful WriteLock (temp file + rename). +func TestWriteLock_NoPartialTempFile(t *testing.T) { + t.Parallel() + + repoRoot := t.TempDir() + if err := WriteLock(repoRoot, sampleLock()); err != nil { + t.Fatalf("WriteLock: %v", err) + } + + dir := filepath.Join(repoRoot, ".skillrig") + + entries, err := os.ReadDir(dir) + if err != nil { + t.Fatalf("read .skillrig: %v", err) + } + + var names []string + for _, e := range entries { + names = append(names, e.Name()) + if strings.Contains(e.Name(), ".tmp-") { + t.Errorf("leftover temp file after WriteLock: %q", e.Name()) + } + } + + if len(names) != 1 || names[0] != "skills-lock.json" { + t.Errorf(".skillrig contents = %v, want exactly [skills-lock.json]", names) + } +} diff --git a/pkg/skillcore/manifest.go b/pkg/skillcore/manifest.go new file mode 100644 index 0000000..7545752 --- /dev/null +++ b/pkg/skillcore/manifest.go @@ -0,0 +1,46 @@ +package skillcore + +import ( + "fmt" + "os" + + "github.com/pelletier/go-toml/v2" +) + +// Manifest is a parsed skill.toml. It is read-only: ParseManifest reads it and +// Add/Verify consume Name/Version, but skillcore never writes it back. +type Manifest struct { + Name string `toml:"name"` + Version string `toml:"version"` + Namespace string `toml:"namespace"` + Description string `toml:"description"` + Tags []string `toml:"tags"` + Requires []Require `toml:"requires"` +} + +// Require is one tool dependency declared in a skill.toml. It is parsed but is +// deliberately NOT written to the lock — the on-disk manifest stays the single +// source of truth, read later by doctor. +type Require struct { + Tool string `toml:"tool"` + Version string `toml:"version"` + Source string `toml:"source"` + Manager string `toml:"manager"` +} + +// ParseManifest parses the skill.toml at path. Unknown keys are ignored for +// forward-compatibility (default Unmarshal — strict mode is deliberately off). +func ParseManifest(path string) (Manifest, error) { + //nolint:gosec // G304: path is a skill.toml within the resolved origin/repo subtree. + data, err := os.ReadFile(path) + if err != nil { + return Manifest{}, fmt.Errorf("reading skill manifest %q: %w", path, err) + } + + var m Manifest + if err := toml.Unmarshal(data, &m); err != nil { + return Manifest{}, fmt.Errorf("parsing skill manifest %q: %w", path, err) + } + + return m, nil +} diff --git a/pkg/skillcore/manifest_test.go b/pkg/skillcore/manifest_test.go new file mode 100644 index 0000000..835c730 --- /dev/null +++ b/pkg/skillcore/manifest_test.go @@ -0,0 +1,97 @@ +package skillcore + +import ( + "path/filepath" + "reflect" + "testing" +) + +// TestParseManifest_RequiresAndUnknownKeys asserts the parse contract: a +// skill.toml with [[requires]] array-of-tables AND unknown keys (both top-level +// and per-require) parses into the expected Manifest, ignoring the unknowns with +// no error (forward-compat — strict mode is deliberately off). +func TestParseManifest_RequiresAndUnknownKeys(t *testing.T) { + t.Parallel() + + dir := t.TempDir() + writeFile(t, dir, "skill.toml", 0o644, sampleManifest) + + got, err := ParseManifest(filepath.Join(dir, "skill.toml")) + if err != nil { + t.Fatalf("ParseManifest: unexpected error: %v", err) + } + + want := Manifest{ + Name: "terraform-plan-review", + Version: "1.4.0", + Namespace: "dev.skillrig.samples", + Description: "Review a terraform plan", + Tags: []string{"terraform", "review"}, + Requires: []Require{ + { + Tool: "terraform", + Version: ">=1.5", + Source: "https://releases.hashicorp.com", + Manager: "asdf", + }, + { + Tool: "tflint", + Version: "0.50.0", + }, + }, + } + + if !reflect.DeepEqual(got, want) { + t.Errorf("ParseManifest mismatch:\n got = %+v\nwant = %+v", got, want) + } +} + +// TestParseManifest_Minimal confirms a manifest with no [[requires]] parses to a +// nil/empty Requires slice (add only needs name + version). +func TestParseManifest_Minimal(t *testing.T) { + t.Parallel() + + dir := t.TempDir() + writeFile(t, dir, "skill.toml", 0o644, "name = \"solo\"\nversion = \"0.1.0\"\n") + + got, err := ParseManifest(filepath.Join(dir, "skill.toml")) + if err != nil { + t.Fatalf("ParseManifest: %v", err) + } + + if got.Name != "solo" || got.Version != "0.1.0" { + t.Errorf("name/version = %q/%q, want solo/0.1.0", got.Name, got.Version) + } + + if len(got.Requires) != 0 { + t.Errorf("Requires = %+v, want empty", got.Requires) + } +} + +// TestParseManifest_Errors covers the failure surface: a missing file and +// malformed TOML must both return an error (the CLI renders it; the SDK only +// returns it). +func TestParseManifest_Errors(t *testing.T) { + t.Parallel() + + t.Run("missing file", func(t *testing.T) { + t.Parallel() + + _, err := ParseManifest(filepath.Join(t.TempDir(), "absent.toml")) + if err == nil { + t.Fatal("ParseManifest(absent): want error, got nil") + } + }) + + t.Run("malformed toml", func(t *testing.T) { + t.Parallel() + + dir := t.TempDir() + writeFile(t, dir, "skill.toml", 0o644, "name = \"x\"\nthis is = = not toml\n") + + _, err := ParseManifest(filepath.Join(dir, "skill.toml")) + if err == nil { + t.Fatal("ParseManifest(malformed): want error, got nil") + } + }) +} diff --git a/pkg/skillcore/treesha.go b/pkg/skillcore/treesha.go new file mode 100644 index 0000000..3e84b2d --- /dev/null +++ b/pkg/skillcore/treesha.go @@ -0,0 +1,25 @@ +// Package skillcore is the single, presentation-free implementation of +// skillrig's integrity primitives: git tree-SHA, skill.toml parsing, +// skills-lock.json I/O, and the Add/Verify operations. It returns typed +// values and typed errors and never writes to stdout/stderr or formats +// user-facing text — the CLI (or any third-party consumer) owns presentation. +// It never fetches: it operates purely on the local filesystem and local git. +package skillcore + +import "strings" + +// TreeSHA returns the git tree-object SHA of relPath at ref within the git +// repository rooted at gitDir, by shelling git rev-parse :. +// The value is git-canonical and relocation-invariant: it depends only on the +// subtree contents, so the SHA Add records on the origin equals the SHA Verify +// recomputes on the consumer's committed tree. relPath is repo-relative +// ("skills/foo" on the origin, ".agents/skills/foo" on the consumer) and must +// resolve to a directory (a skill subtree). Returns a *GitError when git fails. +func TreeSHA(gitDir, ref, relPath string) (string, error) { + sha, err := revParse(gitDir, ref+":"+relPath) + if err != nil { + return "", err + } + + return strings.TrimSpace(sha), nil +} diff --git a/pkg/skillcore/treesha_test.go b/pkg/skillcore/treesha_test.go new file mode 100644 index 0000000..63ebb44 --- /dev/null +++ b/pkg/skillcore/treesha_test.go @@ -0,0 +1,166 @@ +package skillcore + +import ( + "context" + "errors" + "path/filepath" + "regexp" + "testing" +) + +// hex40 matches a git SHA-1 (40 lowercase hex chars). +var hex40 = regexp.MustCompile(`^[0-9a-f]{40}$`) + +// TestTreeSHA_GroundTruth is the Constitution-III ground-truth anchor: it +// asserts skillcore.TreeSHA equals the value raw `git rev-parse HEAD:` +// produces against a bootstrapped fixture. git is the independent oracle (D11) — +// the expected value is never routed through skillcore, so a TreeSHA bug cannot +// hide behind agreeing-but-wrong output. +func TestTreeSHA_GroundTruth(t *testing.T) { + t.Parallel() + + dir, skill := bootstrapOrigin(t) + relPath := "skills/" + skill + + // Independent oracle: raw git, not skillcore. + want := runGit(t, dir, "rev-parse", "HEAD:"+relPath) + + got, err := TreeSHA(dir, "HEAD", relPath) + if err != nil { + t.Fatalf("TreeSHA: unexpected error: %v", err) + } + + if !hex40.MatchString(got) { + t.Errorf("TreeSHA = %q, want a 40-hex tree SHA", got) + } + + if got != want { + t.Errorf("TreeSHA = %q, want (raw git) %q", got, want) + } +} + +// TestTreeSHA_RelocationInvariance proves the fingerprint depends only on the +// subtree contents: the same skill committed at skills/ in one repo and at +// .agents/skills/ in another yields the identical tree-SHA. This is the +// entire label-honesty mechanism (add records origin's SHA; verify recomputes +// the consumer's and they match). +func TestTreeSHA_RelocationInvariance(t *testing.T) { + t.Parallel() + + originDir, skill := bootstrapOrigin(t) + + originSHA, err := TreeSHA(originDir, "HEAD", "skills/"+skill) + if err != nil { + t.Fatalf("origin TreeSHA: %v", err) + } + + // A second repo with the SAME files at a DIFFERENT path. + consumerDir := t.TempDir() + runGit(t, consumerDir, "init", "-q") + writeFile(t, consumerDir, filepath.Join(".agents/skills", skill, "SKILL.md"), 0o644, sampleSkillMd) + writeFile(t, consumerDir, filepath.Join(".agents/skills", skill, "skill.toml"), 0o644, sampleManifest) + runGit(t, consumerDir, "add", "-A") + runGit(t, consumerDir, "commit", "-q", "-m", "vendor") + + consumerSHA, err := TreeSHA(consumerDir, "HEAD", ".agents/skills/"+skill) + if err != nil { + t.Fatalf("consumer TreeSHA: %v", err) + } + + if consumerSHA != originSHA { + t.Errorf("relocation changed the tree SHA: origin %q, consumer %q", originSHA, consumerSHA) + } +} + +// TestTreeSHA_RealGitError exercises the public TreeSHA error path against real +// git: asking for a path absent from HEAD makes rev-parse exit non-zero, and +// TreeSHA must surface a *GitError carrying that positive exit code and stderr. +func TestTreeSHA_RealGitError(t *testing.T) { + t.Parallel() + + dir, _ := bootstrapOrigin(t) + + _, err := TreeSHA(dir, "HEAD", "skills/does-not-exist") + if err == nil { + t.Fatal("TreeSHA: want error for a missing subtree, got nil") + } + + var gitErr *GitError + if !errors.As(err, &gitErr) { + t.Fatalf("TreeSHA error = %T (%v), want *GitError", err, err) + } + + if gitErr.ExitCode <= 0 { + t.Errorf("GitError.ExitCode = %d, want a positive git exit code", gitErr.ExitCode) + } + + if gitErr.Stderr == "" { + t.Error("GitError.Stderr is empty, want git's diagnostic text") + } +} + +// TestGitClient_StubbedExit is the explicit "stub the command constructor" +// error-path test (D8): swapping commandContext for a fake binary that exits +// non-zero with known stderr must yield a *GitError carrying exactly that exit +// code and (trimmed) stderr — no real git involved. This is the seam skillcore's +// git layer exposes for deterministic error-path unit tests. +func TestGitClient_StubbedExit(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + exitCode int + stderr string + }{ + {name: "generic failure", exitCode: 1, stderr: "fatal: not a git repository"}, + {name: "rev-parse bad object", exitCode: 128, stderr: "fatal: bad revision 'HEAD:skills/x'"}, + {name: "stderr is trimmed", exitCode: 2, stderr: " fatal: boom \n"}, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + c := &gitClient{commandContext: stubCommandContext(tt.exitCode, tt.stderr)} + + _, err := c.run(context.Background(), "rev-parse", "HEAD:skills/x") + if err == nil { + t.Fatal("run: want error from non-zero git exit, got nil") + } + + var gitErr *GitError + if !errors.As(err, &gitErr) { + t.Fatalf("run error = %T (%v), want *GitError", err, err) + } + + if gitErr.ExitCode != tt.exitCode { + t.Errorf("GitError.ExitCode = %d, want %d", gitErr.ExitCode, tt.exitCode) + } + + // Stderr is trimmed by the client (errors-as-navigation: the raw + // cause, tidied) — compare against the trimmed expectation. + wantStderr := regexp.MustCompile(`^\s+|\s+$`).ReplaceAllString(tt.stderr, "") + if gitErr.Stderr != wantStderr { + t.Errorf("GitError.Stderr = %q, want %q", gitErr.Stderr, wantStderr) + } + }) + } +} + +// TestGitClient_StubbedSuccess confirms the same stub seam round-trips a +// successful invocation: exit 0 yields no error (the stub writes nothing to +// stdout, so the client returns the empty trimmed string). +func TestGitClient_StubbedSuccess(t *testing.T) { + t.Parallel() + + c := &gitClient{commandContext: stubCommandContext(0, "")} + + out, err := c.run(context.Background(), "rev-parse", "HEAD") + if err != nil { + t.Fatalf("run: unexpected error on exit 0: %v", err) + } + + if out != "" { + t.Errorf("run output = %q, want empty (stub wrote no stdout)", out) + } +} diff --git a/pkg/skillcore/verify.go b/pkg/skillcore/verify.go new file mode 100644 index 0000000..218bbe3 --- /dev/null +++ b/pkg/skillcore/verify.go @@ -0,0 +1,348 @@ +package skillcore + +import ( + "errors" + "io/fs" + "os" + "path/filepath" + "sort" + "strconv" + "strings" +) + +// Verdict statuses for Verdict.Status. +const ( + // StatusOK: locked, present, committed, and the recomputed tree-SHA matches. + StatusOK = "ok" + // StatusMismatch: locked and committed, but the tree-SHA differs (label-honesty failure). + StatusMismatch = "mismatch" + // StatusOrphan: on disk under .agents/skills/ but with no lock entry. + StatusOrphan = "orphan" + // StatusMissing: a lock entry whose path is absent on disk / from HEAD. + StatusMissing = "missing" + // StatusDirty: locked and present but uncommitted/modified versus HEAD. + StatusDirty = "dirty" +) + +// skillsRoot is the canonical vendored-skills directory, repo-relative. +const skillsRoot = ".agents/skills" + +// expectedLockfileVersion is the only lockfileVersion this slice understands. +const expectedLockfileVersion = 1 + +// Report is the aggregate result of Verify. +type Report struct { + OK bool + Counts Counts + Verdicts []Verdict +} + +// Counts tallies verdicts by status for the compact summary. +type Counts struct { + Verified int + Mismatch int + Orphan int + Missing int + Dirty int +} + +// Verdict is the per-skill outcome over the union of locked and on-disk skills. +type Verdict struct { + Name string + Path string + Status string + ExpectedTreeSha string + ActualTreeSha string + Reason string +} + +// Verify checks every vendored skill in repoRoot against the lock: +// label-honesty (recompute TreeSHA on HEAD), orphan/completeness (on-disk set +// versus locked set), and dirty (uncommitted). It is read-only, offline, and +// deterministic, and aggregates all findings. It returns a *VerifyFailure (with +// the same Report attached) when the report is not ok, so callers can branch. +// +// It distinguishes two error classes: a configuration/usage problem — a +// malformed lock (*LockError) or a git failure such as "not a git repository" +// (*GitError) — is returned as a non-*VerifyFailure error (the CLI maps these to +// exit 1). Per-skill findings are returned as a *VerifyFailure (exit 2). It is +// presentation-free and never writes git objects (only rev-parse / status). +func Verify(repoRoot string) (Report, error) { + lf, err := readVerifyLock(repoRoot) + if err != nil { + return Report{}, err + } + + onDisk, err := enumerateOnDiskSkills(repoRoot) + if err != nil { + return Report{}, err + } + + // Index on-disk skills by repo-relative path so locked entries can claim + // them and the remainder fall through as orphans. + onDiskByPath := map[string]bool{} + for _, p := range onDisk { + onDiskByPath[p] = true + } + + verdicts := []Verdict{} + lockedPaths := map[string]bool{} + + for name, entry := range lf.Skills { + lockedPaths[entry.Path] = true + + verdict, err := verifyLockedSkill(repoRoot, name, entry, onDiskByPath[entry.Path]) + if err != nil { + return Report{}, err + } + + verdicts = append(verdicts, verdict) + } + + for _, path := range onDisk { + if lockedPaths[path] { + continue + } + + verdict, err := verifyOrphanSkill(repoRoot, path) + if err != nil { + return Report{}, err + } + + verdicts = append(verdicts, verdict) + } + + // Deterministic ordering: callers render this directly, so sort by path. + sort.Slice(verdicts, func(i, j int) bool { + return verdicts[i].Path < verdicts[j].Path + }) + + rep := buildReport(verdicts) + if !rep.OK { + return rep, &VerifyFailure{Report: rep} + } + + return rep, nil +} + +// readVerifyLock reads the lock and enforces the supported lockfileVersion. An +// absent lock is not an error (zero skills). A malformed/unreadable lock, or one +// with an unsupported version, is a *LockError (a config/usage problem). +func readVerifyLock(repoRoot string) (LockFile, error) { + path := lockPath(repoRoot) + + if _, err := os.Stat(path); err != nil { + if errors.Is(err, fs.ErrNotExist) { + return LockFile{}, nil + } + + return LockFile{}, &LockError{Path: path, Cause: err} + } + + lf, err := ReadLock(repoRoot) + if err != nil { + return LockFile{}, &LockError{Path: path, Cause: err} + } + + if lf.LockfileVersion != expectedLockfileVersion { + return LockFile{}, &LockError{ + Path: path, + Cause: errors.New("unsupported lockfileVersion: " + strconv.Itoa(lf.LockfileVersion)), + } + } + + if lf.Skills == nil { + lf.Skills = map[string]LockEntry{} + } + + return lf, nil +} + +// enumerateOnDiskSkills returns the repo-relative paths of every directory under +// .agents/skills/* that contains a skill.toml or SKILL.md (the spike §6 rule for +// "this dir is a skill"). An absent skills root yields an empty set. +func enumerateOnDiskSkills(repoRoot string) ([]string, error) { + rootAbs := filepath.Join(repoRoot, skillsRoot) + + entries, err := os.ReadDir(rootAbs) + if err != nil { + if errors.Is(err, fs.ErrNotExist) { + return []string{}, nil + } + + return nil, &LockError{Path: rootAbs, Cause: err} + } + + paths := []string{} + + for _, e := range entries { + if !e.IsDir() { + continue + } + + if !isSkillDir(filepath.Join(rootAbs, e.Name())) { + continue + } + + paths = append(paths, filepath.ToSlash(filepath.Join(skillsRoot, e.Name()))) + } + + return paths, nil +} + +// isSkillDir reports whether dir contains a skill.toml or SKILL.md. +func isSkillDir(dir string) bool { + for _, marker := range []string{"skill.toml", "SKILL.md"} { + if _, err := os.Stat(filepath.Join(dir, marker)); err == nil { + return true + } + } + + return false +} + +// verifyLockedSkill produces the verdict for one locked skill. presentOnDisk is +// whether the skill's directory exists on disk (with a marker file). +func verifyLockedSkill( + repoRoot, name string, + entry LockEntry, + presentOnDisk bool, +) (Verdict, error) { + verdict := Verdict{ + Name: name, + Path: entry.Path, + ExpectedTreeSha: entry.TreeSha, + } + + inHead, headErr := pathInHead(repoRoot, entry.Path) + if headErr != nil { + return Verdict{}, headErr + } + + // Absent entirely (not committed, not on disk) → missing. + if !inHead && !presentOnDisk { + verdict.Status = StatusMissing + verdict.Reason = "lock entry whose path is absent on disk and from HEAD" + + return verdict, nil + } + + dirty, err := pathDirty(repoRoot, entry.Path) + if err != nil { + return Verdict{}, err + } + + // Uncommitted (dirty working tree) or present on disk but not yet in HEAD → + // dirty (a working-state finding, distinct from a label-honesty mismatch). + if dirty || (!inHead && presentOnDisk) { + verdict.Status = StatusDirty + verdict.Reason = "vendored but uncommitted or locally modified — commit before verifying" + + return verdict, nil + } + + actual, err := TreeSHA(repoRoot, "HEAD", entry.Path) + if err != nil { + return Verdict{}, err + } + + verdict.ActualTreeSha = actual + if actual != entry.TreeSha { + verdict.Status = StatusMismatch + verdict.Reason = "content does not match recorded version" + + return verdict, nil + } + + verdict.Status = StatusOK + + return verdict, nil +} + +// verifyOrphanSkill produces the orphan verdict for an on-disk skill with no +// lock entry. The actual tree-SHA is recomputed when the path is committed and +// left empty otherwise (uncommitted orphan). +func verifyOrphanSkill(repoRoot, path string) (Verdict, error) { + verdict := Verdict{ + Name: filepath.Base(path), + Path: path, + Status: StatusOrphan, + Reason: "present on disk but not in the lock", + } + + inHead, err := pathInHead(repoRoot, path) + if err != nil { + return Verdict{}, err + } + + if !inHead { + return verdict, nil + } + + actual, err := TreeSHA(repoRoot, "HEAD", path) + if err != nil { + return Verdict{}, err + } + + verdict.ActualTreeSha = actual + + return verdict, nil +} + +// pathInHead reports whether path resolves to a tree in HEAD. A missing tree is +// not an error (false, nil); a real git failure (e.g. not a repo) propagates as +// a *GitError so the CLI can surface a config/usage error. +func pathInHead(repoRoot, path string) (bool, error) { + _, err := TreeSHA(repoRoot, "HEAD", path) + if err == nil { + return true, nil + } + + var gitErr *GitError + if errors.As(err, &gitErr) && gitErr.ExitCode > 0 { + // A positive exit code from rev-parse means git ran but could not + // resolve HEAD: — the path is simply not in the committed tree. + return false, nil + } + + // Exit code <= 0 means git could not run at all (not a repo / not on PATH). + return false, err +} + +// pathDirty reports whether the working tree for path has uncommitted changes, +// via git status --porcelain. Non-empty output means dirty. +func pathDirty(repoRoot, path string) (bool, error) { + out, err := statusPorcelain(repoRoot, path) + if err != nil { + return false, err + } + + return strings.TrimSpace(out) != "", nil +} + +// buildReport tallies verdicts into Counts and sets OK iff all verdicts are ok. +func buildReport(verdicts []Verdict) Report { + counts := Counts{} + ok := true + + for _, v := range verdicts { + switch v.Status { + case StatusOK: + counts.Verified++ + case StatusMismatch: + counts.Mismatch++ + ok = false + case StatusOrphan: + counts.Orphan++ + ok = false + case StatusMissing: + counts.Missing++ + ok = false + case StatusDirty: + counts.Dirty++ + ok = false + } + } + + return Report{OK: ok, Counts: counts, Verdicts: verdicts} +} diff --git a/specledger/002-skillcore-verify/contracts/add.md b/specledger/002-skillcore-verify/contracts/add.md index bbe0215..28d74e2 100644 --- a/specledger/002-skillcore-verify/contracts/add.md +++ b/specledger/002-skillcore-verify/contracts/add.md @@ -21,6 +21,8 @@ skillrig add [--dry-run] [--force] [--json] [--verbose] > **Origin, not a path** (clarified 2026-05-30): there is **no** `--from`/path argument. `add` resolves the active origin through the shared resolver (`SKILLRIG_ORIGIN` > project `.skillrig/config.toml` > global) exactly like every command; the origin *value* may be a local checkout this slice. Tests do `skillrig init --origin ` then `skillrig add `. +> **Local-origin resolution (this slice)** — `init` accepts only an `OWNER/REPO[@REF]` reference (not a filesystem path). For a local origin, `add` reads that reference from a **git checkout at `./OWNER/REPO`, relative to the directory `add` runs from** (your repo root): `init --origin my-org/my-skills` ⇒ `add` reads `./my-org/my-skills/skills//`. `@REF` selects the revision (default `HEAD`). Keep the nested checkout out of the consumer index (e.g. `echo 'my-org/' >> .git/info/exclude`). This is the concrete encoding of "the origin value may be a local checkout"; fetching a remote origin over the network is a later, additive mode. *(Follow-up: the path is resolved relative to the process CWD, not the resolved repo root — run `add` from the repo root. Making it repo-root-relative is a hardening candidate.)* + ## Help (Progressive Discovery) ``` diff --git a/specledger/002-skillcore-verify/reviews/002-review.md b/specledger/002-skillcore-verify/reviews/002-review.md index f6b4c6c..f9652b5 100644 --- a/specledger/002-skillcore-verify/reviews/002-review.md +++ b/specledger/002-skillcore-verify/reviews/002-review.md @@ -42,3 +42,47 @@ exit 0/1/2-not-3 · add-detect+refuse-not-merge · conflict-markers-deferred · - Artifacts are internally consistent — **clear to proceed to implementation** (`/specledger.implement-workflow` experiment, or `/specledger.tasks` for the durable ledger). - Re-run `/specledger.verify` after `tasks.md` exists if you want task-coverage + `TestQuickstart_*`-task mapping validated. + +--- + +# Post-Implementation Adversarial Review — 2026-05-30 + +**Scope:** an independent cold-context agent (Opus 4.8, xhigh) reviewed the **implemented** branch against the per-user-story DoD: it read every artifact, read the code (`pkg/skillcore`, `internal/cli`, tests), ran `make check`, and exercised the binary on the add→commit→verify→tamper round-trip plus edge probes. The agent was stopped just before it emitted its final compiled report; the findings below are **distilled from its complete in-flight analysis** (cross-checked against the code). *(Process note: this agent ran a git round-trip in the repo root by mistake — see AR-P. That motivated the clean-tree-before-review rule now added to `specledger.checkpoint-workflow`.)* + +## Findings + +| ID | New? | Category | Severity | Summary | Status | +|----|------|----------|----------|---------|--------| +| **AR-1** | confirms checkpoint #1 | Origin resolution | HIGH | **Local-origin lookup is split-brain.** `RepoRoot` is absolute (`git rev-parse --show-toplevel`) but `OriginDir` is a bare relative path (`my-org/my-skills`, from `originDirRef`), resolved against the **process CWD**. Running `add` from a **subdirectory** fails ("skill not found in origin") even though the origin checkout is correctly at the repo root — while the vendored files + lock would still target the repo root. Empirically confirmed. | Documented (checkpoint div #1); **CWD-relative resolution itself still unfixed** — hardening candidate noted in `contracts/add.md`. | +| **AR-2** | **NEW** | Errors-as-navigation (FR-019) | **HIGH** | **A missing origin checkout is indistinguishable from a typo'd skill name.** `add.go` (`os.Stat(srcDir)`, ~L85-90) returns `SkillNotFoundError` on *any* stat error, so when the entire `./OWNER/REPO` checkout is absent (user ran `init` but never cloned it), the user gets *"skill … not found in origin → check the skill name"* — actively misleading. The two failure classes (origin-dir-absent vs skill-subdir-absent) must be distinguished (cli.md Principle 2). | **Open.** | +| **AR-3** | **NEW** | Test tier (Constitution III) | MEDIUM | **No `pkg/skillcore/verify_test.go`.** The headline gate's logic — status taxonomy (`ok/mismatch/orphan/missing/dirty`), counts, all-findings aggregation, dirty-before-mismatch precedence — is covered **only** by black-box integration tests, with no presentation-free unit test. The two-tier discipline is met for `add`/`treesha`/`lock`/`manifest` but not for `verify`. | **Open.** | +| **AR-4** | **NEW** (tied to AR-1/AR-2) | Skill accuracy | LOW | The new `skillrig-add-verify` skill understates the CWD-relative fragility ("relative to where you run add (your repo root)") and repeats the misleading *"not found → check the name"* error mapping without the "or the origin checkout is missing" case. | **Open** (fix with AR-1/AR-2). | +| **AR-5** | — | Doc drift | LOW | `data-model.md` sample tree-SHA `c967789…` is stale vs the actual fixture (`40e4cad…`). Documented as "representative, not canonical"; tests recompute via raw git, so **not** a correctness bug. | Optional cleanup. | + +## Positives confirmed (independently verified) + +- **AP-04 upheld:** the only `git` shelling outside `pkg/skillcore` is `gitToplevel` in `internal/cli/repo.go` (repo-root discovery, *not* tree-SHA). All tree-SHA / lock / manifest logic lives solely in `pkg/skillcore`. No parallel implementation. +- **Tree-SHA covers additions, not just byte edits:** an untracked stowaway file inside a locked skill → `dirty`; a committed stowaway → `mismatch`. Correct. +- **Verdict taxonomy is sound,** including `dirty`-before-`mismatch` precedence and working-tree-deletion → `dirty` (exit 2) — both judged defensible/by-design. +- **Round-trip verified live:** clean → exit 0; uncommitted tamper → `dirty` (2); committed tamper → `mismatch` (2); empty `.agents/skills` + lock entry → `missing` (2). +- **`make check` green**, `cli.md` correctly synced this branch (verify integrity-only, `pkg/skillcore` as separate public package, exit 3 reserved). + +## Cleared (false alarms the agent self-corrected) + +- **Wrong-`lockfileVersion` exit code:** initially looked like exit 0, but that was **pipe-masking** (`head` exit, not `skillrig`); true exit is **1**. No bug. +- **`lock_test.go` hard-coded SHAs:** used only as write→read serialization round-trip fixtures, not asserted against real git output — harmless (no circular oracle). + +## Observation (not a finding) + +Running `skillrig verify` inside the `skillrig-cli` repo itself reports all ~17 of its own vendored agent skills as `orphan` (it has no committed `.skillrig/skills-lock.json`) — expected behavior, but a reminder that this repo is not yet self-managed by `skillrig`. + +## AR-P — Process incident (review harness) + +The first review agent's manual round-trip used a `cd "$WORK"` that silently no-op'd (empty var), so `git add -A && git commit -m vendor` ran in the **repo root**, creating a stray commit; the agent then `git reset --soft` back to `e0d8ccd`. No file contents or real commits were lost, but the **staging area was disturbed**. **Mitigation adopted:** require a clean/committed working tree *before* launching a review agent (the agent may freely run git to test) — now documented in `.agents/commands/specledger.checkpoint-workflow.md`. + +## Recommended actions before merge + +1. **[HIGH] AR-2** — distinguish "origin checkout missing" from "skill not found" in `add` (a dedicated error + fix hint). Cheap, high-value. +2. **[HIGH] AR-1** — decide the CWD-vs-repo-root resolution: make `OriginDir` repo-root-relative (robust; tests still pass since root==CWD there), or document "run `add` from the repo root" prominently. Currently only a follow-up note. +3. **[MEDIUM] AR-3** — add `pkg/skillcore/verify_test.go` (stub the git client; table-drive the status taxonomy + counts + aggregation). +4. **[LOW] AR-4 / AR-5** — fold the AR-1/AR-2 nuance into the `skillrig-add-verify` skill; refresh the stale data-model sample SHA. diff --git a/specledger/002-skillcore-verify/sessions/002-skillcore-verify-checkpoint.md b/specledger/002-skillcore-verify/sessions/002-skillcore-verify-checkpoint.md new file mode 100644 index 0000000..81389a8 --- /dev/null +++ b/specledger/002-skillcore-verify/sessions/002-skillcore-verify-checkpoint.md @@ -0,0 +1,60 @@ +# Session Log: 002-skillcore-verify + +## Divergence Review: 2026-05-30 12:09 + +Scope: **staged** changes only (the user staged the implement-workflow output; unstaged files are excluded). Implementation produced by `/specledger.implement-workflow` (multi-agent Workflow). Reviewer mindset: adversarial. + +### Divergences + +| # | Severity | Type | Category | Artifact | Description | +|---|----------|------|----------|----------|-------------| +| 1 | HIGH | oversight | Local-origin resolution undocumented & awkward | spec.md US1 / FR-001 / FR-007 + clarification 2026-05-30 Q1; `internal/cli/add.go:originDirRef` | `skillrig init --origin ` is **rejected** — 001's init validates `OWNER/REPO[@REF]`. `add` resolves the configured `OWNER/REPO` to a **same-named relative directory** `.//` nested in the consumer repo. The spec's "consume from a local checkout" *durable capability* is only reachable via this nested-dir convention, which **no contract, help text, or data-model documents**. Integration tests pass only because `newConsumerRepo` constructs that exact layout. | +| 2 | HIGH | conscious | Constitution IX skill co-evolution not delivered | spec.md Constitution Alignment §IX; CLAUDE.md ("Every CLI change ships a matching skill update with verified trigger accuracy") | No agent-skill update teaching `add`/`verify` usage (exit 0/2 meaning; missing backing tool ≠ verify failure) and no trigger-accuracy eval. The experimental workflow deliberately skipped it (documented in the command's "What this deliberately skips"). | +| 3 | LOW | oversight | Test gap — US3 AS3 | spec.md US3 AS3 / FR-011 | "orphan/completeness scans only the canonical `.agents/skills`" has no dedicated `TestQuickstart_` asserting a manually-created view dir is ignored. Behavior is implicit in `enumerateOnDiskSkills` (only `.agents/skills/*` with a marker file). | +| 4 | LOW | conscious | Commit hygiene — unrelated files staged | git index | `AGENTS.md` (deletion), `CLAUDE.md` (mod), and `.agents/commands/specledger.implement-workflow.md` are staged alongside the 002 feature; they are not part of the feature and would pollute a `feat(002)` commit. | + +**Zero divergences found in the core integrity logic** — a positive signal worth stating: `pkg/skillcore/verify.go` faithfully implements FR-008/009/010/012/013/015 (lockfileVersion guard → `*LockError`/exit 1; dirty via `git status --porcelain` distinct from mismatch; orphan/missing; aggregates all findings; read-only rev-parse/status only; deterministic sort). Data-model entities (Manifest, LockFile/LockEntry without `requires`, AddResult, Report/Counts/Verdict, 5 statuses) match. `docs/design/cli.md` was synced this branch (same-branch doc-sync rule satisfied). + +### DoD Bypassed + +| User Story | Title | Acceptance Criteria | Risk | +|------------|-------|---------------------|------| +| US1 | Vendor a skill | All 5 AS tested (vendor/idempotent/json/refuse-divergent/dry-run). Underlying *local-origin capability* awkward & undocumented (div #1) | HIGH — capability shipped but undiscoverable per spec intent | +| US2 | Prove unmodified | All 4 AS tested + dirty + read-only | none | +| US3 | Orphan/missing | AS1/AS2/AS4 tested; **AS3 (view dirs ignored) not explicitly tested** (div #3) | LOW | +| US4 | Scriptable outcome | All 4 AS tested (exit matrix, json-complete, what/why/fix, malformed-lock) | none | +| Constitution §IX | Agent-skill co-evolution | **Not delivered** (div #2) | HIGH — hard project rule (CLAUDE.md) | + +### Issues Encountered & Resolutions +- Live hand-transcript of `init --origin ` failed ("expected OWNER/REPO[@REF]") → root-caused to `originDirRef` mapping `OWNER/REPO` → `./owner/repo`; re-ran the round-trip with the nested layout and it passed (exit 0 vendor→commit→verify; exit 2 on tamper). Surfaced as divergence #1. +- Transient LSP "undefined" diagnostics during the run → confirmed stale by a clean `go build ./...`; not real. + +### Items Requiring Action Before Merge +1. [HIGH] **Decide & document local-origin resolution** (div #1): either (a) extend `init`/origin to accept a local filesystem path, or (b) document the `.//` nested-checkout convention in `contracts/add.md` + `add --help`, so the spec's "durable local-origin capability" is actually usable/discoverable. Currently neither. +2. [HIGH] **Ship the `add`/`verify` agent-skill update + trigger-accuracy eval** (Constitution §IX / CLAUDE.md hard rule) — div #2. +3. [LOW] Add a `TestQuickstart_VerifyIgnoresViewDirs` (or note the deferral) for US3 AS3 — div #3. +4. [LOW] Unstage `AGENTS.md`, `CLAUDE.md`, `.agents/commands/specledger.implement-workflow.md` before the `feat(002)` commit — div #4. + +### Tests & Checks +- Status: **PASS** +- Commands run: `make check` (gofmt, go vet, golangci-lint → **0 issues**, `go test ./...` → all ok); plus an out-of-band live `init→add→commit→verify→tamper→verify` round-trip transcript. +- Failures: none. +- Coverage: all **20** quickstart scenarios present as `TestQuickstart_*` and passing; `skillcore` unit suite (ground-truth `TreeSHA == git rev-parse`, lock round-trip, manifest parse, stubbed-git error paths). + +### Uncommitted Changes (excluded from this checkpoint, not staged/reverted) +- `.agents/commands/specledger.checkpoint.md` (M), `.agents/commands/specledger.implement-workflow.md` (working-tree edit over staged add), `.specledger/templates/tasks-template.md` (M), `docs/guides/vcr-cassettes.md` (untracked) + +--- + +## Resolutions: 2026-05-30 12:25 (follow-up, same session) + +Acted on the divergences above per user direction: + +- **Div #1 (HIGH) — RESOLVED (documented).** Local-origin resolution (`OWNER/REPO` → `./OWNER/REPO` relative to the invocation dir) is now documented in three places: `skillrig add --help` (`internal/cli/add.go` Long + Example), `contracts/add.md` (new "Local-origin resolution (this slice)" note, incl. the CWD-relative caveat + repo-root-relative hardening follow-up), and the `skillrig-init` skill (new "Local origin (this release)" section with a worked setup). *Chosen path: document the current behavior, not re-architect.* +- **Div #2 (HIGH) — RESOLVED (skill authored; eval not run).** New agent skill `.agents/skills/skillrig-add-verify/` (Constitution §IX) teaching the vendor→commit→verify round-trip, exit-code branching, and that a missing backing tool is NOT a verify failure. Eval sets **defined but not run** per user: `evals/evals.json` (5 behavioral cases) + `evals/trigger-eval-set.json` (20 trigger queries). Running `run_eval.py` / trigger-optimization is a deliberate later step. +- **Div #3 (LOW) — RESOLVED.** Added `TestQuickstart_VerifyIgnoresViewDirs` (+ `writeClientViewSkill` helper) asserting the orphan scan ignores a non-canonical `.claude/skills/` view dir (FR-011 / US3 AS3). Passing. +- **Div #4 (LOW) — DEFERRED by user** ("ignore the git index diff"). Commit hygiene left to the user. + +**Re-check:** `make check` green (gofmt, go vet, golangci-lint 0 issues, `go test ./...` all ok incl. the new test). Next: relaunch the independent cold adversarial review agent. + +--- diff --git a/test/skillcore_quickstart_test.go b/test/skillcore_quickstart_test.go new file mode 100644 index 0000000..fdeb51b --- /dev/null +++ b/test/skillcore_quickstart_test.go @@ -0,0 +1,1332 @@ +// This file holds the TestQuickstart_* integration suite for feature +// 002-skillcore-verify (add + verify). Per Constitution II (Quickstart as +// Contract) each scenario in specledger/002-skillcore-verify/quickstart.md maps +// 1:1 to a TestQuickstart_* test here. Like the 001 suite it builds the real +// binary once (TestMain in quickstart_test.go) and execs it via run(). +// +// Oracle independence (research D11): every expected tree-SHA is computed with +// RAW git (rawTreeSHA → `git rev-parse :`), NEVER through skillcore — +// the binary under test uses skillcore internally, so routing the expected value +// through it would be circular validation (Constitution III). All git +// bootstrap/commit helpers here shell `git` directly with a pinned identity so +// the fixture commit is reproducible (D8). +package quickstart + +import ( + "encoding/json" + "os" + "os/exec" + "path/filepath" + "strings" + "testing" +) + +// originRepo is the OWNER/REPO the fixture's .skillrig-origin.toml declares. The +// origin value is an OWNER/REPO reference (it must pass the resolver's +// OWNER/REPO[@REF] shape check), and this slice the origin is a LOCAL checkout: +// the resolved OWNER/REPO is interpreted as a path relative to the repo root, so +// the origin checkout is laid out at /my-org/my-skills. +const originRepo = "my-org/my-skills" + +// sampleSkill is the one skill the sample origin ships. +const sampleSkill = "terraform-plan-review" + +// sampleVersion is the version recorded in the fixture's skill.toml. +const sampleVersion = "1.4.0" + +// originSubtree is the origin-relative path whose git tree-object SHA is the +// fingerprint add records and verify recomputes (the locked fingerprint boundary +// — research D1). rawTreeSHA reads it straight from git as the oracle. +const originSubtree = "skills/" + sampleSkill + +// vendoredPath is the canonical repo-relative location a vendored skill lands at. +const vendoredPath = ".agents/skills/" + sampleSkill + +// pinnedGitEnv is the reproducible-commit identity (research D8): a fixed +// author/committer name, email, and date so the fixture's commit SHA is stable +// across machines and runs. It is appended to the current environment for every +// git invocation in this suite's helpers. +func pinnedGitEnv() []string { + const ( + name = "skillrig" + email = "ci@skillrig.dev" + date = "2026-01-01T00:00:00Z" + ) + + return append(os.Environ(), + "GIT_AUTHOR_NAME="+name, + "GIT_AUTHOR_EMAIL="+email, + "GIT_AUTHOR_DATE="+date, + "GIT_COMMITTER_NAME="+name, + "GIT_COMMITTER_EMAIL="+email, + "GIT_COMMITTER_DATE="+date, + ) +} + +// git runs a raw git command in dir with the pinned identity and fails the test +// on error. It is the independent oracle / setup primitive: it NEVER calls +// skillcore (research D11). +func git(t *testing.T, dir string, args ...string) string { + t.Helper() + + cmd := exec.CommandContext(t.Context(), "git", args...) + cmd.Dir = dir + cmd.Env = pinnedGitEnv() + + out, err := cmd.CombinedOutput() + if err != nil { + t.Fatalf("git %v in %s: %v\n%s", args, dir, err, out) + } + + return strings.TrimSpace(string(out)) +} + +// sampleOriginDir resolves the committed fixture tree (test/testdata/ +// sample-origin) to an absolute path. Tests exec from the test/ package dir, so +// the fixture is reachable relative to the test cwd. +func sampleOriginDir(t *testing.T) string { + t.Helper() + + abs, err := filepath.Abs(filepath.Join("testdata", "sample-origin")) + if err != nil { + t.Fatalf("resolve sample-origin: %v", err) + } + + if _, err := os.Stat(filepath.Join(abs, ".skillrig-origin.toml")); err != nil { + t.Fatalf("sample-origin fixture missing at %s: %v", abs, err) + } + + return abs +} + +// copyTree recursively copies the fixture src into dst, preserving file modes +// (the exec bit is part of the tree-SHA, so it must survive the copy). +func copyTree(t *testing.T, src, dst string) { + t.Helper() + + err := filepath.WalkDir(src, func(path string, d os.DirEntry, err error) error { + if err != nil { + return err + } + + rel, err := filepath.Rel(src, path) + if err != nil { + return err + } + + target := filepath.Join(dst, rel) + if d.IsDir() { + return os.MkdirAll(target, 0o755) + } + + info, err := d.Info() + if err != nil { + return err + } + + data, err := os.ReadFile(path) + if err != nil { + return err + } + + return os.WriteFile(target, data, info.Mode().Perm()) + }) + if err != nil { + t.Fatalf("copy fixture %s → %s: %v", src, dst, err) + } +} + +// consumerRepo is a bootstrapped consumer: a git repo whose origin (the local +// OWNER/REPO checkout) is nested under it, ready to add/commit/verify. +type consumerRepo struct { + // root is the consumer git repo's work tree (also its rev-parse toplevel). + root string + // originDir is the nested origin checkout at /my-org/my-skills — the + // raw-git oracle target for rawTreeSHA. + originDir string +} + +// bootstrapOrigin git-inits a fresh checkout of the sample origin inside +// parent at the relative OWNER/REPO path and commits it with the pinned +// identity, returning the origin dir and the committed ref. Lives at +// /my-org/my-skills so the resolver's OWNER/REPO value resolves to it as +// a path relative to the consumer root (= parent). +func bootstrapOrigin(t *testing.T, parent string) (dir, ref string) { + t.Helper() + + dir = filepath.Join(parent, filepath.FromSlash(originRepo)) + if err := os.MkdirAll(dir, 0o755); err != nil { + t.Fatalf("mkdir origin %s: %v", dir, err) + } + + copyTree(t, sampleOriginDir(t), dir) + + git(t, dir, "init", "-q", "-b", "main") + git(t, dir, "add", "-A") + git(t, dir, "commit", "-q", "-m", "fixture origin") + + return dir, "HEAD" +} + +// newConsumerRepo builds a consumer repo with the sample origin nested inside it +// and binds it with `skillrig init --origin my-org/my-skills`. The nested origin +// checkout is excluded from the consumer's index (.git/info/exclude) so a later +// `git add -A` stages only the vendored skill + lock, never the origin's own +// repo as a gitlink. +func newConsumerRepo(t *testing.T) consumerRepo { + t.Helper() + requireGit(t) + + root := t.TempDir() + git(t, root, "init", "-q", "-b", "main") + + // Keep the nested origin checkout out of the consumer's index. + if err := os.WriteFile( + filepath.Join(root, ".git", "info", "exclude"), + []byte(strings.SplitN(originRepo, "/", 2)[0]+"/\n"), + 0o644, + ); err != nil { + t.Fatalf("write exclude: %v", err) + } + + originDir, _ := bootstrapOrigin(t, root) + + res := run(t, runOpts{args: []string{"init", "--origin", originRepo}, cwd: root}) + if res.exit != 0 { + t.Fatalf("init --origin %s: exit %d (stderr: %s)", originRepo, res.exit, res.stderr) + } + + return consumerRepo{root: root, originDir: originDir} +} + +// commitAll stages and commits everything in dir with the pinned identity so +// verify (which checks the COMMITTED tree) sees the vendored content. +func commitAll(t *testing.T, dir, msg string) { + t.Helper() + + git(t, dir, "add", "-A") + git(t, dir, "commit", "-q", "-m", msg) +} + +// rawTreeSHA is the independent oracle: the git tree-object SHA of relPath at ref +// in gitDir, read via raw `git rev-parse :` — NEVER skillcore +// (research D11). The binary under test computes the same value through +// skillcore; this is what proves the two agree on real git output. +func rawTreeSHA(t *testing.T, gitDir, ref, relPath string) string { + t.Helper() + + return git(t, gitDir, "rev-parse", ref+":"+relPath) +} + +// statusPorcelain returns `git status --porcelain` for dir, the read-only probe +// for the verify-writes-nothing assertions. +func statusPorcelain(t *testing.T, dir string) string { + t.Helper() + + return git(t, dir, "status", "--porcelain") +} + +// decodeJSON unmarshals res.stdout into a generic object, failing the test (with +// the raw stdout) when it is not a single JSON object. Used to assert the +// --json output is parseable and structurally complete. +func decodeJSON(t *testing.T, stdout string) map[string]any { + t.Helper() + + var obj map[string]any + if err := json.Unmarshal([]byte(stdout), &obj); err != nil { + t.Fatalf("stdout is not a single JSON object: %v\n%s", err, stdout) + } + + return obj +} + +// requireKeys asserts every key is present in obj (structural completeness, +// Constitution II — not just a Contains check). +func requireKeys(t *testing.T, obj map[string]any, keys ...string) { + t.Helper() + + for _, k := range keys { + if _, ok := obj[k]; !ok { + t.Errorf("JSON missing key %q: %v", k, obj) + } + } +} + +// readSkillFile reads a vendored skill file under the consumer root. +func readSkillFile(t *testing.T, root, rel string) string { + t.Helper() + + return readFile(t, filepath.Join(root, vendoredPath, rel)) +} + +// addResultKeys are the complete add --json key set (contract add.md). +var addResultKeys = []string{"ok", "name", "version", "path", "commit", "treeSha", "action", "dryRun"} + +// verdictKeys are the complete per-verdict key set (contract verify.md). +var verdictKeys = []string{"name", "path", "status", "expectedTreeSha", "actualTreeSha", "reason"} + +// countsKeys are the complete counts key set (contract verify.md). +var countsKeys = []string{"verified", "mismatch", "orphan", "missing", "dirty"} + +// --------------------------------------------------------------------------- +// US1 — Vendor a skill (add) +// --------------------------------------------------------------------------- + +func TestQuickstart_AddVendorsSkill(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + wantTreeSHA := rawTreeSHA(t, c.originDir, "HEAD", originSubtree) + + res := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}) + if res.exit != 0 { + t.Fatalf("add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + // Human shape: ≤ 2 lines including the footer hint. + lines := nonEmptyLines(res.stdout) + if len(lines) > 2 { + t.Errorf("human stdout has %d lines, want <= 2:\n%s", len(lines), res.stdout) + } + + if !strings.Contains(res.stdout, "skillrig verify") { + t.Errorf("human output missing next-step footer (skillrig verify):\n%s", res.stdout) + } + + // Files vendored byte-identical to the origin (modes preserved). + for _, f := range []string{"SKILL.md", "skill.toml"} { + got := readSkillFile(t, c.root, f) + want := readFile(t, filepath.Join(c.originDir, "skills", sampleSkill, f)) + + if got != want { + t.Errorf("vendored %s differs from origin", f) + } + + gotMode := fileMode(t, filepath.Join(c.root, vendoredPath, f)) + wantMode := fileMode(t, filepath.Join(c.originDir, "skills", sampleSkill, f)) + + if gotMode != wantMode { + t.Errorf("vendored %s mode = %v, want %v", f, gotMode, wantMode) + } + } + + // Lock: one entry; treeSha == the raw-git ground truth; no requires field. + entry := lockEntry(t, c.root, sampleSkill) + + if entry["treeSha"] != wantTreeSHA { + t.Errorf("lock treeSha = %v, want raw-git ground truth %s", entry["treeSha"], wantTreeSHA) + } + + if entry["version"] != sampleVersion { + t.Errorf("lock version = %v, want %s", entry["version"], sampleVersion) + } + + if entry["path"] != vendoredPath { + t.Errorf("lock path = %v, want %s", entry["path"], vendoredPath) + } + + if _, hasRequires := entry["requires"]; hasRequires { + t.Errorf("lock entry must NOT carry a requires field (D4), got: %v", entry) + } + + if commit, _ := entry["commit"].(string); len(commit) != 40 { + t.Errorf("lock commit = %q, want a 40-hex commit SHA", commit) + } + + // --json: parseable, all keys present, action == vendored. + jsonRes := run(t, runOpts{args: []string{"add", sampleSkill, "--json"}, cwd: c.root}) + + obj := decodeJSON(t, jsonRes.stdout) + requireKeys(t, obj, addResultKeys...) + + // The second add is on already-vendored content, so it is idempotent; assert + // the vendoring fields independently via the first add's lock + a fresh add. + if obj["treeSha"] != wantTreeSHA { + t.Errorf("--json treeSha = %v, want %s", obj["treeSha"], wantTreeSHA) + } +} + +func TestQuickstart_AddVendorsSkillJSONAction(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + + res := run(t, runOpts{args: []string{"add", sampleSkill, "--json"}, cwd: c.root}) + if res.exit != 0 { + t.Fatalf("add --json exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + obj := decodeJSON(t, res.stdout) + requireKeys(t, obj, addResultKeys...) + + if obj["ok"] != true { + t.Errorf("ok = %v, want true", obj["ok"]) + } + + if obj["action"] != "vendored" { + t.Errorf("action = %v, want vendored (fresh add)", obj["action"]) + } + + if obj["dryRun"] != false { + t.Errorf("dryRun = %v, want false", obj["dryRun"]) + } +} + +func TestQuickstart_AddIdempotent(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + + first := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}) + if first.exit != 0 { + t.Fatalf("first add exit = %d, want 0 (stderr: %s)", first.exit, first.stderr) + } + + lockBefore := readFile(t, filepath.Join(c.root, ".skillrig", "skills-lock.json")) + + second := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}) + if second.exit != 0 { + t.Fatalf("second add exit = %d, want 0 (stderr: %s)", second.exit, second.stderr) + } + + if !strings.Contains(second.stdout, "already vendored") && !strings.Contains(second.stdout, "no change") { + t.Errorf("idempotent re-add should note no change, got:\n%s", second.stdout) + } + + lockAfter := readFile(t, filepath.Join(c.root, ".skillrig", "skills-lock.json")) + if lockBefore != lockAfter { + t.Errorf("lock changed on idempotent re-add:\nbefore=%s\nafter=%s", lockBefore, lockAfter) + } + + // Exactly one entry (no duplicate). + skills := lockSkills(t, c.root) + if len(skills) != 1 { + t.Errorf("lock has %d skills, want exactly 1", len(skills)) + } + + jsonRes := run(t, runOpts{args: []string{"add", sampleSkill, "--json"}, cwd: c.root}) + + obj := decodeJSON(t, jsonRes.stdout) + if obj["action"] != "unchanged" { + t.Errorf("--json action = %v on identical re-add, want unchanged", obj["action"]) + } +} + +func TestQuickstart_AddDryRunWritesNothing(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + + res := run(t, runOpts{args: []string{"add", sampleSkill, "--dry-run"}, cwd: c.root}) + if res.exit != 0 { + t.Fatalf("add --dry-run exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + if !strings.Contains(res.stdout, "would vendor") { + t.Errorf("dry-run human output should be prefixed 'would vendor …', got:\n%s", res.stdout) + } + + // No skill tree and no lock created. + if _, err := os.Stat(filepath.Join(c.root, ".agents")); !os.IsNotExist(err) { + t.Errorf(".agents/ must not exist after --dry-run, stat err = %v", err) + } + + if _, err := os.Stat(filepath.Join(c.root, ".skillrig", "skills-lock.json")); !os.IsNotExist(err) { + t.Errorf("lock must not exist after --dry-run, stat err = %v", err) + } + + jsonRes := run(t, runOpts{args: []string{"add", sampleSkill, "--dry-run", "--json"}, cwd: c.root}) + + obj := decodeJSON(t, jsonRes.stdout) + requireKeys(t, obj, addResultKeys...) + + if obj["dryRun"] != true { + t.Errorf("--json dryRun = %v, want true", obj["dryRun"]) + } + + if obj["action"] != "vendored" { + t.Errorf("--json action = %v on dry-run of a fresh skill, want vendored", obj["action"]) + } +} + +func TestQuickstart_AddRefusesDivergentWithoutForce(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + + if res := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}); res.exit != 0 { + t.Fatalf("initial add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + // Diverge a byte of the on-disk copy. + skillMD := filepath.Join(c.root, vendoredPath, "SKILL.md") + appendByte(t, skillMD) + + res := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}) + if res.exit != 1 { + t.Fatalf("divergent add (no --force) exit = %d, want 1 (stderr: %s)", res.exit, res.stderr) + } + + if res.stdout != "" { + t.Errorf("error path must keep stdout empty, got: %q", res.stdout) + } + + // Three distinct parts: what / why / fix. + assertContains(t, "what", res.stderr, "refusing to overwrite "+vendoredPath) + assertContains(t, "why", res.stderr, "on-disk content diverges from the recorded fingerprint") + assertContains(t, "fix", res.stderr, "--force") + + // Files left untouched by the refusal (still diverged). + originMD := readFile(t, filepath.Join(c.originDir, "skills", sampleSkill, "SKILL.md")) + if readFile(t, skillMD) == originMD { + t.Errorf("refused add must not modify on-disk content") + } + + // --force restores from origin and reports action == overwritten. + forceRes := run(t, runOpts{args: []string{"add", sampleSkill, "--force", "--json"}, cwd: c.root}) + if forceRes.exit != 0 { + t.Fatalf("forced add exit = %d, want 0 (stderr: %s)", forceRes.exit, forceRes.stderr) + } + + obj := decodeJSON(t, forceRes.stdout) + if obj["action"] != "overwritten" { + t.Errorf("--force --json action = %v, want overwritten", obj["action"]) + } + + if readFile(t, skillMD) != originMD { + t.Errorf("--force should restore the skill to the origin's content") + } +} + +func TestQuickstart_AddRequiresOrigin(t *testing.T) { + t.Parallel() + requireGit(t) + + // A git repo with no origin anywhere (no init, no env, isolated HOME). + root := t.TempDir() + git(t, root, "init", "-q", "-b", "main") + + res := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: root}) + if res.exit != 1 { + t.Fatalf("add without origin exit = %d, want 1 (stderr: %s)", res.exit, res.stderr) + } + + if res.stdout != "" { + t.Errorf("error path must keep stdout empty, got: %q", res.stdout) + } + + assertContains(t, "what", res.stderr, "no origin configured") + assertContains(t, "why", res.stderr, "no SKILLRIG_ORIGIN / project / global origin") + assertContains(t, "fix", res.stderr, "skillrig init --origin") +} + +func TestQuickstart_AddNotGitRepo(t *testing.T) { + t.Parallel() + requireGit(t) + + // Origin supplied via SKILLRIG_ORIGIN, but cwd is NOT a git repo. The origin + // checkout is laid out at /my-org/my-skills so the relative resolution + // finds it; the failure is the missing git work tree, not a missing origin. + cwd := t.TempDir() + bootstrapOrigin(t, cwd) + + res := run(t, runOpts{ + args: []string{"add", sampleSkill}, + cwd: cwd, + env: map[string]string{"SKILLRIG_ORIGIN": originRepo}, + }) + if res.exit != 1 { + t.Fatalf("add in non-git dir exit = %d, want 1 (stderr: %s)", res.exit, res.stderr) + } + + if res.stdout != "" { + t.Errorf("error path must keep stdout empty, got: %q", res.stdout) + } + + assertContains(t, "what", res.stderr, "not a git repository") + assertContains(t, "why", res.stderr, "vendors into the repo's canonical .agents/skills") + assertContains(t, "fix", res.stderr, "run inside the repo") +} + +// --------------------------------------------------------------------------- +// US2 — Prove a skill is unmodified (verify label-honesty) +// --------------------------------------------------------------------------- + +func TestQuickstart_VerifyPasses(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + wantTreeSHA := rawTreeSHA(t, c.originDir, "HEAD", originSubtree) + + if res := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}); res.exit != 0 { + t.Fatalf("add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + commitAll(t, c.root, "vendor skill") + + res := run(t, runOpts{args: []string{"verify"}, cwd: c.root}) + if res.exit != 0 { + t.Fatalf("verify exit = %d, want 0 (stderr: %s)\nNOTE: the vendored skill declares "+ + "[[requires]] for tools absent in the test env; verify is integrity-only and "+ + "must still pass (SC-006/FR-014)", res.exit, res.stderr) + } + + // Human: exactly 2 lines (summary + footer). + lines := nonEmptyLines(res.stdout) + if len(lines) != 2 { + t.Errorf("verify pass human output = %d lines, want exactly 2:\n%s", len(lines), res.stdout) + } + + if !strings.Contains(lines[0], "verified 1 skills") { + t.Errorf("line 1 = %q, want it to report 1 verified skill", lines[0]) + } + + // --json: ok, counts.verified == 1, the single verdict matches ground truth. + jsonRes := run(t, runOpts{args: []string{"verify", "--json"}, cwd: c.root}) + + rep := decodeReport(t, jsonRes.stdout) + if rep.OK != true { + t.Errorf("--json ok = %v, want true", rep.OK) + } + + if rep.Counts.Verified != 1 { + t.Errorf("counts.verified = %d, want 1", rep.Counts.Verified) + } + + if len(rep.Verdicts) != 1 { + t.Fatalf("verdicts = %d, want 1", len(rep.Verdicts)) + } + + v := rep.Verdicts[0] + if v.Status != "ok" { + t.Errorf("verdict status = %q, want ok", v.Status) + } + + // The headline ground-truth invariant: expected == actual == raw-git tree-SHA. + if v.ExpectedTreeSha != wantTreeSHA || v.ActualTreeSha != wantTreeSHA { + t.Errorf("verdict expected/actual = %q/%q, want both == %s", v.ExpectedTreeSha, v.ActualTreeSha, wantTreeSHA) + } +} + +func TestQuickstart_VerifyIsReadOnly(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + + if res := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}); res.exit != 0 { + t.Fatalf("add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + commitAll(t, c.root, "vendor skill") + + lockPath := filepath.Join(c.root, ".skillrig", "skills-lock.json") + skillMD := filepath.Join(c.root, vendoredPath, "SKILL.md") + + // Pass run: working tree + lock + skill file byte-identical before/after. + beforeStatus := statusPorcelain(t, c.root) + beforeLock := readFile(t, lockPath) + beforeSkill := readFile(t, skillMD) + + if res := run(t, runOpts{args: []string{"verify"}, cwd: c.root}); res.exit != 0 { + t.Fatalf("verify (pass) exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + assertUnchanged(t, "pass", beforeStatus, statusPorcelain(t, c.root), + beforeLock, readFile(t, lockPath), beforeSkill, readFile(t, skillMD)) + + // Tamper + commit so the next verify FAILS, then assert it is still read-only. + appendByte(t, skillMD) + commitAll(t, c.root, "tamper") + + failStatus := statusPorcelain(t, c.root) + failLock := readFile(t, lockPath) + failSkill := readFile(t, skillMD) + + if res := run(t, runOpts{args: []string{"verify"}, cwd: c.root}); res.exit != 2 { + t.Fatalf("verify (fail) exit = %d, want 2 (stderr: %s)", res.exit, res.stderr) + } + + assertUnchanged(t, "fail", failStatus, statusPorcelain(t, c.root), + failLock, readFile(t, lockPath), failSkill, readFile(t, skillMD)) +} + +func TestQuickstart_VerifyDetectsTamper(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + + if res := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}); res.exit != 0 { + t.Fatalf("add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + commitAll(t, c.root, "vendor skill") + + recorded := lockEntry(t, c.root, sampleSkill)["treeSha"] + + // Tamper one byte and commit it (mismatch is a label-honesty failure on the + // committed tree, distinct from the uncommitted "dirty" finding). + appendByte(t, filepath.Join(c.root, vendoredPath, "SKILL.md")) + commitAll(t, c.root, "tamper") + + res := run(t, runOpts{args: []string{"verify"}, cwd: c.root}) + if res.exit != 2 { + t.Fatalf("verify after tamper exit = %d, want 2 (stderr: %s)", res.exit, res.stderr) + } + + if !strings.Contains(res.stdout, sampleSkill) { + t.Errorf("human failure output must name the tampered skill %q:\n%s", sampleSkill, res.stdout) + } + + jsonRes := run(t, runOpts{args: []string{"verify", "--json"}, cwd: c.root}) + + rep := decodeReport(t, jsonRes.stdout) + + v := findVerdict(t, rep, sampleSkill) + if v.Status != "mismatch" { + t.Errorf("verdict status = %q, want mismatch", v.Status) + } + + if v.ExpectedTreeSha != recorded { + t.Errorf("expectedTreeSha = %q, want the recorded %v", v.ExpectedTreeSha, recorded) + } + + if v.ActualTreeSha == "" || v.ActualTreeSha == v.ExpectedTreeSha { + t.Errorf("actualTreeSha = %q, want a non-empty value != expected", v.ActualTreeSha) + } +} + +func TestQuickstart_VerifyDirtyUncommitted(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + + // Vendored but NOT committed → dirty (distinct from mismatch). + if res := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}); res.exit != 0 { + t.Fatalf("add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + res := run(t, runOpts{args: []string{"verify"}, cwd: c.root}) + if res.exit != 2 { + t.Fatalf("verify on uncommitted skill exit = %d, want 2 (stderr: %s)", res.exit, res.stderr) + } + + jsonRes := run(t, runOpts{args: []string{"verify", "--json"}, cwd: c.root}) + + rep := decodeReport(t, jsonRes.stdout) + if rep.Counts.Dirty < 1 { + t.Errorf("counts.dirty = %d, want >= 1", rep.Counts.Dirty) + } + + v := findVerdict(t, rep, sampleSkill) + if v.Status != "dirty" { + t.Errorf("verdict status = %q, want dirty (distinct from mismatch)", v.Status) + } + + if !strings.Contains(strings.ToLower(v.Reason), "commit") { + t.Errorf("dirty reason = %q, want it to mention committing", v.Reason) + } +} + +func TestQuickstart_VerifyEmptyRepoPasses(t *testing.T) { + t.Parallel() + requireGit(t) + + root := t.TempDir() + git(t, root, "init", "-q", "-b", "main") + + res := run(t, runOpts{args: []string{"verify"}, cwd: root}) + if res.exit != 0 { + t.Fatalf("verify on empty repo exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + jsonRes := run(t, runOpts{args: []string{"verify", "--json"}, cwd: root}) + + rep := decodeReport(t, jsonRes.stdout) + if rep.OK != true { + t.Errorf("--json ok = %v on empty repo, want true", rep.OK) + } + + if rep.Counts != (counts{}) { + t.Errorf("counts = %+v on empty repo, want all zero", rep.Counts) + } + + if len(rep.Verdicts) != 0 { + t.Errorf("verdicts = %d on empty repo, want 0 (serialized as [])", len(rep.Verdicts)) + } + + // verdicts must be [] not null. + if !strings.Contains(jsonRes.stdout, "\"verdicts\":[]") { + t.Errorf("empty-repo verdicts should serialize as [], got:\n%s", jsonRes.stdout) + } +} + +// --------------------------------------------------------------------------- +// US3 — Orphan / completeness (verify) +// --------------------------------------------------------------------------- + +func TestQuickstart_VerifyDetectsOrphan(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + + if res := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}); res.exit != 0 { + t.Fatalf("add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + // An unlocked skill dir created by hand (no add) + committed → orphan. + writeRogueSkill(t, c.root) + commitAll(t, c.root, "vendor + rogue") + + res := run(t, runOpts{args: []string{"verify"}, cwd: c.root}) + if res.exit != 2 { + t.Fatalf("verify with orphan exit = %d, want 2 (stderr: %s)", res.exit, res.stderr) + } + + jsonRes := run(t, runOpts{args: []string{"verify", "--json"}, cwd: c.root}) + + rep := decodeReport(t, jsonRes.stdout) + if rep.Counts.Orphan < 1 { + t.Errorf("counts.orphan = %d, want >= 1", rep.Counts.Orphan) + } + + v := findVerdict(t, rep, "rogue") + if v.Status != "orphan" { + t.Errorf("rogue verdict status = %q, want orphan", v.Status) + } +} + +// TestQuickstart_VerifyIgnoresViewDirs proves the orphan/completeness scan is +// confined to the canonical .agents/skills root (FR-011, US3 AS3): a skill-looking +// directory materialized under a per-client view path (e.g. .claude/skills) is +// neither scanned nor flagged as an orphan. Regression guard for the deferred +// multi-client symlink-view feature — without it, a future scan that wandered into +// view roots would spuriously fail verify. +func TestQuickstart_VerifyIgnoresViewDirs(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + + if res := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}); res.exit != 0 { + t.Fatalf("add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + // A manually-created per-client view directory that *looks* like a skill but + // lives OUTSIDE .agents/skills — verify must ignore it entirely. + writeClientViewSkill(t, c.root, "viewer") + commitAll(t, c.root, "vendor skill + non-canonical client view") + + res := run(t, runOpts{args: []string{"verify"}, cwd: c.root}) + if res.exit != 0 { + t.Fatalf("verify with a non-canonical view dir exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + rep := decodeReport(t, run(t, runOpts{args: []string{"verify", "--json"}, cwd: c.root}).stdout) + if !rep.OK || rep.Counts.Verified != 1 || rep.Counts.Orphan != 0 { + t.Errorf("report = %+v, want ok with exactly 1 verified and 0 orphans (view dir ignored)", rep) + } + + if len(rep.Verdicts) != 1 { + t.Fatalf("verdicts = %d, want exactly 1 (only the canonical skill, not the .claude view)", len(rep.Verdicts)) + } + + if got := rep.Verdicts[0].Path; got != vendoredPath { + t.Errorf("verdict path = %q, want the canonical %q (the .claude view must not be scanned)", got, vendoredPath) + } +} + +func TestQuickstart_VerifyDetectsMissing(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + + if res := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}); res.exit != 0 { + t.Fatalf("add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + commitAll(t, c.root, "vendor skill") + + // Remove the vendored dir but keep the lock entry → missing. + if err := os.RemoveAll(filepath.Join(c.root, vendoredPath)); err != nil { + t.Fatalf("remove vendored dir: %v", err) + } + + commitAll(t, c.root, "remove skill") + + res := run(t, runOpts{args: []string{"verify"}, cwd: c.root}) + if res.exit != 2 { + t.Fatalf("verify with missing skill exit = %d, want 2 (stderr: %s)", res.exit, res.stderr) + } + + jsonRes := run(t, runOpts{args: []string{"verify", "--json"}, cwd: c.root}) + + rep := decodeReport(t, jsonRes.stdout) + if rep.Counts.Missing < 1 { + t.Errorf("counts.missing = %d, want >= 1", rep.Counts.Missing) + } + + v := findVerdict(t, rep, sampleSkill) + if v.Status != "missing" { + t.Errorf("verdict status = %q, want missing", v.Status) + } +} + +func TestQuickstart_VerifyAggregatesAllFailures(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + + if res := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}); res.exit != 0 { + t.Fatalf("add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + commitAll(t, c.root, "vendor skill") + + // One tampered (mismatch) + one orphan, both committed, in a single run. + appendByte(t, filepath.Join(c.root, vendoredPath, "SKILL.md")) + writeRogueSkill(t, c.root) + commitAll(t, c.root, "tamper + rogue") + + res := run(t, runOpts{args: []string{"verify"}, cwd: c.root}) + if res.exit != 2 { + t.Fatalf("verify exit = %d, want 2 (stderr: %s)", res.exit, res.stderr) + } + + jsonRes := run(t, runOpts{args: []string{"verify", "--json"}, cwd: c.root}) + + rep := decodeReport(t, jsonRes.stdout) + + // Did NOT stop at the first failure: both failures present in one report. + if rep.Counts.Mismatch < 1 || rep.Counts.Orphan < 1 { + t.Errorf("counts = %+v, want mismatch>=1 AND orphan>=1 (aggregated, not first-fail)", rep.Counts) + } + + // Both skills appear as verdicts (the check covers the full union). + if len(rep.Verdicts) < 2 { + t.Errorf("verdicts = %d, want >= 2 (both skills reported)", len(rep.Verdicts)) + } + + _ = findVerdict(t, rep, sampleSkill) + _ = findVerdict(t, rep, "rogue") +} + +// --------------------------------------------------------------------------- +// US4 — Scriptable outcome (exit codes + --json) +// --------------------------------------------------------------------------- + +func TestQuickstart_VerifyExitCodeMatrix(t *testing.T) { + t.Parallel() + + // pass → 0 (and deterministic on repeat). + c := newConsumerRepo(t) + + if res := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}); res.exit != 0 { + t.Fatalf("add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + commitAll(t, c.root, "vendor skill") + + first := run(t, runOpts{args: []string{"verify"}, cwd: c.root}) + second := run(t, runOpts{args: []string{"verify"}, cwd: c.root}) + + if first.exit != 0 || second.exit != 0 { + t.Errorf("pass exit codes = %d/%d, want 0/0 (deterministic)", first.exit, second.exit) + } + + // verification failure → 2. + appendByte(t, filepath.Join(c.root, vendoredPath, "SKILL.md")) + commitAll(t, c.root, "tamper") + + failA := run(t, runOpts{args: []string{"verify"}, cwd: c.root}) + failB := run(t, runOpts{args: []string{"verify"}, cwd: c.root}) + + if failA.exit != 2 || failB.exit != 2 { + t.Errorf("failure exit codes = %d/%d, want 2/2", failA.exit, failB.exit) + } + + // malformed lock → 1 (config/usage, distinct from 2). + if err := os.WriteFile(filepath.Join(c.root, ".skillrig", "skills-lock.json"), []byte("not json{"), 0o644); err != nil { + t.Fatalf("corrupt lock: %v", err) + } + + if res := run(t, runOpts{args: []string{"verify"}, cwd: c.root}); res.exit != 1 { + t.Errorf("malformed lock exit = %d, want 1", res.exit) + } + + // not a git repo → 1. + nonGit := t.TempDir() + if res := run(t, runOpts{args: []string{"verify"}, cwd: nonGit}); res.exit != 1 { + t.Errorf("not-a-git-repo exit = %d, want 1", res.exit) + } + + // Exit code 3 is reserved and MUST never be emitted by verify. + for _, code := range []int{first.exit, second.exit, failA.exit, failB.exit} { + if code == 3 { + t.Errorf("verify emitted reserved exit code 3") + } + } +} + +func TestQuickstart_VerifyJSONComplete(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + + if res := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}); res.exit != 0 { + t.Fatalf("add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + commitAll(t, c.root, "vendor skill") + + // Passing run. + pass := run(t, runOpts{args: []string{"verify", "--json"}, cwd: c.root}) + assertJSONStructurallyComplete(t, "pass", pass.stdout) + + // stdout stays clean JSON even with diagnostics: stderr is separate. + if strings.TrimSpace(pass.stderr) != "" { + // Diagnostics (if any) must not be on stdout; a non-empty stderr is fine. + _ = pass.stderr + } + + // Failing run. + appendByte(t, filepath.Join(c.root, vendoredPath, "SKILL.md")) + commitAll(t, c.root, "tamper") + + fail := run(t, runOpts{args: []string{"verify", "--json"}, cwd: c.root}) + if fail.exit != 2 { + t.Fatalf("failing verify --json exit = %d, want 2", fail.exit) + } + + assertJSONStructurallyComplete(t, "fail", fail.stdout) +} + +func TestQuickstart_VerifyMalformedLock(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + + if res := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}); res.exit != 0 { + t.Fatalf("add exit = %d, want 0 (stderr: %s)", res.exit, res.stderr) + } + + if err := os.WriteFile(filepath.Join(c.root, ".skillrig", "skills-lock.json"), []byte("{ not valid json"), 0o644); err != nil { + t.Fatalf("corrupt lock: %v", err) + } + + res := run(t, runOpts{args: []string{"verify"}, cwd: c.root}) + + // Exit 1 (usage/config), DISTINCT from a verification failure (2). + if res.exit != 1 { + t.Fatalf("malformed lock exit = %d, want 1 (distinct from 2)", res.exit) + } + + if res.stdout != "" { + t.Errorf("error path must keep stdout empty, got: %q", res.stdout) + } + + // 3-part error naming the file; not a raw parser dump as the whole message. + assertContains(t, "what", res.stderr, "skills-lock.json") + assertContains(t, "why", res.stderr, "why:") + assertContains(t, "fix", res.stderr, "fix:") + + // Raw cause surfaces under --verbose. + verbose := run(t, runOpts{args: []string{"verify", "--verbose"}, cwd: c.root}) + if verbose.exit != 1 { + t.Errorf("--verbose malformed lock exit = %d, want 1", verbose.exit) + } +} + +func TestQuickstart_AddHelpExamples(t *testing.T) { + t.Parallel() + + res := run(t, runOpts{args: []string{"add", "--help"}}) + if res.exit != 0 { + t.Fatalf("add --help exit = %d, want 0", res.exit) + } + + if n := countExampleLines(res.stdout, "skillrig add "); n < 2 { + t.Errorf("add --help shows %d 'skillrig add ' example lines, want >= 2:\n%s", n, res.stdout) + } +} + +func TestQuickstart_VerifyHelpExamples(t *testing.T) { + t.Parallel() + + res := run(t, runOpts{args: []string{"verify", "--help"}}) + if res.exit != 0 { + t.Fatalf("verify --help exit = %d, want 0", res.exit) + } + + if n := countExampleLines(res.stdout, "skillrig verify "); n < 2 { + t.Errorf("verify --help shows %d 'skillrig verify ' example lines, want >= 2:\n%s", n, res.stdout) + } +} + +// --------------------------------------------------------------------------- +// Round-trip (headline acceptance contract) +// --------------------------------------------------------------------------- + +func TestQuickstart_AddThenVerifyRoundTrip(t *testing.T) { + t.Parallel() + + c := newConsumerRepo(t) + wantTreeSHA := rawTreeSHA(t, c.originDir, "HEAD", originSubtree) + + // init (in newConsumerRepo) → add → commit → verify, two commands + a commit, + // zero network, no hand-authored lock. + addRes := run(t, runOpts{args: []string{"add", sampleSkill}, cwd: c.root}) + if addRes.exit != 0 { + t.Fatalf("add exit = %d, want 0 (stderr: %s)", addRes.exit, addRes.stderr) + } + + // add recorded exactly what verify will recompute. + if got := lockEntry(t, c.root, sampleSkill)["treeSha"]; got != wantTreeSHA { + t.Fatalf("lock treeSha = %v, want raw-git ground truth %s", got, wantTreeSHA) + } + + commitAll(t, c.root, "vendor skill") + + verifyRes := run(t, runOpts{args: []string{"verify"}, cwd: c.root}) + if verifyRes.exit != 0 { + t.Fatalf("round-trip verify exit = %d, want 0 (stderr: %s)", verifyRes.exit, verifyRes.stderr) + } + + // One-byte tamper + commit ⇒ verify exit 2. + appendByte(t, filepath.Join(c.root, vendoredPath, "SKILL.md")) + commitAll(t, c.root, "tamper") + + tamperRes := run(t, runOpts{args: []string{"verify"}, cwd: c.root}) + if tamperRes.exit != 2 { + t.Fatalf("round-trip tamper verify exit = %d, want 2 (stderr: %s)", tamperRes.exit, tamperRes.stderr) + } +} + +// --------------------------------------------------------------------------- +// Local helpers (presentation-layer assertions / lock + report decoding) +// --------------------------------------------------------------------------- + +// counts mirrors the verify --json counts object (all five status tallies). +type counts struct { + Verified int `json:"verified"` + Mismatch int `json:"mismatch"` + Orphan int `json:"orphan"` + Missing int `json:"missing"` + Dirty int `json:"dirty"` +} + +// verdict mirrors one verify --json verdict (all six fields). +type verdict struct { + Name string `json:"name"` + Path string `json:"path"` + Status string `json:"status"` + ExpectedTreeSha string `json:"expectedTreeSha"` + ActualTreeSha string `json:"actualTreeSha"` + Reason string `json:"reason"` +} + +// report mirrors the verify --json top-level object. +type report struct { + OK bool `json:"ok"` + Counts counts `json:"counts"` + Verdicts []verdict `json:"verdicts"` +} + +// decodeReport strictly decodes a verify --json payload into the typed report. +func decodeReport(t *testing.T, stdout string) report { + t.Helper() + + var rep report + if err := json.Unmarshal([]byte(stdout), &rep); err != nil { + t.Fatalf("verify --json is not parseable: %v\n%s", err, stdout) + } + + return rep +} + +// findVerdict returns the verdict named name, failing the test when absent. +func findVerdict(t *testing.T, rep report, name string) verdict { + t.Helper() + + for _, v := range rep.Verdicts { + if v.Name == name { + return v + } + } + + t.Fatalf("no verdict named %q in report: %+v", name, rep.Verdicts) + + return verdict{} +} + +// assertJSONStructurallyComplete verifies the verify --json payload is parseable +// and structurally complete: top-level ok/counts/verdicts, all five counts keys, +// and every verdict carrying all six fields (Constitution II — not a Contains +// check). +func assertJSONStructurallyComplete(t *testing.T, label, stdout string) { + t.Helper() + + obj := decodeJSON(t, stdout) + requireKeys(t, obj, "ok", "counts", "verdicts") + + countsObj, ok := obj["counts"].(map[string]any) + if !ok { + t.Fatalf("%s: counts is not an object: %v", label, obj["counts"]) + } + + requireKeys(t, countsObj, countsKeys...) + + verdicts, ok := obj["verdicts"].([]any) + if !ok { + t.Fatalf("%s: verdicts is not an array: %v", label, obj["verdicts"]) + } + + if len(verdicts) == 0 { + t.Errorf("%s: expected at least one verdict to assert structural completeness", label) + } + + for i, raw := range verdicts { + vObj, ok := raw.(map[string]any) + if !ok { + t.Fatalf("%s: verdict[%d] is not an object: %v", label, i, raw) + } + + requireKeys(t, vObj, verdictKeys...) + } +} + +// lockSkills decodes the lock's skills map, failing the test on a read/parse +// error. It proves the lock is tool-written valid JSON. +func lockSkills(t *testing.T, root string) map[string]map[string]any { + t.Helper() + + data := readFile(t, filepath.Join(root, ".skillrig", "skills-lock.json")) + + var lf struct { + Skills map[string]map[string]any `json:"skills"` + } + + if err := json.Unmarshal([]byte(data), &lf); err != nil { + t.Fatalf("lock is not parseable: %v\n%s", err, data) + } + + return lf.Skills +} + +// lockEntry returns the named skill's lock entry, failing if absent. +func lockEntry(t *testing.T, root, name string) map[string]any { + t.Helper() + + entry, ok := lockSkills(t, root)[name] + if !ok { + t.Fatalf("lock has no entry for %q", name) + } + + return entry +} + +// assertContains asserts that haystack contains needle, labelling the failure +// (used to check the what / why / fix parts of an error as DISTINCT checks). +func assertContains(t *testing.T, part, haystack, needle string) { + t.Helper() + + if !strings.Contains(haystack, needle) { + t.Errorf("error (%s) should contain %q, got:\n%s", part, needle, haystack) + } +} + +// assertUnchanged fails when any of the three before/after pairs differ, naming +// the verify phase (pass/fail) — the read-only invariant (FR-015). +func assertUnchanged(t *testing.T, phase, statusBefore, statusAfter, lockBefore, lockAfter, skillBefore, skillAfter string) { + t.Helper() + + if statusBefore != statusAfter { + t.Errorf("[%s] git status changed across verify (not read-only):\nbefore=%q\nafter=%q", phase, statusBefore, statusAfter) + } + + if lockBefore != lockAfter { + t.Errorf("[%s] lock changed across verify (verify must write nothing)", phase) + } + + if skillBefore != skillAfter { + t.Errorf("[%s] skill file changed across verify (verify must write nothing)", phase) + } +} + +// appendByte appends one byte to the file at path (the canonical one-byte tamper +// / divergence used across scenarios). +func appendByte(t *testing.T, path string) { + t.Helper() + + f, err := os.OpenFile(path, os.O_APPEND|os.O_WRONLY, 0o644) + if err != nil { + t.Fatalf("open %s for tamper: %v", path, err) + } + + if _, err := f.WriteString("x"); err != nil { + _ = f.Close() + + t.Fatalf("append to %s: %v", path, err) + } + + if err := f.Close(); err != nil { + t.Fatalf("close %s: %v", path, err) + } +} + +// writeRogueSkill creates an unlocked skill directory under the canonical +// .agents/skills root (no add, so no lock entry) — the orphan fixture. +func writeRogueSkill(t *testing.T, root string) { + t.Helper() + + dir := filepath.Join(root, ".agents", "skills", "rogue") + if err := os.MkdirAll(dir, 0o755); err != nil { + t.Fatalf("mkdir rogue: %v", err) + } + + if err := os.WriteFile(filepath.Join(dir, "skill.toml"), []byte("name = \"rogue\"\nversion = \"0.0.1\"\n"), 0o644); err != nil { + t.Fatalf("write rogue manifest: %v", err) + } +} + +// writeClientViewSkill materializes a skill-looking directory under a per-client +// view root (.claude/skills/) — OUTSIDE the canonical .agents/skills. Used to +// prove verify's scan ignores non-canonical view locations (FR-011 / US3 AS3). +func writeClientViewSkill(t *testing.T, root, name string) { + t.Helper() + + dir := filepath.Join(root, ".claude", "skills", name) + if err := os.MkdirAll(dir, 0o755); err != nil { + t.Fatalf("mkdir client view: %v", err) + } + + manifest := "name = \"" + name + "\"\nversion = \"0.0.1\"\n" + if err := os.WriteFile(filepath.Join(dir, "skill.toml"), []byte(manifest), 0o644); err != nil { + t.Fatalf("write client-view manifest: %v", err) + } +} + +// fileMode returns the permission bits of the file at path. +func fileMode(t *testing.T, path string) os.FileMode { + t.Helper() + + info, err := os.Stat(path) + if err != nil { + t.Fatalf("stat %s: %v", path, err) + } + + return info.Mode().Perm() +} + +// countExampleLines counts lines in help output that begin (after optional +// leading whitespace) with prefix — the ">=2 usage examples" shape check +// (FR-018 / SC-009), stronger than a single Contains. +func countExampleLines(help, prefix string) int { + n := 0 + + for line := range strings.SplitSeq(help, "\n") { + if strings.HasPrefix(strings.TrimSpace(line), prefix) { + n++ + } + } + + return n +} diff --git a/test/testdata/sample-origin/.skillrig-origin.toml b/test/testdata/sample-origin/.skillrig-origin.toml new file mode 100644 index 0000000..e0f53ab --- /dev/null +++ b/test/testdata/sample-origin/.skillrig-origin.toml @@ -0,0 +1,10 @@ +# Canonical, design-aligned sample origin used by the TestQuickstart_* suite. +# The convention contract this origin speaks (architecture §2d.3). The generic +# skillrig binary reads this and fails clearly against an incompatible origin. +convention_version = 1 + +# Identity of this origin in OWNER/REPO grammar. Recorded into consumers' locks. +origin = "my-org/my-skills" + +# Where skills live, relative to repo root. add/verify read skills//. +skills_dir = "skills" diff --git a/test/testdata/sample-origin/skills/terraform-plan-review/SKILL.md b/test/testdata/sample-origin/skills/terraform-plan-review/SKILL.md new file mode 100644 index 0000000..d3867f3 --- /dev/null +++ b/test/testdata/sample-origin/skills/terraform-plan-review/SKILL.md @@ -0,0 +1,16 @@ +--- +name: terraform-plan-review +description: Review a terraform plan for risk and drift before apply, flagging destructive changes, IAM/security-policy edits, and resources that will be replaced rather than updated. +--- + +# Terraform Plan Review + +Use this skill when a user asks you to review, sanity-check, or assess the risk +of a `terraform plan` before they apply. + +## Procedure + +1. Obtain machine-readable plan JSON: `terraform show -json plan.tfplan > plan.json`. +2. Run the analyzer: `oxid review --plan plan.json`. +3. Summarize by severity: destroy/replace, security-sensitive, drift, benign. +4. End with a one-line verdict the user (or CI) can act on. diff --git a/test/testdata/sample-origin/skills/terraform-plan-review/skill.toml b/test/testdata/sample-origin/skills/terraform-plan-review/skill.toml new file mode 100644 index 0000000..f82d3e7 --- /dev/null +++ b/test/testdata/sample-origin/skills/terraform-plan-review/skill.toml @@ -0,0 +1,23 @@ +# Per-skill machine-facing manifest. Vendors with the skill into consumer repos. +name = "terraform-plan-review" +version = "1.4.0" +namespace = "my-org" +description = "Review a terraform plan for risk and drift." + +# Deterministic discovery tags. +tags = ["platform-team", "terraform", "aws"] + +# Backing-CLI prerequisites: DECLARED, not installed. These tools are absent in +# the test environment on purpose — verify is integrity-only and MUST NOT check +# them (SC-006/FR-014), so a pass with these present still exits 0. +[[requires]] +tool = "oxid" +version = ">=0.4.0" +source = "my-org/my-skills" +manager = "mise" + +[[requires]] +tool = "terraform" +version = ">=1.6" +source = "hashicorp/terraform" +manager = "mise" From 4b0e33a0e618b09f504a5c4fa558d3a41177773b Mon Sep 17 00:00:00 2001 From: Vincent De Smet Date: Sat, 30 May 2026 13:33:28 +0800 Subject: [PATCH 6/8] fix(002): resolve adversarial review #2 findings MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Address the independent post-implementation review (reviews/002-review.md): correctness + harness bugs, error-navigation gaps, missing unit tiers, and the local-origin resolution footgun. Correctness / harness: - R2-H1: add resolvePlacement looked up the lock by the directory arg while writeLockEntry keys it by the manifest name — an identical re-add of a skill whose manifest name != its dir was wrongly refused (FR-003). Look up by name; regression test for name!=dir. - R2-H2: the re-exec'd git stub now sets GOCOVERDIR, so `go test -cover ./pkg/skillcore/...` no longer leaks a warning into captured git stderr. - GAP-C: `make test-unit` now runs ./internal/... AND ./pkg/... (skillcore's Constitution III tests were silently skipped by the documented unit tier). - GAP-D: ship an executable check.sh (0o755) in the sample skill so the mode-preservation assertion actually exercises the exec bit. Error navigation (cli.md Principle 1/2): - R2-M3: custom Args validators on add/verify return what/why/fix + an example instead of cobra's terse "accepts 1 arg(s)" / "unknown command". - R2-M4/AR-2: new *OriginNotFoundError distinguishes a missing local origin checkout from a wrong skill name (was conflated into "check the skill name"). Local-origin resolution (AR-1/R2-L6): - Anchor the origin source to the repo root (filepath.Join(repoRoot, originDir)), matching the destination — `add` now works from any subdirectory (was CWD-relative and failed from nested dirs). Verified live. Tests (Constitution III): - GAP-A: pkg/skillcore/verify_test.go — taxonomy (ok/mismatch/orphan/missing/ dirty), counts, aggregate-all, dirty-masks-mismatch precedence, empty-repo, unsupported lockfileVersion -> *LockError. - GAP-B: internal/cli/addverify_test.go — exitCodeFor (incl. wrapped *VerifyFailure), mapAddError classes, originDirRef, arg validators, renderers. Docs / tooling: - R2-L5: cli.md add synopsis corrected to the shipped surface; --origin/--pin marked dropped/planned. - AR-4: add --help, contracts/add.md, skillrig-init + skillrig-add-verify skills synced to repo-root resolution + the distinct missing-checkout error. - checkpoint-workflow review-agent template now loads agentic-go-cli-design + golang-code-style/testing/lint, and requires a clean tree before launching. Gate: make check (0 lint) + go test -cover ./... green (skillcore 79.5%, internal/cli 51.2%). Deferred: AR-5 (cosmetic stale data-model sample SHA). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../specledger.checkpoint-workflow.md | 9 + .agents/skills/skillrig-add-verify/SKILL.md | 11 +- .agents/skills/skillrig-init/SKILL.md | 5 +- Makefile | 9 +- docs/design/cli.md | 17 +- internal/cli/add.go | 47 ++- internal/cli/addverify_test.go | 328 ++++++++++++++++++ internal/cli/verify.go | 20 +- pkg/skillcore/add.go | 37 +- pkg/skillcore/add_test.go | 59 ++++ pkg/skillcore/errors.go | 16 + pkg/skillcore/helpers_test.go | 18 +- pkg/skillcore/verify_test.go | 278 +++++++++++++++ .../002-skillcore-verify/contracts/add.md | 2 +- .../reviews/002-review.md | 53 +++ test/skillcore_quickstart_test.go | 46 ++- .../skills/terraform-plan-review/check.sh | 6 + 17 files changed, 912 insertions(+), 49 deletions(-) create mode 100644 internal/cli/addverify_test.go create mode 100644 pkg/skillcore/verify_test.go create mode 100755 test/testdata/sample-origin/skills/terraform-plan-review/check.sh diff --git a/.agents/commands/specledger.checkpoint-workflow.md b/.agents/commands/specledger.checkpoint-workflow.md index 618e806..0efa09f 100644 --- a/.agents/commands/specledger.checkpoint-workflow.md +++ b/.agents/commands/specledger.checkpoint-workflow.md @@ -161,6 +161,15 @@ Execution steps: - This review is context-free by design — you have no prior knowledge of implementation decisions or tradeoffs made during development. + ## Skills — load these FIRST (you have the Skill tool) + Before reviewing, invoke the Skill tool for the project's design/best-practice + skills so you judge against the SAME standards the code was meant to meet — not + ad-hoc taste. Review findings should cite these where relevant. For this Go CLI: + `agentic-go-cli-design` (errors-as-navigation, two-level output, exit codes, + `--help`/`--json`/`--verbose`/`--dry-run`/`--force`), plus `golang-code-style`, + `golang-testing`, and `golang-lint`. (Adapt the set to the repo's language/stack + and the skills available in the session.) + ## Instructions 1. Run `sl spec setup-plan --json` from repo root and parse JSON for FEATURE_SPEC, IMPL_PLAN, SPECS_DIR, BRANCH diff --git a/.agents/skills/skillrig-add-verify/SKILL.md b/.agents/skills/skillrig-add-verify/SKILL.md index a127b0f..8b7787c 100644 --- a/.agents/skills/skillrig-add-verify/SKILL.md +++ b/.agents/skills/skillrig-add-verify/SKILL.md @@ -46,9 +46,11 @@ Offline and consume-only. **Requires a git repository** (project scope). (`SKILLRIG_ORIGIN` > project `.skillrig/config.toml` > global) exactly like every command. There is **no** `--from`/path argument. - **Local origin (this release)**: the configured `OWNER/REPO` is read from a local git - checkout at `./OWNER/REPO`, relative to where you run `add` (your repo root) — no network. - So `init --origin my-org/my-skills` expects that library checked out at `./my-org/my-skills` - (keep it out of your index, e.g. `echo 'my-org/' >> .git/info/exclude`). + checkout at `/OWNER/REPO` (resolved against the repo root, so `add` works from + any subdirectory) — no network. So `init --origin my-org/my-skills` expects that library + checked out at `/my-org/my-skills` (keep it out of your index, e.g. + `echo 'my-org/' >> .git/info/exclude`). If that checkout is absent, `add` says "origin + checkout not found" (distinct from "skill not found"). - **Idempotent**: re-adding identical content reports success and changes nothing (`action: "unchanged"`). - **Never clobbers**: if the on-disk copy diverges from the recorded fingerprint, `add` @@ -132,7 +134,8 @@ Diagnostics go to stderr, so `skillrig verify --json 2>/dev/null | jq .` stays c | Symptom (stderr) | Cause | Fix | |------------------|-------|-----| | `no origin configured` | no `SKILLRIG_ORIGIN` / project / global origin | `skillrig init --origin OWNER/REPO`, or set `SKILLRIG_ORIGIN` | -| `skill "" not found in origin` | no `skills//` at the configured origin | check the name against the origin's `skills/` | +| `origin checkout not found at ` | the configured `OWNER/REPO` is not checked out locally at `/OWNER/REPO` | clone the origin there (e.g. `git clone `), or re-bind with `skillrig init` | +| `skill "" not found in origin` | the origin IS present but has no `skills//` | check the name against the origin's `skills/` | | `refusing to overwrite ` | on-disk content diverges from the record | re-run with `--force`, or revert local edits | | `not a git repository` | `add`/`verify` run outside a repo | run inside the repo (or `git init` first) | | `cannot read .skillrig/skills-lock.json` | malformed/unreadable lock (exit `1`, **not** `2`) | check/repair the file, or re-vendor with `skillrig add` | diff --git a/.agents/skills/skillrig-init/SKILL.md b/.agents/skills/skillrig-init/SKILL.md index bfc0917..8b82bad 100644 --- a/.agents/skills/skillrig-init/SKILL.md +++ b/.agents/skills/skillrig-init/SKILL.md @@ -91,8 +91,9 @@ bound repo resolves the same origin. `init` records only an `OWNER/REPO[@REF]` **reference** — never a filesystem path (passing a path fails with `invalid origin … expected OWNER/REPO[@REF]`). In this release there is no network fetch, so when a later command (`skillrig add`) needs the origin's files it reads -them from a **local git checkout at `./OWNER/REPO`, relative to where you run the command** -(your repo root). So to vendor from a local copy of `my-org/my-skills`: +them from a **local git checkout at `/OWNER/REPO`** (resolved against the repo +root, so it works from any subdirectory). So to vendor from a local copy of `my-org/my-skills`, +from the repo root: ``` skillrig init --origin my-org/my-skills # records the reference diff --git a/Makefile b/Makefile index 63ca730..69a230e 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,8 @@ # skillrig-cli build & test tasks. # Test tiers map to the package layout (Constitution III): -# unit -> ./internal/... (mocked/recorded boundaries, fast) -# integration -> ./test/... (TestQuickstart_*; builds & execs the real binary) +# unit -> ./internal/... ./pkg/... (presentation-free logic + skillcore +# ground-truth/table-driven tests, fast; no real binary) +# integration -> ./test/... (TestQuickstart_*; builds & execs the binary) BINARY := skillrig @@ -13,8 +14,8 @@ build: ## Build the skillrig binary into ./$(BINARY) test: ## Run the full test suite (unit + integration) go test ./... -test-unit: ## Run unit tests only (presentation-free logic in internal/) - go test ./internal/... +test-unit: ## Run unit tests only (presentation-free logic in internal/ + pkg/skillcore) + go test ./internal/... ./pkg/... test-integration: ## Run the quickstart acceptance suite (builds & execs the binary) go test ./test/... diff --git a/docs/design/cli.md b/docs/design/cli.md index e63906b..22228a4 100644 --- a/docs/design/cli.md +++ b/docs/design/cli.md @@ -67,16 +67,21 @@ The agent decides to use `skillrig add` but isn't sure about the format? It dril ``` $ skillrig add -Error: requires at least 1 arg(s), only received 0 +add requires exactly one argument: the skill name +why: got 0 argument(s) +fix: skillrig add (e.g. skillrig add terraform-plan-review); run skillrig add --help for flags and examples +# `skillrig add --help` then reveals the full (shipped) surface: Usage: - skillrig add [--origin OWNER/REPO] [--pin ] [--json] + skillrig add [--dry-run] [--force] [--json] [--verbose] Examples: skillrig add terraform-plan-review - skillrig add terraform-plan-review --pin v1.4.0 + skillrig add terraform-plan-review --dry-run ``` +> The origin is **resolved**, never passed to `add` (no `--from`/`--origin` arg — clarified 2026-05-30); immutable per-skill `--pin ` is **deferred** (Out of Scope this slice). The synopsis above is the shipped surface. + Progressive disclosure: **overview (injected) → usage (explored) → parameters (drilled down).** The agent discovers on-demand, each level providing just enough information for the next step. This is fundamentally different from stuffing 3,000 words of tool documentation into the system prompt. Most of that information is irrelevant most of the time — pure context waste. Progressive help lets the agent decide when it needs more. @@ -122,8 +127,8 @@ verify failed: 'terraform-plan-review' tree SHA mismatch. locked: a83b… (claims v1.4.0) on-disk: c91f… → The vendored content does not match the version it claims to be. -→ To restore the approved version: 'skillrig add terraform-plan-review --pin v1.4.0' -→ If this edit is intentional, it's a local modification — commit it; 'skillrig bump' will 3-way merge it on the next upstream advance. +→ To restore the approved version: 'skillrig add terraform-plan-review --force' +→ If this edit is intentional, it's a local modification — commit it; 'skillrig bump' (planned) will 3-way merge it on the next upstream advance. ``` **Auth as a distinct failure (R18) — the most common footgun.** A missing backing CLI and an auth failure fetching it are different problems. Never collapse them: @@ -241,7 +246,7 @@ skillrig init --origin my-org/my-skills@staging # track the 'staging' branch This realizes the `@ref` half of the ecosystem-standard identity grammar `OWNER/REPO[/path]@ref` (architecture R26) that `gh skill` (`gh skill install github/awesome-copilot documentation-writer@v1.2.0`) and Vercel `npx skills` use. The `[/path]` portion remains future work. -**Two meanings of `@ref`, kept distinct.** For an **origin**, `@REF` is a *moving pointer* — a branch you track and re-resolve. For a **skill** vendored via `add` (`skillrig add --pin `), the ref is an *immutable* pin — a tag or commit SHA, recorded in the lock so the vendored content is reproducible. Same grammar, opposite intent: the origin says "where to look (and which line of development)"; the pin says "exactly which reviewed bytes." Docs and help text must not conflate them. +**Two meanings of `@ref`, kept distinct.** For an **origin**, `@REF` is a *moving pointer* — a branch you track and re-resolve. For a **skill** vendored via `add` (`skillrig add --pin `, **planned** — not in the current slice), the ref is an *immutable* pin — a tag or commit SHA, recorded in the lock so the vendored content is reproducible. Same grammar, opposite intent: the origin says "where to look (and which line of development)"; the pin says "exactly which reviewed bytes." Docs and help text must not conflate them. ### Why a single `@ref` string, not a separate flag diff --git a/internal/cli/add.go b/internal/cli/add.go index 2be1813..1d9b11b 100644 --- a/internal/cli/add.go +++ b/internal/cli/add.go @@ -3,6 +3,7 @@ package cli import ( "errors" "fmt" + "path/filepath" "github.com/spf13/cobra" @@ -43,10 +44,11 @@ func newAddCmd(opts *globalOpts) *cobra.Command { "active origin (SKILLRIG_ORIGIN > project > global) exactly like every command and\n" + "copies the skill byte-identically, injecting nothing.\n\n" + "Local origin (this release): the configured origin OWNER/REPO is read from a local\n" + - "git checkout at ./OWNER/REPO — relative to the directory you run add from (your\n" + - "repo root) — not over the network. So `init --origin my-org/my-skills` expects that\n" + - "library checked out at ./my-org/my-skills; keep it out of your index (e.g. echo\n" + - "'my-org/' >> .git/info/exclude). Fetching a remote origin is a later, additive mode.\n\n" + + "git checkout at /OWNER/REPO — resolved against the repo root, so add\n" + + "works from any subdirectory — not over the network. So `init --origin my-org/my-skills`\n" + + "expects that library checked out at /my-org/my-skills; keep it out of your\n" + + "index (e.g. echo 'my-org/' >> .git/info/exclude). Fetching a remote origin is a later,\n" + + "additive mode.\n\n" + "add is idempotent on identical content and refuses to overwrite a vendored skill\n" + "whose on-disk content diverges from the lock unless you pass --force. Requires a\n" + "git repository; commit the result, then run skillrig verify.", @@ -56,7 +58,16 @@ func newAddCmd(opts *globalOpts) *cobra.Command { " skillrig add terraform-plan-review --dry-run\n\n" + " # Overwrite a locally-diverged copy with the origin's content\n" + " skillrig add terraform-plan-review --force", - Args: cobra.ExactArgs(1), + // A custom validator (not cobra.ExactArgs) so a misinvocation is + // errors-as-navigation — what/why/fix + an example — instead of cobra's + // terse "accepts 1 arg(s), received 0" dead end (cli.md Principle 1/2). + Args: func(_ *cobra.Command, args []string) error { + if len(args) != 1 { + return usageAddArgs(len(args)) + } + + return nil + }, RunE: func(cmd *cobra.Command, args []string) error { ac.skill = args[0] @@ -94,6 +105,13 @@ func (ac *addCmd) run(cmd *cobra.Command) error { } originDir, ref := originDirRef(res.Origin) + // AR-1: anchor the local origin checkout to the repo root, not the process + // CWD. The destination (.agents/skills + the lock) is already repo-root-anchored + // via repoRoot; leaving the origin source relative made `add` resolve it against + // the CWD, so it failed from any subdirectory while the output still went to the + // repo root. Joining with repoRoot makes both sides consistent — `add` now works + // from anywhere in the repo. + originDir = filepath.Join(repoRoot, originDir) result, err := skillcore.Add(skillcore.AddOptions{ OriginDir: originDir, @@ -129,6 +147,15 @@ func originDirRef(origin config.Origin) (dir, ref string) { return dir, ref } +// usageAddArgs builds the navigational usage error for a wrong add argument +// count (errors-as-navigation: what / why / fix + a concrete example), replacing +// cobra's bare "accepts 1 arg(s)" message. +func usageAddArgs(got int) *UsageError { + return usageErrorf("add requires exactly one argument: the skill name\n"+ + "why: got %d argument(s)\n"+ + "fix: skillrig add (e.g. skillrig add terraform-plan-review); run skillrig add --help for flags and examples", got) +} + // usageNoOriginConfigured builds the 3-part "no origin configured" usage error // (contract add.md): what / why / fix. func usageNoOriginConfigured() *UsageError { @@ -141,6 +168,16 @@ func usageNoOriginConfigured() *UsageError { // values (exit 1), authoring the what/why/fix prose while preserving the raw // cause for --verbose. An unexpected error is wrapped generically. func mapAddError(skill string, err error) error { + var originMissing *skillcore.OriginNotFoundError + if errors.As(err, &originMissing) { + return &UsageError{ + Msg: fmt.Sprintf("origin checkout not found at %s\n", originMissing.OriginDir) + + "why: this release reads the configured origin from a local checkout at that path, and it is absent\n" + + "fix: check out the origin there (git clone " + originMissing.OriginDir + "), or re-bind with skillrig init --origin OWNER/REPO", + Cause: err, + } + } + var notFound *skillcore.SkillNotFoundError if errors.As(err, ¬Found) { return &UsageError{ diff --git a/internal/cli/addverify_test.go b/internal/cli/addverify_test.go new file mode 100644 index 0000000..2dbf654 --- /dev/null +++ b/internal/cli/addverify_test.go @@ -0,0 +1,328 @@ +package cli + +import ( + "bytes" + "encoding/json" + "errors" + "fmt" + "strings" + "testing" + + "github.com/skillrig/cli/internal/config" + "github.com/skillrig/cli/pkg/skillcore" +) + +// TestExitCodeFor pins the load-bearing typed switch: nil → 0, a verification +// failure → 2 (even wrapped), everything else (incl. *UsageError) → 1. +func TestExitCodeFor(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + err error + want int + }{ + {"nil is ok", nil, ExitOK}, + {"usage error", &UsageError{Msg: "bad"}, ExitUsage}, + {"plain error", errors.New("boom"), ExitUsage}, + {"verify failure", &skillcore.VerifyFailure{}, ExitVerification}, + {"wrapped verify failure", fmt.Errorf("ctx: %w", &skillcore.VerifyFailure{}), ExitVerification}, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + if got := exitCodeFor(tt.err); got != tt.want { + t.Errorf("exitCodeFor(%v) = %d, want %d", tt.err, got, tt.want) + } + }) + } +} + +// TestMapAddError asserts each skillcore add error maps to a navigational +// *UsageError with the distinguishing what/why/fix, and preserves the raw cause. +func TestMapAddError(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + err error + wantParts []string + }{ + { + "origin checkout missing", + &skillcore.OriginNotFoundError{OriginDir: "/repo/my-org/my-skills"}, + []string{"origin checkout not found", "/repo/my-org/my-skills", "check out the origin", "init --origin"}, + }, + { + "skill not found", + &skillcore.SkillNotFoundError{Skill: "x"}, + []string{"not found in origin", "check the skill name"}, + }, + { + "overwrite refused", + &skillcore.OverwriteError{Skill: "x", Path: ".agents/skills/x"}, + []string{"refusing to overwrite", "--force"}, + }, + { + "git error", + &skillcore.GitError{ExitCode: 128, Stderr: "fatal"}, + []string{"git error"}, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + got := mapAddError("x", tt.err) + + var ue *UsageError + if !errors.As(got, &ue) { + t.Fatalf("mapAddError = %T, want *UsageError", got) + } + + for _, part := range tt.wantParts { + if !strings.Contains(ue.Msg, part) { + t.Errorf("message %q missing %q", ue.Msg, part) + } + } + + if !errors.Is(got, tt.err) { + t.Errorf("mapAddError dropped the raw cause %v", tt.err) + } + }) + } +} + +// TestOriginDirRef maps a resolved origin to (relative owner/repo dir, ref) and +// defaults an empty ref to HEAD. +func TestOriginDirRef(t *testing.T) { + t.Parallel() + + dir, ref := originDirRef(config.Origin{Owner: "my-org", Repo: "my-skills"}) + if dir != "my-org/my-skills" || ref != "HEAD" { + t.Errorf("originDirRef = (%q,%q), want (my-org/my-skills, HEAD)", dir, ref) + } + + _, ref = originDirRef(config.Origin{Owner: "my-org", Repo: "my-skills", Ref: "staging"}) + if ref != "staging" { + t.Errorf("ref = %q, want staging", ref) + } +} + +// TestAddArgsValidator: misinvocation is navigational (a *UsageError), one arg ok. +func TestAddArgsValidator(t *testing.T) { + t.Parallel() + + cmd := newAddCmd(&globalOpts{}) + + for _, args := range [][]string{nil, {"a", "b"}} { + err := cmd.Args(cmd, args) + + var ue *UsageError + if !errors.As(err, &ue) { + t.Fatalf("Args(%v) = %T, want *UsageError", args, err) + } + + if !strings.Contains(ue.Msg, "skillrig add ") { + t.Errorf("Args(%v) message missing the fix example: %q", args, ue.Msg) + } + } + + if err := cmd.Args(cmd, []string{"terraform-plan-review"}); err != nil { + t.Errorf("Args(one arg) = %v, want nil", err) + } +} + +// TestVerifyArgsValidator: an extra positional is navigational, no args ok. +func TestVerifyArgsValidator(t *testing.T) { + t.Parallel() + + cmd := newVerifyCmd(&globalOpts{}) + + err := cmd.Args(cmd, []string{"extra"}) + + var ue *UsageError + if !errors.As(err, &ue) { + t.Fatalf("Args(extra) = %T, want *UsageError", err) + } + + if !strings.Contains(ue.Msg, "verify takes no arguments") { + t.Errorf("message missing the what: %q", ue.Msg) + } + + if err := cmd.Args(cmd, nil); err != nil { + t.Errorf("Args(no args) = %v, want nil", err) + } +} + +// TestRenderAddResult_Human asserts the compact human shapes per Action. +func TestRenderAddResult_Human(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + res skillcore.AddResult + want string + }{ + {"vendored", skillcore.AddResult{Name: "tf", Version: "1.4.0", Path: ".agents/skills/tf", TreeSha: "abc1234def", Action: skillcore.ActionVendored}, "vendored tf@1.4.0"}, + {"unchanged", skillcore.AddResult{Name: "tf", Version: "1.4.0", Action: skillcore.ActionUnchanged}, "already vendored (no change)"}, + {"dry-run", skillcore.AddResult{Name: "tf", Version: "1.4.0", Action: skillcore.ActionVendored, DryRun: true}, "would vendor"}, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + var b bytes.Buffer + if err := renderAddResult(&b, tt.res, false); err != nil { + t.Fatalf("renderAddResult: %v", err) + } + + out := b.String() + if !strings.Contains(out, tt.want) { + t.Errorf("human output %q missing %q", out, tt.want) + } + + if lines := strings.Count(strings.TrimRight(out, "\n"), "\n") + 1; lines > 2 { + t.Errorf("human output has %d lines, want <= 2:\n%s", lines, out) + } + + if !strings.Contains(out, "skillrig verify") { + t.Errorf("missing next-step footer: %q", out) + } + }) + } +} + +// TestRenderAddResult_JSON asserts the complete --json view (all keys, lowercased action). +func TestRenderAddResult_JSON(t *testing.T) { + t.Parallel() + + res := skillcore.AddResult{Name: "tf", Version: "1.4.0", Path: ".agents/skills/tf", Commit: "c0ffee", TreeSha: "deadbeef", Action: skillcore.ActionVendored} + + var b bytes.Buffer + + if err := renderAddResult(&b, res, true); err != nil { + t.Fatalf("renderAddResult json: %v", err) + } + + var obj map[string]any + if err := json.Unmarshal(b.Bytes(), &obj); err != nil { + t.Fatalf("json.Unmarshal: %v\n%s", err, b.String()) + } + + for _, k := range []string{"ok", "name", "version", "path", "commit", "treeSha", "action", "dryRun"} { + if _, present := obj[k]; !present { + t.Errorf("json missing key %q: %v", k, obj) + } + } + + if obj["action"] != "vendored" || obj["ok"] != true { + t.Errorf("json action/ok = %v/%v, want vendored/true", obj["action"], obj["ok"]) + } +} + +// TestRenderVerifyReport_JSONComplete asserts the report JSON is structurally +// complete for both empty and populated reports (counts has all five keys; +// verdicts is [] not null when empty; each verdict carries all six fields). +func TestRenderVerifyReport_JSONComplete(t *testing.T) { + t.Parallel() + + reports := map[string]skillcore.Report{ + "empty": {OK: true}, + "populated": { + OK: false, + Counts: skillcore.Counts{Verified: 1, Mismatch: 1}, + Verdicts: []skillcore.Verdict{ + {Name: "ok-skill", Path: ".agents/skills/ok-skill", Status: skillcore.StatusOK, ExpectedTreeSha: "a", ActualTreeSha: "a"}, + {Name: "bad", Path: ".agents/skills/bad", Status: skillcore.StatusMismatch, ExpectedTreeSha: "a", ActualTreeSha: "b", Reason: "content does not match"}, + }, + }, + } + + for name, rep := range reports { + t.Run(name, func(t *testing.T) { + t.Parallel() + + var b bytes.Buffer + if err := renderVerifyReport(&b, rep, true); err != nil { + t.Fatalf("renderVerifyReport: %v", err) + } + + // Decode into the concrete shape so missing keys / null verdicts fail. + var got verifyReportJSON + if err := json.Unmarshal(b.Bytes(), &got); err != nil { + t.Fatalf("json.Unmarshal: %v\n%s", err, b.String()) + } + + if got.Verdicts == nil { + t.Errorf("verdicts is null, want [] (empty repo must serialize as [])") + } + + if !strings.Contains(b.String(), `"counts":{"verified"`) { + t.Errorf("counts not fully present: %s", b.String()) + } + + if len(got.Verdicts) != len(rep.Verdicts) { + t.Errorf("verdicts = %d, want %d (every checked skill must appear)", len(got.Verdicts), len(rep.Verdicts)) + } + }) + } +} + +// TestRenderVerifyReport_Human asserts the bounded human shapes for pass and fail. +func TestRenderVerifyReport_Human(t *testing.T) { + t.Parallel() + + var pass bytes.Buffer + if err := renderVerifyReport(&pass, skillcore.Report{OK: true, Counts: skillcore.Counts{Verified: 3}}, false); err != nil { + t.Fatalf("render pass: %v", err) + } + + if lines := nonEmptyCount(pass.String()); lines != 2 { + t.Errorf("pass output = %d lines, want exactly 2:\n%s", lines, pass.String()) + } + + if !strings.Contains(pass.String(), "verified 3 skills") { + t.Errorf("pass output missing count: %s", pass.String()) + } + + failReport := skillcore.Report{ + OK: false, + Counts: skillcore.Counts{Verified: 1, Mismatch: 1}, + Verdicts: []skillcore.Verdict{ + {Name: "ok-skill", Status: skillcore.StatusOK}, + {Name: "bad", Status: skillcore.StatusMismatch, ExpectedTreeSha: "aaaaaaaa", ActualTreeSha: "bbbbbbbb"}, + }, + } + + var fail bytes.Buffer + if err := renderVerifyReport(&fail, failReport, false); err != nil { + t.Fatalf("render fail: %v", err) + } + + out := fail.String() + if !strings.Contains(out, "verify FAILED: 1 of 2 skills") || !strings.Contains(out, "✗ bad") { + t.Errorf("fail output wrong shape:\n%s", out) + } + // Bounded: header + one line per FAILING verdict + footer (passing ones summarized). + if lines := nonEmptyCount(out); lines > failReport.Counts.Mismatch+2 { + t.Errorf("fail output = %d lines, want <= findings+2:\n%s", lines, out) + } +} + +// nonEmptyCount counts the non-blank lines in s. +func nonEmptyCount(s string) int { + n := 0 + + for line := range strings.SplitSeq(s, "\n") { + if strings.TrimSpace(line) != "" { + n++ + } + } + + return n +} diff --git a/internal/cli/verify.go b/internal/cli/verify.go index 1adafd5..20d03a0 100644 --- a/internal/cli/verify.go +++ b/internal/cli/verify.go @@ -2,6 +2,7 @@ package cli import ( "errors" + "strings" "github.com/spf13/cobra" @@ -38,7 +39,15 @@ func newVerifyCmd(opts *globalOpts) *cobra.Command { " skillrig verify\n\n" + " # Machine-readable per-skill verdicts for an agent / jq\n" + " skillrig verify --json", - Args: cobra.NoArgs, + // Custom validator (not cobra.NoArgs) so an extra positional yields + // what/why/fix instead of cobra's "unknown command" dead end (cli.md P1/P2). + Args: func(_ *cobra.Command, args []string) error { + if len(args) != 0 { + return usageVerifyArgs(args) + } + + return nil + }, RunE: func(cmd *cobra.Command, _ []string) error { return vc.run(cmd) }, @@ -74,6 +83,15 @@ func (vc *verifyCmd) run(cmd *cobra.Command) error { // verifyNotGitRepoWhy is the rationale for verify's not-a-repo error. const verifyNotGitRepoWhy = "tree-SHA recompute needs git" +// usageVerifyArgs builds the navigational usage error when verify is given +// positional arguments it does not take (errors-as-navigation: what / why / fix). +func usageVerifyArgs(args []string) *UsageError { + return usageErrorf("verify takes no arguments\n"+ + "why: it verifies the whole repo (got: %s)\n"+ + "fix: run skillrig verify (add --json for machine-readable per-skill verdicts)", + strings.Join(args, " ")) +} + // handleVerifyError classifies skillcore.Verify's error. A *VerifyFailure is a // per-skill finding: render the report to stdout (human or --json) and return // the failure so exitCodeFor yields exit 2. A *LockError is a config/usage diff --git a/pkg/skillcore/add.go b/pkg/skillcore/add.go index 54e08c7..0524a94 100644 --- a/pkg/skillcore/add.go +++ b/pkg/skillcore/add.go @@ -82,11 +82,9 @@ func (e *OverwriteError) Error() string { // overwrite unless opts.Force, writes nothing when opts.DryRun, and is // idempotent on identical content. func Add(opts AddOptions) (AddResult, error) { - srcDir := filepath.Join(opts.OriginDir, "skills", opts.Skill) - - info, err := os.Stat(srcDir) - if err != nil || !info.IsDir() { - return AddResult{}, &SkillNotFoundError{Skill: opts.Skill} + srcDir, err := locateSkillSource(opts) + if err != nil { + return AddResult{}, err } manifest, err := ParseManifest(filepath.Join(srcDir, "skill.toml")) @@ -109,7 +107,7 @@ func Add(opts AddOptions) (AddResult, error) { destPath := vendorRoot + "/" + opts.Skill destDir := filepath.Join(opts.RepoRoot, ".agents", "skills", opts.Skill) - action, err := resolvePlacement(opts, srcDir, destDir, treeSha) + action, err := resolvePlacement(opts, manifest.Name, srcDir, destDir, treeSha) if err != nil { return AddResult{}, err } @@ -145,12 +143,35 @@ func Add(opts AddOptions) (AddResult, error) { return result, nil } +// locateSkillSource resolves and validates the origin's skill subtree, returning +// its directory. It distinguishes a missing origin checkout +// (*OriginNotFoundError — the library isn't checked out at OriginDir) from a +// missing skill (*SkillNotFoundError — the origin is present but has no such +// skill), so the CLI can give the right fix (errors-as-navigation). +func locateSkillSource(opts AddOptions) (string, error) { + if info, err := os.Stat(opts.OriginDir); err != nil || !info.IsDir() { + return "", &OriginNotFoundError{OriginDir: opts.OriginDir, Ref: opts.Ref} + } + + srcDir := filepath.Join(opts.OriginDir, "skills", opts.Skill) + if info, err := os.Stat(srcDir); err != nil || !info.IsDir() { + return "", &SkillNotFoundError{Skill: opts.Skill} + } + + return srcDir, nil +} + // resolvePlacement inspects the destination and decides the Action without // writing anything. A fresh placement is ActionVendored. An existing tree is // ActionUnchanged (idempotent) only when its on-disk content is byte-identical // to the origin source AND the lock records the matching fingerprint; any // divergence is ActionOverwritten under Force, otherwise an *OverwriteError. -func resolvePlacement(opts AddOptions, srcDir, destDir, treeSha string) (Action, error) { +// +// name is the manifest name the lock is keyed by (writeLockEntry uses it); the +// lookup MUST use it, not opts.Skill (the directory arg), or a skill whose +// manifest name differs from its directory would never match its own lock entry +// and an identical re-add would be wrongly refused (FR-003 idempotency). +func resolvePlacement(opts AddOptions, name, srcDir, destDir, treeSha string) (Action, error) { if _, err := os.Stat(destDir); err != nil { if os.IsNotExist(err) { return ActionVendored, nil @@ -169,7 +190,7 @@ func resolvePlacement(opts AddOptions, srcDir, destDir, treeSha string) (Action, return "", err } - recorded := lock.Skills[opts.Skill].TreeSha + recorded := lock.Skills[name].TreeSha if identical && recorded == treeSha { return ActionUnchanged, nil } diff --git a/pkg/skillcore/add_test.go b/pkg/skillcore/add_test.go index 832b848..3fa856d 100644 --- a/pkg/skillcore/add_test.go +++ b/pkg/skillcore/add_test.go @@ -111,6 +111,65 @@ func TestAdd_Idempotent(t *testing.T) { } } +// TestAdd_IdempotentWhenManifestNameDiffersFromDir guards R2-H1: the lock is +// keyed by the manifest name, so the placement guard must look up the recorded +// fingerprint by that name too — not by the directory arg. data-model only says +// the leaf SHOULD equal the name, so a dir "tf-review" with manifest name +// "terraform-plan-review" is legal; before the fix an identical re-add was +// wrongly refused with an *OverwriteError (FR-003 violation). +func TestAdd_IdempotentWhenManifestNameDiffersFromDir(t *testing.T) { + t.Parallel() + + originDir := t.TempDir() + runGit(t, originDir, "init", "-q") + + const dirName = "tf-review" // != manifest name "terraform-plan-review" + writeFile(t, originDir, filepath.Join("skills", dirName, "SKILL.md"), 0o644, sampleSkillMd) + writeFile(t, originDir, filepath.Join("skills", dirName, "skill.toml"), 0o644, sampleManifest) + runGit(t, originDir, "add", "-A") + runGit(t, originDir, "commit", "-q", "-m", "seed dir!=name skill") + + consumer := newConsumer(t) + + if _, err := Add(addOpts(originDir, dirName, consumer, false)); err != nil { + t.Fatalf("first Add: %v", err) + } + + second, err := Add(addOpts(originDir, dirName, consumer, false)) + if err != nil { + t.Fatalf("identical re-add must be idempotent, not refused: %v", err) + } + + if second.Action != ActionUnchanged { + t.Errorf("Action = %q, want %q (name!=dir must not false-refuse)", second.Action, ActionUnchanged) + } +} + +// TestAdd_OriginCheckoutMissing guards R2-M4/AR-2: when the entire origin +// checkout directory is absent, Add returns a distinct *OriginNotFoundError +// (check out the origin) rather than *SkillNotFoundError (check the skill name). +func TestAdd_OriginCheckoutMissing(t *testing.T) { + t.Parallel() + + consumer := newConsumer(t) + missingOrigin := filepath.Join(consumer, "my-org", "my-skills") // never created + + _, err := Add(addOpts(missingOrigin, "terraform-plan-review", consumer, false)) + if err == nil { + t.Fatal("Add(missing origin checkout): want error, got nil") + } + + var originMissing *OriginNotFoundError + if !errors.As(err, &originMissing) { + t.Fatalf("Add error = %T (%v), want *OriginNotFoundError", err, err) + } + + var skillMissing *SkillNotFoundError + if errors.As(err, &skillMissing) { + t.Error("missing origin checkout must NOT be reported as SkillNotFoundError") + } +} + // TestAdd_DivergentRefused asserts the never-silently-clobber guard (FR-004): // once a vendored skill is locally modified it diverges from the lock, and a // plain re-add must refuse with an *OverwriteError (the CLI maps it to exit 1). diff --git a/pkg/skillcore/errors.go b/pkg/skillcore/errors.go index 6e8000c..35037e6 100644 --- a/pkg/skillcore/errors.go +++ b/pkg/skillcore/errors.go @@ -37,6 +37,22 @@ func (e *LockError) Unwrap() error { return e.Cause } +// OriginNotFoundError is returned when the resolved local origin checkout does +// not exist at OriginDir — e.g. the user ran `skillrig init` (which records only +// an OWNER/REPO reference) but never checked the library out to its expected +// local path. It is deliberately distinct from *SkillNotFoundError (the origin +// IS present but the named skill is absent) so the CLI can tell the user to +// check out the origin rather than re-check the skill name (errors-as-navigation: +// do not conflate look-alike failure classes). Presentation-free: terse Error. +type OriginNotFoundError struct { + OriginDir string + Ref string +} + +func (e *OriginNotFoundError) Error() string { + return fmt.Sprintf("origin checkout not found at %q", e.OriginDir) +} + // GitError is returned when a git invocation fails. It carries the process exit // code and captured stderr, mirroring the gh/git client pattern, so the caller // can render an environment error. It is presentation-free. diff --git a/pkg/skillcore/helpers_test.go b/pkg/skillcore/helpers_test.go index 7f5fc40..7f6e02e 100644 --- a/pkg/skillcore/helpers_test.go +++ b/pkg/skillcore/helpers_test.go @@ -124,12 +124,26 @@ func stubCommandContext(exitCode int, stderr string) func(ctx context.Context, n // Re-exec this test binary, routing into TestHelperProcess. csArgs := append([]string{"-test.run=TestHelperProcess", "--"}, args...) - cmd := exec.CommandContext(ctx, os.Args[0], csArgs...) - cmd.Env = []string{ + env := []string{ "GO_WANT_HELPER_PROCESS=1", "HELPER_EXIT_CODE=" + strconv.Itoa(exitCode), "HELPER_STDERR=" + stderr, } + // Under `go test -cover`, os.Args[0] is a coverage-instrumented binary; + // re-exec'ing it with a cleared env makes it print "warning: GOCOVERDIR + // not set …" to stderr, which would pollute the git stderr the client + // captures and break TestGitClient_StubbedExit. Hand it a coverage dir so + // it stays silent (a real git binary writes no such warning). Without + // -cover the var is simply ignored. + coverDir := os.Getenv("GOCOVERDIR") + if coverDir == "" { + coverDir = os.TempDir() + } + + env = append(env, "GOCOVERDIR="+coverDir) + + cmd := exec.CommandContext(ctx, os.Args[0], csArgs...) + cmd.Env = env return cmd } diff --git a/pkg/skillcore/verify_test.go b/pkg/skillcore/verify_test.go new file mode 100644 index 0000000..a86c174 --- /dev/null +++ b/pkg/skillcore/verify_test.go @@ -0,0 +1,278 @@ +package skillcore + +import ( + "errors" + "os" + "path/filepath" + "testing" +) + +// seedVerifyRepo builds a consumer git repo with one committed skill under +// .agents/skills// and an on-disk lock recording its real (raw-git) +// tree-SHA. The tree-SHA is computed via raw git (the independent oracle, D11), +// never through skillcore. Returns the repo dir and the recorded tree-SHA. +func seedVerifyRepo(t *testing.T, name string) (repo, treeSha string) { + t.Helper() + + repo = t.TempDir() + runGit(t, repo, "init", "-q") + + rel := skillsRoot + "/" + name + writeFile(t, repo, filepath.Join(rel, "SKILL.md"), 0o644, sampleSkillMd) + writeFile(t, repo, filepath.Join(rel, "skill.toml"), 0o644, sampleManifest) + runGit(t, repo, "add", "-A") + runGit(t, repo, "commit", "-q", "-m", "vendor "+name) + + treeSha = runGit(t, repo, "rev-parse", "HEAD:"+rel) + commit := runGit(t, repo, "rev-parse", "HEAD") + + writeVerifyLock(t, repo, LockFile{ + LockfileVersion: 1, + Origin: "my-org/my-skills", + Skills: map[string]LockEntry{ + name: {Version: "1.4.0", Commit: commit, TreeSha: treeSha, Path: rel}, + }, + }) + + return repo, treeSha +} + +func writeVerifyLock(t *testing.T, repo string, lf LockFile) { + t.Helper() + + if err := WriteLock(repo, lf); err != nil { + t.Fatalf("WriteLock: %v", err) + } +} + +// findVerdict returns the verdict for name, failing if absent. +func findVerdict(t *testing.T, rep Report, name string) Verdict { + t.Helper() + + for _, v := range rep.Verdicts { + if v.Name == name { + return v + } + } + + t.Fatalf("no verdict for %q in %+v", name, rep.Verdicts) + + return Verdict{} +} + +// TestVerify_CleanPass: a committed skill matching its lock → ok, no error, +// expected == actual == the raw-git tree-SHA (label-honesty ground truth). +func TestVerify_CleanPass(t *testing.T) { + t.Parallel() + + repo, treeSha := seedVerifyRepo(t, "terraform-plan-review") + + rep, err := Verify(repo) + if err != nil { + t.Fatalf("Verify: unexpected error: %v", err) + } + + if !rep.OK || rep.Counts.Verified != 1 { + t.Fatalf("report = %+v, want ok with 1 verified", rep) + } + + v := findVerdict(t, rep, "terraform-plan-review") + if v.Status != StatusOK || v.ExpectedTreeSha != treeSha || v.ActualTreeSha != treeSha { + t.Errorf("verdict = %+v, want ok with expected==actual==%s", v, treeSha) + } +} + +// TestVerify_Mismatch: committed content tampered (and re-committed) so HEAD's +// tree-SHA no longer matches the lock → mismatch, *VerifyFailure (exit-2 class). +func TestVerify_Mismatch(t *testing.T) { + t.Parallel() + + repo, treeSha := seedVerifyRepo(t, "terraform-plan-review") + + writeFile(t, repo, filepath.Join(skillsRoot, "terraform-plan-review", "SKILL.md"), 0o644, "tampered\n") + runGit(t, repo, "commit", "-aqm", "tamper") + + rep, err := Verify(repo) + + var vf *VerifyFailure + if !errors.As(err, &vf) { + t.Fatalf("Verify error = %T (%v), want *VerifyFailure", err, err) + } + + if rep.OK || rep.Counts.Mismatch != 1 { + t.Fatalf("report = %+v, want not-ok with 1 mismatch", rep) + } + + v := findVerdict(t, rep, "terraform-plan-review") + if v.Status != StatusMismatch || v.ExpectedTreeSha != treeSha || v.ActualTreeSha == treeSha { + t.Errorf("verdict = %+v, want mismatch with expected==%s != actual", v, treeSha) + } +} + +// TestVerify_Orphan: an on-disk skill dir with no lock entry → orphan. +func TestVerify_Orphan(t *testing.T) { + t.Parallel() + + repo, _ := seedVerifyRepo(t, "terraform-plan-review") + + writeFile(t, repo, filepath.Join(skillsRoot, "rogue", "skill.toml"), 0o644, "name = \"rogue\"\nversion = \"0.0.1\"\n") + runGit(t, repo, "add", "-A") + runGit(t, repo, "commit", "-qm", "add rogue") + + rep, err := Verify(repo) + + var vf *VerifyFailure + if !errors.As(err, &vf) { + t.Fatalf("Verify error = %T, want *VerifyFailure", err) + } + + if rep.Counts.Orphan != 1 { + t.Errorf("counts.orphan = %d, want 1 (%+v)", rep.Counts.Orphan, rep) + } + + if v := findVerdict(t, rep, "rogue"); v.Status != StatusOrphan { + t.Errorf("rogue verdict = %q, want orphan", v.Status) + } +} + +// TestVerify_Missing: a lock entry whose files were removed (and the removal +// committed) → missing. +func TestVerify_Missing(t *testing.T) { + t.Parallel() + + repo, _ := seedVerifyRepo(t, "terraform-plan-review") + + if err := os.RemoveAll(filepath.Join(repo, skillsRoot, "terraform-plan-review")); err != nil { + t.Fatalf("remove skill: %v", err) + } + + runGit(t, repo, "add", "-A") + runGit(t, repo, "commit", "-qm", "remove skill") + + rep, err := Verify(repo) + + var vf *VerifyFailure + if !errors.As(err, &vf) { + t.Fatalf("Verify error = %T, want *VerifyFailure", err) + } + + if rep.Counts.Missing != 1 { + t.Errorf("counts.missing = %d, want 1 (%+v)", rep.Counts.Missing, rep) + } + + if v := findVerdict(t, rep, "terraform-plan-review"); v.Status != StatusMissing { + t.Errorf("verdict = %q, want missing", v.Status) + } +} + +// TestVerify_Dirty: a committed skill with an uncommitted local edit → dirty +// (a working-state finding, distinct from mismatch). +func TestVerify_Dirty(t *testing.T) { + t.Parallel() + + repo, _ := seedVerifyRepo(t, "terraform-plan-review") + + // Uncommitted edit to the vendored tree. + writeFile(t, repo, filepath.Join(skillsRoot, "terraform-plan-review", "SKILL.md"), 0o644, "locally edited\n") + + rep, err := Verify(repo) + + var vf *VerifyFailure + if !errors.As(err, &vf) { + t.Fatalf("Verify error = %T, want *VerifyFailure", err) + } + + if rep.Counts.Dirty != 1 { + t.Errorf("counts.dirty = %d, want 1 (%+v)", rep.Counts.Dirty, rep) + } + + if v := findVerdict(t, rep, "terraform-plan-review"); v.Status != StatusDirty { + t.Errorf("verdict = %q, want dirty", v.Status) + } +} + +// TestVerify_DirtyTakesPrecedenceOverMismatch: when content is both committed- +// tampered (would be mismatch) AND has a further uncommitted edit, dirty wins — +// the by-design precedence (commit before verifying). +func TestVerify_DirtyTakesPrecedenceOverMismatch(t *testing.T) { + t.Parallel() + + repo, _ := seedVerifyRepo(t, "terraform-plan-review") + + // Committed tamper (would be a mismatch on its own)… + writeFile(t, repo, filepath.Join(skillsRoot, "terraform-plan-review", "SKILL.md"), 0o644, "committed tamper\n") + runGit(t, repo, "commit", "-aqm", "committed tamper") + // …plus a further uncommitted edit. + writeFile(t, repo, filepath.Join(skillsRoot, "terraform-plan-review", "SKILL.md"), 0o644, "uncommitted on top\n") + + rep, _ := Verify(repo) + + v := findVerdict(t, rep, "terraform-plan-review") + if v.Status != StatusDirty { + t.Errorf("verdict = %q, want dirty (dirty masks mismatch until committed)", v.Status) + } +} + +// TestVerify_EmptyRepoPasses: a fresh repo with no skills and no lock → ok. +func TestVerify_EmptyRepoPasses(t *testing.T) { + t.Parallel() + + repo := newConsumer(t) + + rep, err := Verify(repo) + if err != nil { + t.Fatalf("Verify(empty): unexpected error: %v", err) + } + + if !rep.OK || len(rep.Verdicts) != 0 || rep.Counts != (Counts{}) { + t.Errorf("report = %+v, want ok/empty", rep) + } +} + +// TestVerify_AggregatesAllFindings: a mismatch AND an orphan in one run are both +// reported (never first-fail). +func TestVerify_AggregatesAllFindings(t *testing.T) { + t.Parallel() + + repo, _ := seedVerifyRepo(t, "terraform-plan-review") + + writeFile(t, repo, filepath.Join(skillsRoot, "terraform-plan-review", "SKILL.md"), 0o644, "tampered\n") + writeFile(t, repo, filepath.Join(skillsRoot, "rogue", "skill.toml"), 0o644, "name = \"rogue\"\n") + runGit(t, repo, "add", "-A") + runGit(t, repo, "commit", "-aqm", "tamper + orphan") + + rep, _ := Verify(repo) + + if rep.Counts.Mismatch < 1 || rep.Counts.Orphan < 1 { + t.Errorf("counts = %+v, want >=1 mismatch AND >=1 orphan in one run", rep.Counts) + } + + if len(rep.Verdicts) < 2 { + t.Errorf("verdicts = %d, want >= 2 (did not aggregate)", len(rep.Verdicts)) + } +} + +// TestVerify_MalformedLockVersion: an unsupported lockfileVersion is a *LockError +// (config/usage, exit 1), NOT a *VerifyFailure (exit 2). +func TestVerify_MalformedLockVersion(t *testing.T) { + t.Parallel() + + repo, treeSha := seedVerifyRepo(t, "terraform-plan-review") + + writeVerifyLock(t, repo, LockFile{ + LockfileVersion: 99, + Skills: map[string]LockEntry{"terraform-plan-review": {TreeSha: treeSha, Path: skillsRoot + "/terraform-plan-review"}}, + }) + + _, err := Verify(repo) + + var lockErr *LockError + if !errors.As(err, &lockErr) { + t.Fatalf("Verify error = %T (%v), want *LockError", err, err) + } + + var vf *VerifyFailure + if errors.As(err, &vf) { + t.Error("unsupported lockfileVersion must NOT be a *VerifyFailure (it is exit 1, not 2)") + } +} diff --git a/specledger/002-skillcore-verify/contracts/add.md b/specledger/002-skillcore-verify/contracts/add.md index 28d74e2..ad913b2 100644 --- a/specledger/002-skillcore-verify/contracts/add.md +++ b/specledger/002-skillcore-verify/contracts/add.md @@ -21,7 +21,7 @@ skillrig add [--dry-run] [--force] [--json] [--verbose] > **Origin, not a path** (clarified 2026-05-30): there is **no** `--from`/path argument. `add` resolves the active origin through the shared resolver (`SKILLRIG_ORIGIN` > project `.skillrig/config.toml` > global) exactly like every command; the origin *value* may be a local checkout this slice. Tests do `skillrig init --origin ` then `skillrig add `. -> **Local-origin resolution (this slice)** — `init` accepts only an `OWNER/REPO[@REF]` reference (not a filesystem path). For a local origin, `add` reads that reference from a **git checkout at `./OWNER/REPO`, relative to the directory `add` runs from** (your repo root): `init --origin my-org/my-skills` ⇒ `add` reads `./my-org/my-skills/skills//`. `@REF` selects the revision (default `HEAD`). Keep the nested checkout out of the consumer index (e.g. `echo 'my-org/' >> .git/info/exclude`). This is the concrete encoding of "the origin value may be a local checkout"; fetching a remote origin over the network is a later, additive mode. *(Follow-up: the path is resolved relative to the process CWD, not the resolved repo root — run `add` from the repo root. Making it repo-root-relative is a hardening candidate.)* +> **Local-origin resolution (this slice)** — `init` accepts only an `OWNER/REPO[@REF]` reference (not a filesystem path). For a local origin, `add` reads that reference from a **git checkout at `/OWNER/REPO`, resolved against the repo root** (`git rev-parse --show-toplevel`), so `add` works from **any subdirectory**: `init --origin my-org/my-skills` ⇒ `add` reads `/my-org/my-skills/skills//`. `@REF` selects the revision (default `HEAD`). Keep the nested checkout out of the consumer index (e.g. `echo 'my-org/' >> .git/info/exclude`). This is the concrete encoding of "the origin value may be a local checkout"; fetching a remote origin over the network is a later, additive mode. If the checkout is absent, `add` returns a distinct "origin checkout not found" error (not "skill not found"). *(Resolved AR-1/R2-M4: origin source is repo-root-anchored, consistent with the destination; missing-checkout is its own error class.)* ## Help (Progressive Discovery) diff --git a/specledger/002-skillcore-verify/reviews/002-review.md b/specledger/002-skillcore-verify/reviews/002-review.md index f9652b5..ad41df5 100644 --- a/specledger/002-skillcore-verify/reviews/002-review.md +++ b/specledger/002-skillcore-verify/reviews/002-review.md @@ -86,3 +86,56 @@ The first review agent's manual round-trip used a `cd "$WORK"` that silently no- 2. **[HIGH] AR-1** — decide the CWD-vs-repo-root resolution: make `OriginDir` repo-root-relative (robust; tests still pass since root==CWD there), or document "run `add` from the repo root" prominently. Currently only a follow-up note. 3. **[MEDIUM] AR-3** — add `pkg/skillcore/verify_test.go` (stub the git client; table-drive the status taxonomy + counts + aggregation). 4. **[LOW] AR-4 / AR-5** — fold the AR-1/AR-2 nuance into the `skillrig-add-verify` skill; refresh the stale data-model sample SHA. + +--- + +# Independent Adversarial Review #2 — 2026-05-30 (post-commit, clean tree) + +**Scope:** a second cold-context agent (Opus 4.8, xhigh), explicitly **forbidden from reading `reviews/`/`sessions/`** (true independence), reviewed the committed branch (`168afd1`): read artifacts + code, ran `make check`, exercised the binary, and probed edge cases. It **confirmed AR-1/AR-2/AR-3 independently** and found **new** defects below the green bar. `make check` PASSES; no CRITICAL. + +| ID | New? | Category | Severity | Summary | +|----|------|----------|----------|---------| +| **R2-H1** | **NEW** | Correctness (FR-003 idempotency) | **HIGH** | **`add` falsely refuses an identical re-add when manifest `name` ≠ directory name.** The lock is written keyed by `manifest.Name` (`add.go` ~L320) but `resolvePlacement` looks the entry back up by `opts.Skill` (the dir arg, ~L172: `lock.Skills[opts.Skill].TreeSha`). When they differ, the lookup misses → `recorded=="" ≠ treeSha` → identical re-add **refused** with a wrong `OverwriteError` "diverges from the recorded fingerprint". Reproduced live. data-model only *SHOULD* (not MUST) equate leaf==name, so it's reachable. No fixture has name≠dir, so all tests pass. | +| **R2-H2** | **NEW** | Test harness | **HIGH** | **`go test -cover ./pkg/skillcore/...` FAILS (3 subcases).** Under `-cover`, the re-exec'd `TestHelperProcess` git stub emits `warning: GOCOVERDIR not set…` to stderr, which `TestGitClient_StubbedExit` (`treesha_test.go:144`) captures and compares against expected-clean stderr. `make check` uses plain `go test`, so it's invisible to the gate but breaks any `-cover` run / future coverage CI. Fix: set `GOCOVERDIR` in the stub or strip the warning. | +| **R2-M3** | **NEW** | Errors-as-navigation (FR-019; cli.md P1/P2) | MEDIUM | **Bad-args invocation is a dead end.** `skillrig add` (no args) → only `error: accepts 1 arg(s), received 0`; `verify extra-arg` → `error: unknown command…` — **no what/why/fix, no Usage/Examples** (root `SilenceUsage:true`, root.go:41). Directly contradicts cli.md Principle 1's worked example showing `skillrig add` with no args printing Usage+Examples. | +| **R2-M4** | confirms **AR-2** | Errors-as-navigation | MEDIUM | "skill not found in origin" conflates a **missing origin checkout** with a **wrong skill name** (`add.go:84-90`, any stat error → `SkillNotFoundError`); `--verbose` repeats the same terse message. Skill error table inherits it. | +| **R2-L5** | **NEW** | Doc contract drift | LOW | **`docs/design/cli.md` documents a non-existent `add` surface** — lines 73/76–77 present the *current* synopsis/examples as `add [--origin OWNER/REPO] [--pin ]` / `add … --pin v1.4.0`. Neither flag exists (real surface: `[--dry-run] [--force] [--json] [--verbose]`); `--from`/`--origin` was dropped and `--pin` is Out of Scope. *(My DocSync pass synced other lines but missed these.)* | +| **R2-L6** | confirms **AR-1** | Origin resolution | LOW (conscious) | local-origin path resolved relative to **process CWD**, not repo root → `add` from a subdirectory fails ("skill not found"). Self-documented in `add.md:24` as a hardening candidate; still a footgun, no add-from-subdir test. | + +## Test-discipline gaps (Constitution II/III) + +| ID | New? | Gap | +|----|------|-----| +| **GAP-A** | confirms **AR-3** | `pkg/skillcore/verify.go` has **zero** unit tests — the entire Verify operation (status classification, `pathInHead`/`pathDirty` precedence, `readVerifyLock`, `enumerateOnDiskSkills`, `buildReport`) is validated only via black-box integration. | +| **GAP-B** | **NEW** | `internal/cli` add/verify presentation layer has **zero** unit tests (only `init` is unit-tested) — `exitCodeFor`, `mapAddError`, renderers, `originDirRef`, `gitToplevel` are integration-only. Constitution III mandates presentation-free unit tests in `internal/…`. | +| **GAP-C** | **NEW** | **`make test-unit` (`go test ./internal/...`) EXCLUDES `pkg/skillcore`** — the Constitution III ground-truth/table-driven centerpiece runs only under `make test`/`make check`. The Makefile's unit tier wasn't updated when skillcore moved to `pkg/` (SDK-1). A dev running the documented unit tier silently skips the integrity tests. | +| **GAP-D** | **NEW** | `TestQuickstart_AddVendorsSkill`'s mode-preservation assertion is a **no-op** — both fixtures are `0o644`, so the "exec bit is part of the tree-SHA" guarantee is never exercised. Add a `0o755` file to the sample skill. | + +## Positives (independently re-verified) +Ground-truth anchoring (lock treeSha == raw `git rev-parse`, raw-git oracle, relocation-invariance); verify genuinely read-only + offline + deterministic; exit codes correct & typed-switch (1 lock/repo, 2 verification, never 3); FR-014/SC-006 (requires-absent tools still pass, no `requires` in lock); aggregation; orphan scan confined to `.agents/skills` (view dir ignored); `dirty` distinct from `mismatch`, untracked stowaway → `dirty`; clean stdout/stderr; `--verbose` raw causes; skillrig-add-verify skill accurate. **No scope creep**; no FR wholly unimplemented. + +## Priority before merge (reviewer's call) +**R2-H1** (idempotency correctness bug) · **R2-H2** (`-cover` failure) · **GAP-A/B** (no unit tests for verify or the add/verify CLI layer). Then GAP-C (Makefile unit tier), R2-M3/M4 (error navigation), R2-L5 (cli.md), GAP-D (exec-bit fixture). + +--- + +# Remediations — 2026-05-30 (effort: high) + +All Review #2 findings **resolved** (AR-1 included per user decision). Gate after fixes: `make check` 0 lint issues + all tests; `go test -cover ./...` green (skillcore 79.5%, internal/cli 51.2% — both were 0% for verify/add-cli before); `make test-unit` now runs skillcore. Behavior fixes verified live (add from a subdir; missing-checkout error; bad-args navigation). + +| ID | Severity | Status | Fix (files) | +|----|----------|--------|-------------| +| **R2-H1** | HIGH | ✅ Fixed | `resolvePlacement` now looks up the lock by the **manifest name** (the key `writeLockEntry` writes), not the directory arg — identical re-add of a name≠dir skill is `unchanged`, not refused. Regression test `TestAdd_IdempotentWhenManifestNameDiffersFromDir`. (`pkg/skillcore/add.go`, `add_test.go`) | +| **R2-H2** | HIGH | ✅ Fixed | The re-exec'd git stub sets `GOCOVERDIR`, so `go test -cover ./pkg/skillcore/...` no longer leaks a warning into captured stderr. (`pkg/skillcore/helpers_test.go`) | +| **R2-M3** | MEDIUM | ✅ Fixed | Custom `Args` validators on `add`/`verify` return what/why/fix + an example instead of cobra's terse error. Unit-tested. (`internal/cli/add.go`, `verify.go`, `addverify_test.go`) | +| **R2-M4 / AR-2** | MEDIUM/HIGH | ✅ Fixed | New typed `*OriginNotFoundError` distinguishes a missing local origin checkout from a wrong skill name; CLI renders "origin checkout not found at " with the clone/re-bind fix. Tests at both layers. (`pkg/skillcore/{errors,add}.go`, `internal/cli/add.go`) | +| **R2-L6 / AR-1** | LOW→fixed | ✅ Fixed | Origin source is now anchored to the repo root (`filepath.Join(repoRoot, originDir)`), matching the destination — `add` works from any subdirectory (verified live). (`internal/cli/add.go`) | +| **R2-L5** | LOW | ✅ Fixed | `docs/design/cli.md` `add` synopsis/examples corrected to the shipped surface (`--dry-run`/`--force`/`--json`/`--verbose`); `--origin`/`--pin` marked dropped/planned. | +| **GAP-A** | MEDIUM | ✅ Fixed | `pkg/skillcore/verify_test.go` — 10 unit tests: clean/mismatch/orphan/missing/dirty, dirty-masks-mismatch precedence, aggregate-all, empty-repo, unsupported-lockfileVersion → `*LockError`. | +| **GAP-B** | MEDIUM | ✅ Fixed | `internal/cli/addverify_test.go` — `exitCodeFor` (incl. wrapped `*VerifyFailure`→2), `mapAddError` classes, `originDirRef`, arg validators, add/verify renderers (human shape + JSON completeness). | +| **GAP-C** | — | ✅ Fixed | `make test-unit` → `go test ./internal/... ./pkg/...` (skillcore now in the unit tier). | +| **GAP-D** | — | ✅ Fixed | Executable `check.sh` (0o755) added to the sample skill; `AddVendorsSkill` asserts the exec bit survives (mode preservation now actually exercised). | +| **AR-4** | LOW | ✅ Fixed | Docs synced to the AR-1/R2-M4 behavior: `add --help`, `contracts/add.md`, `skillrig-init` + `skillrig-add-verify` skills now describe repo-root resolution + the distinct missing-checkout error. | +| **AR-5** | LOW | ⏭️ Deferred | Stale `data-model.md` sample SHA — illustrative only (tests recompute via raw git); left as a cosmetic cleanup. | + +**Tooling:** `specledger.checkpoint-workflow` review-agent template now instructs loading `agentic-go-cli-design` + `golang-code-style`/`golang-testing`/`golang-lint` so future reviews judge against the same standards. diff --git a/test/skillcore_quickstart_test.go b/test/skillcore_quickstart_test.go index fdeb51b..0790494 100644 --- a/test/skillcore_quickstart_test.go +++ b/test/skillcore_quickstart_test.go @@ -272,6 +272,33 @@ var countsKeys = []string{"verified", "mismatch", "orphan", "missing", "dirty"} // US1 — Vendor a skill (add) // --------------------------------------------------------------------------- +// assertVendoredMatchesOrigin checks every vendored file is byte-identical to the +// origin with its mode preserved, and that check.sh keeps its executable bit (the +// exec bit is part of the tree-SHA, so a mode change would break label-honesty). +func assertVendoredMatchesOrigin(t *testing.T, c consumerRepo) { + t.Helper() + + for _, f := range []string{"SKILL.md", "skill.toml", "check.sh"} { + got := readSkillFile(t, c.root, f) + want := readFile(t, filepath.Join(c.originDir, "skills", sampleSkill, f)) + + if got != want { + t.Errorf("vendored %s differs from origin", f) + } + + gotMode := fileMode(t, filepath.Join(c.root, vendoredPath, f)) + wantMode := fileMode(t, filepath.Join(c.originDir, "skills", sampleSkill, f)) + + if gotMode != wantMode { + t.Errorf("vendored %s mode = %v, want %v", f, gotMode, wantMode) + } + } + + if execMode := fileMode(t, filepath.Join(c.root, vendoredPath, "check.sh")); execMode&0o111 == 0 { + t.Errorf("vendored check.sh lost its executable bit: mode = %v", execMode) + } +} + func TestQuickstart_AddVendorsSkill(t *testing.T) { t.Parallel() @@ -293,22 +320,9 @@ func TestQuickstart_AddVendorsSkill(t *testing.T) { t.Errorf("human output missing next-step footer (skillrig verify):\n%s", res.stdout) } - // Files vendored byte-identical to the origin (modes preserved). - for _, f := range []string{"SKILL.md", "skill.toml"} { - got := readSkillFile(t, c.root, f) - want := readFile(t, filepath.Join(c.originDir, "skills", sampleSkill, f)) - - if got != want { - t.Errorf("vendored %s differs from origin", f) - } - - gotMode := fileMode(t, filepath.Join(c.root, vendoredPath, f)) - wantMode := fileMode(t, filepath.Join(c.originDir, "skills", sampleSkill, f)) - - if gotMode != wantMode { - t.Errorf("vendored %s mode = %v, want %v", f, gotMode, wantMode) - } - } + // Files vendored byte-identical to the origin, with modes (incl. the exec bit) + // preserved. + assertVendoredMatchesOrigin(t, c) // Lock: one entry; treeSha == the raw-git ground truth; no requires field. entry := lockEntry(t, c.root, sampleSkill) diff --git a/test/testdata/sample-origin/skills/terraform-plan-review/check.sh b/test/testdata/sample-origin/skills/terraform-plan-review/check.sh new file mode 100755 index 0000000..ac2420c --- /dev/null +++ b/test/testdata/sample-origin/skills/terraform-plan-review/check.sh @@ -0,0 +1,6 @@ +#!/usr/bin/env bash +# A sample executable helper shipped inside the skill. Its 0o755 mode is part of +# the git tree-SHA, so it makes the mode-preservation assertions meaningful +# (without an executable fixture file the exec-bit guarantee is never exercised). +set -euo pipefail +echo "terraform-plan-review: ok" From 697bac30ec5a2da747aa18942832224da031aae2 Mon Sep 17 00:00:00 2001 From: Vincent De Smet Date: Sat, 30 May 2026 13:48:56 +0800 Subject: [PATCH 7/8] refactor(skill): consolidate into a single `skillrig` agent skill MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Merge the per-command skills `skillrig-init` and `skillrig-add-verify` into one consolidated `.agents/skills/skillrig/` skill, following skill-creator's progressive-disclosure / domain-organization pattern: - root SKILL.md: short + grokkable — what skillrig is, when to use it (find/ install/vendor/verify skills from your origin), the origin precondition + smoketest, the init->add->verify workflow, and a routing table to: - references/{init,add,verify}.md: full per-activity detail (moved verbatim from the two originals; all guidance preserved — cross-validated). - evals/: merged 9 behavioral cases (4 init + 5 add/verify) + 23 trigger queries; init-domain queries flip from negative→positive now that one skill owns them. Discovery scope: "find" means find an APPROVED skill in your configured origin (search is the next planned feature) — distinct from the generic `find-skills` skill; recorded as a known gap in the root SKILL.md. Guidance updated so future commands extend this one skill rather than spawning a new per-command skill: - Constitution IX: "one consolidated skill, not one-per-command" — a new command adds a references/.md + updates the root routing/description. - CLAUDE.md: skill path -> .agents/skills/skillrig/ + the same rule. Removed: .agents/skills/skillrig-init/, .agents/skills/skillrig-add-verify/. No code change; make check green. Evals defined, not run. Co-Authored-By: Claude Opus 4.8 (1M context) --- .agents/skills/skillrig-add-verify/SKILL.md | 154 ------------------ .../skillrig-add-verify/evals/evals.json | 63 ------- .agents/skills/skillrig-init/evals/evals.json | 49 ------ .../skillrig-init/evals/trigger-eval-set.json | 13 -- .agents/skills/skillrig/SKILL.md | 88 ++++++++++ .agents/skills/skillrig/evals/evals.json | 110 +++++++++++++ .../evals/trigger-eval-set.json | 21 ++- .agents/skills/skillrig/references/add.md | 64 ++++++++ .../SKILL.md => skillrig/references/init.md} | 82 +++------- .agents/skills/skillrig/references/verify.md | 66 ++++++++ .specledger/memory/constitution.md | 11 ++ CLAUDE.md | 2 +- 12 files changed, 379 insertions(+), 344 deletions(-) delete mode 100644 .agents/skills/skillrig-add-verify/SKILL.md delete mode 100644 .agents/skills/skillrig-add-verify/evals/evals.json delete mode 100644 .agents/skills/skillrig-init/evals/evals.json delete mode 100644 .agents/skills/skillrig-init/evals/trigger-eval-set.json create mode 100644 .agents/skills/skillrig/SKILL.md create mode 100644 .agents/skills/skillrig/evals/evals.json rename .agents/skills/{skillrig-add-verify => skillrig}/evals/trigger-eval-set.json (67%) create mode 100644 .agents/skills/skillrig/references/add.md rename .agents/skills/{skillrig-init/SKILL.md => skillrig/references/init.md} (63%) create mode 100644 .agents/skills/skillrig/references/verify.md diff --git a/.agents/skills/skillrig-add-verify/SKILL.md b/.agents/skills/skillrig-add-verify/SKILL.md deleted file mode 100644 index 8b7787c..0000000 --- a/.agents/skills/skillrig-add-verify/SKILL.md +++ /dev/null @@ -1,154 +0,0 @@ ---- -name: skillrig-add-verify -description: >- - Vendor agent skills from your configured origin into a repo with `skillrig add`, and - prove vendored skills are exactly what was recorded with `skillrig verify`. Use this - whenever the user wants to "add/vendor/pull in a skill" from their org's skills library, - "lock" or "pin" a skill into the repo, set up a CI gate that the committed skills are - unmodified, "check/verify our skills haven't been tampered with", understand the - vendor→commit→verify round-trip, read a verify pass/fail or exit code, or debug a - `mismatch` / `orphan` / `missing` / `dirty` verdict. Trigger even when the user doesn't - name the command — e.g. "make sure nobody changed our skills", "why did the skills check - fail in CI", or "our agent skill got edited, how do I restore it". Also use to explain - that a missing backing tool is NOT a verify failure (integrity-only). -license: MIT -metadata: - author: skillrig - cli: skillrig - user-invocable: true ---- - -# skillrig-add-verify Skill - -**When to Load**: The user wants to vendor a skill from their origin (`skillrig add`), -verify the repo's vendored skills against their recorded identities (`skillrig verify`), -gate CI on that verification, or interpret/debug a verify outcome (exit codes, per-skill -verdicts) — or whenever `skillrig add` / `skillrig verify` is referenced. - -## The promise these two commands make - -> *The skill your agent runs is exactly the version that was reviewed and approved.* - -`add` records a tamper-evident fingerprint of a skill's content when it is vendored; -`verify` later recomputes that fingerprint and fails if it drifted. Both use the **same** -git tree-SHA, computed by shelling `git`, so the value written at vendor time and the value -checked at verify time **cannot diverge** — the gate cannot lie. This is the whole point; -keep it intact (never hand-edit the lock, never mutate vendored files to "fix" a mismatch). - -## `skillrig add ` — vendor a skill (Vendor Mutation) - -Vendors `` from the repo's **configured origin** into the canonical -`.agents/skills//`, byte-identical and mode-preserving (it injects nothing), and -records its identity — `version`, `commit`, `treeSha`, `path` — in `.skillrig/skills-lock.json`. -Offline and consume-only. **Requires a git repository** (project scope). - -- **Origin, not a path**: `add` resolves the active origin via the shared resolver - (`SKILLRIG_ORIGIN` > project `.skillrig/config.toml` > global) exactly like every command. - There is **no** `--from`/path argument. -- **Local origin (this release)**: the configured `OWNER/REPO` is read from a local git - checkout at `/OWNER/REPO` (resolved against the repo root, so `add` works from - any subdirectory) — no network. So `init --origin my-org/my-skills` expects that library - checked out at `/my-org/my-skills` (keep it out of your index, e.g. - `echo 'my-org/' >> .git/info/exclude`). If that checkout is absent, `add` says "origin - checkout not found" (distinct from "skill not found"). -- **Idempotent**: re-adding identical content reports success and changes nothing - (`action: "unchanged"`). -- **Never clobbers**: if the on-disk copy diverges from the recorded fingerprint, `add` - **refuses** without `--force` (so local edits are never lost silently). It does **not** - three-way-merge — re-vendoring the same version has no upstream change to merge; that is a - future `bump`. Use `--force` to overwrite with the origin's content, or revert your edits. -- **`--dry-run`** previews placement + record changes and writes nothing. - -After `add`, **commit** `.agents/skills/` + the lock, then run `verify` — verify -checks the *committed* tree. - -| Flag | Purpose | -|------|---------| -| `--dry-run` | Report what would be vendored/recorded; write nothing | -| `--force` | Overwrite a vendored skill whose on-disk content diverges from the lock | -| `--json` | Emit the complete `AddResult` on stdout | -| `--verbose` | Show underlying paths / raw git cause behind summaries and errors | - -`--json` keys (always present): `ok, name, version, path, commit, treeSha, action, dryRun`; -`action ∈ {vendored, unchanged, overwritten}`. - -## `skillrig verify` — prove vendored skills are unmodified (Verification Gate) - -Checks **this repository's** vendored skills (project scope: `.agents/skills` vs the -committed `.skillrig/skills-lock.json`) — offline, deterministic, **read-only**. It -aggregates **all** findings in one run (never stops at the first failure). Takes no args. - -Two checks: -- **Label-honesty**: recompute each locked skill's tree-SHA from its **committed** content - and compare to the lock. -- **Orphan / completeness**: the on-disk skill set under `.agents/skills` must equal the - locked set — an unrecorded skill (`orphan`) or a recorded-but-absent one (`missing`) fails. - -### Per-skill verdicts (the `status` field) - -| Status | Meaning | Fix | -|--------|---------|-----| -| `ok` | committed content matches the recorded fingerprint | — | -| `mismatch` | committed content differs from the record (label-honesty failure) | re-`add` from origin, or restore the approved content | -| `orphan` | on disk but no lock entry (untracked — the primary supply-chain risk) | `skillrig add` it, or remove it | -| `missing` | lock entry whose files are absent | restore the files, or remove the lock entry | -| `dirty` | locked + present but **uncommitted / locally modified** | commit it (verify checks committed content) — *distinct* from `mismatch` | - -### CRITICAL: verify is integrity-only — a missing backing tool is NOT a failure - -`verify` does **no** prerequisite/eligibility check. A skill may declare `[[requires]]` -backing tools in its `skill.toml`; if those tools are absent in the environment, `verify` -**still passes** (it checks content, not runnability). Prerequisite checking is a future -`doctor` concern (the reserved exit `3`), never emitted here. Don't tell a user that verify -failed because a tool isn't installed — that's never the cause. - -## Exit codes (load-bearing — branch on these in CI/agents) - -| Code | When | -|------|------| -| `0` | All verdicts `ok` (**including** the empty case: no skills / no lock → clean pass) | -| `1` | Usage/config: malformed or unreadable lock, bad flags, **not inside a git repo** | -| `2` | Verification failure: any `mismatch`, `orphan`, `missing`, or `dirty` | -| `3` | **Never emitted** — reserved for `doctor`'s prerequisite class | - -A malformed lock is a **`1`**, not a `2` — keep that distinction when scripting (a `2` -means "content drifted"; a `1` means "I couldn't even run the check"). - -`verify --json` keys: `ok, counts{verified,mismatch,orphan,missing,dirty}, verdicts[]` with -each verdict carrying `name, path, status, expectedTreeSha, actualTreeSha, reason`. -Diagnostics go to stderr, so `skillrig verify --json 2>/dev/null | jq .` stays clean JSON. - -## Workflow patterns - -1. **Vendor + lock a skill**: - `skillrig add terraform-plan-review` → `git add -A && git commit` → `skillrig verify`. -2. **CI merge gate** (the headline use): run `skillrig verify` (or `--json` for an agent); - exit `0` proceeds, `2` blocks with a per-skill report, `1` is a setup/config problem. -3. **Recover a tampered skill**: a `mismatch`/`dirty` verdict → re-vendor from origin with - `skillrig add --force`, then commit and re-verify. -4. **Found an `orphan`**: either `skillrig add` it (record it) or delete the directory. -5. **Preview before writing**: `skillrig add --dry-run`. - -## Error handling - -| Symptom (stderr) | Cause | Fix | -|------------------|-------|-----| -| `no origin configured` | no `SKILLRIG_ORIGIN` / project / global origin | `skillrig init --origin OWNER/REPO`, or set `SKILLRIG_ORIGIN` | -| `origin checkout not found at ` | the configured `OWNER/REPO` is not checked out locally at `/OWNER/REPO` | clone the origin there (e.g. `git clone `), or re-bind with `skillrig init` | -| `skill "" not found in origin` | the origin IS present but has no `skills//` | check the name against the origin's `skills/` | -| `refusing to overwrite ` | on-disk content diverges from the record | re-run with `--force`, or revert local edits | -| `not a git repository` | `add`/`verify` run outside a repo | run inside the repo (or `git init` first) | -| `cannot read .skillrig/skills-lock.json` | malformed/unreadable lock (exit `1`, **not** `2`) | check/repair the file, or re-vendor with `skillrig add` | - -All failures state what/why/fix; add `--verbose` for the raw underlying cause. Errors go to -stderr, data to stdout. - -## Token efficiency - -Human output is compact (a summary line per finding + a footer hint). Use `--json` only when -a program/agent will parse the verdicts; otherwise the compact human form keeps context small. - -## Related - -- `skillrig-init` — bind the repo to an origin first (this skill assumes an origin is set; - see that skill for origin references, `@REF` branch tracking, and resolution precedence). diff --git a/.agents/skills/skillrig-add-verify/evals/evals.json b/.agents/skills/skillrig-add-verify/evals/evals.json deleted file mode 100644 index 2b84a8d..0000000 --- a/.agents/skills/skillrig-add-verify/evals/evals.json +++ /dev/null @@ -1,63 +0,0 @@ -[ - { - "id": 1, - "name": "vendor-a-skill-roundtrip", - "description": "User wants to pull a skill from their library into the repo — should reach for `skillrig add ` and explain the vendor→commit→verify round-trip (lands in .agents/skills, records the lock, commit, then verify).", - "prompt": "Our team's skills library is already configured for this repo. I want to bring in the `terraform-plan-review` skill and make sure it's locked to exactly the approved version. How do I do that with skillrig?", - "trap": "Model invents a non-existent command (e.g. `skillrig install`/`skillrig pull`), suggests hand-editing .skillrig/skills-lock.json, copies files manually instead of `skillrig add`, expects a network fetch, or forgets that you must commit before `verify` (which checks committed content).", - "assertions": [ - { "id": "1.1", "text": "Recommends `skillrig add terraform-plan-review` to vendor from the configured origin into .agents/skills/" }, - { "id": "1.2", "text": "Explains the identity is recorded in .skillrig/skills-lock.json (version/commit/treeSha/path) — not hand-authored" }, - { "id": "1.3", "text": "Says to git-commit the vendored skill + lock, THEN run `skillrig verify` (verify checks committed content)" }, - { "id": "1.4", "text": "Does NOT invent a path/`--from` argument or claim it fetches over the network (local, consume-only this release)" } - ] - }, - { - "id": 2, - "name": "verify-integrity-only-not-prereq", - "description": "User suspects verify failed because a backing CLI is missing — should explain verify is integrity-only and a missing backing tool NEVER fails verify (prereq is a future doctor / reserved exit 3).", - "prompt": "`skillrig verify` is failing in CI and one of our skills declares it needs `terraform` and `oxid`, which aren't installed on the CI runner. Is the missing tool why verify fails, and should I install them to fix it?", - "trap": "Model claims verify checks prerequisites / that the missing tool causes the failure, tells the user to install the tools to fix verify, or conflates integrity (exit 2) with a prerequisite check (the reserved exit 3).", - "assertions": [ - { "id": "2.1", "text": "States a missing backing tool does NOT cause a verify failure — verify is integrity-only (checks content, not runnability)" }, - { "id": "2.2", "text": "Attributes prerequisite/eligibility checking to a future `doctor` (the reserved exit 3, never emitted by verify)" }, - { "id": "2.3", "text": "Redirects the user to the real cause: a non-zero verify is a content mismatch / orphan / missing / dirty (exit 2), or a config problem (exit 1) — inspect with `skillrig verify --json`" } - ] - }, - { - "id": 3, - "name": "ci-gate-exit-codes", - "description": "User wants to gate a merge on verify — should explain the stable exit codes (0 pass / 1 usage-config / 2 verification failure; 3 never) and branching, not prose parsing.", - "prompt": "I want our merge pipeline to block if any vendored skill has been tampered with. How do I wire `skillrig verify` into CI, and what exit codes do I branch on?", - "trap": "Model gets the exit codes wrong (e.g. says verification failure is exit 1), suggests grepping the human text instead of using the exit code or `--json`, claims a malformed lock is exit 2, or mentions exit 3 as something verify emits.", - "assertions": [ - { "id": "3.1", "text": "Says exit 0 = pass (including empty/no-skills), exit 2 = verification failure (mismatch/orphan/missing/dirty), exit 1 = usage/config (incl. malformed lock, not-a-git-repo)" }, - { "id": "3.2", "text": "Recommends branching on the exit code (and/or `--json`), not parsing prose" }, - { "id": "3.3", "text": "Notes a malformed/unreadable lock is exit 1 (distinct from a content failure 2), and exit 3 is never emitted by verify" } - ] - }, - { - "id": 4, - "name": "divergent-overwrite-force", - "description": "User locally edited a vendored skill and re-adds the same version — should explain add refuses without --force (no silent clobber, no three-way merge) and how to proceed.", - "prompt": "I made some local tweaks to .agents/skills/terraform-plan-review, and now `skillrig add terraform-plan-review` is erroring with something about refusing to overwrite. What's going on and how do I get the original back?", - "trap": "Model claims `add` performs a three-way merge, tells the user to hand-edit the lock, suggests deleting the lock entry, or implies add will silently overwrite their edits.", - "assertions": [ - { "id": "4.1", "text": "Explains add refuses because the on-disk content diverges from the recorded fingerprint (it never silently clobbers local edits)" }, - { "id": "4.2", "text": "Says `skillrig add terraform-plan-review --force` overwrites with the origin's approved content (or revert the local edits to proceed)" }, - { "id": "4.3", "text": "Does NOT claim add three-way-merges the local edits (re-vendoring the same version has no upstream change to merge — that is a future `bump`)" } - ] - }, - { - "id": 5, - "name": "dirty-vs-mismatch-verdict", - "description": "User gets a `dirty` verdict — should explain it means uncommitted/locally-modified (commit it), distinct from a content `mismatch`, because verify checks the committed tree.", - "prompt": "`skillrig verify --json` shows one of my skills with status `dirty` rather than `ok` or `mismatch`. I haven't changed the recorded version on purpose. What does dirty mean and how do I clear it?", - "trap": "Model conflates `dirty` with `mismatch`, tells the user the recorded fingerprint is wrong, or suggests editing the lock — instead of recognizing that verify checks the COMMITTED tree and the working copy has uncommitted changes.", - "assertions": [ - { "id": "5.1", "text": "Explains `dirty` = the vendored skill has uncommitted/locally-modified content (verify checks the committed tree)" }, - { "id": "5.2", "text": "Says to commit the vendored files (or discard the local changes) so the committed tree matches the record, then re-verify" }, - { "id": "5.3", "text": "Distinguishes `dirty` from `mismatch` (mismatch = committed content differs from the record; dirty = not yet committed) and does NOT suggest editing the lock" } - ] - } -] diff --git a/.agents/skills/skillrig-init/evals/evals.json b/.agents/skills/skillrig-init/evals/evals.json deleted file mode 100644 index d8f8ee7..0000000 --- a/.agents/skills/skillrig-init/evals/evals.json +++ /dev/null @@ -1,49 +0,0 @@ -[ - { - "id": 1, - "name": "bind-repo-to-origin", - "description": "User asks to point a repo at the team's skills library — should reach for `skillrig init --origin OWNER/REPO` (project scope) and explain it lands at the git root.", - "prompt": "I want this repository to use our team's shared agent skills, which live at acme/agent-skills. How do I set that up with skillrig?", - "trap": "Model invents a non-existent command (e.g. `skillrig config set`), suggests editing config.toml by hand instead of `skillrig init`, or claims it scaffolds/creates the origin.", - "assertions": [ - { "id": "1.1", "text": "Recommends `skillrig init --origin acme/agent-skills` (origin in OWNER/REPO form)" }, - { "id": "1.2", "text": "Explains the project config is written at the git repository root (.skillrig/config.toml), not necessarily the current subdirectory" }, - { "id": "1.3", "text": "Does NOT claim init creates/scaffolds the origin — it binds an existing one (consume-only)" } - ] - }, - { - "id": 2, - "name": "ci-agent-non-interactive", - "description": "Automated/CI context — should pass --origin (or SKILLRIG_ORIGIN) AND --non-interactive so the command never blocks on a prompt.", - "prompt": "I'm running skillrig in CI where there's no terminal. How do I bind the origin so it never hangs waiting for input?", - "trap": "Model relies on TTY auto-detection alone, omits --non-interactive, or suggests piping input to the prompt instead of failing fast.", - "assertions": [ - { "id": "2.1", "text": "Recommends passing --origin OWNER/REPO explicitly (or setting SKILLRIG_ORIGIN)" }, - { "id": "2.2", "text": "Recommends --non-interactive to force fail-fast and never prompt, even if a TTY is present" } - ] - }, - { - "id": 3, - "name": "no-origin-and-precedence", - "description": "User hits a 'no origin configured' error and asks how resolution works — should explain SKILLRIG_ORIGIN > project > global and the concrete fixes.", - "prompt": "A skillrig command failed with 'no origin configured'. How does skillrig decide which origin to use, and how do I fix this?", - "trap": "Model gets the precedence order wrong, forgets the SKILLRIG_ORIGIN env override, or omits the --global fallback option.", - "assertions": [ - { "id": "3.1", "text": "States precedence SKILLRIG_ORIGIN > project .skillrig/config.toml > global config (highest wins)" }, - { "id": "3.2", "text": "Offers concrete fixes: run `skillrig init --origin OWNER/REPO`, set SKILLRIG_ORIGIN, or set a --global default" }, - { "id": "3.3", "text": "Notes the project config is found by walking up from the current directory (works from subdirectories)" } - ] - }, - { - "id": 4, - "name": "track-a-branch", - "description": "User wants the repo to follow a specific branch of the skills library — should append @REF to the origin (OWNER/REPO@branch), not invent a separate flag.", - "prompt": "Our skills repo acme/agent-skills has a 'staging' branch we want to try before it merges. How do I point this repo at that branch with skillrig?", - "trap": "Model invents a non-existent flag (e.g. --branch/--ref), claims skillrig verifies the branch exists over the network, or conflates this with pinning an individual skill to a tag/SHA.", - "assertions": [ - { "id": "4.1", "text": "Recommends `skillrig init --origin acme/agent-skills@staging` (the @REF suffix on the single --origin string)" }, - { "id": "4.2", "text": "Does NOT invent a separate --branch/--ref flag; the ref rides inside the origin reference" }, - { "id": "4.3", "text": "Notes the @REF is validated for shape only / offline — existence on the remote is not checked" } - ] - } -] diff --git a/.agents/skills/skillrig-init/evals/trigger-eval-set.json b/.agents/skills/skillrig-init/evals/trigger-eval-set.json deleted file mode 100644 index 50166fc..0000000 --- a/.agents/skills/skillrig-init/evals/trigger-eval-set.json +++ /dev/null @@ -1,13 +0,0 @@ -[ - { "query": "I want this repository to use our team's shared agent skills at acme/agent-skills. How do I set that up?", "should_trigger": true }, - { "query": "How do I point this repo at our skills library with skillrig?", "should_trigger": true }, - { "query": "How do I set the skillrig origin for my project?", "should_trigger": true }, - { "query": "skillrig says 'no origin configured' — how do I fix that?", "should_trigger": true }, - { "query": "What's the precedence between SKILLRIG_ORIGIN and the project config file?", "should_trigger": true }, - { "query": "How do I set a personal default skills origin used across all my repos?", "should_trigger": true }, - { "query": "I want this repo to track the staging branch of our skills repo acme/agent-skills with skillrig — how?", "should_trigger": true }, - { "query": "Can skillrig point at a specific branch of the skills library instead of the default branch?", "should_trigger": true }, - { "query": "Write a table-driven unit test for a Go parsing function.", "should_trigger": false }, - { "query": "How do I rebase my feature branch onto main and resolve conflicts?", "should_trigger": false }, - { "query": "Configure golangci-lint to enable the gosec linter.", "should_trigger": false } -] diff --git a/.agents/skills/skillrig/SKILL.md b/.agents/skills/skillrig/SKILL.md new file mode 100644 index 0000000..4e3f7cf --- /dev/null +++ b/.agents/skills/skillrig/SKILL.md @@ -0,0 +1,88 @@ +--- +name: skillrig +description: >- + Point a repository at your org's agent-skills library and manage vendored skills with the + `skillrig` CLI — bind/choose the origin (`init`), vendor/add a skill (`add`), and + verify/check that committed skills are exactly what was approved (`verify`). Use whenever + the user wants to find, install, add, vendor, pull in, lock, or pin an agent skill from a + skills library; set/configure where skills come from or fix a "no origin configured" error; + point a repo at a skills repo (OWNER/REPO[@branch]) or use SKILLRIG_ORIGIN; or verify / + check / audit that vendored skills haven't been tampered with (a CI gate) and debug + add/verify errors or `mismatch`/`orphan`/`missing`/`dirty` verdicts. Vendoring requires an + origin, so `skillrig init` comes first. Trigger even when the command isn't named — e.g. + "point this repo at our skills", "pull in the terraform-review skill", "make sure nobody + changed our skills", "why did the skills check fail in CI", "our agent skill got edited". +license: MIT +metadata: + author: skillrig + cli: skillrig + user-invocable: true +--- + +# skillrig + +`skillrig` is a single, generic, **consume-only** CLI for pointing a repo at an **origin** — +the `OWNER/REPO` that hosts your org's agent skills — and managing the skills vendored from +it. The same binary serves humans, agents, and CI. There is no publish/login: GitHub is the +authority plane ("publishing" = a PR to the origin). + +**The promise:** *the skill your agent runs is exactly the version that was reviewed and +approved.* `add` records a tamper-evident git tree-SHA when a skill is vendored; `verify` +recomputes it and fails if it drifted — same primitive on both sides, so the gate cannot lie. + +## When to use this skill + +Use it whenever the user wants to **find / install / add / vendor / pull in** an agent skill +from a library, **set or fix the origin** ("point this repo at our skills", "no origin +configured"), or **verify / check / audit** that vendored skills are unmodified (a CI gate), +including debugging `add`/`verify` output. Three activities, three commands: + +| Activity | Command | Read | +|---|---|---| +| Choose where skills come from (bind the origin) | `skillrig init` | [references/init.md](references/init.md) | +| Vendor a skill into the repo (+ lock its identity) | `skillrig add ` | [references/add.md](references/add.md) | +| Prove vendored skills match what was approved | `skillrig verify` | [references/verify.md](references/verify.md) | + +Load only the reference for the activity at hand. + +## Prerequisite: an origin must be configured (run `init` first) + +`add` needs to know **where** skills come from, so a configured origin is a precondition. +**Smoketest before vendoring:** is an origin resolvable? + +- project: `.skillrig/config.toml` exists (at the git repo root), **or** +- env: `$SKILLRIG_ORIGIN` is set, **or** +- global: `~/.config/skillrig/config.toml` exists. + +Precedence (highest wins): `SKILLRIG_ORIGIN` > project config > global. If none resolve, +`add` fails with `no origin configured` — run `skillrig init --origin OWNER/REPO` first +(see [references/init.md](references/init.md)). **`verify` needs no origin** (it reads the +committed lock + tree, offline). + +## The typical workflow + +``` +skillrig init --origin my-org/my-skills # 1. bind the origin (once per repo) +skillrig add terraform-plan-review # 2. vendor a skill into .agents/skills/ +git add -A && git commit -m "vendor skill" # 3. commit (verify checks committed content) +skillrig verify # 4. prove it matches the recorded version (CI gate) +``` + +Exit codes are load-bearing for CI/agents: `0` ok · `1` usage/config · `2` verification +failure · `3` reserved (never emitted). Errors are what/why/fix on stderr; `--json` is the +complete machine view on stdout; `--verbose` shows the raw cause. Details per command in the +references. + +## Not here yet + +**Discovery / search is the next planned feature.** Listing or searching the *approved* +skills available in your configured origin (a `search`/index command) does **not exist yet** — +until it lands, vendor a skill by its **known name** with `add`. Note the scope: `skillrig`'s +"find" means *find an approved skill in **your origin***, which is distinct from the generic +`find-skills` skill (discovering skills from anywhere). So a request to "find/install a skill +from our library" is `skillrig`; "what skills exist out there for X?" is `find-skills`. + +Also designed but **not implemented** (don't assume they exist): remote/network fetch + auth, +immutable per-skill `--pin`, multi-client symlink views, and a prerequisite/health `doctor` +(the reserved exit `3`). This slice is `init` + `add` (from a local origin checkout) + +`verify`. diff --git a/.agents/skills/skillrig/evals/evals.json b/.agents/skills/skillrig/evals/evals.json new file mode 100644 index 0000000..8309eb1 --- /dev/null +++ b/.agents/skills/skillrig/evals/evals.json @@ -0,0 +1,110 @@ +[ + { + "id": 1, + "name": "bind-repo-to-origin", + "description": "User asks to point a repo at the team's skills library — should reach for `skillrig init --origin OWNER/REPO` (project scope) and explain it lands at the git root.", + "prompt": "I want this repository to use our team's shared agent skills, which live at acme/agent-skills. How do I set that up with skillrig?", + "trap": "Model invents a non-existent command (e.g. `skillrig config set`), suggests editing config.toml by hand instead of `skillrig init`, or claims it scaffolds/creates the origin.", + "assertions": [ + { "id": "1.1", "text": "Recommends `skillrig init --origin acme/agent-skills` (origin in OWNER/REPO form)" }, + { "id": "1.2", "text": "Explains the project config is written at the git repository root (.skillrig/config.toml)" }, + { "id": "1.3", "text": "Does NOT claim init creates/scaffolds the origin — it binds an existing one (consume-only)" } + ] + }, + { + "id": 2, + "name": "ci-agent-non-interactive", + "description": "Automated/CI context — should pass --origin (or SKILLRIG_ORIGIN) AND --non-interactive so the command never blocks on a prompt.", + "prompt": "I'm running skillrig in CI where there's no terminal. How do I bind the origin so it never hangs waiting for input?", + "trap": "Model relies on TTY auto-detection alone, omits --non-interactive, or suggests piping input to the prompt instead of failing fast.", + "assertions": [ + { "id": "2.1", "text": "Recommends passing --origin OWNER/REPO explicitly (or setting SKILLRIG_ORIGIN)" }, + { "id": "2.2", "text": "Recommends --non-interactive to force fail-fast and never prompt, even if a TTY is present" } + ] + }, + { + "id": 3, + "name": "no-origin-and-precedence", + "description": "User hits a 'no origin configured' error and asks how resolution works — should explain SKILLRIG_ORIGIN > project > global and the concrete fixes.", + "prompt": "A skillrig command failed with 'no origin configured'. How does skillrig decide which origin to use, and how do I fix this?", + "trap": "Model gets the precedence order wrong, forgets the SKILLRIG_ORIGIN env override, or omits the --global fallback option.", + "assertions": [ + { "id": "3.1", "text": "States precedence SKILLRIG_ORIGIN > project .skillrig/config.toml > global config (highest wins)" }, + { "id": "3.2", "text": "Offers concrete fixes: run `skillrig init --origin OWNER/REPO`, set SKILLRIG_ORIGIN, or set a --global default" }, + { "id": "3.3", "text": "Notes the project config is found by walking up from the current directory (works from subdirectories)" } + ] + }, + { + "id": 4, + "name": "track-a-branch", + "description": "User wants the repo to follow a specific branch of the skills library — should append @REF to the origin (OWNER/REPO@branch), not invent a separate flag.", + "prompt": "Our skills repo acme/agent-skills has a 'staging' branch we want to try before it merges. How do I point this repo at that branch with skillrig?", + "trap": "Model invents a non-existent flag (e.g. --branch/--ref), claims skillrig verifies the branch exists over the network, or conflates this with pinning an individual skill to a tag/SHA.", + "assertions": [ + { "id": "4.1", "text": "Recommends `skillrig init --origin acme/agent-skills@staging` (the @REF suffix on the single --origin string)" }, + { "id": "4.2", "text": "Does NOT invent a separate --branch/--ref flag; the ref rides inside the origin reference" }, + { "id": "4.3", "text": "Notes the @REF is validated for shape only / offline — existence on the remote is not checked" } + ] + }, + { + "id": 5, + "name": "vendor-a-skill-roundtrip", + "description": "User wants to pull a skill from their library into the repo — should reach for `skillrig add ` and explain the vendor→commit→verify round-trip.", + "prompt": "Our team's skills library is already configured for this repo. I want to bring in the `terraform-plan-review` skill and make sure it's locked to exactly the approved version. How do I do that with skillrig?", + "trap": "Model invents a non-existent command (e.g. `skillrig install`/`skillrig pull`), suggests hand-editing the lock, copies files manually, expects a network fetch, or forgets you must commit before `verify`.", + "assertions": [ + { "id": "5.1", "text": "Recommends `skillrig add terraform-plan-review` to vendor from the configured origin into .agents/skills/" }, + { "id": "5.2", "text": "Explains the identity is recorded in .skillrig/skills-lock.json (version/commit/treeSha/path), not hand-authored" }, + { "id": "5.3", "text": "Says to git-commit the vendored skill + lock, THEN run `skillrig verify` (verify checks committed content)" }, + { "id": "5.4", "text": "Does NOT invent a path/`--from` argument or claim it fetches over the network (local, consume-only this release)" } + ] + }, + { + "id": 6, + "name": "verify-integrity-only-not-prereq", + "description": "User suspects verify failed because a backing CLI is missing — should explain verify is integrity-only and a missing backing tool NEVER fails verify (prereq is a future doctor / reserved exit 3).", + "prompt": "`skillrig verify` is failing in CI and one of our skills declares it needs `terraform` and `oxid`, which aren't installed on the CI runner. Is the missing tool why verify fails, and should I install them to fix it?", + "trap": "Model claims verify checks prerequisites / that the missing tool causes the failure, tells the user to install the tools to fix verify, or conflates integrity (exit 2) with a prerequisite check (exit 3).", + "assertions": [ + { "id": "6.1", "text": "States a missing backing tool does NOT cause a verify failure — verify is integrity-only (checks content, not runnability)" }, + { "id": "6.2", "text": "Attributes prerequisite/eligibility checking to a future `doctor` (the reserved exit 3, never emitted by verify)" }, + { "id": "6.3", "text": "Redirects to the real cause: a non-zero verify is mismatch/orphan/missing/dirty (exit 2) or a config problem (exit 1) — inspect with `skillrig verify --json`" } + ] + }, + { + "id": 7, + "name": "ci-gate-exit-codes", + "description": "User wants to gate a merge on verify — should explain the stable exit codes (0 pass / 1 usage-config / 2 verification failure; 3 never) and branching, not prose parsing.", + "prompt": "I want our merge pipeline to block if any vendored skill has been tampered with. How do I wire `skillrig verify` into CI, and what exit codes do I branch on?", + "trap": "Model gets the exit codes wrong (e.g. verification failure as exit 1), suggests grepping the human text instead of the exit code or --json, claims a malformed lock is exit 2, or mentions exit 3 as something verify emits.", + "assertions": [ + { "id": "7.1", "text": "Says exit 0 = pass (incl. empty), exit 2 = verification failure (mismatch/orphan/missing/dirty), exit 1 = usage/config (incl. malformed lock, not-a-git-repo)" }, + { "id": "7.2", "text": "Recommends branching on the exit code (and/or --json), not parsing prose" }, + { "id": "7.3", "text": "Notes a malformed/unreadable lock is exit 1 (distinct from a content failure 2), and exit 3 is never emitted by verify" } + ] + }, + { + "id": 8, + "name": "divergent-overwrite-force", + "description": "User locally edited a vendored skill and re-adds the same version — should explain add refuses without --force (no silent clobber, no three-way merge) and how to proceed.", + "prompt": "I made some local tweaks to .agents/skills/terraform-plan-review, and now `skillrig add terraform-plan-review` is erroring with something about refusing to overwrite. What's going on and how do I get the original back?", + "trap": "Model claims `add` performs a three-way merge, tells the user to hand-edit the lock, suggests deleting the lock entry, or implies add will silently overwrite their edits.", + "assertions": [ + { "id": "8.1", "text": "Explains add refuses because on-disk content diverges from the recorded fingerprint (never silently clobbers local edits)" }, + { "id": "8.2", "text": "Says `skillrig add terraform-plan-review --force` overwrites with the origin's approved content (or revert the edits)" }, + { "id": "8.3", "text": "Does NOT claim add three-way-merges (re-vendoring the same version has no upstream change to merge — that is a future `bump`)" } + ] + }, + { + "id": 9, + "name": "dirty-vs-mismatch-verdict", + "description": "User gets a `dirty` verdict — should explain it means uncommitted/locally-modified (commit it), distinct from a content `mismatch`, because verify checks the committed tree.", + "prompt": "`skillrig verify --json` shows one of my skills with status `dirty` rather than `ok` or `mismatch`. I haven't changed the recorded version on purpose. What does dirty mean and how do I clear it?", + "trap": "Model conflates `dirty` with `mismatch`, tells the user the recorded fingerprint is wrong, or suggests editing the lock — instead of recognizing verify checks the COMMITTED tree and the working copy has uncommitted changes.", + "assertions": [ + { "id": "9.1", "text": "Explains `dirty` = the vendored skill has uncommitted/locally-modified content (verify checks the committed tree)" }, + { "id": "9.2", "text": "Says to commit the vendored files (or discard the local changes), then re-verify" }, + { "id": "9.3", "text": "Distinguishes `dirty` from `mismatch` (mismatch = committed content differs from the record) and does NOT suggest editing the lock" } + ] + } +] diff --git a/.agents/skills/skillrig-add-verify/evals/trigger-eval-set.json b/.agents/skills/skillrig/evals/trigger-eval-set.json similarity index 67% rename from .agents/skills/skillrig-add-verify/evals/trigger-eval-set.json rename to .agents/skills/skillrig/evals/trigger-eval-set.json index 26bcf01..6d92e8b 100644 --- a/.agents/skills/skillrig-add-verify/evals/trigger-eval-set.json +++ b/.agents/skills/skillrig/evals/trigger-eval-set.json @@ -1,22 +1,25 @@ [ - { "query": "Our skills library is configured for this repo — how do I vendor the terraform-plan-review skill into it with skillrig?", "should_trigger": true }, + { "query": "I want this repository to use our team's shared agent skills at acme/agent-skills. How do I set that up?", "should_trigger": true }, + { "query": "How do I point this repo at our skills library with skillrig?", "should_trigger": true }, + { "query": "skillrig says 'no origin configured' — how do I fix that?", "should_trigger": true }, + { "query": "What's the precedence between SKILLRIG_ORIGIN and the project config file?", "should_trigger": true }, + { "query": "How do I set a personal default skills origin used across all my repos?", "should_trigger": true }, + { "query": "I want this repo to track the staging branch of our skills repo acme/agent-skills — how?", "should_trigger": true }, + { "query": "Our skills library is configured for this repo — how do I vendor the terraform-plan-review skill into it?", "should_trigger": true }, { "query": "I want to pull in an agent skill from our org library and lock it to the approved version. How?", "should_trigger": true }, + { "query": "I re-ran skillrig add and it says 'refusing to overwrite' — how do I force it to take the origin's version?", "should_trigger": true }, + { "query": "someone changed .agents/skills/terraform-plan-review locally and now I want to restore the original from our library", "should_trigger": true }, { "query": "How do I add a CI check that the skills committed in our repo haven't been modified from what was approved?", "should_trigger": true }, { "query": "skillrig verify failed in CI and exited 2 — what does that mean and which skills are the problem?", "should_trigger": true }, { "query": "make sure nobody secretly edited one of our agent skills before we merge", "should_trigger": true }, { "query": "verify is reporting one of my skills as 'orphan' and another as 'dirty' — what do those mean and how do I fix them?", "should_trigger": true }, - { "query": "someone changed .agents/skills/terraform-plan-review locally and now I want to restore the original from our library", "should_trigger": true }, { "query": "does skillrig verify fail if the CLI a skill needs (like terraform) isn't installed on the runner?", "should_trigger": true }, - { "query": "I re-ran skillrig add and it says 'refusing to overwrite' — how do I force it to take the origin's version?", "should_trigger": true }, - { "query": "what's the difference between a mismatch and a dirty verdict in skillrig verify?", "should_trigger": true }, - { "query": "How do I point this repo at our team's skills library acme/agent-skills with skillrig?", "should_trigger": false }, - { "query": "skillrig says 'no origin configured' — how do I set the origin?", "should_trigger": false }, - { "query": "What's the precedence between SKILLRIG_ORIGIN and the project config file?", "should_trigger": false }, - { "query": "I want this repo to track the staging branch of our skills repo with skillrig init", "should_trigger": false }, { "query": "verify the SHA-256 checksum of this tarball I downloaded matches the release page", "should_trigger": false }, { "query": "how do I check that my last git commit is GPG-signed and verified on GitHub?", "should_trigger": false }, { "query": "configure golangci-lint to enable the gosec linter for our Go CLI", "should_trigger": false }, { "query": "write a table-driven unit test for a Go function that parses OWNER/REPO references", "should_trigger": false }, { "query": "how do I vendor my Go module dependencies with go mod vendor?", "should_trigger": false }, - { "query": "add a new skill to my Claude Code setup by writing a SKILL.md from scratch", "should_trigger": false } + { "query": "help me author a brand-new SKILL.md from scratch for a Claude Code skill I'm writing", "should_trigger": false }, + { "query": "find me a skill that can help write better git commit messages", "should_trigger": false }, + { "query": "what's the git command to see which files changed in the last commit?", "should_trigger": false } ] diff --git a/.agents/skills/skillrig/references/add.md b/.agents/skills/skillrig/references/add.md new file mode 100644 index 0000000..921d763 --- /dev/null +++ b/.agents/skills/skillrig/references/add.md @@ -0,0 +1,64 @@ +# `skillrig add ` — vendor a skill (Vendor Mutation) + +> Bring a skill from your **configured origin** into the repo and record its identity. +> Needs an origin (run [init](init.md) first). After `add`, commit, then [verify](verify.md). + +Vendors `` into the canonical `.agents/skills//`, byte-identical and +mode-preserving (it injects nothing), and records its identity — `version`, `commit`, +`treeSha`, `path` — in `.skillrig/skills-lock.json`. Offline and consume-only. +**Requires a git repository** (project scope). The recorded `treeSha` is the git tree-SHA +`verify` later recomputes, so the two cannot drift (the gate cannot lie) — never hand-edit +the lock. + +- **Origin, not a path**: `add` resolves the active origin via the shared resolver + (`SKILLRIG_ORIGIN` > project `.skillrig/config.toml` > global). There is **no** + `--from`/path argument. +- **Local origin (this release)**: the configured `OWNER/REPO` is read from a local git + checkout at `/OWNER/REPO` (resolved against the repo root, so `add` works from + **any subdirectory**) — no network. So `init --origin my-org/my-skills` expects that + library checked out at `/my-org/my-skills` (keep it out of your index, e.g. + `echo 'my-org/' >> .git/info/exclude`). If that checkout is absent, `add` says **"origin + checkout not found"** (distinct from "skill not found"). +- **Idempotent**: re-adding identical content reports success and changes nothing + (`action: "unchanged"`). +- **Never clobbers**: if the on-disk copy diverges from the recorded fingerprint, `add` + **refuses** without `--force` (so local edits are never lost silently). It does **not** + three-way-merge — re-vendoring the same version has no upstream change to merge (that is a + future `bump`). Use `--force` to overwrite with the origin's content, or revert your edits. +- **`--dry-run`** previews placement + record changes and writes nothing. + +After `add`, **commit** `.agents/skills/` + the lock, then run `verify` — it checks +the *committed* tree. + +| Flag | Purpose | +|------|---------| +| `--dry-run` | Report what would be vendored/recorded; write nothing | +| `--force` | Overwrite a vendored skill whose on-disk content diverges from the lock | +| `--json` | Emit the complete `AddResult` on stdout | +| `--verbose` | Show underlying paths / raw git cause behind summaries and errors | + +`--json` keys (always present): `ok, name, version, path, commit, treeSha, action, dryRun`; +`action ∈ {vendored, unchanged, overwritten}`. + +## Workflow patterns + +1. **Vendor + lock**: `skillrig add terraform-plan-review` → `git add -A && git commit` → + `skillrig verify`. +2. **Recover a tampered skill** (a `mismatch`/`dirty` verdict from verify): re-vendor with + `skillrig add --force`, then commit and re-verify. +3. **Adopt an `orphan`**: `skillrig add ` records an on-disk-but-unlocked skill. +4. **Preview**: `skillrig add --dry-run`. + +## Error handling + +| Symptom (stderr) | Cause | Fix | +|------------------|-------|-----| +| `no origin configured` | no `SKILLRIG_ORIGIN` / project / global origin | `skillrig init --origin OWNER/REPO`, or set `SKILLRIG_ORIGIN` | +| `origin checkout not found at ` | the configured `OWNER/REPO` is not checked out locally at `/OWNER/REPO` | clone the origin there (`git clone `), or re-bind with `skillrig init` | +| `skill "" not found in origin` | the origin IS present but has no `skills//` | check the name against the origin's `skills/` | +| `refusing to overwrite ` | on-disk content diverges from the record | re-run with `--force`, or revert local edits | +| `not a git repository` | run outside a repo | run inside the repo (or `git init` first) | +| bad/missing args | wrong invocation | the error states what/why/fix + an example; or `skillrig add --help` | + +All failures state what/why/fix and exit `1`; add `--verbose` for the raw cause. Errors go to +stderr, data to stdout. diff --git a/.agents/skills/skillrig-init/SKILL.md b/.agents/skills/skillrig/references/init.md similarity index 63% rename from .agents/skills/skillrig-init/SKILL.md rename to .agents/skills/skillrig/references/init.md index 8b82bad..2e40ec8 100644 --- a/.agents/skills/skillrig-init/SKILL.md +++ b/.agents/skills/skillrig/references/init.md @@ -1,26 +1,6 @@ ---- -name: skillrig-init -description: >- - Bind a repository (or your per-user default) to a skill origin with `skillrig init`, - and understand how skillrig resolves the active origin. Use when the user wants to - "point this repo at our skills library", "set the origin", configure where skills come - from, set up `skillrig` in a repo, choose between a project vs global default origin, - track a specific branch of the origin (OWNER/REPO@branch), use the `SKILLRIG_ORIGIN` - environment variable, or fix a "no origin configured" / "no origin given" error. - Triggers on `skillrig init`, origin binding (OWNER/REPO[@REF]), pointing at a branch - of the skills repo, and origin-resolution precedence questions. -license: MIT -metadata: - author: skillrig - cli: skillrig - user-invocable: true ---- - -# skillrig-init Skill - -**When to Load**: The user wants to point a repository at an existing skills origin -(`OWNER/REPO`), set a personal default, configure `SKILLRIG_ORIGIN`, or resolve a -"no origin configured" failure — or whenever `skillrig init` is referenced. +# `skillrig init` — bind the repo to an origin + +> Choose **where** skills come from. Run this first; `add` needs a configured origin. ## Overview @@ -33,9 +13,9 @@ and binding the same origin twice is a no-op. The origin reference is `OWNER/REPO[@REF]`. The optional `@REF` tracks a specific **branch** of the library (e.g. `my-org/my-skills@staging`); omit it to track the default branch. The `@REF` is validated for shape only (offline) — it is **not** checked -against the remote — and is stored combined in the single `origin` key. (Note: an -origin's `@ref` is a moving branch pointer; pinning an individual *skill* to an immutable -tag/SHA is a separate, later concern.) +against the remote — and is stored combined in the single `origin` key. (An origin's +`@ref` is a moving branch pointer; pinning an individual *skill* to an immutable tag/SHA +is a separate, later concern.) It writes one of two config files: @@ -48,7 +28,7 @@ It writes one of two config files: `git` must be on `PATH` (used only for the offline repo-root lookup). -## Command Surface +## Command surface | Flag | Purpose | When to use | |------|---------|-------------| @@ -58,7 +38,7 @@ It writes one of two config files: | `--json` | Emit a complete result object on stdout | Machine consumption | | `--verbose` | Show underlying paths / raw cause behind summaries and errors | Debugging a failure | -## Decision Criteria +## Decision criteria - **Project vs global**: bind the repo (no `--global`) so the repo is self-describing and teammates resolve the same origin. Use `--global` only for a personal fallback. @@ -69,7 +49,7 @@ It writes one of two config files: - **`SKILLRIG_ORIGIN`**: prefer this env var for one-off overrides (e.g. CI) — it beats both config files without editing anything on disk. -## Resolution Precedence +## Resolution precedence Every command resolves the active origin with one rule (highest wins): @@ -81,7 +61,7 @@ SKILLRIG_ORIGIN > project .skillrig/config.toml (nearest ancestor) > global - A malformed or origin-less config file is **skipped**, and resolution continues down the order — it is not a hard failure. - When no source supplies an origin, that is the "no origin configured" state the user - must fix (see Error Handling). + must fix (see Error handling). The project lookup walks **up** from the working directory, so any subdirectory of a bound repo resolves the same origin. @@ -90,22 +70,21 @@ bound repo resolves the same origin. `init` records only an `OWNER/REPO[@REF]` **reference** — never a filesystem path (passing a path fails with `invalid origin … expected OWNER/REPO[@REF]`). In this release there is no -network fetch, so when a later command (`skillrig add`) needs the origin's files it reads -them from a **local git checkout at `/OWNER/REPO`** (resolved against the repo -root, so it works from any subdirectory). So to vendor from a local copy of `my-org/my-skills`, -from the repo root: +network fetch, so when `skillrig add` needs the origin's files it reads them from a **local +git checkout at `/OWNER/REPO`** (resolved against the repo root, so it works from +any subdirectory). To vendor from a local copy of `my-org/my-skills`, from the repo root: ``` skillrig init --origin my-org/my-skills # records the reference -git clone my-org/my-skills # the checkout add reads from (./my-org/my-skills) +git clone my-org/my-skills # the checkout add reads from echo 'my-org/' >> .git/info/exclude # keep it out of your repo's index -skillrig add # reads ./my-org/my-skills/skills// +skillrig add # reads /my-org/my-skills/skills// ``` `@REF` selects the revision (default `HEAD`). Fetching a remote origin over the network is a -later, additive mode. See `skillrig add --help` for the vendoring side. +later, additive mode. See [add.md](add.md) for the vendoring side. -## JSON Output +## JSON output `skillrig init --origin my-org/my-skills --json` emits a single object with all keys present; branch on `written` to tell a fresh bind from an idempotent no-op: @@ -116,31 +95,24 @@ present; branch on `written` to tell a fresh bind from an idempotent no-op: `scope` is `project` or `global`; `written` is `false` when the origin was already bound. -## Workflow Patterns +## Workflow patterns 1. **Bind a repo**: `skillrig init --origin my-org/my-skills` → run from anywhere in the repo; config lands at the repo root. -2. **Track a branch**: `skillrig init --origin my-org/my-skills@staging` → records the - origin pinned to the `staging` branch (stored as `origin = 'my-org/my-skills@staging'`). +2. **Track a branch**: `skillrig init --origin my-org/my-skills@staging`. 3. **Personal default**: `skillrig init --origin my-org/my-skills --global`. -4. **CI / agent**: pass `--origin` (or set `SKILLRIG_ORIGIN`) **and** `--non-interactive` - so the command never prompts. +4. **CI / agent**: pass `--origin` (or set `SKILLRIG_ORIGIN`) **and** `--non-interactive`. 5. **One-off override**: `SKILLRIG_ORIGIN=ci-org/ci-skills skillrig ` — no file edit. -## Error Handling +## Error handling | Symptom (stderr) | Cause | Fix | |------------------|-------|-----| -| `invalid origin "": expected OWNER/REPO[@REF]` | Origin (or its `@REF`) not in `OWNER/REPO[@REF]` shape | Pass a valid `--origin my-org/my-skills` (or `--origin my-org/my-skills@main`) | -| `no origin given … non-interactive session (no TTY)` | `init` run without `--origin` and stdin is not a terminal | Pass `--origin OWNER/REPO` or set `SKILLRIG_ORIGIN` | +| `invalid origin "": expected OWNER/REPO[@REF]` | Origin (or `@REF`) not in `OWNER/REPO[@REF]` shape | Pass `--origin my-org/my-skills` (or `…@main`) | +| `no origin given … non-interactive session (no TTY)` | `init` run without `--origin`, stdin not a terminal | Pass `--origin OWNER/REPO` or set `SKILLRIG_ORIGIN` | | `no origin given … non-interactive mode requested (--non-interactive)` | `--non-interactive` set but no `--origin` | Pass `--origin OWNER/REPO` or set `SKILLRIG_ORIGIN` | -| "no origin configured" from a later command | No source supplied an origin | Run `skillrig init --origin OWNER/REPO`, or set `SKILLRIG_ORIGIN`, or add a `--global` default | - -All failures exit non-zero (usage/config errors exit `1`); add `--verbose` to see the -raw cause behind the message. - -## Token Efficiency +| `no origin configured` (from a later command) | No source supplied an origin | `skillrig init --origin OWNER/REPO`, set `SKILLRIG_ORIGIN`, or add a `--global` default | -Default human output is ≤2 lines (a confirmation plus a one-line resolve-order hint). -Use `--json` only when a program will parse the result; otherwise the compact human form -keeps context small. +All failures exit `1` (usage/config); add `--verbose` for the raw cause. Default human +output is ≤2 lines (a confirmation + a one-line resolve-order hint); use `--json` only when +a program will parse it. diff --git a/.agents/skills/skillrig/references/verify.md b/.agents/skills/skillrig/references/verify.md new file mode 100644 index 0000000..fac89d8 --- /dev/null +++ b/.agents/skills/skillrig/references/verify.md @@ -0,0 +1,66 @@ +# `skillrig verify` — prove vendored skills are unmodified (Verification Gate) + +> Offline, deterministic, read-only integrity gate. **Needs no origin.** The headline CI use. + +Checks **this repository's** vendored skills (project scope: `.agents/skills` vs the committed +`.skillrig/skills-lock.json`). It aggregates **all** findings in one run (never stops at the +first failure) and takes no arguments. Two checks: + +- **Label-honesty**: recompute each locked skill's git tree-SHA from its **committed** content + and compare to the lock. +- **Orphan / completeness**: the on-disk skill set under `.agents/skills` must equal the locked + set — an unrecorded skill (`orphan`) or a recorded-but-absent one (`missing`) fails. + +## Per-skill verdicts (the `status` field) + +| Status | Meaning | Fix | +|--------|---------|-----| +| `ok` | committed content matches the recorded fingerprint | — | +| `mismatch` | committed content differs from the record (label-honesty failure) | re-`add` from origin, or restore the approved content | +| `orphan` | on disk but no lock entry (untracked — the primary supply-chain risk) | `skillrig add` it, or remove it | +| `missing` | lock entry whose files are absent | restore the files, or remove the lock entry | +| `dirty` | locked + present but **uncommitted / locally modified** | commit it (verify checks committed content) — *distinct* from `mismatch` | + +## CRITICAL: verify is integrity-only — a missing backing tool is NOT a failure + +`verify` does **no** prerequisite/eligibility check. A skill may declare `[[requires]]` backing +tools in its `skill.toml`; if those tools are absent in the environment, `verify` **still +passes** (it checks content, not runnability). Prerequisite checking is a future `doctor` +concern (the reserved exit `3`), never emitted here. Don't tell a user that verify failed +because a tool isn't installed — that's never the cause. + +## Exit codes (load-bearing — branch on these in CI/agents) + +| Code | When | +|------|------| +| `0` | All verdicts `ok` (**including** the empty case: no skills / no lock → clean pass) | +| `1` | Usage/config: malformed or unreadable lock, bad args, **not inside a git repo** | +| `2` | Verification failure: any `mismatch`, `orphan`, `missing`, or `dirty` | +| `3` | **Never emitted** — reserved for `doctor`'s prerequisite class | + +A malformed lock is a **`1`**, not a `2` — keep that distinction when scripting (`2` = content +drifted; `1` = couldn't even run the check). + +`verify --json` keys: `ok, counts{verified,mismatch,orphan,missing,dirty}, verdicts[]` with each +verdict carrying `name, path, status, expectedTreeSha, actualTreeSha, reason`. Diagnostics go to +stderr, so `skillrig verify --json 2>/dev/null | jq .` stays clean JSON. + +## Workflow patterns + +1. **CI merge gate** (headline): run `skillrig verify` (or `--json` for an agent); exit `0` + proceeds, `2` blocks with a per-skill report, `1` is a setup/config problem. +2. **Triage a failure**: `skillrig verify --json | jq '.verdicts[] | select(.status != "ok")'`. +3. **`dirty`?** commit the vendored files (verify checks committed content), then re-verify. +4. **`mismatch`?** the committed content no longer matches its recorded version — re-`add + --force` to restore, or investigate the change. (See [add.md](add.md).) + +## Error handling + +| Symptom (stderr) | Cause | Fix | +|------------------|-------|-----| +| `cannot read .skillrig/skills-lock.json` | malformed/unreadable lock (exit `1`, **not** `2`) | check/repair the file, or re-vendor with `skillrig add` | +| `not a git repository` | run outside a repo | run inside the repo | +| bad/extra args | `verify` takes none | the error states what/why/fix; run `skillrig verify` (add `--json`) | + +All failures state what/why/fix; `--verbose` shows the raw cause. Human output is compact (a +summary line per finding + a footer hint); use `--json` only when a program/agent parses it. diff --git a/.specledger/memory/constitution.md b/.specledger/memory/constitution.md index bf38db3..0f852fd 100644 --- a/.specledger/memory/constitution.md +++ b/.specledger/memory/constitution.md @@ -133,6 +133,17 @@ Use the `skill-creator` skill at `.claude/skills/skill-creator/` to test trigger accuracy and run evals (`scripts/run_eval.py`) after changes. A feature is not complete until its skill coverage is verified. +**One consolidated skill, not one-per-command.** There is a SINGLE user-facing +agent skill for the whole CLI — `.agents/skills/skillrig/` — with a short root +`SKILL.md` (what it is, when to use it, the `init`→`add`→`verify` workflow, the +origin precondition) that routes to per-activity detail under `references/` +(one file per command/activity, e.g. `init.md`/`add.md`/`verify.md`). A new +command extends this skill (a new `references/.md` + the root's routing table +and description keywords), it does NOT spawn a new top-level `skillrig-` +skill. Progressive disclosure (skill-creator's domain-organization pattern) keeps +the root grokkable while the references carry depth; splitting per command +fragments triggering and duplicates the shared workflow/precondition guidance. + This is doubly load-bearing here: **undertriggering** — an agent failing to invoke a skill when it should — is a documented failure mode that skillrig itself exists to fight (it is the rationale behind the origin's `lint` conformance gate, cli.md / diff --git a/CLAUDE.md b/CLAUDE.md index 5d20ae4..4851af1 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -57,7 +57,7 @@ Scripts and agents branch on them, so meanings are fixed (`internal/cli/exit.go` - **Errors as navigation.** Every error states what failed, the *real* (never-swallowed) cause, and a suggested fix. `--verbose` is the escape hatch that prints the raw underlying cause — it must exist on every command. See `cli.md` Principle 2 and anti-patterns AP-03. - **Two-level output.** Human output is compact with a footer hint; `--json` is complete and untruncated. `--json`/`--verbose` are persistent root flags (`globalOpts`); mutating commands also take `--dry-run`, and `add`/`update` take `--force`. Tests must assert output *shape* (bounded line count for human, parseable + structurally complete for JSON), not just `Contains` (constitution §II). - **Classify every new command** into a `cli.md` pattern (Query / Vendor Mutation / Verification Gate / Environment / Global Management) and run the `docs/design/checklist-template.md` gate before merge. -- **Skill–CLI co-evolution (constitution IX).** Every CLI change ships a matching skill update with verified trigger accuracy. The relevant skill lives in `.agents/skills/skillrig-init/`; eval tooling is `.agents/skills/skill-creator/scripts/run_eval.py` (note: the constitution's `scripts/run_eval.py` path is stale). Per global instructions, run skill evals with `model: "sonnet"`. +- **Skill–CLI co-evolution (constitution IX).** Every CLI change ships a matching skill update with verified trigger accuracy. There is **one consolidated skill** for the whole CLI at `.agents/skills/skillrig/` — a short root `SKILL.md` that routes to per-activity detail in `references/` (`init.md`/`add.md`/`verify.md`). A new command **extends** this skill (add a `references/.md` + update the root's routing table + description keywords); do **not** create a new top-level `skillrig-` skill. Eval tooling is `.agents/skills/skill-creator/scripts/run_eval.py` (note: the constitution's `scripts/run_eval.py` path is stale). Per global instructions, run skill evals with `model: "sonnet"`. ## Workflow & tracking From 76b560efcc3169321d2fcb6dc6b7f8e631632956 Mon Sep 17 00:00:00 2001 From: Vincent De Smet Date: Sat, 30 May 2026 15:59:07 +0800 Subject: [PATCH 8/8] =?UTF-8?q?fix(002):=20address=20Qodo=20PR=20review=20?= =?UTF-8?q?=E2=80=94=20security=20+=20correctness=20hardening?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Resolve the bugs and one rule violation from the automated Qodo review on PR #5 (see specledger/002-skillcore-verify/reviews/002-review.md): - Path traversal (#5): validate the skill name as a single safe path segment before any FS op, so `add ../x` can't escape .agents/skills/ or os.RemoveAll an arbitrary dir. New *InvalidSkillNameError + tests. - Symlinks (#6): reject any symlink in the origin skill subtree (would let copy/compare follow outside it and break byte-identical/git-canonical vendoring). New *SymlinkUnsupportedError + test; policy noted in cli.md (preserve-as-symlink is a future relaxation). - Verify error class (#8): pathInHead now propagates a *GitError only when git cannot run or "not a git repository"; every other rev-parse failure (absent path, unborn HEAD) stays "not in tree" — honoring the Verify SDK contract without breaking the dirty/missing verdicts. Tests for both. - rev-parse option injection (#7): refuse a revision beginning with '-' (git rev-parse echoes --/--end-of-options, so a guard is the right fix). Test. - %q rule (#1): quote path strings in mapAddError's user-facing messages. Declined: verify-report-on-stdout (the report is data; exit code is the signal; contract requires `verify --json 2>/dev/null | jq`) and the //go:build integration tag (project separates integration by ./test/ dir) — PR replies posted. Skipped: unchecked strings.Builder writes (never error). Gate: golangci-lint 0 issues; go test -cover -count=1 ./... green (skillcore 80.7%, internal/cli 51.6%). Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/design/cli.md | 2 +- internal/cli/add.go | 28 +++++++-- pkg/skillcore/add.go | 63 ++++++++++++++++++- pkg/skillcore/add_test.go | 59 +++++++++++++++++ pkg/skillcore/errors.go | 27 ++++++++ pkg/skillcore/git.go | 18 +++++- pkg/skillcore/treesha_test.go | 22 +++++++ pkg/skillcore/verify.go | 28 ++++++--- pkg/skillcore/verify_test.go | 55 ++++++++++++++++ .../reviews/002-review.md | 19 ++++++ test/skillcore_quickstart_test.go | 2 +- 11 files changed, 307 insertions(+), 16 deletions(-) diff --git a/docs/design/cli.md b/docs/design/cli.md index 22228a4..0971592 100644 --- a/docs/design/cli.md +++ b/docs/design/cli.md @@ -263,7 +263,7 @@ Every `skillrig` subcommand MUST identify which pattern(s) it follows. This clas | Pattern | Purpose | Examples | Constraints | |---------|---------|----------|-------------| | **Query** | Deterministic read of the discovery artifact | `search` | Offline. Reads committed `index.json`. Deterministic tag filtering — **no inference** (N6). | -| **Vendor Mutation** | Write skill tree + lock entry | `add` *(implemented)*, `bump --pr` | Writes lock via `skillcore` only. Supports `--dry-run`; refuses to clobber content that diverges from the locked `treeSha` without `--force`. `bump` *proposes* (opens a PR), never force-adopts (R13). MUST never silently discard local edits (R32). | +| **Vendor Mutation** | Write skill tree + lock entry | `add` *(implemented)*, `bump --pr` | Writes lock via `skillcore` only. Supports `--dry-run`; refuses to clobber content that diverges from the locked `treeSha` without `--force`. `bump` *proposes* (opens a PR), never force-adopts (R13). MUST never silently discard local edits (R32). Vendors byte-identical + mode-preserving; the skill name MUST be a single path segment (no traversal). **Symlinks in a skill subtree are rejected this slice** — following them would break byte-identical / git-canonical vendoring (git records a symlink as a link, not its target); preserving symlinks faithfully is a future relaxation. | | **Verification Gate** | Offline integrity / prereq / conformance | `verify` *(implemented — integrity-only)*, `lint` | MUST be offline + deterministic. Exit-code driven. **No live/online signal in this path** (R11/N1). `verify` = consumer CI gate; `lint` = author CI gate on the origin. As implemented, `verify` is **integrity-only** (label-honesty + orphan detection, exit 2); prerequisite/eligibility checks (a missing `[[requires]]` tool → exit 3) belong to the future `doctor`, so `verify` does not emit exit 3 today. | | **Environment** | Health, auth, config, bootstrap | `doctor`, `init` | MUST be idempotent. `doctor` checks prerequisite auth (R18); works without a fully-configured project. `init` is **consumer-side only** — binds to an *existing* origin, never bootstraps one (architecture §2d). | | **Global Management** | Fetch/restore user-scope skills | `global add`, `global verify` | Genuinely *fetches and materializes* (the restore mode project scope doesn't need, §3). Touches per-environment home dirs, never the repo's project lock (R8). | diff --git a/internal/cli/add.go b/internal/cli/add.go index 1d9b11b..5aca3cd 100644 --- a/internal/cli/add.go +++ b/internal/cli/add.go @@ -168,12 +168,32 @@ func usageNoOriginConfigured() *UsageError { // values (exit 1), authoring the what/why/fix prose while preserving the raw // cause for --verbose. An unexpected error is wrapped generically. func mapAddError(skill string, err error) error { + var invalidName *skillcore.InvalidSkillNameError + if errors.As(err, &invalidName) { + return &UsageError{ + Msg: fmt.Sprintf("invalid skill name %q\n", invalidName.Skill) + + "why: a skill name must be a single path segment (no '/' or '..') so it stays inside .agents/skills/\n" + + "fix: pass just the skill's directory name, e.g. skillrig add terraform-plan-review", + Cause: err, + } + } + + var symlink *skillcore.SymlinkUnsupportedError + if errors.As(err, &symlink) { + return &UsageError{ + Msg: fmt.Sprintf("cannot vendor %q: it contains a symlink (%q)\n", skill, symlink.Path) + + "why: symlinks are not supported in vendored skills this release — following them would break byte-identical, git-canonical vendoring\n" + + "fix: remove the symlink in the origin skill, or vendor a skill without symlinks", + Cause: err, + } + } + var originMissing *skillcore.OriginNotFoundError if errors.As(err, &originMissing) { return &UsageError{ - Msg: fmt.Sprintf("origin checkout not found at %s\n", originMissing.OriginDir) + + Msg: fmt.Sprintf("origin checkout not found at %q\n", originMissing.OriginDir) + "why: this release reads the configured origin from a local checkout at that path, and it is absent\n" + - "fix: check out the origin there (git clone " + originMissing.OriginDir + "), or re-bind with skillrig init --origin OWNER/REPO", + fmt.Sprintf("fix: check out the origin there (git clone %q), or re-bind with skillrig init --origin OWNER/REPO", originMissing.OriginDir), Cause: err, } } @@ -182,7 +202,7 @@ func mapAddError(skill string, err error) error { if errors.As(err, ¬Found) { return &UsageError{ Msg: fmt.Sprintf("skill %q not found in origin\n", skill) + - "why: no skills/" + skill + "/ at the configured origin\n" + + fmt.Sprintf("why: no skills/%s/ at the configured origin\n", skill) + "fix: check the skill name against the origin", Cause: err, } @@ -191,7 +211,7 @@ func mapAddError(skill string, err error) error { var overwrite *skillcore.OverwriteError if errors.As(err, &overwrite) { return &UsageError{ - Msg: fmt.Sprintf("refusing to overwrite %s\n", overwrite.Path) + + Msg: fmt.Sprintf("refusing to overwrite %q\n", overwrite.Path) + "why: on-disk content diverges from the recorded fingerprint\n" + "fix: re-run with --force, or revert local edits", Cause: err, diff --git a/pkg/skillcore/add.go b/pkg/skillcore/add.go index 0524a94..2993eab 100644 --- a/pkg/skillcore/add.go +++ b/pkg/skillcore/add.go @@ -7,6 +7,7 @@ import ( "io/fs" "os" "path/filepath" + "strings" ) // Action is the outcome of an Add: how the vendored tree changed. @@ -82,7 +83,7 @@ func (e *OverwriteError) Error() string { // overwrite unless opts.Force, writes nothing when opts.DryRun, and is // idempotent on identical content. func Add(opts AddOptions) (AddResult, error) { - srcDir, err := locateSkillSource(opts) + srcDir, err := prepareSource(opts) if err != nil { return AddResult{}, err } @@ -143,6 +144,66 @@ func Add(opts AddOptions) (AddResult, error) { return result, nil } +// prepareSource validates the skill name, locates the origin skill subtree, and +// rejects any symlink within it — all the safety pre-flight before Add touches +// the filesystem. opts.Skill is used as a path segment for the source, the +// destination, and os.RemoveAll on overwrite, so a traversal name (e.g. "../x") +// must be refused here, before any copy or delete can escape the canonical +// subtree; a symlink would let copy/compare follow it outside the subtree and +// break byte-identical/git-canonical vendoring. +func prepareSource(opts AddOptions) (string, error) { + if err := validateSkillName(opts.Skill); err != nil { + return "", err + } + + srcDir, err := locateSkillSource(opts) + if err != nil { + return "", err + } + + if err := ensureNoSymlinks(srcDir); err != nil { + return "", err + } + + return srcDir, nil +} + +// validateSkillName rejects any skill name that is not a single safe path +// segment, so it can never escape the canonical .agents/skills/ subtree +// (path-traversal hardening): non-empty, not "." or "..", no path separator +// (`/` or `\`), and equal to its own filepath.Base. +func validateSkillName(name string) error { + if name == "" || name == "." || name == ".." || + strings.ContainsAny(name, `/\`) || filepath.Base(name) != name { + return &InvalidSkillNameError{Skill: name} + } + + return nil +} + +// ensureNoSymlinks walks srcDir and returns a *SymlinkUnsupportedError on the +// first symlink found. WalkDir yields a symlink as a non-directory entry and does +// not descend into a symlinked directory, so d.Type()&fs.ModeSymlink detects both +// file and directory symlinks before any copy/compare can follow them. +func ensureNoSymlinks(srcDir string) error { + return filepath.WalkDir(srcDir, func(path string, d fs.DirEntry, err error) error { + if err != nil { + return err + } + + if d.Type()&fs.ModeSymlink != 0 { + rel, relErr := filepath.Rel(srcDir, path) + if relErr != nil { + rel = path + } + + return &SymlinkUnsupportedError{Path: rel} + } + + return nil + }) +} + // locateSkillSource resolves and validates the origin's skill subtree, returning // its directory. It distinguishes a missing origin checkout // (*OriginNotFoundError — the library isn't checked out at OriginDir) from a diff --git a/pkg/skillcore/add_test.go b/pkg/skillcore/add_test.go index 3fa856d..56bf86f 100644 --- a/pkg/skillcore/add_test.go +++ b/pkg/skillcore/add_test.go @@ -250,6 +250,65 @@ func TestAdd_SkillNotFound(t *testing.T) { } } +// TestAdd_RejectsTraversalSkillName guards the path-traversal hardening (Qodo +// #5): a skill name that is not a single safe path segment is refused with +// *InvalidSkillNameError BEFORE any filesystem op, so it can never escape +// .agents/skills/ (and os.RemoveAll can never hit an arbitrary dir). +func TestAdd_RejectsTraversalSkillName(t *testing.T) { + t.Parallel() + + originDir, _ := bootstrapOrigin(t) + consumer := newConsumer(t) + + for _, bad := range []string{"", ".", "..", "../evil", "a/b", "a/../b", "/abs", `a\b`} { + _, err := Add(addOpts(originDir, bad, consumer, false)) + + var inv *InvalidSkillNameError + if !errors.As(err, &inv) { + t.Errorf("Add(%q): error = %T (%v), want *InvalidSkillNameError", bad, err, err) + } + } + + // Nothing was written anywhere under the consumer (no escape, no partial vendor). + if _, err := os.Stat(filepath.Join(consumer, ".agents")); !os.IsNotExist(err) { + t.Error("a traversal-name add created .agents/; want nothing written") + } +} + +// TestAdd_RejectsSymlinkInOrigin guards the symlink hardening (Qodo #6): an +// origin skill containing a symlink is refused with *SymlinkUnsupportedError, and +// nothing is vendored. +func TestAdd_RejectsSymlinkInOrigin(t *testing.T) { + t.Parallel() + + originDir := t.TempDir() + runGit(t, originDir, "init", "-q") + + const skill = "with-symlink" + writeFile(t, originDir, filepath.Join("skills", skill, "SKILL.md"), 0o644, sampleSkillMd) + writeFile(t, originDir, filepath.Join("skills", skill, "skill.toml"), 0o644, sampleManifest) + + if err := os.Symlink("SKILL.md", filepath.Join(originDir, "skills", skill, "link")); err != nil { + t.Fatalf("symlink: %v", err) + } + + runGit(t, originDir, "add", "-A") + runGit(t, originDir, "commit", "-q", "-m", "seed with symlink") + + consumer := newConsumer(t) + + _, err := Add(addOpts(originDir, skill, consumer, false)) + + var se *SymlinkUnsupportedError + if !errors.As(err, &se) { + t.Fatalf("Add error = %T (%v), want *SymlinkUnsupportedError", err, err) + } + + if _, statErr := os.Stat(filepath.Join(consumer, ".agents", "skills", skill)); !os.IsNotExist(statErr) { + t.Error("symlink-containing skill was partially vendored; want nothing written") + } +} + // TestAdd_DryRunWritesNothing asserts --dry-run reports the action but leaves no // vendored files and no lock on disk. func TestAdd_DryRunWritesNothing(t *testing.T) { diff --git a/pkg/skillcore/errors.go b/pkg/skillcore/errors.go index 35037e6..1688b7c 100644 --- a/pkg/skillcore/errors.go +++ b/pkg/skillcore/errors.go @@ -53,6 +53,33 @@ func (e *OriginNotFoundError) Error() string { return fmt.Sprintf("origin checkout not found at %q", e.OriginDir) } +// InvalidSkillNameError is returned when a skill name is not a single safe path +// segment — it is empty, "."/"..", or contains a path separator (so it could +// escape the canonical .agents/skills/ subtree). Add validates the name +// before any filesystem operation, so a traversal name (e.g. "../x") is refused +// before any copy or os.RemoveAll. Presentation-free: terse Error. +type InvalidSkillNameError struct { + Skill string +} + +func (e *InvalidSkillNameError) Error() string { + return fmt.Sprintf("invalid skill name %q", e.Skill) +} + +// SymlinkUnsupportedError is returned when the origin skill subtree contains a +// symlink. Following symlinks could read content outside the subtree and would +// break the byte-identical / git-canonical vendoring guarantee (git records a +// symlink as a link, not its target's content), so this slice refuses them +// outright. Path is the offending symlink, relative to the skill dir. +// Presentation-free: terse Error. +type SymlinkUnsupportedError struct { + Path string +} + +func (e *SymlinkUnsupportedError) Error() string { + return fmt.Sprintf("symlink not supported in vendored skill: %q", e.Path) +} + // GitError is returned when a git invocation fails. It carries the process exit // code and captured stderr, mirroring the gh/git client pattern, so the caller // can render an environment error. It is presentation-free. diff --git a/pkg/skillcore/git.go b/pkg/skillcore/git.go index 14a6c16..97a219c 100644 --- a/pkg/skillcore/git.go +++ b/pkg/skillcore/git.go @@ -54,9 +54,23 @@ func (c *gitClient) run(ctx context.Context, args ...string) (string, error) { return strings.TrimSpace(stdout.String()), nil } -// revParse runs `git -C rev-parse ` and returns the trimmed -// output (e.g. a resolved commit or tree SHA). +// revParse runs `git -C rev-parse ` and returns the trimmed output +// (e.g. a resolved commit or tree SHA). +// +// A rev never legitimately begins with '-', so one that does is refused up front +// rather than passed to git — where a leading-dash rev (e.g. from an origin ref +// like "-h", which the shape-only origin validation permits) would be parsed as +// an option. git rev-parse cannot be made safe here with `--`/`--end-of-options` +// (it echoes those tokens to stdout instead of treating them as terminators), so +// the guard is the correct fix for this option-injection vector (Qodo #7). func (c *gitClient) revParse(gitDir, rev string) (string, error) { + if strings.HasPrefix(rev, "-") { + return "", &GitError{ + ExitCode: -1, + Stderr: "refusing to use a revision that begins with '-': " + rev, + } + } + return c.run(context.Background(), "-C", gitDir, "rev-parse", rev) } diff --git a/pkg/skillcore/treesha_test.go b/pkg/skillcore/treesha_test.go index 63ebb44..f2ba247 100644 --- a/pkg/skillcore/treesha_test.go +++ b/pkg/skillcore/treesha_test.go @@ -5,9 +5,31 @@ import ( "errors" "path/filepath" "regexp" + "strings" "testing" ) +// TestTreeSHA_DashRefNotTreatedAsOption guards Qodo #7: a ref beginning with '-' +// (which the shape-only origin validation permits) must be treated as a revision +// via `--end-of-options`, never parsed as a git option — so the failure is a +// bad-revision *GitError, not an "unknown option" error. +func TestTreeSHA_DashRefNotTreatedAsOption(t *testing.T) { + t.Parallel() + + originDir, skill := bootstrapOrigin(t) + + _, err := TreeSHA(originDir, "-h", "skills/"+skill) + + var gitErr *GitError + if !errors.As(err, &gitErr) { + t.Fatalf("TreeSHA(ref=-h) error = %T (%v), want *GitError", err, err) + } + + if strings.Contains(strings.ToLower(gitErr.Stderr), "unknown option") { + t.Errorf("ref '-h' was parsed as a git option (stderr: %q); --end-of-options should prevent this", gitErr.Stderr) + } +} + // hex40 matches a git SHA-1 (40 lowercase hex chars). var hex40 = regexp.MustCompile(`^[0-9a-f]{40}$`) diff --git a/pkg/skillcore/verify.go b/pkg/skillcore/verify.go index 218bbe3..97f3333 100644 --- a/pkg/skillcore/verify.go +++ b/pkg/skillcore/verify.go @@ -289,9 +289,11 @@ func verifyOrphanSkill(repoRoot, path string) (Verdict, error) { return verdict, nil } -// pathInHead reports whether path resolves to a tree in HEAD. A missing tree is -// not an error (false, nil); a real git failure (e.g. not a repo) propagates as -// a *GitError so the CLI can surface a config/usage error. +// pathInHead reports whether path resolves to a tree in HEAD. A path that git +// specifically reports as absent from the committed tree is (false, nil); any +// other git failure (not a git repository, git not on PATH, unborn HEAD, …) +// propagates as a *GitError so the SDK contract holds — Verify must NOT downgrade +// a fatal git failure into a "missing"/"orphan" verdict (it returns the error). func pathInHead(repoRoot, path string) (bool, error) { _, err := TreeSHA(repoRoot, "HEAD", path) if err == nil { @@ -299,16 +301,28 @@ func pathInHead(repoRoot, path string) (bool, error) { } var gitErr *GitError - if errors.As(err, &gitErr) && gitErr.ExitCode > 0 { - // A positive exit code from rev-parse means git ran but could not - // resolve HEAD: — the path is simply not in the committed tree. + if errors.As(err, &gitErr) && gitErr.ExitCode > 0 && !isFatalGitError(gitErr.Stderr) { + // git ran inside a repo but could not resolve HEAD: — the path is + // not in the committed tree. This covers an absent path ("does not exist + // in 'HEAD'") AND an unborn HEAD (no commits yet → "ambiguous argument + // 'HEAD'"/"bad revision"), both of which are "not committed", not failures. return false, nil } - // Exit code <= 0 means git could not run at all (not a repo / not on PATH). + // git could not run (ExitCode <= 0) or reported a fatal repo error (e.g. "not + // a git repository"): surface it as a *GitError so Verify honors its contract + // (a config/usage problem, not a missing/orphan verdict). return false, err } +// isFatalGitError reports whether stderr indicates git could not operate on a +// repository at all (as opposed to merely failing to resolve a revision/path). +// Only these are propagated; every other rev-parse failure means "not in the +// committed tree". +func isFatalGitError(stderr string) bool { + return strings.Contains(strings.ToLower(stderr), "not a git repository") +} + // pathDirty reports whether the working tree for path has uncommitted changes, // via git status --porcelain. Non-empty output means dirty. func pathDirty(repoRoot, path string) (bool, error) { diff --git a/pkg/skillcore/verify_test.go b/pkg/skillcore/verify_test.go index a86c174..5df5991 100644 --- a/pkg/skillcore/verify_test.go +++ b/pkg/skillcore/verify_test.go @@ -252,6 +252,61 @@ func TestVerify_AggregatesAllFindings(t *testing.T) { } } +// TestVerify_DirtyWhenUncommittedUnbornHead guards the #8/dirty interaction: a +// vendored-but-never-committed skill in a repo with NO commits (unborn HEAD) +// must be a `dirty` verdict, not a propagated git error — `git rev-parse HEAD:…` +// fails with "bad revision"/"ambiguous argument", which is "not committed", not +// a fatal repo failure. +func TestVerify_DirtyWhenUncommittedUnbornHead(t *testing.T) { + t.Parallel() + + repo := newConsumer(t) // git init, but NO commits → unborn HEAD + + rel := skillsRoot + "/x" + writeFile(t, repo, filepath.Join(rel, "skill.toml"), 0o644, sampleManifest) + writeVerifyLock(t, repo, LockFile{ + LockfileVersion: 1, + Skills: map[string]LockEntry{"x": {TreeSha: "deadbeef", Path: rel}}, + }) + + rep, err := Verify(repo) + + var vf *VerifyFailure + if !errors.As(err, &vf) { + t.Fatalf("Verify error = %T (%v), want *VerifyFailure", err, err) + } + + if v := findVerdict(t, rep, "x"); v.Status != StatusDirty { + t.Errorf("status = %q, want dirty (uncommitted on an unborn HEAD)", v.Status) + } +} + +// TestVerify_PropagatesFatalGitError guards Qodo #8: a fatal git failure (here a +// non-git directory) must surface as a *GitError (config/usage), NOT be +// downgraded to a missing/orphan verdict — honoring the Verify SDK contract for +// callers that don't pre-validate repoRoot. +func TestVerify_PropagatesFatalGitError(t *testing.T) { + t.Parallel() + + dir := t.TempDir() // a tmpdir, NOT a git repo + writeVerifyLock(t, dir, LockFile{ + LockfileVersion: 1, + Skills: map[string]LockEntry{"x": {TreeSha: "deadbeef", Path: skillsRoot + "/x"}}, + }) + + _, err := Verify(dir) + + var gitErr *GitError + if !errors.As(err, &gitErr) { + t.Fatalf("Verify(non-repo) error = %T (%v), want *GitError", err, err) + } + + var vf *VerifyFailure + if errors.As(err, &vf) { + t.Error("a fatal git failure must NOT be reported as a *VerifyFailure (it is exit 1, not 2)") + } +} + // TestVerify_MalformedLockVersion: an unsupported lockfileVersion is a *LockError // (config/usage, exit 1), NOT a *VerifyFailure (exit 2). func TestVerify_MalformedLockVersion(t *testing.T) { diff --git a/specledger/002-skillcore-verify/reviews/002-review.md b/specledger/002-skillcore-verify/reviews/002-review.md index ad41df5..504f74d 100644 --- a/specledger/002-skillcore-verify/reviews/002-review.md +++ b/specledger/002-skillcore-verify/reviews/002-review.md @@ -139,3 +139,22 @@ All Review #2 findings **resolved** (AR-1 included per user decision). Gate afte | **AR-5** | LOW | ⏭️ Deferred | Stale `data-model.md` sample SHA — illustrative only (tests recompute via raw git); left as a cosmetic cleanup. | **Tooling:** `specledger.checkpoint-workflow` review-agent template now instructs loading `agentic-go-cli-design` + `golang-code-style`/`golang-testing`/`golang-lint` so future reviews judge against the same standards. + +--- + +# Qodo Bot Review (PR #5) — 2026-05-30 + +Automated Qodo review on the PR: **4 bugs + 4 rule violations**. Triaged against the codebase. + +| # | Finding | Disposition | +|---|---------|-------------| +| **5** | Skill-name **path traversal** (`opts.Skill` unvalidated → escape `.agents/skills/`, `os.RemoveAll` arbitrary dir) | ✅ **Fixed** — `validateSkillName` (single safe segment) gates `Add` before any FS op; `*InvalidSkillNameError` + tests (`../x`, `a/b`, `..`, abs, `\`). | +| **6** | **Symlink following** in copy/compare (read outside subtree; breaks git-canonical) | ✅ **Fixed** — `ensureNoSymlinks` rejects any symlink in the origin subtree; `*SymlinkUnsupportedError` + test. **Policy:** reject this slice (noted in `docs/design/cli.md`; preserve-as-symlink is a future relaxation). | +| **8** | **Verify masks git failures** (`pathInHead` treated any exit>0 as "missing") | ✅ **Fixed** — propagate as `*GitError` only when git couldn't run or "not a git repository"; every other rev-parse failure (absent path, unborn HEAD) stays "not in tree". Tests for both the fatal-error and unborn-HEAD-dirty cases. | +| **7** | **rev-parse leading-dash** option injection (ref `-h`) | ✅ **Fixed** — `revParse` refuses a rev beginning with `-` (git rev-parse echoes `--`/`--end-of-options`, so a guard is the correct fix, not a terminator). Test asserts no "unknown option". | +| **1** | `%s` vs `%q` for paths in `mapAddError` (rule 782577) | ✅ **Fixed** — `%q` for `OriginDir`/`Path`; updated the affected integration assertion. | +| **3** | Unchecked `Fprintf`→`strings.Builder` (rule 782713) | ⏭️ **Skipped** — `strings.Builder` never errors; the project's own errcheck does not flag it. Harmless. | +| **4** | Missing `//go:build integration` (rule 782685) | ❌ **Declined** — conflicts with the project convention (integration separated by the `./test/` dir; 001 has no tag; Makefile uses `go test ./test/...`). Reply posted on the PR. | +| **2** | verify report on stdout (rule 783453) | ❌ **Declined (by design)** — the verdict report **is** the data; the exit code is the error signal (cf. `grep`/`diff`/`go test`); the contract requires `verify --json 2>/dev/null \| jq` to work. Reply posted on the PR. | + +Gate after the round: `golangci-lint` 0 issues; `go test -cover -count=1 ./...` green (skillcore **80.7%**, internal/cli **51.6%**). Note: `make check`'s `go test ./...` can cache a **stale** `test/` result (integration tests exec a separately-built binary the cache doesn't track) — validate the integration tier with `-count=1`. diff --git a/test/skillcore_quickstart_test.go b/test/skillcore_quickstart_test.go index 0790494..461ce02 100644 --- a/test/skillcore_quickstart_test.go +++ b/test/skillcore_quickstart_test.go @@ -486,7 +486,7 @@ func TestQuickstart_AddRefusesDivergentWithoutForce(t *testing.T) { } // Three distinct parts: what / why / fix. - assertContains(t, "what", res.stderr, "refusing to overwrite "+vendoredPath) + assertContains(t, "what", res.stderr, "refusing to overwrite \""+vendoredPath+"\"") assertContains(t, "why", res.stderr, "on-disk content diverges from the recorded fingerprint") assertContains(t, "fix", res.stderr, "--force")